Page MenuHome

Improved importance sampling for Beckmann and GGX
ClosedPublic

Authored by Brecht Van Lommel (brecht) on Jun 1 2014, 1:42 AM.

Details

Summary

From this paper that came out a few days ago, and comes with sample code:

Importance Sampling Microfacet-Based BSDFs using the Distribution of Visible Normals.
E. Heitz and E. d'Eon, EGSR 2014

http://hal.inria.fr/hal-00996995/en

Notes:

  • Still many optimizations and cleanups needed, this is just a proof of concept.
  • Sampling function uses erf, eval uses approximation to compute D, should be made consistent. Importance sampling using the approximation may be faster.
  • erf_inv was copied from boost to avoid a compile error because we compile without RTTI. Also this function seems not available in OpenCL. We may end up not needing it if we can use the approximation for sampling.
  • Their sample code (though not this patch) supports anisotropic Beckmann and GGX, will be nice to not need a different distribution for that.

Diff Detail

Branch
ggx-sample

Event Timeline

Suzanne with grey background, 2 glossy bounces.

Beckmann

GGX

Basically this eliminates the excessive noise at grazing angles, the comparison uses roughness 0.4 to make the noisy area bigger, but the same happens for the smaller noisy area with lower roughness.

The above renders were done with 4 samples, below some renders with more samples.

GGX, 16 samples

GGX, 64 samples

Very cool !!.

Is that plan to evaluate if this can just outright replace the current sampling ? From a quick glance through of the paper there do not really seem to be any downsides.
I am asking because I am not sure how to handle all the possible combination of options. And I am not sure that all subsets and combination of cycles build now as it is.

This is not criticism but rather me asking how are we going to handle that also taking into account things like people wanting to do metropolis and bidir etc etc.

The only possible downside seems to be performance, we'll have to see the impact when the code is optimized.

Besides that there is no problem for things like bidir or mlt, it's only a local change in the BSDF code that other parts of the code do not need to be concerned with. The BSDF converges to the same result, just better importance sampled.

And yes, this can replace the current Beckmann an GGX sampling, no point keeping the old one if performance is close enough.

Here a test with a scene of mine, noise is definitely better. Beckmann is used here, the upper half of the wall is just glossy, the lower one a Diffuse/Glossy mix.


Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 1 2014, 2:08 PM

Fix wrong transmission eval, lots of code cleaning.

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 1 2014, 2:10 PM

Game of Trigonometry

Nice work, rendertime with latest Patch (Game of Geometry), is 1:39min for my scene now^^

modified bmw file with branched path

64 aa * 16/20 glossy

master (3.03) vs patch (3.09)

master (3.03) vs patch (3.09) -> would imply a 3.17 % slowdown the authors of the paper seemed to be in the 3-4 % region also. (3844 vs 4000) samples in 12 minutes for one of the images in the paper.

Can't build myself, i'd like to share this scene for testing, since it is hard to get rid of glossy noise without branched PT and cranking up glossy rays.
I got rid of the furnitures, since it's client work, once you tested please tell me i need to remove it from downloading.

I'd suggest to check the false ceiling on top of the image, is a glossy wood quite noisy with path tracing. If you want to post comparison here is ok for me.
It is already setup and should render in couple of minutes at 64spp. CPU, used indirect clamping, MIS on area lamps, deactivated MIS on background emitter outside etc...

interior-glossy-IS

another render this time with dof :)

200 aa * 8 glossy

patch (5.18)

200 aa * 10 glossy

master (5.06)

200 aa * 5 glossy

patch (3.34)

Tested Marcos scene.

58s (vanilla)

62s (patch)

also tested marcog scene :)

the key here is "sample all lights"

master 45 sec

patch 47 sec

Thanks for the performance comparisons.

Beckmann is still expected to be significantly slower because of the use of erf and erf_inv functions. I'm not sure yet if I can replace them, perhaps a lookup table will be needed, I'll first see if a closed form formula is possible. GGX should be better, though some minor improvements are still possible there.

Hi all, made some tests with this patch and got errors trying GPU.

/daten/blender-git/build/bin/2.70/scripts/addons/cycles/kernel/closure/../closure/bsdf_microfacet.h(293): error: Within a __device__/__global__ function, only __shared__ variables may be marked "static"

21 errors detected in the compilation of "/tmp/tmpxft_000012db_00000000-6_kernel.cpp1.ii".
CUDA kernel compilation failed, see console for details.

Same error on different line numbers.

Opensuse 13.1/64
Intel i5 3770K
GTX 760 4 GB (Display)
GTX 560Ti 448 Cores
Driver 331.67

Master edfd989

Cheers, mib

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 2 2014, 2:20 PM

Some minor optimizations to avoid doing computations twice.

GGX sampling code is roughly as fast as the old code now. However overall render
time is still a few percent slower because there's fewer wasted samples that end
up terminating the path, and so more rays are traced.

Beckmann sampling seems harder to optimize than I hoped. Still stuck with 2x exp,
2x erfinv and 1x erf calls, and BSDF sampling code taking about twice as long as
before.

GPU

Simplify erfinv implementation so it compiles with CUDA/OpenCL.

Render result is different than CPU (also for GGX, so not because of erfinv)
for unknown reason still.

The latest patch makes my corridor scene render 1s faster (1:38min now), but looks different than the version before?

Game of Trigonometry patch: http://www.pasteall.org/pic/show.php?id=72156
Minor optim patch (latest): http://www.pasteall.org/pic/show.php?id=72158

@Thomas Dinges (dingto): strange, I see no difference in any scenes here.

Hi, made a few tests again and found a glitch.

Tested with testbuild2 and b460674 patched, GGX and Beckman, CPU and GPU.

Opensuse 13.1/64
Intel i5 3770K
GTX 760 4 GB (Display)
GTX 560Ti 448 Cores
Driver 331.67

Cheers, mib

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 3 2014, 9:23 PM
  • Fix Beckmann giving a different render result in some cases.
  • Use approximate erf implementation.

Thanks, it renders the Corridor fine again, 1:36min now. Still slower, but I rendered with just 80 samples (1:18min then), and it looks still better than the original vanilla one. This is scene dependent though, but probably not so much a deal for Branched path, where we can just change the Glossy/Transmission Samples.

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 4 2014, 1:00 AM

Internal code support for anisotropic Beckmann and GGX reflection. Based on:

Understanding the Masking-Shadowing Function in Microfacet-Based BRDFs
E. Heitz, Research Report 2014

This is not hooked up yet, D549 already contains the code to support multiple
anisotropic distributions so can be copied from there later.

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 4 2014, 1:03 AM

Remove accidentally committed debugging code.

Brecht Van Lommel (brecht) updated this revision to Unknown Object (????).Jun 11 2014, 7:59 PM
  • Included D549 in this patch.
  • Anisotropic BSDF now supports GGX and Beckmann distributions, Ward has been removed because other distributions are superior.
  • GGX is now the default distribution for glossy and anisotropic nodes, since it looks good, has low noise and is fast to evaluate.
  • Ashikhmin-Shirley is now available in the Glossy BSDF.