Page MenuHome

Limit the internal sample index of Sobol-Burley for faster sample generation.
ClosedPublic

Authored by Nathan Vegdahl (cessen) on Dec 8 2022, 11:24 PM.

Details

Summary

This is done based on the render sample count so that it doesn't impact sampling quality. It's similar in spirit to the adaptive table size in D16561, but in this case for performance rather than memory usage.

Diff Detail

Repository
rB Blender

Event Timeline

Nathan Vegdahl (cessen) requested review of this revision.Dec 8 2022, 11:24 PM
Nathan Vegdahl (cessen) created this revision.

In a simple test scene (only lambert shaders and very low poly count, so sample generation should be more of the run time) on an x86-64 CPU, this made a 1024-sample render go from a 1:15.69 render time to 1:13.65 (roughly %2.7 improvement). For comparison, rendering the same scene with the PMJ sampler took 1:11.29.

The performance improvement is somewhat dependent on sample count (lower sample counts have a greater relative perf improvement), but it scales logarithmically, so going to e.g. 4096 samples should only have slightly less relative improvement compared to 1024, and 256 only slightly more.

I haven't tested on GPU, so I don't know what the perf numbers look like there.

Btw, I'm happy for this to land either before or after D16443. If there are any conflicts, they should be very minor and easy to rebase either way.

I'll commit this with the optimization to compute reverse_integer_bits(aa_samples_next_ge_power_of_two - 1) in advance.

This revision is now accepted and ready to land.Dec 14 2022, 5:01 PM