Page MenuHome

Cycles: add math functions for float8
ClosedPublic

Authored by Andrii Symkin (pembem22) on Jul 23 2022, 9:08 PM.

Details

Summary

This patch adds required math functions for float8 to make it possible using float8 instead of float3 for color data.

Diff Detail

Repository
rB Blender
Branch
cycles-add-float8-functions (branched from master)
Build Status
Buildable 23134
Build 23134: arc lint + arc unit

Event Timeline

Andrii Symkin (pembem22) requested review of this revision.Jul 23 2022, 9:08 PM
Andrii Symkin (pembem22) created this revision.
Andrii Symkin (pembem22) planned changes to this revision.Jul 23 2022, 9:11 PM
Andrii Symkin (pembem22) added inline comments.
intern/cycles/util/types_float8.h
15–18

There's a problem when compiling on CUDA caused by ccl_try_align macro not defined. Is this macro required here or should it be replaced with something else?

What is the bigger goal this patch is leading to? How using float8 will help for color operations?

intern/cycles/util/types_float8.h
15–18

On a CPU it is required to be aligned.

I do not think CUDA supports 256bit operations and perhaps the float8 is to be implemented as a pair of float4 in which case it seems that we can skip alignment qualifier here.

What is the bigger goal this patch is leading to? How using float8 will help for color operations?

This will be used for spectral rendering to calculate 8 wavelengths per ray. I've submitted a similar patch D15318 for float4 as well.

Brecht Van Lommel (brecht) requested changes to this revision.Jul 25 2022, 4:19 PM

Ideally we would not need 8 channel colors, since it will give significant memory overhead on the GPU, or complexity due to support different number of channels at runtime. But regardless it's good to be able to experiment with this.

intern/cycles/util/types_float8.h
15–18

You can do:

#ifdef __KERNEL_GPU__
struct float8
#else
struct ccl_try_align(32) float8
#endif

On any currently supported GPU architecture, implementing this as a pair of float4 should make no performance difference compared to single floats. So probably easiest to just keep it as individual floats.

intern/cycles/util/types_float8.h
15–18

That's an interesting aspect of GPU compute. One thing I forgot to mention above is float4 having a benefit for the pre-AVX2 CPUs. We still support those.

For the initial patch for the float8 operations and making it available on all platforms is not essential. We can look into further optimizations after the initial work is done.

intern/cycles/util/types_float8.h
15–18

Fair point, I forgot about pre-AVX2 CPUs, but indeed optimizing for that case is not essential.

  • Format code
  • Fix compilation on CUDA

I ran into some compiler warnings/errors, but will fix those as part of the commit.

This revision is now accepted and ready to land.Jul 25 2022, 5:53 PM
This revision was automatically updated to reflect the committed changes.