There is a generic function to retrieve float and float3 attributes
primitive_attribute_float and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.
Actually the calling function knows already where the data is stored.
So we could simplify this by splitting these functions and remove
the check logic.
This patch splits the primitive_attribute_float? functions into
primitive_surface_attribute_float? and primitive_volume_attribute_float?.
What leads to less branching and more optimum kernels.
This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.
Impact in compilation times
job | scene_name | previous | new | percentage -------+-----------------+----------+-------+------------ t61513 | empty | 10.63 | 10.66 | 0% t61513 | bmw | 17.91 | 17.65 | 1% t61513 | fishycat | 19.57 | 17.68 | 10% t61513 | barbershop | 54.10 | 24.41 | 55% t61513 | classroom | 17.55 | 16.29 | 7% t61513 | koro | 18.92 | 18.05 | 5% t61513 | pavillion | 17.43 | 16.52 | 5% t61513 | splash279 | 16.48 | 14.91 | 10% t61513 | volume_emission | 36.22 | 21.60 | 40%
Impact in render times
job | scene_name | previous | new | percentage -------+-----------------+----------+--------+------------ 61513 | empty | 21.06 | 20.35 | 3% 61513 | bmw | 198.44 | 190.05 | 4% 61513 | fishycat | 394.20 | 401.25 | -2% 61513 | barbershop | 1188.16 | 912.39 | 23% 61513 | classroom | 341.08 | 340.38 | 0% 61513 | koro | 472.43 | 471.80 | 0% 61513 | pavillion | 905.77 | 899.80 | 1% 61513 | splash279 | 55.26 | 54.86 | 1% 61513 | volume_emission | 62.59 | 61.70 | 1%
There is also a possitive impact when using CPU and CUDA, but they are small.
I didn't split the hair logic from the surface logic due to:
- Hair and surface use same attribute types. It was not clear if it could be splitted when looking at the code only.
- Hair and surface are quick to compile and to read. So the benefit is quite small.