This patch will significantly speed up attribute node when multiple
threads are available, especially in linear situations when parallelism
cannot be achieved elsewhere.
I tested on a mesh with 4 million vertices on a Ryzen 3700x.
All tests were done on "plain" attributes where no conversion
is necessary and the data is stored in contiguous arrays.
The results are an average over ~30 runs.
| Node | Before (ms) | After (ms) | Speedup (x times faster) | Notes |
|---|---|---|---|---|
| Align Rotation to Vector | 651 | 70.8 | 9.19 | "Auto" pivot mode |
| Attribute Math | 60.1 | 10.1 | 5.95 | "Hyperbolic Sine" operation |
| Attribute Color Ramp | 55.2 | 11.1 | 4.97 | |
| Attribute Mix | 36.2 | 7.74 | 4.67 | Color data type |
| Attribute Sample Texture | 430 | 92 | 4.66 | "Clouds" texture |
| Attribute Math | 10.6 | 2.60 | 4.12 | "Add" operation |
| Attribute Randomize | 38.7 | 12.0 | 3.21 | Vector data type |
| Attribute Vector Math | 58.8 | 18.5 | 3.18 | Refract operation |
| Attribute Map Range | 28.2 | 6.79 | 4.17 | Vector data type |
The changes are not exhaustive, other nodes could still
be parallelized in the future. Also, it would be possible to further
optimize the grain size in parallel_for. I'd rather make sure that
it isn't too small though. I tested some different values, but also
relied on intuition-- increasing grain size for less complex operations
and vice versa.