A lot of users are reporting very poor stroke performance in Sculpt Mode
with high end processors.
In a demo file @JulienKaspar send me, I notice that the PBVH is creating
only one leaf node for 49398 vertices in Multires level 1. I believe
that what is causing that lag is that when a brush that has low spacing
is used with a small radius a lot of stroke steps are generated per
viewport update. As there is only one node, these steps can't be
multithreaded and they all loop over the 50k vertices, making the brush
lag.
When reducing the leaf limit to 1000, the PBVH has 16 nodes for the same
number of vertices, which increases performance a lot.
I'm not sure when was the last time the leaf limit was updated and if
the current 10k leaf limit is still acceptable today with 16 and 32
cores CPUs available, but if we can agree on a better default for most
user maybe we can expose it as a setting (similar to how "Use Threaded
Sculpting" was in the UI)
About the decission of making it 1000 by default, it is just based on
some testing I did in my computer. This patch should be tested in as
many CPUs as possible in case we want to include only one default
instead of exposing it as a setting.
Here is a comparison table measuring 130 samples of the time spent
in the ##paint_stroke_modal## function for the same brush size at
different multires levels:
On a i7 9750H, 16GB, GTX 1650 4GB:
Average:
| | 49398 | 818782 | 2897118 |
|----------------|-------------------|-------------------|-------------------|
| 10k leaf limit | 0.004075751879699 | 0.003446639097744 | 0.003298706766917 |
| 1k leaf limit | 0.00068130075188 | 0.002079 | 0.006291 |
| Improvement | 5.98230938165606 | 3.51549522604394 | 1.40370949835066 |
Maximum:
| | 49398 | 818782 | 2897118 |
|----------------|------------------|------------------|------------------|
| 10k leaf limit | 0.00872 | 0.006726 | 0.008588 |
| 1k leaf limit | 0.001949 | 0.002079 | 0.006291 |
| Improvement | 4.47408927655208 | 3.23520923520923 | 1.36512478143379 |