Currently this conversion (which happens when using modifiers in edit
mode, for example) is completely single threaded. It's harder than
some other areas to multithread because BMesh elements don't always
know their indices (and vise versa), and because the dynamic AoS format
used by BMesh makes some typical solutions not helpful.
This patch proposes to split the operation into two steps. The first
updates the indices of BMesh elements and builds tables for easy
iteration later. It also checks if some optional mesh attributes
should be added. The second uses parallel loops over all elements,
copying attribute values and building the Mesh topology.
Both steps process different domains in separate threads (though the
first has to combine faces and loops). Though this isn't proper data
parallelism, it's quite helpful because each domain doesn't affect the
others.
Timings
I tested this on a Ryzen 3700x with a 1 million face grid. With no
extra attributes and then with several color attributes and vertex
groups.
| File | Before | After |
| Simple | 190.2 ms | 124.0 ms |
| More Attributes | 253.9 ms | 134.8 ms |
On a Ryzen 7950x:
| File | Before | After |
| Simple | 101.6 ms | 59.6 ms |
| More Attributes | 149.2 ms | 65.6 ms |
The optimization scales better with more attributes on the BMesh. The
speedup isn't as linear as multithreading other operations, indicating
added overhead. I think this is worth it though, because the user is
usually actively interacting with a mesh in edit mode.
Potential Future Improvements
- Avoid the first lookup table building when they are already non-dirty
- Store the tables in the BMesh to ammortize their creation (discussed in D14938)
- Do a small bit of preprocessing to make CustomData_from_bmesh_block faster (D17141)
- Avoid building tables if there aren't any attributes to transfer
- To avoid code duplication here I went with the option that scaled better
- Experiment with different grain sizes
- Apply similar changes to the non "for eval" version of BMesh to Mesh conversion
Extra Timing Information
SIMPLE:
'CustomData Merge': (Average: 5178 ns, Min: 3350 ns)
'Vert 1st Pass': (Average: 26.6 ms, Min: 21.2 ms)
'Face/Loop 1st Pass': (Average: 63.1 ms, Min: 58.4 ms)
'Edge 1st Pass': (Average: 63.0 ms, Min: 56.9 ms)
'Adding Optional Mesh Attributes': (Average: 31965 ns, Min: 23810 ns)
'Face 2nd Pass': (Average: 41.4 ms, Min: 15.0 ms)
'Vert 2nd Pass': (Average: 42.9 ms, Min: 21.3 ms)
'Edge 2nd Pass': (Average: 56.5 ms, Min: 36.5 ms)
'Loop 2nd Pass': (Average: 58.4 ms, Min: 44.1 ms)
'BM_mesh_bm_to_me_for_eval': (Average: 131.1 ms, Min: 123.6 ms)
MORE ATTRIBUTES:
'CustomData Merge': (Average: 4.3 ms, Min: 4.1 ms)
'Vert 1st Pass': (Average: 26.4 ms, Min: 21.3 ms)
'Edge 1st Pass': (Average: 62.6 ms, Min: 58.4 ms)
'Face/Loop 1st Pass': (Average: 64.1 ms, Min: 58.9 ms)
'Adding Optional Mesh Attributes': (Average: 32289 ns, Min: 26410 ns)
'Vert 2nd Pass': (Average: 62.7 ms, Min: 34.0 ms)
'Loop 2nd Pass': (Average: 68.1 ms, Min: 48.0 ms)
'Face 2nd Pass': (Average: 46.8 ms, Min: 17.0 ms)
'Edge 2nd Pass': (Average: 67.7 ms, Min: 42.5 ms)
'BM_mesh_bm_to_me_for_eval': (Average: 145.2 ms, Min: 138.1 ms)