Supported multi-threading for bm_mesh_loops_calc_normals.
This is done by operating on vertex-loops instead of face-loops.
Single threaded operation still loops over faces since iterating
over vertices adds some overhead in the case of custom-normals
as the order used for accessing loops must be the same as iterating
of a faces loops.
From isolated timing tests of `bm_mesh_loops_calc_normals` on high
poly models, this gives between 3.5x to 10x speedup,
with larger gains for meshes with custom-normals.
---
Notes:
- This change is limited to multi-threading existing logic. Some other optimizations may be worth looking into but I would rather do them separately to this patch as it makes the patch harder to review and complicates debugging any regressions they could cause.
- For faster overall performance, tagging sharp edges could be multi-threaded as well - perhaps making it part of the `bm_mesh_loops_calc_normals` to avoid a separate blocking loop over all edges.
- Optimizations have been split into a separate patch (which requires this one to be applied first), see: {D11970}.
Other changes made as part of this patch:
- {rB5cd1aaf0808f2bac11fadcf8c351429de54ac68a}
- {rB15cdcb4e9085c3cf35528c2f7e559955b4ff531a}