Page MenuHome

Draw Manager: Use threading for Mesh extract lines
ClosedPublic

Authored by Germano Cavalcante (mano-wii) on Jun 2 2021, 5:18 PM.

Details

Summary

This patch was made over D11455, and benckmark compared to it.

It follows an idea similar to that seen in BLI_task_parallel_range
whose TaskParallelSettings has a callback to reduce the
"userdata_chunk" (TaskParallelReduceFunc func_reduce).

I didn't like the idea of adding a mutex to ExtractorRunData but
apparently it doesn't affect performance negatively (tested by keeping
extract_lines in single thread).

Benchmarking

master:PATCH:
large_mesh_editing:Average: 14.246502 FPSAverage: 15.438118 FPS
rdata 9ms iter 31ms (frame 69ms)rdata 9ms iter 27ms (frame 65ms)
large_mesh_editing_ledge:Average: 14.913622 FPSAverage: 15.856538 FPS
rdata 9ms iter 30ms (frame 67ms)rdata 9ms iter 26ms (frame 63ms)
looptris_test:Average: 3.970774 FPSAverage: 4.095200 FPS
rdata 11ms iter 90ms (frame 235ms)rdata 12ms iter 87ms (frame 229ms)
subdiv_mesh_cage_and_final:Average: 1.926931 FPSAverage: 1.957404 FPS
rdata 7ms iter 39ms (frame 262ms)rdata 7ms iter 35ms (frame 258ms)
rdata 7ms iter 41ms (frame 254ms)rdata 7ms iter 37ms (frame 250ms)
subdiv_mesh_final_only:Average: 6.575331 FPSAverage: 6.679989 FPS
rdata 3ms iter 19ms (frame 145ms)rdata 3ms iter 18ms (frame 146ms)
subdiv_mesh_final_only_ledge:Average: 6.791831 FPSAverage: 6.723643 FPS
rdata 3ms iter 19ms (frame 142ms)rdata 3ms iter 19ms (frame 142ms)

Note:
extract_tris is now the most time-consuming extract in mesh editing:

Diff Detail

Repository
rB Blender
Branch
master
Build Status
Buildable 15036
Build 15036: arc lint + arc unit

Event Timeline

Germano Cavalcante (mano-wii) requested review of this revision.Jun 2 2021, 5:18 PM
Germano Cavalcante (mano-wii) created this revision.

Yes, this solution matches closely what I did for D11499: GPU: Thread safe index buffer builders..
Main differences is that the solution is part of GPU module and can be used in other areas as well. No need for any locking (although this is negligible). Also use the same mechanism for single threading. Single allocation (negligible).

Note that the solution proposed here would need more attention for developers when using custom user data. The API does not provide a clear interface how to use it. And might be used incorrectly.

1diff --git a/source/blender/draw/intern/mesh_extractors/extract_mesh_ibo_lines.cc b/source/blender/draw/intern/mesh_extractors/extract_mesh_ibo_lines.cc
2index 6237529902b..74a3d3825c5 100644
3--- a/source/blender/draw/intern/mesh_extractors/extract_mesh_ibo_lines.cc
4+++ b/source/blender/draw/intern/mesh_extractors/extract_mesh_ibo_lines.cc
5@@ -42,6 +42,15 @@ static void *extract_lines_init(const MeshRenderData *mr,
6 return elb;
7 }
8
9+static void *extract_lines_task_init(void *_userdata)
10+{
11+ GPUIndexBufBuilder *elb = static_cast<GPUIndexBufBuilder *>(_userdata);
12+ GPUIndexBufBuilder *sub_builder = static_cast<GPUIndexBufBuilder *>(
13+ MEM_mallocN(sizeof(*sub_builder), __func__));
14+ GPU_indexbuf_subbuilder_init(elb, sub_builder);
15+ return sub_builder;
16+}
17+
18 static void extract_lines_iter_poly_bm(const MeshRenderData *UNUSED(mr),
19 BMFace *f,
20 const int UNUSED(f_index),
21@@ -138,6 +147,14 @@ static void extract_lines_iter_ledge_mesh(const MeshRenderData *mr,
22 GPU_indexbuf_set_line_restart(elb, e_index);
23 }
24
25+static void extract_lines_task_finish(void *_userdata, void *_task_userdata)
26+{
27+ GPUIndexBufBuilder *elb = static_cast<GPUIndexBufBuilder *>(_userdata);
28+ GPUIndexBufBuilder *sub_builder = static_cast<GPUIndexBufBuilder *>(_task_userdata);
29+ GPU_indexbuf_subbuilder_finish(elb, sub_builder);
30+ MEM_freeN(sub_builder);
31+}
32+
33 static void extract_lines_finish(const MeshRenderData *UNUSED(mr),
34 struct MeshBatchCache *UNUSED(cache),
35 void *buf,
36@@ -153,13 +170,15 @@ constexpr MeshExtract create_extractor_lines()
37 {
38 MeshExtract extractor = {0};
39 extractor.init = extract_lines_init;
40+ extractor.task_init = extract_lines_task_init;
41 extractor.iter_poly_bm = extract_lines_iter_poly_bm;
42 extractor.iter_poly_mesh = extract_lines_iter_poly_mesh;
43 extractor.iter_ledge_bm = extract_lines_iter_ledge_bm;
44 extractor.iter_ledge_mesh = extract_lines_iter_ledge_mesh;
45+ extractor.task_finish = extract_lines_task_finish;
46 extractor.finish = extract_lines_finish;
47 extractor.data_type = MR_DATA_NONE;
48- extractor.use_threading = false;
49+ extractor.use_threading = true;
50 extractor.mesh_buffer_offset = offsetof(MeshBufferCache, ibo.lines);
51 return extractor;
52 }
53@@ -197,13 +216,15 @@ constexpr MeshExtract create_extractor_lines_with_lines_loose()
54 {
55 MeshExtract extractor = {0};
56 extractor.init = extract_lines_init;
57+ extractor.task_init = extract_lines_task_init;
58 extractor.iter_poly_bm = extract_lines_iter_poly_bm;
59 extractor.iter_poly_mesh = extract_lines_iter_poly_mesh;
60 extractor.iter_ledge_bm = extract_lines_iter_ledge_bm;
61 extractor.iter_ledge_mesh = extract_lines_iter_ledge_mesh;
62+ extractor.task_finish = extract_lines_task_finish;
63 extractor.finish = extract_lines_with_lines_loose_finish;
64 extractor.data_type = MR_DATA_NONE;
65- extractor.use_threading = false;
66+ extractor.use_threading = true;
67 extractor.mesh_buffer_offset = offsetof(MeshBufferCache, ibo.lines);
68 return extractor;
69 }
to migrate this to the current master (apply directly on master).
Note that the extract_lines_with_lines_loose also needed the task init.

I had to rebase the patch with latest master. Not sure what went wrong.

This revision is now accepted and ready to land.Jun 9 2021, 10:48 AM