Page MenuHome

T88352: Use threaded ibo.tris extraction for single material meshes.
ClosedPublic

Authored by Jeroen Bakker (jbakker) on May 18 2021, 4:07 PM.

Details

Summary

This patch adds a specific extraction method when the mesh has only
one material. This method is multi-threaded.

There is a trade-off in this patch as the ibo isn't compressed (it adds
restart indexes for hidden faces). So it depends if threading is faster
than the additional GPU buffer upload.

Subdivided cube

I used a cube subdivided 7 times, modifiers applied. that gives around 400000 faces.

The test is selecting some vertices and move them. During this test the next buffers are updated on each frame:

  • vbo.pos_nor
  • vbo.lnor
  • vbo.edit_data
  • ibo.tris
  • ibo.points

System info:

platformLinux-5.11.0-7614-generic-x86_64-with-glibc2.33
rendererAMD SIENNA_CICHLID (DRM 3.40.0, 5.11.0-7614-generic, LLVM 11.0.1)
vendorAMD
version4.6 (Core Profile) Mesa 21.0.1
cpuIntel(R) Core(TM) i7-6700 CPU @ 3.40GHz
compilergcc version 10.3.0

Timing have been measured using DEBUG_TIME in draw_cache_extract_mesh.

master: rdata 8ms iter 45ms (frame 153ms)
this patch rdata 6ms iter 36ms (frame 132ms)

Diff Detail

Repository
rB Blender

Event Timeline

Jeroen Bakker (jbakker) requested review of this revision.May 18 2021, 4:07 PM
Jeroen Bakker (jbakker) created this revision.
Jeroen Bakker (jbakker) edited the summary of this revision. (Show Details)

Is elb a copy in each core? If not, GPU_indexbuf_set_tri_verts doesn't seem to be thread-safe.

Is elb a copy in each core? If not, GPU_indexbuf_set_tri_verts doesn't seem to be thread-safe.

elb is the full buffer. as long as elt_index or mlt_index don't overlap (and they don't) there is no issue about thread safety. As I see it threads store on other location in the elb. This is how the other threaded extractors work.

I mean. Within the function we have this that is not good concurrently:

if (builder->index_len < idx) {
    builder->index_len = idx;
}

Between checking if builder->index_len < idx and setting the value, builder->index_len can change in another thread.
We have "atomic" functions for this, but they generally penalize performance a lot.

I mean. Within the function we have this that is not good concurrently:

if (builder->index_len < idx) {
    builder->index_len = idx;
}

Between checking if builder->index_len < idx and setting the value, builder->index_len can change in another thread.
We have "atomic" functions for this, but they generally penalize performance a lot.

Ah yes. If that is the case we should store the max index locally and have an option to save it during finish.

Germano Cavalcante (mano-wii) requested changes to this revision.Jun 2 2021, 3:09 PM
This revision now requires changes to proceed.Jun 2 2021, 3:09 PM
Jeroen Bakker (jbakker) edited the summary of this revision. (Show Details)

Updated with latest changes in master.

Jeroen Bakker (jbakker) retitled this revision from [WIP] T88352: Use threaded ibo.tris extraction for single material meshes. to T88352: Use threaded ibo.tris extraction for single material meshes..Jun 9 2021, 11:35 AM
Jeroen Bakker (jbakker) edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jun 9 2021, 2:09 PM