Page MenuHome

Smart UV: search optimization
ClosedPublic

Authored by Mikkel Gjoel (mikkelgjoel) on Sep 2 2021, 2:17 PM.
Tokens
"Burninate" token, awarded by AlexeyAdamitsky."100" token, awarded by EAW."Party Time" token, awarded by PratikPB2123."Love" token, awarded by gilberto_rodrigues."Like" token, awarded by dulrich."Pterodactyl" token, awarded by Gavriel5578.

Details

Summary

This patch changes the search for unprocessed faces while generating UV-islands from using a flag on faces, to a local N-wide bitfield. It makes no changes to functionality.

This changes the search for unprocessed faces to only search from the previously found face. Local testing on 1.5mio tri meshes gives a 75x speedup (of the code affected, which is the first half of the work).


The former code would traverse all faces of a mesh until a face was found that had not been processed. This ended up being slow mainly because it had to load face-data to determine the state of the flag. Secondarily, the way it iterated and marked the mesh, it would end up traversing all previously processed faces to find and unprocessed one.


Diff Detail

Repository
rB Blender

Event Timeline

Mikkel Gjoel (mikkelgjoel) requested review of this revision.Sep 2 2021, 2:17 PM
Mikkel Gjoel (mikkelgjoel) created this revision.

removed TODO-comment that was already implemented

75x speedup?!! Nice optimization :)

From what I could understand in the patch. I believe much of the API in BLI_bitmap.h could be used here.
Also you could use some of the functions BLI_math_bits.h.
Thus, all these bitwise operations could be deduplicated.

This can use BLI_bitmap probably.
Also https://wiki.blender.org/wiki/Tools/ClangFormat should be applied.

Nice performance increase.

Mikkel Gjoel (mikkelgjoel) planned changes to this revision.EditedSep 2 2021, 3:08 PM

Hans Goudey kindly pointed out BLI_bitmap, will rewrite to use that instead.

Adds todo for BLI_bitmap, fixes non-msc bitscan path.

75x speedup?!! Nice optimization :)

That was badly phrased. It's a 75x speedup to that _half_ of the code, so only around twice faster overall (as the second half is still slow).

From what I could understand in the patch. I believe much of the API in BLI_bitmat.h could be used here.
Also you could use some of the functions BLI_math_bits.h.
Thus, all these bitwise operations could be deduplicated.

Thanks a lot, will rewrite to use that instead!

Mikkel Gjoel (mikkelgjoel) updated this revision to Diff 41383.EditedSep 2 2021, 8:05 PM
Mikkel Gjoel (mikkelgjoel) edited the summary of this revision. (Show Details)

Rewritten to use BLI_bitmap instead of custom bitmap. Performance looks to be the same.

static char bitscan_forward_bitmap(BLI_bitmap const *const bits, const int bitcount, int *const index )

^ could be considered moved to bitmap.c

source/blender/bmesh/intern/bmesh_query.c
2683

Wouldn't it be more efficient to leave everything set to 0 and use BLI_BITMAP_ENABLE instead of BLI_BITMAP_DISABLE?

2695

I wonder if, instead of using this function, it would be better to check the smallest index used while setting the indices.

int face_index = f_other->head.index;
face_index_min = min_ii(face_index_min, face_index);
2732

Here you can use BLI_BITMAP_TEST.
BLI_BITMAP_TEST_BOOL is for cases like when the value is assigned to a bool.

Campbell Barton (campbellbarton) requested changes to this revision.EditedSep 3 2021, 5:22 AM

Could you post a test file which shows the bottleneck?

The main down-side for this patch is that it's still having to search for the first face, even if the search using a bitmap is faster.
It also requires allocating a an array of faces (since BM_face_at_index is used).

An alternative optimization is to only use one iterator (using BM_iter_new & BM_iter_step directly) instead of the BM_ITER_MESH macro which starts at the beginning each time.
This way, the iteration over faces to find the next un-tagged face to initialize stack would continue where the last iteration left off.

As far as I can see this should be at least as fast as this patch - if not faster since it avoids having to allocate and fill a lookup table for faces (in the case the lookup table has not yet been initialized).


Requesting changes since there is a missing function call, although I'd like to see a comparison with the proposed optimization.


NOTE: The same optimization should be applied to BM_mesh_calc_edge_groups, this can be done after the changes to BM_mesh_calc_face_groups have been accepted.
source/blender/bmesh/intern/bmesh_query.c
2700

BM_mesh_elem_table_ensure must be used before this function is called, since the table may not be initialized.

This revision now requires changes to proceed.Sep 3 2021, 5:22 AM
source/blender/bmesh/intern/bmesh_query.c
2683

I did it this way to be able to use the forward bitscan intrinsic. There is probably a way to invert that, but I am not sure how it would be more efficient?

source/blender/bmesh/intern/bmesh_query.c
2695

Interesting, I am not sure I understand how that would work, given that finding the islands is a scattered traversal of faces?

[x][ ][ ][ ][ ][ ][ ][ ] // <-- first index
[x][ ][x][ ][ ][x][ ][ ] // <-- traversed faces after first (3-face) island

^ what is face_index_min here? How would I find the next face_index_min after processing that one, if only tracking a single index?

@Campbell Barton (campbellbarton) The case where the algorithm scales badly, is with number of islands, as it then traverses the entire mesh many times. Subdividing a cube and running the algorithm with a low Angle Limit shows this quite effectively (subdivided cube attached).

Profiling of the subdivided cube with 5deg Angle Limit

The main down-side for this patch is that it's still having to search for the first face, even if the search using a bitmap is faster.

True, I did consider something like a freelist or adding a hierarchy-layer to the search, but decided to try the simple-stupid thing first. Didn't really feel a need to improve it further when it went from 45s to 0.5s.

It also requires allocating a an array of faces (since BM_face_at_index is used).

I haven't hit a case where it's not already there - any idea how to reproduce that case?

An alternative optimization is to only use one iterator (using BM_iter_new & BM_iter_step directly) instead of the BM_ITER_MESH macro which starts at the beginning each time.
This way, the iteration over faces to find the next un-tagged face to initialize stack would continue where the last iteration left off.

As far as I can see this should be at least as fast as this patch - if not faster since it avoids having to allocate and fill a lookup table for faces (in the case the lookup table has not yet been initialized).

@Campbell Barton (campbellbarton) I will give this suggestion a stab.

Added BM_mesh_elem_table_ensure to ensure face-index array exists (thanks @Campbell Barton (campbellbarton) ).

@Campbell Barton (campbellbarton) Your suggestion solves the same bottleneck, and slightly faster \o/

Mikkel Gjoel (mikkelgjoel) edited the summary of this revision. (Show Details)

I have changed the diff to match this suggestion instead. I am a bit uncertain of the style here, but I have just expanded the BM_ITER_MESH macro and moved the iterator out of the outer-loop.

Mikkel Gjoel (mikkelgjoel) marked an inline comment as done.Sep 3 2021, 11:27 AM
Mikkel Gjoel (mikkelgjoel) marked an inline comment as done.Sep 3 2021, 11:30 AM
Campbell Barton (campbellbarton) requested changes to this revision.Sep 3 2021, 12:58 PM

Looks fine, BM_mesh_calc_edge_groups should be updated too.

source/blender/bmesh/intern/bmesh_query.c
2666

BM_CHECK_TYPE_ELEM_ASSIGN isn't typically used outside of this macro. A regular cast will do. Same again below.

This revision now requires changes to proceed.Sep 3 2021, 12:58 PM
Mikkel Gjoel (mikkelgjoel) marked an inline comment as done.Sep 3 2021, 2:49 PM
Mikkel Gjoel (mikkelgjoel) updated this revision to Diff 41416.EditedSep 3 2021, 2:51 PM

removed unrelated change that snuck into former diff

Mikkel Gjoel (mikkelgjoel) updated this revision to Diff 41422.EditedSep 3 2021, 4:29 PM

Blindly updated BM_mesh_calc_edge_groups() to match BM_mesh_calc_face_groups().

@Campbell Barton (campbellbarton) do you know what functionality uses this function so I can test that it behaves as expected? (can't immediately find it from clicking relevant functionality or trawling the code - appears to have been mostly replaced with the _array version).

Mikkel Gjoel (mikkelgjoel) planned changes to this revision.Sep 3 2021, 4:29 PM
Mikkel Gjoel (mikkelgjoel) requested review of this revision.Sep 4 2021, 10:17 AM

Minor formatting update.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 4 2021, 3:09 PM
This revision was automatically updated to reflect the committed changes.