Page MenuHome

Weld Modifier Connected Mode
ClosedPublic

Authored by Henrik Dick (weasel) on Sep 20 2020, 8:24 PM.
Tokens
"Burninate" token, awarded by Schamph."Love" token, awarded by Raimund58."Burninate" token, awarded by lopoIsaac."Like" token, awarded by duarteframos.

Details

Summary

The Simple Mode from the improvements task (T73139) is implemented for merging along edges. It is now called Connected Mode, while the prior default is now called All.

With the recent performance improvement, the Connected Mode is in some cases only double the speed than the usual merge all strategy but in other cases it may be even faster. The bottleneck is somewhere further down the line of merging geometry.

The need for this patch came from T80897, because the merging in complex solidify is making it really slow. If this patch lands, merging can be removed from solidify without bigger consequences, as this is just a quicker and more advanced algorithm to do the same thing that solidify does slowly after T80897 lands.

Diff Detail

Event Timeline

Henrik Dick (weasel) requested review of this revision.Sep 20 2020, 8:24 PM
Henrik Dick (weasel) created this revision.

Could you submit the performance improvement of the default (Full) Weld Modifier in another patch?
Once the other patch is reviewed and accepted, it can be committed and this one can be updated.

Henrik Dick (weasel) planned changes to this revision.EditedSep 21 2020, 3:10 PM

Waiting for D8972 to land.

Henrik Dick (weasel) updated this revision to Diff 29170.EditedSep 22 2020, 2:14 PM
  • rebase after D8972
  • only include the one feature which is needed to get rid of merge in complex solidify (simple mode)

The other features from the prior revision of this diff, can be added later if needed.

Henrik Dick (weasel) retitled this revision from Weld Performance Improvements (new Simple Mode) to Weld Modifier Simple Mode.Sep 22 2020, 2:16 PM
Henrik Dick (weasel) edited the summary of this revision. (Show Details)
Brecht Van Lommel (brecht) added inline comments.
source/blender/makesrna/intern/rna_modifier.c
6322–6324

Perhaps better naming would be:

  • Connected: Only merged connected vertices within the distance
  • All: Merge all vertices within the distance (slower)

Simple & Full don't really explain what this does, and I think merging only connected geometry may in some cases be the desired behavior rather than just a performance optimization.

Thanks for the patch.
Seeing the code, I have some ideas on how to improve readability/style (mainly of the _ctx_alloc and _ctx_setup functions).
I will work on that later.

These are the warnings raised when compiling:

'v_mask_act': unreferenced formal parameter
'medge': local variable is initialized but not referenced
'totedge': local variable is initialized but not referenced
source/blender/modifiers/intern/MOD_weld.c
401

I'm not sure if "weld_fix_vert_map" is an appropriate name
A comment here would be useful to explain what this function fixes.

2032

It will probably be necessary to create a versioning (in ...\source\blender\blenloader\intern\versioning_290.c) to adapt old files.

Henrik Dick (weasel) marked an inline comment as done.Sep 22 2020, 5:32 PM
Henrik Dick (weasel) added inline comments.
source/blender/makesrna/intern/rna_modifier.c
6322–6324

Agreed. @Germano Cavalcante (mano-wii) as you want to work on the names later anyway, should I do it now or do you want to do this?

source/blender/modifiers/intern/MOD_weld.c
401

Agreed. Just for reference here: The vert_dest_map prior to this function is supposed to converge to the correct value when iterated (for fixpoints). This function will finalize this map by doing the iteration for every vert and writing the result. It also sets all verts that didn't get merged to OUT_OF_CONTEXT.

2032

I dont think so. I made sure current "Full" Mode is mode=0 so when old files get loaded it will automatically choose Full (AFAIK and tested)

Henrik Dick (weasel) marked an inline comment as done.
  • change name to all/connected
  • add comment to function
Henrik Dick (weasel) retitled this revision from Weld Modifier Simple Mode to Weld Modifier Connected Mode.Sep 22 2020, 7:35 PM
Henrik Dick (weasel) edited the summary of this revision. (Show Details)
Germano Cavalcante (mano-wii) requested changes to this revision.Sep 22 2020, 7:40 PM
Germano Cavalcante (mano-wii) added inline comments.
source/blender/modifiers/intern/MOD_weld.c
433

From what I can see here. Neither 'medge' nor 'totegde' are being used.

522

Remove or change this unused parameter to uint UNUSED(v_mask_act).
This avoids warnings on some compilers and makes the code more descriptive.

571

Is it really necessary to allocate these arrays mvert and combined_verts?
Apparently the idea is to get the average coordinate value.
This does not seem to be a good solution as the result may depend on the order in which the vertices are tested.
Also allocating and reading values on different arrays can be inefficient and memory consuming.

Isn't the idea here just to detect the edges that collapse?
That being the case, at first glance I don't really see a good reason for these arrays.

1464

Why was this line removed?

1817

Here instead of a comment, you can use a BLI_assert(MOD_WELD_ALL_MODE).
Thus, in addition to informing what is expected, it can prevent future errors.

This revision now requires changes to proceed.Sep 22 2020, 7:40 PM
Henrik Dick (weasel) marked 3 inline comments as done.
  • cleanup from requested changes
source/blender/modifiers/intern/MOD_weld.c
571

This came from complex solidify. Originally it didn't have that, but the chain merging was quite severe on meshes with ordered vertices like text. Honestly it is also quite bad with the usual weld modifier. D7224 changed that in solidify and I want to keep it like that. If you can think of a different way of getting rid of that chaining with a good result, propose it, but I found this (even though it is order dependend) is fairly good in practice and the order dependency does usually not become a problem (See T75032#896741 for the things I thought of how to solve this). Also the calculated positions could potentially be used as final positions in the future, removing some of the extra overhead.
If that should turn out to be still to slow then an option could be added later to switch between behaviors (also for the "all" weld, if possible).

What I am more concerned with here is, that with very high merging thresholds various spikes appear. I know the reason for it, but it's not trivial to fix.

1464

As weld_mesh should be setup in weld_mesh_context_create I removed vert_groups_map from it and made it an argument to the functions that need it. The problem is that this free will not be called if no vertices where merged, but the memory would still have been allocated. So I wanted both cases in one and put it at the bottom of doWeld.

I think of vert_dest_map as the backbone of the whole modifier, because it is used all throughout. It should not be part of weld_mesh, which is only used for one (conditional) step of the modifier.

I'm not sure I understand the "chain merging" problem.
So I decided to test this patch removing all the changes that at first sight don't seem to be necessary.
Here is the resulting patch:

1From 1f0d0e32f2e2debb4eba2425fab12d73a3754a67 Mon Sep 17 00:00:00 2001
2From: Henrik Dick <weasel>
3Date: Tue, 22 Sep 2020 18:35:40 -0300
4Subject: Weld Modifier Connected Mode
5
6
7diff --git a/source/blender/makesdna/DNA_modifier_types.h b/source/blender/makesdna/DNA_modifier_types.h
8index 9a5b5b8c2ca..e3e4b3017ed 100644
9--- a/source/blender/makesdna/DNA_modifier_types.h
10+++ b/source/blender/makesdna/DNA_modifier_types.h
11@@ -1982,8 +1982,10 @@ typedef struct WeldModifierData {
12 /* Name of vertex group to use to mask, MAX_VGROUP_NAME. */
13 char defgrp_name[64];
14
15+ int mode;
16+
17 short flag;
18- char _pad[6];
19+ char _pad[2];
20 } WeldModifierData;
21
22 /* WeldModifierData->flag */
23@@ -1991,6 +1993,12 @@ enum {
24 MOD_WELD_INVERT_VGROUP = (1 << 0),
25 };
26
27+/* #WeldModifierData->mode */
28+enum {
29+ MOD_WELD_ALL_MODE = 0,
30+ MOD_WELD_CONNECTED_MODE = 1,
31+};
32+
33 typedef struct DataTransferModifierData {
34 ModifierData modifier;
35
36diff --git a/source/blender/makesrna/intern/rna_modifier.c b/source/blender/makesrna/intern/rna_modifier.c
37index 2463f3c409f..8cc1d2333be 100644
38--- a/source/blender/makesrna/intern/rna_modifier.c
39+++ b/source/blender/makesrna/intern/rna_modifier.c
40@@ -6318,6 +6318,12 @@ static void rna_def_modifier_weld(BlenderRNA *brna)
41 StructRNA *srna;
42 PropertyRNA *prop;
43
44+ static const EnumPropertyItem mode_items[] = {
45+ {MOD_WELD_CONNECTED_MODE, "CONNECTED", 0, "Connected", "Only merge along the edges."},
46+ {MOD_WELD_ALL_MODE, "ALL", 0, "All", "Full merge by distance."},
47+ {0, NULL, 0, NULL, NULL},
48+ };
49+
50 srna = RNA_def_struct(brna, "WeldModifier", "Modifier");
51 RNA_def_struct_ui_text(srna, "Weld Modifier", "Weld modifier");
52 RNA_def_struct_sdna(srna, "WeldModifierData");
53@@ -6341,6 +6347,11 @@ static void rna_def_modifier_weld(BlenderRNA *brna)
54 "(0 makes it infinite)");
55 RNA_def_property_update(prop, 0, "rna_Modifier_update");
56
57+ prop = RNA_def_property(srna, "mode", PROP_ENUM, PROP_NONE);
58+ RNA_def_property_enum_items(prop, mode_items);
59+ RNA_def_property_ui_text(prop, "Mode", "Mode defines the merge rule.");
60+ RNA_def_property_update(prop, 0, "rna_Modifier_update");
61+
62 prop = RNA_def_property(srna, "vertex_group", PROP_STRING, PROP_NONE);
63 RNA_def_property_string_sdna(prop, NULL, "defgrp_name");
64 RNA_def_property_ui_text(
65diff --git a/source/blender/modifiers/intern/MOD_weld.c b/source/blender/modifiers/intern/MOD_weld.c
66index d7d24062fc5..a460bd33fe3 100644
67--- a/source/blender/modifiers/intern/MOD_weld.c
68+++ b/source/blender/modifiers/intern/MOD_weld.c
69@@ -1673,43 +1673,81 @@ static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContex
70 }
71
72 /* Get overlap map. */
73- /* TODO: For a better performanse use KD-Tree. */
74- struct BVHTreeFromMesh treedata;
75- BVHTree *bvhtree = bvhtree_from_mesh_verts_ex(&treedata,
76- mvert,
77- totvert,
78- false,
79- v_mask,
80- v_mask_act,
81- wmd->merge_dist / 2,
82- 2,
83- 6,
84- 0,
85- NULL,
86- NULL);
87-
88- if (v_mask) {
89- MEM_freeN(v_mask);
90- }
91-
92- if (bvhtree == NULL) {
93- return result;
94+ BVHTreeOverlap *overlap = NULL;
95+ uint overlap_len = 0;
96+ float merge_dist_sq = square_f(wmd->merge_dist);
97+ if (wmd->mode == MOD_WELD_ALL_MODE) {
98+ /* TODO: For a better performanse use KD-Tree. */
99+ struct BVHTreeFromMesh treedata;
100+ BVHTree *bvhtree = bvhtree_from_mesh_verts_ex(&treedata,
101+ mvert,
102+ totvert,
103+ false,
104+ v_mask,
105+ v_mask_act,
106+ wmd->merge_dist / 2,
107+ 2,
108+ 6,
109+ 0,
110+ NULL,
111+ NULL);
112+
113+ if (bvhtree) {
114+ struct WeldOverlapData data;
115+ data.mvert = mvert;
116+ data.merge_dist_sq = merge_dist_sq;
117+
118+ overlap = BLI_bvhtree_overlap_ex(bvhtree,
119+ bvhtree,
120+ &overlap_len,
121+ bvhtree_weld_overlap_cb,
122+ &data,
123+ wmd->max_interactions,
124+ BVH_OVERLAP_RETURN_PAIRS);
125+
126+ free_bvhtree_from_mesh(&treedata);
127+ }
128 }
129+ else {
130+ BLI_assert(wmd->mode == MOD_WELD_CONNECTED_MODE);
131+ overlap_len = 0;
132+ for (MEdge *me = mesh->medge, *me_end = me + mesh->totedge; me < me_end; me++) {
133+ if (v_mask && (!BLI_BITMAP_TEST(v_mask, me->v1) || !BLI_BITMAP_TEST(v_mask, me->v2))) {
134+ me->flag &= ~ME_EDGE_TMP_TAG;
135+ continue;
136+ }
137
138- struct WeldOverlapData data;
139- data.mvert = mvert;
140- data.merge_dist_sq = square_f(wmd->merge_dist);
141+ if (len_squared_v3v3(mvert[me->v1].co, mvert[me->v2].co) <= merge_dist_sq) {
142+ me->flag |= ME_EDGE_TMP_TAG;
143+ overlap_len++;
144+ }
145+ else {
146+ me->flag &= ~ME_EDGE_TMP_TAG;
147+ }
148+ }
149
150- uint overlap_len;
151- BVHTreeOverlap *overlap = BLI_bvhtree_overlap_ex(bvhtree,
152- bvhtree,
153- &overlap_len,
154- bvhtree_weld_overlap_cb,
155- &data,
156- wmd->max_interactions,
157- BVH_OVERLAP_RETURN_PAIRS);
158+ if (overlap_len) {
159+ overlap = MEM_mallocN(sizeof(*overlap) * overlap_len, __func__);
160+ BVHTreeOverlap *overlap_iter = &overlap[0];
161+ for (MEdge *me = mesh->medge, *me_end = me + mesh->totedge; me < me_end; me++) {
162+ if (me->flag & ME_EDGE_TMP_TAG) {
163+ if (me->v1 < me->v2) {
164+ overlap_iter->indexA = me->v1;
165+ overlap_iter->indexB = me->v2;
166+ }
167+ else {
168+ overlap_iter->indexA = me->v2;
169+ overlap_iter->indexB = me->v1;
170+ }
171+ overlap_iter++;
172+ }
173+ }
174+ }
175+ }
176
177- free_bvhtree_from_mesh(&treedata);
178+ if (v_mask) {
179+ MEM_freeN(v_mask);
180+ }
181
182 if (overlap_len) {
183 WeldMesh weld_mesh;
184@@ -1905,7 +1943,7 @@ static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContex
185 weld_mesh_context_free(&weld_mesh);
186 }
187
188- MEM_freeN(overlap);
189+ MEM_SAFE_FREE(overlap);
190 return result;
191 }
192
193@@ -1943,10 +1981,17 @@ static void panel_draw(const bContext *UNUSED(C), Panel *panel)
194 PointerRNA ob_ptr;
195 PointerRNA *ptr = modifier_panel_get_property_pointers(panel, &ob_ptr);
196
197+ int mode = RNA_enum_get(ptr, "mode");
198+
199 uiLayoutSetPropSep(layout, true);
200
201+ uiItemR(layout, ptr, "mode", 0, NULL, ICON_NONE);
202 uiItemR(layout, ptr, "merge_threshold", 0, IFACE_("Distance"), ICON_NONE);
203- uiItemR(layout, ptr, "max_interactions", 0, NULL, ICON_NONE);
204+
205+ if (mode == MOD_WELD_ALL_MODE) {
206+ uiItemR(layout, ptr, "max_interactions", 0, NULL, ICON_NONE);
207+ }
208+
209 modifier_vgroup_ui(layout, ptr, &ob_ptr, "vertex_group", "invert_vertex_group", NULL);
210
211 modifier_panel_end(layout, ptr);

It worked very well in my tests and is certainly more efficient.
Can you provide a file that shows the problem mentioned?

Oh right I missed to mention one thing. Solidify used the original vertex positions back then (and now the ones from this new array), so the problem was much worse in that case, as it always merged to the last vertex. In case of the weld modifier it's a bit better because it merges to the center of the merged verts, so it's not order dependend. I mean removing those array would make it faster, but just look at the difference.

Steps to reproduce the problem:
(apply P1649 and run blender)

  1. Open the file in T75032 (F8369236).
  2. Disable the Solidify Modifier. And add a Weld Modifier.
  3. Increase parameter "Resolution Preview U" from 1 to 5 or more and watch.
  4. Now enable solidify again and disable weld (make sure they have the same merge threshold).
  5. repeat step 3. and watch the difference. My current patch has the same behavior as solidify there.

Another way to clearly see the difference is to apply 5 subsurf levels on a cube and then add a weld modifier and slowly turn up the merge threshold.

EDIT:
Hold on, I was just testing it a bit more and it doesn't immediatly seem to show the issue that I though would definitly happen. So I will take some time and investigate further... maybe there is a good way to get rid of something here.

Henrik Dick (weasel) added a comment.EditedSep 23 2020, 1:54 PM

Ok I investigated a bit further. First of all, just removing those two arrays from my implementation does not seem to change the result too much (in the image below, you would not immediatly see the quality difference). It's just a bit more unpredictable and in some special cases really bad (what I call chain merging, observable in F8369236).

But your proposed patch still used overlaps and therefore creates significantly different results. Look at this comparison (same settings and everything):

using overlap
using vert_dest_map directly and the helper arrays

Also note that I could increase the merge distance, but then we would be looking at an empty image on the left and a still very good image at the right.

I also checked the performance by adding a wave modifier and observing the fps. My results were:

  1. There is no obvious difference whether those additional array are used or not. (at least in my testing)
  2. Your patch is in this case 30% faster, but mostly because there are much less vertices in the result that need to get calculated.

I understand that the solution on the right seems to be better. But you are using an unconventional method to get an order-dependent result.
Unnecessarily complicating the code to make the "connected" method behave quite differently from "full".
Merging chain of vertices that overlap in a single point was by design.
If this behavior has to change, it has to affect both methods (full and connected).
And it has to be done preferably in a separate patch.

Ok but then I still need to make that patch, because I need that to get rid of solidifys merge.

Henrik Dick (weasel) planned changes to this revision.Sep 24 2020, 7:38 PM

Waiting for D8995 to land first.

  • rebase

with the recent changes, this patch got much simpler. It also now matches the default behavior.

@Germano Cavalcante (mano-wii) you did appear to dislike the algorithm. Do you have any thoughts on how it could be improved, without going down the patch of the old weld behavior?

Also, since we are introducing a new mode option, we could also make the old bvh behavior accessible again, for compatibility.

  • rebase
  • add default mode value
  • rebase
  • remove unused struct fields

I want to give an overview over the current implementation here.

Connected Mode uses edge connections to merge vertices.
This can be done in three basic ways.

  1. Tag too short edges and then collapse all of them.
    • Order independent
    • Even very big usual meshes can collapse to a single vert if the individual edges are too short (heavy infinite chain merging)
    • result looks much like the BVH Tree solution and is similar in implementation
    • probably the fastest of all (not by much though)
  2. Merge short edges one by one in edge order.
    • edge order dependent as a merged edge will always change the length of adjacent edges of at least one of the merged verts.
    • result looks like the current KD Tree solution and is somewhat similar in implementation.
    • There are multiple ways to merge the too short edges with different results.
      1. Always merge to vertex with lowest index (simplest/fastest implementation) -> heavy infinite chain merging for text objects and curves.
      2. Always merge to the mean of the resulting cluster (current) -> only infinite chain merging for a crafted leaning tower of lire (very very rare and not severe)
      3. Always merge to the mean of the resulting cluster and enlarge the radius of the cluster to contain all merged verts, only merge when the resulting radius is less than the merge threshold -> no infinite chain merging possible (tested)
      4. Always merge to the first merged vertex in the clusters (on first merge, merge to the vert with lower index, new single verts will be merged to that one as well, when two clusters merge, merge to the first created cluster) -> infinite chain merging for edges with less than half the merge length for some edge orders, not for usual text/curves. (theory at this point, will maybe test and add a paste)
    • Loose edges and stray (non-manifold) triangles appear on very high merge thresholds even when starting from smooth manifold surfaces. This happens because sometimes the edge order collapses a loop before ever encountering the inner verts. If the collapsed loop has a far distance to the inside verts, they will stay behind. This happens with all the strategies A-D. Additional filtering would be required to remove these artifacts (I could add a TODO in the code).
  3. Do Incremental Decimation like in http://graphics.stanford.edu/courses/cs468-10-fall/LectureSlides/08_Simplification.pdf
    • like the Decimate Modifier in Collapse Mode but with merge distance instead of a percentage. Would be simpler to do it there I guess.
    • quite heavy, requires sorting of the edges and sorted manipulation (needs fitting data structure)
    • will automatically keep topological shape

2.B. is the currently implemented solution.

Thanks for the info,
But I still feel that this code can be simplified.
I do not see the need to make a solution that seeks order independence and in which the center of each cluster has to be computed in advance.
Kdtree does not do this, so this new solution just needs to be differentiated by performance.

But I tried a different and simplified solution and did not get a satisfactory result.

1diff --git a/source/blender/modifiers/intern/MOD_weld.c b/source/blender/modifiers/intern/MOD_weld.c
2index 7e4897dd3bc..0865ba2ed24 100644
3--- a/source/blender/modifiers/intern/MOD_weld.c
4+++ b/source/blender/modifiers/intern/MOD_weld.c
5@@ -27,7 +27,7 @@
6 * - Review weight and vertex color interpolation.;
7 */
8
9-//#define USE_WELD_DEBUG
10+#define USE_WELD_DEBUG
11 //#define USE_WELD_NORMALS
12 //#define USE_BVHTREEKDOP
13
14@@ -1567,11 +1567,6 @@ static bool bvhtree_weld_overlap_cb(void *userdata, int index_a, int index_b, in
15 }
16 #endif
17
18-struct WeldVertexCluster {
19- float co[3];
20- uint merged_verts;
21-};
22-
23 static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContext *ctx, Mesh *mesh)
24 {
25 Mesh *result = mesh;
26@@ -1709,74 +1704,48 @@ static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContex
27 #endif
28 else {
29 BLI_assert(wmd->mode == MOD_WELD_CONNECTED_MODE);
30-
31 MEdge *medge, *me;
32- float edgedir[3];
33-
34 medge = mesh->medge;
35 totvert = mesh->totvert;
36 totedge = mesh->totedge;
37
38- struct WeldVertexCluster *vert_clusters = MEM_calloc_arrayN(
39- totvert, sizeof(*vert_clusters), __func__);
40- struct WeldVertexCluster *vc = &vert_clusters[0];
41- for (uint i = 0; i < totvert; i++, vc++) {
42- copy_v3_v3(vc->co, mvert[i].co);
43+ for (uint i = 0; i < totvert; i++) {
44+ vert_dest_map[i] = OUT_OF_CONTEXT;
45 }
46- float merge_dist_sq = square_f(wmd->merge_dist);
47-
48- range_vn_u(vert_dest_map, totvert, 0);
49
50 /* Collapse Edges that are shorter than the threshold. */
51+ const float merge_dist = wmd->merge_dist;
52 me = &medge[0];
53 for (uint i = 0; i < totedge; i++, me++) {
54+ uint v, v_dst;
55 uint v1 = me->v1;
56 uint v2 = me->v2;
57-
58- while (v1 != vert_dest_map[v1]) {
59- v1 = vert_dest_map[v1];
60- }
61- while (v2 != vert_dest_map[v2]) {
62- v2 = vert_dest_map[v2];
63- }
64- if (v1 == v2) {
65- continue;
66- }
67 if (v_mask && (!BLI_BITMAP_TEST(v_mask, v1) || !BLI_BITMAP_TEST(v_mask, v2))) {
68 continue;
69 }
70- if (v1 > v2) {
71- SWAP(uint, v1, v2);
72+
73+ uint v1_dst = vert_dest_map[v1];
74+ uint v2_dst = vert_dest_map[v2];
75+ if (v1_dst != OUT_OF_CONTEXT && v2_dst != OUT_OF_CONTEXT) {
76+ continue;
77 }
78- struct WeldVertexCluster *v1_cluster = &vert_clusters[v1];
79- struct WeldVertexCluster *v2_cluster = &vert_clusters[v2];
80-
81- sub_v3_v3v3(edgedir, v2_cluster->co, v1_cluster->co);
82- const float dist_sq = len_squared_v3(edgedir);
83- if (dist_sq <= merge_dist_sq) {
84- float influence = (v2_cluster->merged_verts + 1) /
85- (float)(v1_cluster->merged_verts + v2_cluster->merged_verts + 2);
86- madd_v3_v3fl(v1_cluster->co, edgedir, influence);
87-
88- v1_cluster->merged_verts += v2_cluster->merged_verts + 1;
89- vert_dest_map[v2] = v1;
90- vert_kill_len++;
91+ else if (v1_dst != OUT_OF_CONTEXT) {
92+ v = v2;
93+ v_dst = v1_dst;
94 }
95- }
96-
97- MEM_freeN(vert_clusters);
98-
99- for (uint i = 0; i < totvert; i++) {
100- if (i == vert_dest_map[i]) {
101- vert_dest_map[i] = OUT_OF_CONTEXT;
102+ else if (v2_dst != OUT_OF_CONTEXT) {
103+ v = v1;
104+ v_dst = v2_dst;
105 }
106 else {
107- uint v = i;
108- while (v != vert_dest_map[v] && vert_dest_map[v] != OUT_OF_CONTEXT) {
109- v = vert_dest_map[v];
110- }
111- vert_dest_map[v] = v;
112- vert_dest_map[i] = v;
113+ v = v1;
114+ v_dst = v2;
115+ }
116+
117+ if (compare_v3v3(mvert[v].co, mvert[v_dst].co, merge_dist)) {
118+ vert_dest_map[v] = v_dst;
119+ vert_dest_map[v_dst] = v_dst;
120+ vert_kill_len++;
121 }
122 }
123 }

So there must be a lot more involved.

Can you show the results of profiling, for us to see how much advantage this patch brings?

I tried your patch. It's as you said not satisfactory, but I did the profiling nevertheless.
Turns out your patch runs on average roughly (surprise) half as fast as the current solution of this patch.
(obviously that only applies for my test object and not for any object, see below)
The reason for that is, that the geometry created by your merge algorithm is more (and more complicated?),
so more time is spent in the second stage of the weld modifier.

This goes to show, that as long as the algorithm is in linear execution time and cares for quality, performance will not be the issue.

I didn't find the code of the KD Tree yet, because there was some trickery in the files which made it hard to find.
Nevertheless I think the KD Tree does not need the average positions to avoid chain merging, because it is evaluated
over all vertices. Therefore it can disperse the vertices into clusters of bounded size. When only working with edges
it is necessary to keep track of the cluster size in some other way. Especially because edges may also merge whole clusters.

I also want to add a screenshot of the loose edges and stray triangles that I was talking about.
This screenshot shows the current patch. It is also the object on which I tested the performance difference.

I also gained more confidence in the current patch while writing the summary. I think it should be fine (the best solution)
and these artifacts can be fixed later if needed as they will not be a problem in most common cases AFAICT.

I think we can accept these loose edges for now.
But I still don't see the need for these struct WeldVertexCluster *vert_clusters.
I made a comparison without it and the result didn't seem to be that different.

I don't see much of the benefit of calculating the center of the clusters, the result is not perfect in both cases.

diff --git a/source/blender/modifiers/intern/MOD_weld.c b/source/blender/modifiers/intern/MOD_weld.c
index 7e4897dd3bc..7372f440514 100644
--- a/source/blender/modifiers/intern/MOD_weld.c
+++ b/source/blender/modifiers/intern/MOD_weld.c
@@ -1567,11 +1567,6 @@ static bool bvhtree_weld_overlap_cb(void *userdata, int index_a, int index_b, in
 }
 #endif
 
-struct WeldVertexCluster {
-  float co[3];
-  uint merged_verts;
-};
-
 static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContext *ctx, Mesh *mesh)
 {
   Mesh *result = mesh;
@@ -1717,13 +1712,7 @@ static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContex
     totvert = mesh->totvert;
     totedge = mesh->totedge;
 
-    struct WeldVertexCluster *vert_clusters = MEM_calloc_arrayN(
-        totvert, sizeof(*vert_clusters), __func__);
-    struct WeldVertexCluster *vc = &vert_clusters[0];
-    for (uint i = 0; i < totvert; i++, vc++) {
-      copy_v3_v3(vc->co, mvert[i].co);
-    }
-    float merge_dist_sq = square_f(wmd->merge_dist);
+    float merge_dist = wmd->merge_dist;
 
     range_vn_u(vert_dest_map, totvert, 0);
 
@@ -1748,24 +1737,13 @@ static Mesh *weldModifier_doWeld(WeldModifierData *wmd, const ModifierEvalContex
       if (v1 > v2) {
         SWAP(uint, v1, v2);
       }
-      struct WeldVertexCluster *v1_cluster = &vert_clusters[v1];
-      struct WeldVertexCluster *v2_cluster = &vert_clusters[v2];
 
-      sub_v3_v3v3(edgedir, v2_cluster->co, v1_cluster->co);
-      const float dist_sq = len_squared_v3(edgedir);
-      if (dist_sq <= merge_dist_sq) {
-        float influence = (v2_cluster->merged_verts + 1) /
-                          (float)(v1_cluster->merged_verts + v2_cluster->merged_verts + 2);
-        madd_v3_v3fl(v1_cluster->co, edgedir, influence);
-
-        v1_cluster->merged_verts += v2_cluster->merged_verts + 1;
+      if (compare_v3v3(mvert[v1].co, mvert[v2].co, merge_dist)) {
         vert_dest_map[v2] = v1;
         vert_kill_len++;
       }
     }
 
-    MEM_freeN(vert_clusters);
-
     for (uint i = 0; i < totvert; i++) {
       if (i == vert_dest_map[i]) {
         vert_dest_map[i] = OUT_OF_CONTEXT;

Here is a comparison for with and without using mean position (with your paste). You can see minor differences in the mesh in the lower part of the video (less streched triangles on the left), but for text objects the difference is really clear.
The reason text objects show this effect of chain merging entire regions when the edges are short, is the element order. It is completely sorted.

This Testfile contains the usual cases where some algorithms break down. I didn't do the leaning tower of lire example, because it is a lot of work to craft it with the right element order.

In fact with text objects the result is better.
The code in general is fine in my opinion.

This revision is now accepted and ready to land.Oct 7 2020, 2:25 PM

Minor changes to weld modifier connected mode.

  • Order "All" first in the enum since it's the default.
  • Reduce variable scope
  • Use const.
  • Replace calloc with malloc since most data was being immediately written to.
  • Re-use padding variable instead of adding padding to struct.