Page MenuHome

Fix T93090: Crash with data transfer modifier and geometry nodes.
ClosedPublic

Authored by Jacques Lucke (JacquesLucke) on Nov 22 2021, 8:24 PM.

Details

Summary

Think I found the root cause of the bug. Some notes:

  • The normal layers are computed correctly when GEO-bird_body is evaluated.
  • Later in the BKE_object_get_evaluated_mesh function that is called by the transfer modifier, the mesh was copied. The copy created by mesh_copy_data does not include the normal layers because CD_MASK_NORMAL is not in CD_MASK_MESH or CD_MASK_DERIVEDMESH.
  • The copy happens because in another thread geometry nodes is using the evaluated mesh of GEO-bird_body while evaluating another object. Since now the mesh is shared (or rather the MeshComponent actually), the mesh has to be copied when geometry_set_eval->get_mesh_for_write is called.
  • Mid term, BKE_object_get_evaluated_mesh should probably return a const mesh. Theoretically, it should only be changed when updating caches (like the batch cache) which are safe to update, because they are derived data and do not actually change the mesh. Adding derived data to a mesh should generally be protected by a mutex. For batch cache that might not be necessary when we know that only the main thread adds the batch cache. There seems to be some other code that modifies the mesh returned by BKE_object_get_evaluated_mesh, that needs to be investigated further. It's also a bit weird that BKE_object_get_evaluated_mesh takes a const Object * but returns a non-const Mesh *.
  • For now just use get_mesh_for_read with a const_cast instead of using get_mesh_for_write. That should match the old behavior without GeometrySet.

Diff Detail

Repository
rB Blender
Branch
fix-missing-normals-layer (branched from master)
Build Status
Buildable 18876
Build 18876: arc lint + arc unit

Event Timeline

Jacques Lucke (JacquesLucke) requested review of this revision.Nov 22 2021, 8:24 PM
Jacques Lucke (JacquesLucke) created this revision.
Jacques Lucke (JacquesLucke) edited the summary of this revision. (Show Details)

Seems ok, I guess this get_mesh_for_read is expected to have the same level of constness as object->runtime.data_eval anyway.

Later there can later be a refactor in master to make BKE_object_get_evaluated_mesh return const Mesh*.

Thanks for the investigation. A const return value from this function is a good goal for 3.1, but agreed that this is the right call for 3.0.

This revision is now accepted and ready to land.Nov 22 2021, 9:12 PM

Modifiers request specific customdata for their own mesh through their requiredDataMask callback (see e.g. https://developer.blender.org/diffusion/B/browse/master/source/blender/modifiers/intern/MOD_normal_edit.c$662 for a simpler example than the datatransfer one).

The modifier stack is suppose to ensure that this data is available and up to date before calling the modifier.

When the modifier needs specific customdata from its other operand(s), this is defined from its updateDepsgraph callback, using calls to DEG_add_customdata_mask (see e.g. https://developer.blender.org/diffusion/B/browse/master/source/blender/modifiers/intern/MOD_datatransfer.c$131). The depsgraph is then responsible to evaluate those operand meshes with the proper options to get those data in the final evaluated result.

Why this is not working properly with geometry node I don't know... But imho from a quick look at this patch it looks like a quick and dirty hack, not a proper solution?

Not saying that this is not a valid 'fix' for 3.0 btw. But we also need a proper fix for 3.1.

Why this is not working properly with geometry node I don't know... But imho from a quick look at this patch it looks like a quick and dirty hack, not a proper solution?

All of what you describe is working. It's just that there was a incorrect copy of the mesh made in BKE_object_get_evaluated_mesh that did not include the data that was requested by other objects (because mesh_copy_data does not copy all custom data layers). This fix is actually to not copy the mesh which is what this patch does.

The "proper" solution is to return a const Mesh * from BKE_object_get_evaluated_mesh to make it more obvious that this copy is not necessary.

OK, think I understand better now, thx.
Would be interesting though to know which "areas expect to be able to modify the evaluated mesh" then... duplicating the mesh in that call was also a huge potential performance issue.

Would be interesting though to know which "areas expect to be able to modify the evaluated mesh" then... duplicating the mesh in that call was also a huge potential performance issue.

It's mostly the batch cache, and some remaining places that add derived data to the mesh but aren't protected by a lock. There's also a couple in the paint code. Definitely worth resolving soon.