Page MenuHome

Fix T88026: resolve ownership ambiguity with shared physics pointers
AbandonedPublic

Authored by Campbell Barton (campbellbarton) on May 11 2021, 3:10 PM.

Details

Summary

This is a proof of concept patch that fixes T88026, using similar logic to other pointers that are shared between original & evaluated data.

This patch implements the following logic:

  • The shared pointer for rigidbody and softbody is only ever copied when performing ID duplication into same Main (not when NO_MAIN tag).
  • The depsgraph temporarily sets this shared pointer and clears it before freeing the ID data.
  • This way when the shared pointer is set, it can always be freed without the possibility of causing a double-free or having to use ID tags to detect if it should be freed or not (as is currently done). This is error prone as any code that changes tags would need to run logic that update pointer ownership too (which seems unnecessarily complicated)

Open Topics

There are some things to consider with this patch.

  • Should we consider this shared pointer to be part of the object that should not be lost (just as we would not want to loose custom-properties or material list).

    Currently this pointer contains point-cache for softbody, which is written/read to a file.
  • Accessing the shared pointer on the NO_MAIN copy requires looking up the original ID.

    So BKE_libblock_management_main_add (currently unused) for example, won't get the shared pointer set unless there are explicit checks to do so. (currently the shared pointers aren't handled either).
  • This leaves the shared pointer set to NULL, while I couldn't cause a case where this caused a NULL pointer de-reference, it's possible there are situations where the data can be NULL where as before it wouldn't have been.

Diff Detail

Repository
rB Blender
Branch
TEMP-T88026-FIX (branched from master)
Build Status
Buildable 14505
Build 14505: arc lint + arc unit

Event Timeline

Campbell Barton (campbellbarton) requested review of this revision.May 11 2021, 3:10 PM
Campbell Barton (campbellbarton) created this revision.

Added developers who have been involved in this area - if you prefer not to review this patch, feel free to remove yourselves.

Campbell Barton (campbellbarton) edited the summary of this revision. (Show Details)

The shared pointer for rigidbody and softbody is only ever copied when performing regular ID duplication (not when NO_MAIN tag is used).

I might be missing something, but I think the whole reason shared exists is to be able to share a point cache between an original and evaluated datablock. So that evaluated physics can be written into the point cache of the original datablock. And so not copying over that pointer would break that.

The shared pointer for rigidbody and softbody is only ever copied when performing regular ID duplication (not when NO_MAIN tag is used).

I might be missing something, but I think the whole reason shared exists is to be able to share a point cache between an original and evaluated datablock. So that evaluated physics can be written into the point cache of the original datablock. And so not copying over that pointer would break that.

The sharing between copied & evaluated still works (in my tests at least) - it's just that the with this patch it's not done for all evaluated/duplicated data-blocks, only copy-on-write objects, managed by the depsgraph.

This seems to follow the depsgraph design (at least whats already there) most closely.

It might end up being simpler to flag ownership as you suggested though, In general I'd rather avoid having different ownership rules for how evaluated ID's manage shared pointers.

Campbell Barton (campbellbarton) edited the summary of this revision. (Show Details)

Ah, I see now that indeed it will still be set for the depsgraph.

Should we consider this shared pointer to be part of the object that should not be lost (just as we would not want to loose custom-properties or material list).

Not sure when you are thinking of this being lost. But generally we should preserve the point cache settings when duplicating objects or making object instances real?

This leaves the shared pointer set to NULL, while I couldn't cause a case where this caused a NULL pointer de-reference, it's possible there are situations where the data can be NULL where as before it wouldn't have been.

Functions like rna_SoftBodyModifier_point_cache_get dereference this pointer without checking. I don't know if there is a situation where it can be made to crash right now, but to me it seems like a fragile assumption.

Is there a specific reason we can't use LIB_TAG_COPIED_ON_WRITE instead of LIB_TAG_NO_MAIN everywhere for the point cache sharing?

Ah, I see now that indeed it will still be set for the depsgraph.

Should we consider this shared pointer to be part of the object that should not be lost (just as we would not want to loose custom-properties or material list).

Not sure when you are thinking of this being lost. But generally we should preserve the point cache settings when duplicating objects or making object instances real?

Currently there aren't easy ways to trigger this (unless you count running rna_SoftBodyModifier_point_cache_get on evaluated data), if BKE_libblock_management_main_add was used anywhere this could cause a NULL shared pointer.

If we support loading data into the local Main using BLO_library_temp_load_id, that would cause issues too.

This leaves the shared pointer set to NULL, while I couldn't cause a case where this caused a NULL pointer de-reference, it's possible there are situations where the data can be NULL where as before it wouldn't have been.

Functions like rna_SoftBodyModifier_point_cache_get dereference this pointer without checking. I don't know if there is a situation where it can be made to crash right now, but to me it seems like a fragile assumption.

Right, either we would need to add NULL checks everywhere, or at least ensure it exists before accessing, in that case it would loose the point cache from the original.

Is there a specific reason we can't use LIB_TAG_COPIED_ON_WRITE instead of LIB_TAG_NO_MAIN everywhere for the point cache sharing?

We could, we'd need to either add LIB_ID_CREATE_COPIED_ON_WRITE or tag the ID LIB_TAG_COPIED_ON_WRITE before copying data to the destination.

Looking into this, the flag is never cleared at the moment, so as long as it's documented that clearing this flag changes pointer allocation logic, this could work (I'll look into it).

This is an alternative fix that ties the LIB_TAG_COPIED_ON_WRITE flag with shared pointer use: P2115

It has the advantage that depsgraph COW copies are an exception, and they're the only case where physics pointers are shared.