Page MenuHome

Fix T81077 id_management test on macOS
ClosedPublic

Authored by Ankit Meel (ankitm) on Oct 22 2020, 7:30 PM.

Details

Summary

This looks like a optimizer bug where it makes wrong assumptions.
The code inside lib_id_delete:264 on rBafd13710b897cc1c11b
for (id = last_remapped_id->next; id; id = id->next) {..}
is not executed in release/relwithdebinfo builds.

This can be "fixed" by several ways:

  • Adding a line that prints the last_remapped_id->name right before the said for-loop starts.
  • Turning off optimization for the whole function id_delete: #pragma clang optimize off/on Ray Molenkamp
  • Marking last_remapped_id volatile. Julian Eisel
  • Marking tagged_deleted_ids volatile. But it adds a warning when calling BLI_addtail: discards volatile qualifier. Discovered by accident.

Fix T81077

Diff Detail

Repository
rB Blender

Event Timeline

Ankit Meel (ankitm) requested review of this revision.Oct 22 2020, 7:30 PM
Ankit Meel (ankitm) created this revision.
Ankit Meel (ankitm) edited the summary of this revision. (Show Details)
Ankit Meel (ankitm) edited the summary of this revision. (Show Details)Oct 22 2020, 7:32 PM
Ankit Meel (ankitm) edited the summary of this revision. (Show Details)Oct 22 2020, 7:34 PM
Ankit Meel (ankitm) retitled this revision from Fix id_management test on macOS to Fix T81077 id_management test on macOS.
Ankit Meel (ankitm) edited the summary of this revision. (Show Details)

Just for context, repeating what I said earlier in the chat - this is what I think goes on:

  • The crash makes sense to me now, the "MECube" (which is the default cube from the factory-startup I think) is tagged properly for deletion, but not the "OBCube" - because the loop isn't executed correctly.
  • The optimizer realizes that tagged_deleted_ids is collected the same way on each iteration (which is based on ID.tag) and it operates on the same data, so it assumes it will always yield the same result.
  • It does not realise however that ID.tag is changed within the loop, somewhere deeper down the call-stack (here in fact https://developer.blender.org/diffusion/B/browse/master/source/blender/blenkernel/intern/lib_remap.c$158).

Note that this should be reported to the Clang team. The Godbolt links we used in the chat will hopefully be enough, although if my analysis above is correct, the code on Godbolt won't show the error, as the part of ID.tag being changed indirectly in the loop is missing. Maybe a simple (non-inline!) function that just modifies ID.tag will be enough.

Bastien Montagne (mont29) requested changes to this revision.Oct 24 2020, 4:34 PM

Think this is a reasonable work around OSX compiler bug for now. But please do explain why we need this variable to be volatile for now.

source/blender/blenkernel/intern/lib_id_delete.c
264

This requires a comment stating that it is a temporary work around to address a bug in OSX Clang compiler optimizer.

There is no reasons for this variable to be volatile in itself otherwise...

This revision now requires changes to proceed.Oct 24 2020, 4:34 PM
source/blender/blenkernel/intern/lib_id_delete.c
264

Should I add a preprocessor directive __APPLE__?

Ankit Meel (ankitm) updated this revision to Diff 30330.EditedOct 24 2020, 7:05 PM
  • add comment

never mind the preprocessor comment.. it breaks formatting,

diff --git a/source/blender/blenkernel/intern/lib_id_delete.c b/source/blender/blenkernel/intern/lib_id_delete.c
index 25c48479ef9..03315453e61 100644
--- a/source/blender/blenkernel/intern/lib_id_delete.c
+++ b/source/blender/blenkernel/intern/lib_id_delete.c
@@ -263,7 +263,11 @@ static void id_delete(Main *bmain, const bool do_tagged_deletion)
       ID *id, *id_next;
       /* Marked volatile to avoid a macOS Clang optimization bug. See T81077.
        * #last_remapped_id.next is assumed to be NULL by optimizer which is wrong. */
-      volatile ID *last_remapped_id = tagged_deleted_ids.last;
+#ifdef __APPLE__
+
+      volatile
+#endif
+          ID *last_remapped_id = tagged_deleted_ids.last;
       keep_looping = false;
 
       /* First tag and remove from Main all datablocks directly from target lib.
Ankit Meel (ankitm) marked an inline comment as done.Oct 24 2020, 7:16 PM

Should this be committed to 2.91 release also ?

Yes, it should also be comited to 2.91, obviously.

source/blender/blenkernel/intern/lib_id_delete.c
264

am not sure it would be worth it, this should not have impact on other platforms?

This revision is now accepted and ready to land.Oct 26 2020, 9:14 AM
Ankit Meel (ankitm) edited the summary of this revision. (Show Details)Oct 26 2020, 10:28 AM
This revision was automatically updated to reflect the committed changes.
source/blender/blenkernel/intern/lib_id_delete.c
264

I have a hard time believing that this is apple specific, afaik Apple doesn't do such in-depth changes to their version of Clang. With some configurations or versions of Clang the error may still happen on Linux.
So getting rid of the #ifdef was the right decision I think.

Ankit Meel (ankitm) marked 2 inline comments as done.Oct 26 2020, 11:36 AM
Ankit Meel (ankitm) added inline comments.
source/blender/blenkernel/intern/lib_id_delete.c
264

__APPLE__ would've covered macOS clang from both LLVM and Apple.
I thought of it just to express intent.
But it breaks formatting D9315#231100 , and adds nothing more than verbosity.
Marking comments as done.

Ankit Meel (ankitm) marked an inline comment as done.Oct 26 2020, 8:17 PM

Bug report on clang/llvm tracker : https://bugs.llvm.org/show_bug.cgi?id=47984