Page MenuHome

SIGSEGV on running a frame_change_pre handler
Closed, ArchivedPublic

Description

System Information
System: Linux 4.11.4, x86_64 GNU/Linux
Processor AMD Ryzen 7
Advanced Micro Devices, Inc. [AMD/ATI] Hawaii PRO [Radeon R9 290/390] (rev 80)

Blender Version
Broken: 2.78a, 2.79a, 2.79b

Short description of error
SIGSEGV on running a frame_change_pre handler in the presence of a rigid body model. Problem disappears when most objects are deleted, the frame_change handler is not installed or the (python) string formatting is not used. Tested on 2.78a, 2.79a, 2.79 and a build from the git repository (hash: 96fba1e1016)

Exact steps for others to reproduce the error

  • load attached .blend file
  • run script
  • animate

typically crashes on rendering frame 2-18

The problem appears to involve a race condition in dag_get_node() (blender-git/blender/source/blender/blenkernel/intern/depsgraph.c). Introducing an (existing) spinlock guard solves the problem, but I'm new to the project and don't have enough oversight to decide whether it's a fix or a work-around.

DagNode *dag_get_node(DagForest *forest, void *fob)
{
DagNode *node;
+++ BLI_spin_lock(&threaded_update_lock);

node = dag_find_node(forest, fob);
if (!node)

		node = dag_add_node(forest, fob);

+++ BLI_spin_unlock(&threaded_update_lock);
return node;
}

Event Timeline

Cant get it to crash here.
(framehandler works fine and I played the animation in viewport as well as rendered the whole sequence...)

Do you also have issues when using the new dependency graph [not sure issues in the old dependency graph have a high chance of being fixed...]?
(run blender from the commandline with --enable-new-depsgraph)

Race conditions are tricky to reproduce. Just tried that with 2.79b and it crashes with and without the new depsgraph, though the stack-trace is different.


On my computer this scenario crashes quite predictably, though on different frames. If you want me to try other versions, I'll gladly oblige.

Same thing with 2.78 (with --enable-new-depsgraph)

Cloned a fresh copy from the repository... Same thing. (with and without new-deps-graph)

Reinstate spinlock (as indicated above): problem gone, file renders correctly.

Besides... It's a classic case where some form of coordination is needed if the routine is called from concurrent threads.

This comment was removed by Renate Meijer (kleuske).

Ok. I asked gdb. It appears that for some reason the Entry pointer is fucked up.

Thread 44 "blender" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff96eff700 (LWP 15570)]
0x000055555852ae15 in ghash_lookup_entry_ex (bucket_index=2709, key=0x7fff9e26c408, gh=0x7fffd9d17568)

at /home/kleuske/blender-git/blender/source/blender/blenlib/intern/BLI_ghash.c:399

399 if (UNLIKELY(gh->cmpfp(key, e->key) == false)) {
(gdb) print gh
$1 = (GHash *) 0x7fffd9d17568
(gdb) print *gh
$2 = {hashfp = 0x55555852d236 <BLI_ghashutil_ptrhash>, cmpfp = 0x55555852d251 <BLI_ghashutil_ptrcmp>, buckets = 0x7fff9d662008, entrypool = 0x7fffcff913c8,

nbuckets = 4099, limit_grow = 3074, limit_shrink = 768, cursize = 10, size_min = 0, nentries = 1669, flag = 0}

(gdb) print e
$3 = (Entry *) 0x68
(gdb) bt
#0 0x000055555852ae15 in ghash_lookup_entry_ex (bucket_index=2709, key=0x7fff9e26c408, gh=0x7fffd9d17568)

at /home/kleuske/blender-git/blender/source/blender/blenlib/intern/BLI_ghash.c:399

#1 ghash_lookup_entry (key=0x7fff9e26c408, gh=0x7fffd9d17568) at /home/kleuske/blender-git/blender/source/blender/blenlib/intern/BLI_ghash.c:436
#2 BLI_ghash_lookup (gh=0x7fffd9d17568, key=0x7fff9e26c408) at /home/kleuske/blender-git/blender/source/blender/blenlib/intern/BLI_ghash.c:807
#3 0x00005555580ed30f in dag_find_node (forest=0x7fffcff90588, fob=0x7fff9e26c408) at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:1108
#4 0x00005555580ed448 in dag_get_node (forest=0x7fffcff90588, fob=0x7fff9e26c408) at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:1150
#5 0x00005555580eb670 in build_dag_object (dag=0x7fffcff90588, scenenode=0x7fff9d684608, scene=0x7fffa07fe008, ob=0x7fff9e26c408, mask=61)

at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:558

#6 0x00005555580ecec2 in build_dag (bmain=0x7fffa077bf08, sce=0x7fffa07fe008, mask=61)

at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:993

#7 0x00005555580ee6ef in dag_scene_build (bmain=0x7fffa077bf08, sce=0x7fffa07fe008) at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:1667
#8 0x00005555580eeadf in DAG_scene_relations_update (bmain=0x7fffa077bf08, sce=0x7fffa07fe008)

at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/depsgraph.c:1785

#9 0x000055555824bd63 in BKE_scene_update_for_newframe_ex (eval_ctx=0x7fffa30db4d8, bmain=0x7fffa077bf08, sce=0x7fffa07fe008, lay=1, do_invisible_flush=true)

at /home/kleuske/blender-git/blender/source/blender/blenkernel/intern/scene.c:1995

#10 0x0000555557a19510 in RE_engine_render (re=0x7fff9790d008, do_all=0) at /home/kleuske/blender-git/blender/source/blender/render/intern/source/external_engine.c:663
#11 0x0000555557a346c9 in do_render_3d (re=0x7fff9790d008) at /home/kleuske/blender-git/blender/source/blender/render/intern/source/pipeline.c:1559
#12 0x0000555557a359a7 in do_render_fields_blur_3d (re=0x7fff9790d008) at /home/kleuske/blender-git/blender/source/blender/render/intern/source/pipeline.c:1948
#13 0x0000555557a37877 in do_render_composite_fields_blur_3d (re=0x7fff9790d008) at /home/kleuske/blender-git/blender/source/blender/render/intern/source/pipeline.c:2631
#14 0x0000555557a385c9 in do_render_all_options (re=0x7fff9790d008) at /home/kleuske/blender-git/blender/source/blender/render/intern/source/pipeline.c:2901
#15 0x0000555557a3b1cd in RE_BlenderAnim (re=0x7fff9790d008, bmain=0x7fffa077bf08, scene=0x7fffa07fe008, camera_override=0x0, lay_override=0, sfra=1, efra=700, tfra=1)

at /home/kleuske/blender-git/blender/source/blender/render/intern/source/pipeline.c:3824

#16 0x000055555791a047 in render_startjob (rjv=0x7fffcffb2188, stop=0x7fff98cebbbc, do_update=0x7fff98cebbba, progress=0x7fff98cebbc0)

at /home/kleuske/blender-git/blender/source/blender/editors/render/render_internal.c:605

#17 0x00005555573a6785 in do_job_thread (job_v=0x7fff98cebb48) at /home/kleuske/blender-git/blender/source/blender/windowmanager/intern/wm_jobs.c:337
#18 0x00005555585a3042 in tslot_thread_start (tslot_p=0x7fff9d69c508) at /home/kleuske/blender-git/blender/source/blender/blenlib/intern/threads.c:253
#19 0x00007ffff4fe5494 in start_thread (arg=0x7fff96eff700) at pthread_create.c:333
#20 0x00007ffff2754acf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
(gdb)

Bastien Montagne (mont29) lowered the priority of this task from 90 to Normal.Jul 22 2018, 12:39 PM

This is nothing to do with dependency graph, this is just a poor design of handlers: they can run form multiple threads (like, render thread and viewport thread) but they do not guarantee an thread safety. In 2.8 with renderer having own depsgraph some things will be easier (it will be possible to avoid thread conflicts in this case). The issue is: we do not have reliable python API in 2.8 yet, this is yet to be re-introduced.

Assigning to Campbell. He introduced handlers and he is into python APIs of any sort.

Sybren A. Stüvel (sybren) changed the task status from Confirmed to Needs Information from User.Feb 4 2020, 3:52 PM

I can't seem to reproduce this on my machine with Blender 2.82 alpha @ 5dc1183580e932870064b44246e8fb750a8d806e, but as this is about a race condition that doesn't mean it's fixed for sure.

@Renate Meijer (kleuske) Could you give this another test with a daily build from https://builder.blender.org/ ?

Jacques Lucke (JacquesLucke) claimed this task.

I'm not able to reproduce the issue.

Will close this for now. @Renate Meijer (kleuske), feel free to reopen this report if you are still able to reproduce the issue in the latest version. Please also provide a (simple) test file for the latest Blender version.