The crash happens at external_engine.h:560
GHashIterator pa_iter;
GHASH_ITER (pa_iter, re->parts) {
RenderPart *pa = BLI_ghashIterator_getValue(&pa_iter); // HERE
// pa_iter.curEntry is 0xFFFFFFFFFFFFFFEFIt seems that sometimes render parts may be initialized on one thread while simultaneously read on another. The render info drawing thread locks the read mutex, but the rendering thread doesn't lock a write mutex when doing the initialization in the RE_engine_render:841.
Stack trace of the crashing thread:
> blender.exe!RE_engine_get_current_tiles(Render * re, int * r_total_tiles, bool * r_needs_free) Line 560 C blender.exe!draw_render_info(const bContext * C, Scene * scene, Image * ima, ARegion * region, float zoomx, float zoomy) Line 104 C blender.exe!draw_image_main(const bContext * C, ARegion * region) Line 1037 C blender.exe!image_main_region_draw(const bContext * C, ARegion * region) Line 677 C blender.exe!ED_region_do_draw(bContext * C, ARegion * region) Line 543 C blender.exe!wm_draw_window_offscreen(bContext * C, wmWindow * win, bool stereo) Line 689 C blender.exe!wm_draw_window(bContext * C, wmWindow * win) Line 812 C blender.exe!wm_draw_update(bContext * C) Line 1013 C blender.exe!WM_main(bContext * C) Line 482 C blender.exe!main(int argc, const unsigned char * * UNUSED_argv_c) Line 530 C [External Code]
Possible stack trace of another thread (Edited. Real trace is of course quite random, but i did catch something like this):
> blender.exe!RE_engine_render(Render * re, int do_all) Line 841 C [Inline Frame] blender.exe!do_render_3d(Render *) Line 1137 C blender.exe!do_render(Render * re) Line 1217 C blender.exe!do_render_composite(Render * re) Line 1354 C blender.exe!do_render_all_options(Render * re) Line 1617 C blender.exe!RE_RenderAnim(Render * re, Main * bmain, Scene * scene, ViewLayer * single_layer, Object * camera_override, int sfra, int efra, int tfra) Line 2593 C blender.exe!render_startjob(void * rjv, short * stop, short * do_update, float * progress) Line 653 C blender.exe!do_job_thread(void * job_v) Line 398 C
Both threads are working with the same render parts object at the same time. RE_engine_render inits render parts without a lock (see the patch).
I must admit that I'm not qualified to debug this kind of code, so this patch needs a thorough review. It meant to be more of an illustration to the bug report than a proper fix. Also I don't know how to test this, as the issue is quite hard to reproduce as it is, and it seems to go away as soon as anything is changed in RE_engine_get_current_tiles or RE_engine_render. I couldn't reproduce the crash with this proposed fix, but I also can't reproduce it after placing a breakpoint, or a printf in the code of RE_engine_render or RE_engine_get_current_tiles that works with the render parts.