Page MenuHome

Task isolation in task pool causes deadlock.
Closed, ResolvedPublicBUG

Description

We found this deadlock with the new multi-threaded geometry nodes evaluator, but it can also be reproduced without geometry nodes.
In both cases all threads get deadlocked in BLI_task_pool_work_and_wait.

Steps to reproduce with geometry nodes

  1. Undo the change in rB223c6e1ead2 (to enabled threading again).
  2. Start Blender with --threads 2. The number of threads is important.
  3. Load the file below.
  4. Duplicate the cube.
  5. Deadlock (Blender freezes). Call stack of both threads: P2137.

Steps to reproduce without geometry nodes

  1. Apply P2140.
  2. Compile and start Blender.
  3. Deadlock almost immediately as Blender starts.

The bug is related to task isolation. This small change below would remove the deadlock, but reintroduce new issues that were fixed by rB08ac4d3d71dee9fc4ec7f878e57de59c87115280.

diff --git a/source/blender/blenlib/intern/task_pool.cc b/source/blender/blenlib/intern/task_pool.cc
index 6404f5264cc..0c07e1f6204 100644
--- a/source/blender/blenlib/intern/task_pool.cc
+++ b/source/blender/blenlib/intern/task_pool.cc
@@ -115,7 +115,7 @@ class Task {
   void operator()() const
   {
 #ifdef WITH_TBB
-    tbb::this_task_arena::isolate([this] { run(pool, taskdata); });
+    run(pool, taskdata);
 #else
     run(pool, taskdata);
 #endif

Read the task isolation docs to understand what tbb::this_task_arena::isolate is doing. At first I didn't really understand how this could lead to a deadlock, but this document explains it in the Oh No! Work Isolation Can Cause Its Own Correctness Issues! section.

Event Timeline

Hans Goudey (HooglyBoogly) changed the task status from Needs Triage to Confirmed.May 27 2021, 3:01 AM
Hans Goudey (HooglyBoogly) triaged this task as High priority.
Hans Goudey (HooglyBoogly) changed the subtype of this task from "Report" to "Bug".

This is interesting. In my testing, there is always a deadlock once the number of objects reaches the number of threads I give Blender. Using all of my 16 threads, it only fails when I tweak the translation after I add the 16th cube.

Hans Goudey (HooglyBoogly) lowered the priority of this task from High to Normal.May 27 2021, 3:35 PM
Jacques Lucke (JacquesLucke) renamed this task from Deadlock when new geometry nodes evaluator is used. to Task isolation in task pool causes deadlock..May 27 2021, 4:04 PM
Jacques Lucke (JacquesLucke) updated the task description. (Show Details)