Found by a profile by the Godot Engine project.
For my test case glTF2 export went from 15 minutes to 5 minutes.
Differential D11502
glTF baking time optimization via parallelization Authored by Ernest Lee (iFire) on Jun 4 2021, 4:28 PM.
Details Found by a profile by the Godot Engine project. For my test case glTF2 export went from 15 minutes to 5 minutes.
Diff Detail
Event TimelineComment Actions Hello, Comment Actions Can you explain what this change is supposed to do ? Is that simply like the for loop is distributed to run on multiple processors so that multiple curves can be calculated at the same time ? Then i believe this is a good thing :) Comment Actions @Julien DUROURE (julien) This file is in the collada module because it originally was created for the collada importer/exporter. And i am happy to see that this stuff is reused elsewhere now :) Comment Actions I had a choice of per object, per bone and per curve keyframes. So I moved it to the closest function to the work. This blender 3d curve baking optimization is done sequentially curve by curve so each curve is divided among all the keyframes to all the cores. Comment Actions @Ernest Lee (iFire) I suspect that when each core handles one entire curve at a time that might possibly be more efficient? I am not sure though, did you try ? Or maybe i misunderstand how it works.. Comment Actions @Ernest Lee (iFire) Can you provide your test case and a brief instruction how to do the test ? I would like to debug this as it sounds a bit odd that you can improve the runtime that way by a factor of 3. If that is true then maybe the code itself is not efficient. @Sybren A. Stüvel (sybren) I am not sure if we can use OMP wherever we want. Are there any rules when its allowed to use it or is there another method to use in Blender? Comment Actions Here's the test I'm imagining.
For location two it'll be: /* Add curves on Object->material actions*/
object_type = BC_ANIMATION_TYPE_MATERIAL;
#ifdef WITH_OMP
#pragma omp parallel for
#endf
for (int a = 0; a < ob->totcol; a++) {Comment Actions Searching the existing code for WITH_OMP shows that it's currently only used in extern/quadriflow. I don't think we should add more uses. Parallelizing parts of the code that benefit from it is fine, but from what I can tell OpenMP is on its way out. I also don't think that a test on a single machine with a single glTF file is a good way to test the performance impact of a code change. @Ernest Lee (iFire) If you want to take this further, and work on it to improve the patch, please take a look at the Ingredients of a Patch as described on our wiki. I would also welcome a description of how this change in the Collada code can impact the glTF exporter. |