the function vec_roll_to_mat3_normalized() has a bug as described in T82455. This Differential is only for refactoring the code such that it becomes more clear what the function does and how the bug can be fixed. This differential is supposed to not introduce any functional changes.
Details
Diff Detail
- Repository
- rB Blender
Event Timeline
@Gaia Clary (gaiaclary) do you feel up to adding some unit tests for this function? That way it's much easier to document certain edge cases, and to show that this change doesn't have a negative impact on existing situations. The test could include the examples from T23954, T31333, T27675, and T30438, to ensure that these situations actually remain fixed.
| source/blender/blenkernel/intern/armature.c | ||
|---|---|---|
| 2171–2172 | Not for this patch, but this comment is a bit mind-boggling to me.
| |
I have lots of questions about this particular bit of code, and I hope now that you've dived into it you can help answer them.
due to float precision errors, we can have nor = (0.0, 0.99999994, 0.0)...
Why is this bad? What is theta? There is no comment that explains it. It's first defined as 1+nor[1] (so in this example theta is almost 2.0), but later it's redefined as nor[0]*nor[0] + nor[2]*nor[2] (so in this example it is 0.0).
I have found other situations where nor == (x, 0.99999994, z) with x and/or z very close to 0, but not exactly 0. However the test (nor[0] || nor[2]) can be true for tiny numbers.
A zero check for floats is usually bad, so changing those to some epsilon-around-zero check would likely be better, indeed.
Because of this i believe it is better to not only check (x or z != 0) but also to make sure that (nor[0] * nor[0] + nor[2] * nor[2]) is above THETA_THRESHOLD_NEGY_CLOSE.
This would create a tight coupling between that check, and the redefinition of theta as nor[0]*nor[0] + nor[2]*nor[2]. I think having a separate variable, with a more describing name, would be better. That way if the new variable is too small, the approach that uses the old variable can still be used. This would also make it possible to declare both theta and the new var as const.
Remark: I was once able to create a three bone situation by hand which shows the described failure, but i could not reproduce the situation ever since then. I guess the problem is mathematical unprecision. However i see this issue very frequently when using a generated rig (with python) where the bones are combined with a rather complicated set of constraints. Depending on the generator parameters occasionally the rig implodes due to this issue.
This makes it even more important to write some automated tests. These will just use a vector and a float as input, instead of requiring an entire rig.
This seems to make sense yes, but indeed would be nice to have some regression tests on this now...
refactor vec_roll_to_mat3_normalized() fro easier reading.
in particular:
- added 3 const float x,y,z to replace usage of nor[0], nor[1], nor[2]
- replaced the #defines by static const float numbers and renamed the variables with more descriptive names.
Below is a python script that generates a rig that suffers from this bug. Run the script, then switch from edit mode to pose mode and see how "bad bone" disappears. Because its pose bone matrix is totally broken. The problem originates in the super small bone roll values which finally lead to the bug:
import bpy
bone_data = [
["good bone", (0.0, 0.0, 1.2038891315460205), (0.0, 0.01536799967288971, 1.3546786308288574), -4.1202467175263267e-16],
["bad bone", (0.0, 0.01536799967288971, 1.3546786308288574), (0.0, 0.0, 1.2038891315460205), -1.570796257510665e-07],
]
def make_armature(ctx, name):
data = bpy.data.armatures.new(name)
obj = bpy.data.objects.new(name, data)
ctx.collection.objects.link(obj)
ctx.view_layer.objects.active = obj
bpy.ops.object.mode_set(mode='EDIT')
parent = None
for bname, head, tail, roll in bone_data:
bone = obj.data.edit_bones.new(bname)
if parent:
bone.parent = parent
parent=bone
bone.head=head
bone.tail=tail
bone.roll = roll
return obj
armobj=make_armature(bpy.context, "armature")I added a simple workaround to our Addon such that it just limits the precision of the bone roll to only 6 digits. This is enough to keep Blender from doing bad stuff. The correct fix though is to apply this differential to master :)
- Refactor function for easier reading.
- revert fix and only keep the refactored code with no funtional changes
| source/blender/blenkernel/intern/armature.c | ||
|---|---|---|
| 2159–2160 | Any particular reason this is static? The good thing about #define is that it's all handled by the precompiler. | |
| 2159–2160 | Comments in C files should be /* */ style. | |
| 2159–2160 | Comments should be a full sentence, so start with capital and end with period. | |
| 2160 | I think the change in name to "safe" and "critical" is really nice. | |
| 2171–2172 | These don't change value any more, so you can declare them as const: const float theta = 1.0f + y; const float theta_alt = x * x + z * z; | |
| 2172 | Calculating theta_alt can be deferred to the else clause where it's actually used. Might save a few CPU ticks (although this might be optimized by the compiler as well). | |
| 2173 | Instead of
I think it's nicer to remove the "..." and explain why this is a problem:
| |
| 2177–2178 | Instead of -Y I think not (0, -1, 0) is clearer. Or "the -Y vector", whichever you prefer. | |
| 2202 | "very close to" includes "at zero distance", so "nor is very close to -Y" is enough. | |
- revert fix and only keep the refactored code with no funtional changes
- Added proposed changes as described in Differential comments
| source/blender/blenkernel/intern/armature.c | ||
|---|---|---|
| 2159–2160 |
maybe it was a misunderstanding from a comment from sergey. I removed the 'static'. I still prefer to have it a regular variable. | |
| 2172 |
This is true for now, but when the bug fix is applied, then we need theta_alt already in the if clause. | |
- Refactor function for easier reading.
- take care that no funtional changes are made during refactoring.