Page MenuHome

Fix T81704
AbandonedPublic

Authored by Joseph Brandenburg (TheAngerSpecialist) on Oct 17 2020, 7:06 PM.

Details

Summary
  1. Description of the problem:

See T81704 --
Basically, if you have a Spline IK chain that is fit exactly to a curve, then change the Y-Scale Mode from "Fit Curve" to anything else, the curve will shrink. Repeatedly applying the pose as rest pose will *annihilate* your rig by shrinking the bones over and over again.

  1. Description of the proposed solution, and a motivation as to why this is the best solution:

I solve the problem by removing the evaluation's incorrect method of factoring in curve_scale and re-implementing it. Instead of comparing the bone chain's edit-mode length with the spline's total length, I compare it to the chain's length if it were fit to the curve. This means that if the user applies the pose with Fit To Curve and switches to Bone Original, the pose remains the same instead of getting shorter. Finally, I added some code to the Spline IK evaluation that finds the correct point by iterating until the error is below a threshold.

  1. List of alternative solutions, and a motivation as to why these are undesirable.

I would like to solve this problem with sphere intersection, but I don't know how.

  1. Limitations of the proposed solution.

This will, of course, change the behavior of existing rigs -- but seeing as the scaling caused by the old behavior rendered the feature unusable (at least for me), I don't know how many people this will affect. It's also kinda ugly and bloated, but it seems to perform well enough (700 duplicates of a spline IK chain with keyframes has a 24fps framerate in my test). Still, with a loop that may have many iterations and lots of lookups in the spline for points, it's conceivable that performance could be impacted.

  1. Mock-up of the proposed user interface and a description of how users are going to interact with the new feature.

This will not change the UI :)

Diff Detail

Repository
rB Blender
Branch
master
Build Status
Buildable 10774
Build 10774: arc lint + arc unit

Event Timeline

Joseph Brandenburg (TheAngerSpecialist) retitled this revision from Remove SplineIKEvaluationState curve_length; to Fix T81704 by removing SplineIKEvaluationState.curve_length;.Oct 17 2020, 7:19 PM
Joseph Brandenburg (TheAngerSpecialist) edited the summary of this revision. (Show Details)
  • Remove unnecesary lines of code added in the process of figuring

I don't know what the intention of this curve_scale was, but it was
causing incorrect scaling on the bones in the spline because it
was being used as a weight for distance to the next point, and
for whatever reason it's always less than 1.0 -- so changing
from "Fit-To-Curve" to any other Y-scale mode caused the bones
to shrink.

I assume that this parameter was supposed to adjust the spline IK
for changes in the spline-object's scale, but all it really did
was make the constraint sensitive to the curve resolution and
NURBS order.

Perhaps it was meant as a correction for the error due to curve
resolution versus limit surface. But that isn't really useful!

Finally, I think perhaps it is a good idea to make Spline IK
correct for the spline's scale. But that's for another patch!

Joseph Brandenburg (TheAngerSpecialist) retitled this revision from Fix T81704 by removing SplineIKEvaluationState.curve_length; to Fix T81704.Oct 17 2020, 8:36 PM
Joseph Brandenburg (TheAngerSpecialist) edited the summary of this revision. (Show Details)

OK, so here's a problem:


This is the master branch

With my patch

So removing those lines of code is *definitely* not enough to fix the problem! I think I now understand what they were for.

I think then, that the problem with the curve_length parameter is that it isn't being adjusted correctly. The parameter is necessary for applying the difference in scale for bone chains that are shorter than the respective spline, but because the bones are not exactly the same shape as the spline, the factor is always incorrect. So, it's very similar to my initial guess about the cause of the problem -- the length of the bone-chain will always be different than the length of the curve because the curve is, well, curvier than the bones.

So what I need to do is a) find a better way of comparing the length or b) find a way of making curve_length equal to 1.0 when the rig is in rest-pose.

Re-implement state->curve_length

I've reimplemented the factoring for curve/chain length. However,
My implementation isn't quite right, either. I think it is closer
than the old one. I've left the print statements in for now, so
that anyone who tries this patch can see the effect.

Anyhow, it's clear to me that even if I scale the ctime paramater
where_on_curve correctly based on the length of the chain in
edit mode and the length it would be if it were set to
Fit To Curve, the result will be "wrong". I need to find out how
to measure the error and adjust accordingly.

Why is this happening? I think it is because the actual WS distance
between two points on a curve is hard to predict. I'm starting to
think it's not possible to solve this problem without using
sphere intersection, a bisect search, or some sort of iterative
method to get the exact value.

  • Account for individual segment error;

I've added some code to account for the error in each individual

segments' length. This makes the result of the constraint much
more accurate. It still isn't perfect, but it's a huge improvement
over the original behavior.

In my next commit I will attempt to make this bit of code iterative until the error is below some arbitrary threshold.

Note: I'm not sure this is the right approach, to be honest! I've come around full-circle to thinking that evaluating the curve as a mesh and doing a sphere intersection is the right thing to do, but I don't know how to do that in C. I could write up a Python mock-up, though.

Here's edit mode:

Pose mode without the patch:


Error for each bone:

-0.00824883
 0.00447076
 0.00106042
-0.01599806

And Pose Mode, with my patch:


Error for each bone:

0.00015642
2.7e-06
0.00027897
2.2e-07

So as you can see, the result of the patch is much less error-prone in the case of a bone chain that is shorter than the spline. Still not *quite* precise enough for my liking!

Here's the result of the patch when the bones are fit to the curve in edit mode:

Without the patch:

-0.0155875
-0.01621791
-0.02047623
-0.02022866

The error is so bad that it's easily visible to the naked eye.

With the patch:

-9e-08
0.0
-3.6e-07
-7e-08

In this case, even I would be satisfied with this level of precision in one of my rigs.

One more thing -- I've left all the print statements in so far. I want smarter brains to have an easy time checking things. Of course they'll be removed in the end! But after this is really, truly, actually fixed.

  • Segment Length compensation is now iterative.
  • Formatting

I think this really does solve the bug completely, but I think it is an ugly solution to the problem. The error values are now well below the four-significant-figure threshold that I use to check my own work. As long as it works, I can use it for my own needs, regardless of whether it's a pretty solution 😏 !

I know that I need to correct a lot of code style things. my next commit will be a make format and then after that I will need feedback from more experienced developers, since I don't know how to do some of the improvements I still want to do. In particular -- cleaning up anything in the math, using Infinity in the spot where it is needed, and Float Epilson or whatever instead of the 0.00001f.

Finally, I think that it's possible to calculate all of this a single time when initializing the tree, but right now most of the calculation occurs for every bone that is evaluated, every time. I don't know how to make the dynamic-length list I would need for this (It would need to be the length of the number of bones in the chain... and from what I understand structs need to have a known, fixed size when they are defined).

  • Remove prints and a commented-out line of code.

If you want to check the logic, I can add them back in.

Here is the test file used in the images above.

  • Added sanity checks and tweaked some values
  • Format
Sybren A. Stüvel (sybren) requested changes to this revision.Oct 19 2020, 4:01 PM

That's a LOT of text. Do you think you could maybe summarise it in the patch description?

I'm having a hard time understanding how Spline IK is supposed to be working. The manual doesn't even explain what it does exactly. It does mention very broadly that "the bone chain is now controlled by the curve", but there is no clear definition of what that means.

What would you think of the following approach?

  • Update the manual so that it actually explains what Spline IK does,
  • create a unit test that tests that Spline IK is doing the right thing,
  • show that when this patch is applied, the unit test succeeds.
source/blender/blenkernel/intern/armature_update.c
214

How can a field named "scale" represent a position?

241

This comment is nice, in the sense that it tries to explain what's going on. I don't understand it, though.

  • Why is the comparison to the spline length necessary?
  • What is "the Fit-To-Curve length"? Is that the total length of the bones in the chain after they've been fitted to the curve?
  • To keep terminology clear, only use "the length of the chain" as the number of bones in the chain, like what you'd set in the constraint parameters.
  • What does it mean to compare lengths directly? How would you compare lengths in any other way than "length1 < length2"?
  • Why does the resolution of the curve have anything to do with its length?
251

Why is this length "ideal"?

252–256

Avoid overly short abbreviations, and name the variables to illustrate what role they have, rather than what kind of data they contain. For example, bone_tail is clearer than point1.

256

No need to describe what a variable is used for. That should be clear from the code.

261

Try to avoid one-letter variables. bone_index is much more expressive than i.

262

factor can be const.

265

Why is this copy necessary?

266

The distance between two points can be computed directly with len_v2v2().

267

Declare seglen on this line, that way it can be const:

const int seglen = len_v3v3(p1, p2);
271

The entire above block of new code should be in its own function. It can then be given a clear name that explains its purpose/function.

276–277

The extra parentheses can be removed.

276–277

What happens when tree->totlength == 0.0f? In that case state->curve_scale is undefined.

This revision now requires changes to proceed.Oct 19 2020, 4:01 PM
source/blender/blenkernel/intern/armature_update.c
214

This is the name of the parameter from before the patch. I think "curve_length" is a better name, but I didn't want to change anything that was there before. This field doesn't refer to a position, it's the total length of the curve.

241
  1. TLDR; To correctly adjust the bone chain for scale.

Comparison to the spline length is necessary because of how Blender gets the points on the curve -- it measures each chain's total World-Space length, then records the "factor" along the chain of bones where each joint is (normalized from 0-1). The where_on_path function is used to lookup the point on the curve at that location along the curve. Now, for Fit-To-Curve, this works quite well because there is no need to preserve the scale of any of the bones. However, "Bone Original" and "None" need to multiply the lookup parameter by the quotient of the curve's length and the chain's WS length in order to adjust the points along the curve where the function gets the points. So if the chain of bones is 2m and the u=curve is 3m, the adjustment value should be 0.6667. This is stored in curve_scale and multiplied with the base values.

  1. Yes.
  2. By "compare lengths directly" I mean compare the WS length of the chain with the WS length of the curve. The values should be 1/1 == 1 if the bone-chain is fit to the curve already (The user applied the pose with "Fit To Curve" and then switched to "Bone Original").
  3. This one isn't very easy to explain without a picture. But basically, it's because a bone that stretches between two points is a straight line, whereas a curve probably isn't a straight line. The section of the curve is always a curved line, an arc, so the bones can be the same "length" as far as the user cares -- the bones follow the curve from the start till the end -- but Blender will always read the chain as shorter than the curve and adjust the points too much.
265

Copying from vec is necessary because vec is a float[4] and the point is a float[3] -- where_on_path returns the point and the tilt as a x,y,z + w. This is what other parts of the code do, so I just copied them.

276–277

It's initialized to 1.0f in line 238. I think I do have some 0-division checks to do... I don't know how to do error handling in C, and I think it's better to Look Before You Leap in this language, right? Thanks 😄

Joseph Brandenburg (TheAngerSpecialist) marked an inline comment as done.
  • Address comments, organize and clean up code

Still to do:

  • create a unit test and update the manual.
  • test corner cases
    • Specifically -- if the error for a segment *increases* when trying to correct it, then it should fall back to the value it had before any corrections were made. This can happen in cases where the curve is deformed in an extreme way.
  • experiment with another method
  • clean, clean, clean.
Joseph Brandenburg (TheAngerSpecialist) marked 12 inline comments as done.
  • Revert to master and rename the IK state's curve_scale to adjustment
  • Rename IK Tree's points to more descriptive joint_bindings
  • Implement new joint binding algorithm that sets the joint bindings so that the initial pose will not scale the bones, modify adjustment code to use the new adjustment calculation correctly.

Specifically:

  • First, check if the curve is available,
  • Then, for each bone, search forward in the curve to find the bounds for the binary search (since there could be two or more points on the curve that intersect the circle, we want only the first one.)
  • Once the bounds have been set, do a binary search until exactly the correct point is found. If the bounds are not set correctly, this result may be wrong, although this is unlikely.
  • Once the joint bindings are set, they are adjusted again in the case of "Even Divisions" -- this does not attempt to factor in bones' length, although I can make it average the length of the bones' first and then calculate the joint bindings with the average length of all of the bones instead of the length of the current bone. This is a design detail that I haven't thought about much.
  • At this point, the joints are bound. In splineik_evaluate_init, to adjust the curve, we compare the length of the segments between the spline-segments with the length of all of the bones together -- when the curve is in rest-pose, this value should be exactly 1.0. When the curve is posed, it will change.
  • in the evaluation function, finally, there's a small change to the way adjustment works. Now, instead of scaling the joint bindings down, since they were always normalized to 0.0-1.0 before, we can scale them up or down depending on the situation. For example, if Fit To Curve is enabled, then the values are normalized and used directly. Otherwise, they are adjusted by adjustment (and by posescale if Bone Original is enabled). This will be 1.0 if the rig is in rest pose, but it will be more or less if it isn't.

This is a pretty difficult behavior to explain succinctly, so feel free to poke me in blender.chat with questions. Anyhow, there's still a lot of testing and checking to do - sanity checks, testing for corner cases, etc. As an added bonus, we can use a similar method to get exact bone-scale preservation if needed (this method only 100% guarantees the bones' scale when the rig is in rest pose).

EDIT: I can confirm that this crashes in some cases, e.g. changing a shape-key on the curve... will have to spend some time carefully enumerating these situations and solving the problems. Probably won't be able to spend much time on this for a few days, goal is to have a 💯% fix by Dec 17th meeting.

  • Updated NULL check, should be more reliable now;
  • Added some code that hacks around a bug that seems to occur non-deterministicly, and some commented-out-code for demonstrating the bug.

To elaborate:
I think I've fixed the T81704. Fixed. Solved. Terminated. However, because my fix relies on runtime data to do anything, it sometimes doesn't work. Part of the reason is, sometimes the runtime data isn't there. That's no good! But the other part of the problem is -- sometimes the NULL check is passed, but it fails later on in another function or code-block. If you un-comment my test and compile it, you'll see that the iteration where the NULL-check fails is random and non-deterministic. Someone in blender.chat suggested it may be due to another thread.

Anyways, the worst thing that this can do is create error messages in the console that should be suppressed. The real problem is that runtime data doesn't seem to exist until some user-interaction in the UI. I'm not sure if this is reliable within Python, for instance. I wish there were a way of getting the data without relying on runtime, or at the very least, to keep a copy of it around to use instead, in case there isn't a valid runtime for use immediately.

I think I'm at the point where I donno what else to do. I think my solution to the bug is essentially the right way to do it, unless we decide to use a mesh-approximation of the curve, instead. Making it work without any hitches will require changes to other parts of Blender, maybe even depsgraph code, and I don't feel comfortable doing that without guidance.

OH! I also think the behavior is still bad if the chain is longer than the curve.. will need to investigate that situation further.

source/blender/blenkernel/intern/armature_update.c
137

Uncomment this block of code to see the problems I mentioned in my comment -- something causes this NULL check to fail after it has already succeeded, and after a random number of iterations. This doesn't seem to be caused by my patch.


OK, here's something -- try and duplicate the curve&armature. Crashes. 😭

source/blender/blenkernel/intern/armature_update.c
74

What is tar? Understanding this function's parameters requires knowing where it is used, which is bad API design. I'm guessing it's the curve object, but why not just name it curve_object or curve_ob then?

Same for p1 and p2, these are too short names that can't be understood without further investigation. They're also return parameters, which means they should be prefixed with r_ and be the last parameters.

76

throw → return

77–80

Don't mix up int and bool like this, unless there is a good reason to. This function doesn't compute some error metric, so adding them together is weird. If you really need the number of where_on_path calls that returned an error, a better name would be num_errors.

83–84

This is quite a hard function to understand. I would expect a curve-sphere intersection function to receive a curve and a sphere in its parameters. Where are they?

109–111

Return parameters should be listed last.

111

Instead of tar, name the Object * for what it is in this function.

112

Distance to what?

113

Start what? Point along the curve? Distance to the start position?

115

What is precision? And what units is it expressed in? It looks more like a maximum number of iterations, rather than the precision.

I want to help out a bit more than the notes I just left, but in order to do that it helps if the code is a bit more transparent in what it does & how it works.

I'm resigning as reviewer, as T81704 has been fixed already. Feel free to close the patch ("Abandon Revision", which sounds slightly more dramatic).

I'm closing the patch, as it hasn't been updated in over a year, and addresses an already-closed issue.