Page MenuHome

Geometry Nodes: Deterministic anonymous attribute lifetimes.
ClosedPublic

Authored by Jacques Lucke (JacquesLucke) on Dec 23 2022, 6:40 PM.

Details

Summary

This patch does a few things. For the overall design, see below.

  • Refactor AnonymousAttributeID to remove the reference counts that controlled the attribute lifetime (there is still a reference count, but that's for the struct itself, not for the referenced data).
  • Introduce AnonymousAttributePropagationInfo which is used to pass information to algorithms about which anonymous attributes should be propagated. This had to be added in many places that relied on anonymous attribute reference counts before.
  • Inference anonymous attribute relations in node groups (this extends rB0f8487f640ed: Geometry Nodes: add more details about field handling to node declaration).
  • Add new methods to GeoNodeExecParams to deterministically determine which anonymous attributes should be created/propagated.
  • More processing on the lazy-function graph for a geometry node group that dynamically determines lifetimes of anonymous attributes.

No functional changes are expected.

Possible future improvements:

  • Avoid using MultiValueMap in some cases. Using some kind of bit vectors should result in better performance there but are harder to get correct right now.
  • The number of passes over the node tree can be reduced in the relations-inferencing code, but that's harder to get correct, and I'd rather to do that separately.
  • Simplify computing socket usages. Currently, they can be a bit redundant sometimes (statements like A or (A or B) can be simplified to just A or B).
  • Take anonymous attribute propagation info into account in more places. Some nodes still propagate more attributes than they have to (e.g. subdivision).

Deterministic Anonymous Attribute Lifetimes

The purpose of this document is to provide some context to the anonymous attributes lifetimes refactor.

What are anonymous attributes?

Anonymous attributes are one of the two key concepts that were introduced in Geometry Nodes in Blender 3.0. Together with fields they were originally proposed on devtalk as a solution to severe usability problems.

Anonymous attributes are attributes that don't have a name (that's visible to the user). This has a couple of benefits:

  • Users don't have to specify the name to use an attribute created by a node.
  • Attribute name collisions can be avoided automatically.
  • It's possible to detect where they are used and their creation and deletion can therefore be automated. Note that this is not generally possible with named attributes, because they may even be used after export in other software. Named attributes are expected to stay around when they are not removed explicitly.

Numerous built-in nodes create anonymous attributes and output fields that reference them. In the image below are a few. All the non-geometry output sockets reference anonymous attributes.

What is their lifetime and why do we care?

Since anonymous attributes are created and removed automatically, the question arises where that actually happens in practice. The lifetime is essentially the set of geometry sockets when contain the anonymous attribute. It begins when the attribute is created and ends where the attribute is removed. Note that while there is a single node that creates the attribute, there can be multiple that delete it.

The general goal is to make anonymous attribute lifetimes as short as possible, because unnecessary attributes on a geometry are just wasteful. That's because they take up memory and might have to be processed by later nodes.

For example, the _expected_ lifetime of the Top attribute in the image below ends at the input of the extrude node. The attribute should not be propagated to the extruded mesh, because it's not required anymore. As we'll see, the _expected_ lifetimes does not match the _actual_ lifetime of anonymous attributes currently. Often the _actual_ lifetime is (a bit) longer than the _expected_ lifetime.

Why are their lifetimes not deterministic currently?

To get the field and anonymous attributes systems up and running quickly, we used a relatively simple approach to deal with the lifetimes: reference counting. Essentially, for every anonymous attribute a reference count is allocated, that keeps track of how many geometries contain the attribute, and how many fields reference the attribute. Whenever a node propagates attributes from one geometry to another, it checks whether any field still references anonymous attributes. If one is found that is not referenced anymore, it is not propagated.

While this approach is relatively simple to implement, it has some issues:

  • It is not deterministic. The same node may sometimes propagate an anonymous attribute in one evaluation, but not in the next, even though nothing else changed. That's because the order in which nodes are evaluated is not deterministic (nodes are evaluated on multiple threads). This has the effect that sometimes a field that references an attribute has or has not been freed before the node is executed.
  • The actual lifetimes of attributes is almost always a too long. For example, in the image above, the extrude node currently does propagate the Top selection from the Cylinder node, even though it's not necessary (the Subdivide Mesh node does not though). That's because the field referencing the attribute still exists while the extrude node is evaluated.

Another issue is that it can happen that (non-deterministically) an anonymous attribute is created sometimes, but sometimes not. For example, in the image below the Normal output might be created sometimes, even if the Switch input is true (and hence it would not be needed). That's again caused by non-deterministic node evaluation order. If the Distribute Points on Faces node is executed before the Switch node, it does not know yet whether the Normal output will be required or not. Therefore, it has to create the attribute, because adding it later is not possible.

Why do we want them to be deterministic?

Deterministic lifetimes are necessary for different kinds of caching. That is because non-deterministic lifetimes lead to nodes doing different things even when the inputs are exactly the same. Cache invalidation is already hard enough when you know all the data that feeds into the computation of a geometry. It's essentially impossible when not all data sources are known and they are non-deterministic.

This applies to planned automated caching in geometry nodes, as well as simulation caching that is being worked on.

There are other benefits to deterministic attribute lifetimes:

  • We can debug and visualize them. Showing them to users may make it easier to understand the benefit of anonymous attributes over named attributes. We can also show hints to users when a link does not make sense because an attribute referenced by a field does not exist on the geometry it is evaluated on.
  • Solving the problem deterministically also means that we can make sure the expected and actual lifetimes match always or at least more often.

Naive Solution

There is one very simple solution that would make anonymous attributes lifetimes deterministic: always create all anonymous attributes possible and never delete them before during geometry nodes evaluation. While this works, it causes a lot of overhead that can make geometry nodes much slower than necessary.

Knowing about this solution is useful, even though it is not practical. That's because it is still _correct_ in the sense that the user-observable behavior of geometry nodes (except for performance) is the same as before. This shows that better anonymous attribute lifetimes are just an (important) optimization. Therefore, the actual lifetimes don't always have to be perfect if they are at least as long as the expected lifetimes. In common cases it should be possible to achieve perfect lifetimes, but it's not obvious that it is always possible. Some times it may be hard to analyze node groups which contain many switch nodes.

How to achieve better lifetimes deterministically?

There are multiple ways to do this which vary in their complexity and efficiency. For example, switch nodes could be ignored completely in the lifetime analysis. This would simplify the solution, but causes actual lifetimes to be much longer than necessary in many cases. One could also try to find "global" solutions, i.e. first inline all nested nodes into a single node tree and then do analysis there. This can have a very large upfront cost that's especially bad when many of the nodes aren't actually evaluated (because they are disabled with a Switch node for example). Below, I'll explain the solution I currently have in mind without wandering off into alternatives too much, to keep the document more focused.

A key requirement for me was that the analysis can happen on the node-group level. This way it can be parallelized, and the same groups don't have to be analyzed multiple times as would be necessary with "global" solutions. It's also easier to cache the results of the analysis.

Extra Inputs for Determinism

To make the behavior of nodes deterministic (and therefore also the attribute lifetimes), two kinds of inputs are added to nodes. Note that these are not visible to users, they are just added to an internal graph data structure.

  • For every output field socket that contains anonymous attributes created in the node, add a boolean input that controls whether the anonymous attributes referenced by the output should be created.
  • For every geometry output that propagates attributes from input geometries, add an input that controls which anonymous attributes are propagated. (This is better than inserting nodes that remove anonymous attributes because sometimes the same node wants to use the attribute during field evaluation, but does not have to propagate it.)

These inputs are added to all relevant built-in nodes as well node groups.

Static Analysis

The remaining and challenging part is to pass is to determine when these extra inputs are necessary and what to pass into them. For both we need more information about the behavior of nodes. More specifically, 4 main relations between sockets of a node are required.

  • Green: Relation from input to output geometry that can propagate attributes.
  • Yellow: Relation between field and geometry input that indicates that a field is evaluated on a specific geometry (and therefore that the geometry is expected to have all anonymous attributes referenced by the field).
  • Blue: Relation between field and geometry output that indicates that the node created a new anonymous attribute that is referenced by the field and is available on the geometry.
  • Orange: Relation between field input and field output that indicates that the output field might reference the same anonymous attributes that are referenced by the input.

Again, all of these relations have to exist for built-in nodes as well as node groups. These relations already allow a good amount of static analysis but it's important to keep in mind that they are not always perfect. If a node group uses many switch nodes, it's possible to essentially create all possible relations even though most of them are never used in practice. Either way, these relations offer a good starting point for further dynamic analysis (while the node tree is evaluated) and also allow building UI features.

We do not expect users to set up these relations for node groups manually. Therefore, we have to inference these relations for node groups based on the node setup. Luckily, this seems to be possible in linear time (one has to iterate over all nodes more than once though).

With the added inputs and the inferenced relations, it is already possible to do not a terrible job at determining anonymous attribute lifetimes. However, there are still fairly simple scenarios where anonymous attributes are created/propagated unnecessarily. For example, consider the case below:

In this case, we'd only want to generate the normal attribute in the distribute node in case the Use Normals group input is true. However, that is not known statically. So somehow we want to let the distribute node know to respect the Use Normals input.

Dynamic Analysis

To make the setup above work as expected, the Generate node group has tell its parent node group that the Normals input is only used when the Use Normals input is true. Note that the condition could also be more complex (e.g. the Normals input is only used when some integer input has a specific value and a specific output socket is used).

Generally, we want to know which inputs a node group will use depending on other input values and which outputs are used. So it's easiest to just add some additional internal sockets to group nodes. For every original output socket, there is input that indicates whether the output is used. For every original input socket, there is an output that indicates whether the input used.

It's important to note that this can in fact generate cycles in the graph. In most cases, that's not a problem, because the it's not actually a data dependency, it just looks like a cycle because the node group is not inlined. The evaluator can evaluate that just fine. However, it's also possible that there are actual data dependencies. That happens e.g. when Use Normals is true if and only if the Distribute Points on Faces node outputs more than 10 points. In this case, the cycle has to be broken by always creating the anonymous attribute for the normals.

Attribute Propagation

So far we only looked at determining whether an anonymous attribute should be created. Now, we just have to know at which point the lifetime ends. As mentioned before, there is not a single point where the lifetime ends. For example, in the node tree below, the lifetime of the cylinder top selection and at two points. In the Extrude Mesh and in the Subdivide Mesh node. The Dual Mesh node is expected to propagate the attribute.

With some static analysis based on the relations from above, we can determine which attributes should be propagated in each node.

In many cases, a node has to propagate attributes referenced by multiple fields. In the example below, the Subdivide Mesh node should propagate the anonymous attributes referenced by the Selection and Offset group input. Furthermore, it should propagate the attributes that have to exist on the Geometry group output.


Diff Detail

Repository
rB Blender
Branch
anonymous-attribute-refactor (branched from master)
Build Status
Buildable 25242
Build 25242: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
  • Merge branch 'master' into anonymous-attribute-refactor
  • progress

The new system is in place, the old system is out, and tests pass (locally). Now just gotta do a fair amount of cleanup.

https://builder.blender.org/admin/#/builders/136/builds/322

  • Merge branch 'master' into anonymous-attribute-refactor
  • Cleanup pass

I didn't look at the actual lifetime inferencing logic yet.

source/blender/blenkernel/BKE_anonymous_attribute_id.hh
16–20
70–79

It might be nice to generalize attribute set/info structs a bit more so they can take over the roles of the various "names to propagate" and "names to skip" variables scattered around.

Though I guess these are mostly created by the new lifetime system, so that may be a bit tricky?

79

Seems like this is mostly passed by const reference anyway. Maybe the shared_ptr could be justified in a comment?

source/blender/blenkernel/BKE_curve_to_mesh.hh
5

Best to use a forward declaration here?

source/blender/blenkernel/BKE_customdata.h
276–277

This function is unused now

source/blender/blenkernel/BKE_node_runtime.hh
669–704

These new bNodeSocket functions could be added in a separate commit

source/blender/blenkernel/intern/node_tree_anonymous_attributes.cc
2

What do you think about moving this to nodes/geometry? Not sure if it's "bad" for blenkernel to call a function in nodes. Or did you think about making this into a more general lifetime analysis system in the future?

330–332

Probably on your list, but ideally we wouldn't need to use the socket display shape to retrieve the field status

source/blender/blenlib/BLI_hash_tables.hh
371 ↗(On Diff #58916)

What about using a Span for hashing rather than the vector directly? Would be a bit more reusable, and including BLI_span.hh here is probably a bit better than BLI_vector.hh.

source/blender/functions/FN_field.hh
88 ↗(On Diff #58916)

Do you have any idea for how to make this stand out a little less as just related to geometry nodes? I guess there's a risk of preemptive generalization, but maybe there's some middle ground.

Jacques Lucke (JacquesLucke) marked 8 inline comments as done.
  • Merge branch 'master' into anonymous-attribute-refactor
  • improve comment
  • cleanup
  • remove unused functions
  • add todo
  • comment
source/blender/blenkernel/BKE_anonymous_attribute_id.hh
70–79

Yes, think I mentioned that to you before. Didn't do that for now because the semantics are not always obvious and this would result in an even larger refactor.

With anonymous attributes, AnonymousAttributePropagationInfo can really be treated like a "hint" (and that is fine as long as it is deterministic). I noticed that we have a quite a few places that should make use of it but don't currently (e.g. mesh subdivision). For named attributes, treating the propagation info as a hint doesn't really work so well, because the calling code might make assumptions about which named attributes are not propagated.

source/blender/blenkernel/BKE_node_runtime.hh
669–704

Yes. There are a few more things in this patch that can be committed separately. Will do that in a bit.

source/blender/blenkernel/intern/node_tree_anonymous_attributes.cc
2

I'm not sure. Think for now I'll keep it here for consistency with the field inferencing.

330–332

Yes. Should be some other run-time data.
Think I will extract this into a separate function but keep using display shape for now. Mainly because there is no alternative currently, and we are already using the display shape for that purpose in node_link_is_field_link).

source/blender/blenlib/BLI_hash_tables.hh
371 ↗(On Diff #58916)

It has to exist for Vector and Span actually. That's because hash tables look this up using DefaultEquality<KeyType>. So when KeyType is a Vector, then there needs to be some DefaultEquality<Vector<...>>. It might be possible to move these declarations around a bit, but didn't seem worth it right now (BLI_vector is included everywhere in the end anyway).

source/blender/functions/FN_field.hh
88 ↗(On Diff #58916)

We could potentially iterate over all (recursive) field inputs instead. Will try that.

  • Merge branch 'master' into anonymous-attribute-refactor
  • Merge branch 'master' into anonymous-attribute-refactor
  • Merge branch 'master' into anonymous-attribute-refactor
  • cleanup
  • fix and cleanup extracting field inputs

Some feedback on the socket declaration part of the patch:

  • We need a function like .field_on(0) for the instance on points node
  • The different between "auto" and "all" seems a bit arbitrary. I'd suggest just using "all" since it's a bit more specific about the actual behavior. "Auto" is a bit like "default", it's not very helpful unless you already know what it's doing.
    • Either that or "all/auto" could be removed completely and the function could just be .field(). Since the default behavior is safe anyway, that might be best.
  • I like the "field_on" naming. It's concise but sounds natural
  • It's unfortunate to put all of this geometry nodes related stuff in the generic socket declaration system used for all node tree types. The alternatives aren't so appealing either though. Either this could become generic enough that it's not specific for geometry nodes (probably not realistic), or node declarations could become virtual, with these on a geometry nodes subclass (lots of code duplication).
    • Given the other options, it would be reasonable not to solve this now, but I feel like we should have some idea, since it's a sore spot in the declaration design.
  • I'd change propagate_from_auto -> propagate_all. Simpler, more natural sounding, and gives the same information
  • cleanup
  • Merge branch 'master' into anonymous-attribute-refactor
  • Committed parts of this patch to master and merged master back into the patch.
  • Merge branch 'master' into anonymous-attribute-refactor
  • Merge branch 'master' into anonymous-attribute-refactor
  • cleanup
Jacques Lucke (JacquesLucke) retitled this revision from Geometry Nodes: Deterministic anonymous attribute lifetimes (WIP). to Geometry Nodes: Deterministic anonymous attribute lifetimes..Jan 4 2023, 6:22 PM
Jacques Lucke (JacquesLucke) edited the summary of this revision. (Show Details)
Jacques Lucke (JacquesLucke) edited the summary of this revision. (Show Details)

I ran out of brainpower 2/3 of the way through geometry_nodes_lazy_function.cc. I find it impressive that you managed to get all that working.
To me the code seems reasonable-- it doesn't look obviously wrong anywhere, I agree with the style/formatting choices, and I don't see any obvious ways to improve it.
I'd have to spend a lot more time to actually give a meaningful opinion on some of the choices though.
I will say I appreciated wherever there was some prose (comments) interspersed in the code. Though I'm sure that's also personal thing.


Running the performance benchmarks, I found that the mouse house file is about 2.5x slower with this patch applied. Some performance regression there wouldn't be surprising, but that probably deserves some investigation-- it probably shouldn't slow down by that much.

FileMasterPatch
Erindale_Flower_Shop1.2804s1.8751s
Erindale_Mouse_House0.3961s1.1576s
source/blender/blenkernel/BKE_anonymous_attribute_id.hh
31

"Also it is intrinsicly reference counted in order to ..."

It would help to give some reasoning here

source/blender/blenkernel/intern/node_tree_anonymous_attributes.cc
173–218

Does this section have any similarity to find_linked_group_inputs? It seems relatively similar (finding a set of output sockets connected to some input). But I'm sure I'm missing something.

Each of these algorithms is relatively simple to understand in isolation, but I find it sort of hard to see common patterns. Maybe there aren't as many as I expect!

source/blender/nodes/NOD_geometry_exec.hh
286

Seems fine to use StringRef here instead of StringRefNull. Also maybe return early?

source/blender/nodes/intern/geometry_nodes_lazy_function.cc
702–704

Can you add a brief description for what these mean?

1019

Suggestion:
LazyFunctionForJoiningAnonymousAttributeSets -> LazyFunctionForAnonymousAttributeSetJoin

Similar for LazyFunctionForExtractingAnonymousAttributeSet above

1490

Some short comment used as a label for this code section would be helpful for the reader IMO

Running the performance benchmarks, I found that the mouse house file is about 2.5x slower with this patch applied. Some performance regression there wouldn't be surprising, but that probably deserves some investigation-- it probably shouldn't slow down by that much.

True, didn't happen to me when I last checked, but it's quite possible but one of my last minute changes caused that. Will investigate.

Testing the functionality now, this setup seems to still compute the rotation. (from printing in the code and not observing a time difference when muting the set position node). Based on the code it seems like this is meant to work.

Jacques Lucke (JacquesLucke) edited the summary of this revision. (Show Details)
  • fix performance regression
Jacques Lucke (JacquesLucke) marked 6 inline comments as done.
  • improve support for muted nodes
  • improve comment about intrinsic reference counting
  • cleanup

Thanks for adding the possible future improvements section. I guess the documentation will end up in the wiki?

"No functional changes" besides some performance improvements from not propagating some anonymous attributes, right?

This revision is now accepted and ready to land.Jan 5 2023, 1:07 AM

I guess the documentation will end up in the wiki?

Yeah, can put it there.

"No functional changes" besides some performance improvements from not propagating some anonymous attributes, right?

Depends on your definition of "functional" (see functional vs. non-functional).

source/blender/blenkernel/intern/node_tree_anonymous_attributes.cc
173–218

They are relatively similar. I found it easier to reason about the code without abstracting it more though. Abstracting find_linked_group_inputs away worked a bit better, because it only depended on one of the relations. Here we depend on three and they are traversed in opposite directions..