Page MenuHome

Textures: prototype multi-function texture nodes CPU evaluation
DraftPublic

Authored by Brecht Van Lommel (brecht) on Jun 9 2022, 4:31 PM.
This is a draft revision that has not yet been submitted for review.

Details

Reviewers
None
Summary

The basic idea is to implement new texture nodes that are just a subset of
shader nodes, re-using geometry nodes code as much as possible.

This is very early, the main purpose was for me to get acquainted with all the
new data structures that geometry nodes introduced, and find design issues to
be solved.

Design doc: T98940: Texture nodes CPU evaluation design

Diff Detail

Repository
rB Blender
Branch
tex-field (branched from master)
Build Status
Buildable 22479
Build 22479: arc lint + arc unit

Event Timeline

Switch to MFProcedure.

This seems to be significantly slower than the previous version of the patch. In a simple graph with a Noise Texture evaluation, the time spent evaluating the noise itself when from 70% to 15% of the execution time. If we can get big batches everywhere this may become negligible, but this still seems much slower than it could be.

This seems to be significantly slower than the previous version of the patch. In a simple graph with a Noise Texture evaluation, the time spent evaluating the noise itself when from 70% to 15% of the execution time. If we can get big batches everywhere this may become negligible, but this still seems much slower than it could be.

Well, there are certainly ways to make it faster to evaluate a single element, but that's never really a bottleneck for geometry nodes. When you want to evaluate single elements and optimize for latency, then using MultiFunction is the wrong approach. It just makes too many high level decisions during evaluation that wouldn't be necessary when evaluating single elements.
I can reproduce your measurements and tried to expand on them a bit. First I had to disable a couple of optimizations, but make the performance comparisons useful: P3001.
That's because the procedure executor currently assumes that multi-functions without inputs produce a constant value and optimizes for that (that could be changed for the case when a multi-function uses context, but it would be better if the "context" is passed in as an input to the multi-function).

I measured around 7us overhead for evaluating a single noise (which takes about 1us). This is obviously fairly bad. Now, the key part is, and what the multi-function system is optimized for, that when computing 10, 100 or 1000 noise values, the overhead from the executor is still around 7-10us. So it generally does not grow with the data size. That also means that especially for noise "big batches" don't actually need to be that "big" to make the overhead negligible. For simple math functions like "add", the batches would need to be bigger to make the overhead negligible or course.


I can see why you want to get evaluate single texture values to make it work with existing code (instead of batching). I don't think it's generally possible to optimize MultiFunction for that use case while also keeping it optimized for many elements. Oftentimes one has to make a trade-off that favors one over the other. Possible ways forward regarding single values vs. batches:

  • Refactor some existing users of the texture system that matter for us now to use batches.
  • Implement a simple procedure executor that is optimized for evaluating a single element (it will still have a lot of overhead compared to using a system that's actually optimized for that use case).
  • Build a separate function evaluation system that's optimized for evaluating single values (but it really shouldn't be used when batches are processed). That involves defining a function calling convention, a way to compose functions into a bigger function and a way to execute that composed function efficiently. We could in theory write the function nodes in a way that it automatically generates code for different function systems.
  • Accept that it is slow until we support batching. Maybe don't merge to master until some batching exists.

At a higher level, using fields (FN_field.hh) for texture still seems quite reasonable, and it would actually eliminate the need to be build the multi-function procedure manually, because that is done by the field system. Although, it should be noted that that increases the *constant* overhead more. Especially, because the current field evaluator is not written for the use case where the same field is evaluated many times in different contexts, that can and should probably be improved though.

The field system can also be extended to be useful when the latency of single-element evaluation is important and when code for a gpu is generated. While FieldOperation only contains a multi-function currently, it could contain other ways to process the data.


At an even higher level, it might even make sense to share code that evaluates geometry nodes and texture nodes. More specifically, I'm talking about T98492, which I'm currently working on. This could even be used to evaluate shader nodes. Note that for shaders and textures this evaluator would only be used to create a Field or something similar, that then has to be executed by another system (e.g. multi-functions). The benefit is that this other system doesn't have to deal with node-groups, type conversions etc. anymore. Because that that is done when the Field is constructed.


I can see that it can be difficult to get a good intuition for the purpose of these different abstractions levels (multi-functions, fields, geometry-nodes-evaluator, ...) but I do think that this separation allows us to write stable and high performance systems. If you have any questions, just ask.

It would be useful if you could share your test file. It took quite a while to figure out how to use the texture node system.

For reference, D15198 is the first time we are using a FieldContext that is not based on a GeometryComponentFieldContext.

Work towards using fields, geometry components and batched execution.

Still hacky and incompletes in many places, took a while to get my head around all the geometry nodes data structures and how it fits together.

  • Switch from TextureComponent to TextureFieldInput, rebase for new attribute API
  • Batched texture evaluation for 2D image painting

Nice to see batching working!
I can do a pass on the patch next week. Mainly want to try to make it use the FieldEvaluator. Maybe will also change how TextureCustomDataAttributeProvider works. It's not obvious to me that this has to go through the attribute API, which isn't really meant for accessing derived data. Might be much simpler to access the data without the attribute API.

I started working on that in D15460. Lot's of stuff is still missing that was part of this patch, but maybe it helps a bit already anyway. Is there a way to test evaluating the texture on a geometry currently?