The idea is to avoid storing the whole array of closures when we only need
that to sum certain closure types later. Instead we sum closure weight once
we are about to add the closure to storage, but without actually adding it
to the storage.
Simple idea which is a bit more tricky to implement, because we don't want
to introduce the whole new bunch of API functions, especially the one for
SVM nodes evaluation.
The current idea is to tag ShaderData with special flag that we want shader
evaluation to sum certain type of closures. The sum is done in the SVN NSDF
code (fast-forwarding ahead: OSL is not yet supported). After that we can
read sum weight from the ShaderData directly.
Now, in order to avoid code duplication, we use trickery of allocating stack
memory of a size of ShaderData without closures array, and pass as if it's
a proper ShaderData. Closure allocation adds some guard that we don't try
to allocate closures in such ShaderData, but other than that there's nothing
what would forbid one to shoot his own foot.
Such a silent substitution of "lite" version of ShaderData is something
what makes me unhappy about this patch and something where i wouldn't
mind having discussion about possible ideas.
- Maybe we should introduce ShaderDataLite instead and have some special API to deal with that?
(would be annoying to copy-paste all shader_setup functions tho)
- Have some persistent flag in ShaderData, so we always know whether we deal with proper ShaderData or with a lite one?
Would at least make checks in official API more robust, because then we wouldn't by accident reset sum flag and try to use the closure array.
All in all, this seems promising direction. On 1080 gives about 210MB
VRAM saving (which we can later re-use for shadow_blocked(), so don't
get too excited) and it also gives few percent of speedup in the koro
scene by the looks of it.