Page MenuHome

Metal: Remove Vec3 packing from uniform buffer generation as this causes UBO misalignment in Metal.
ClosedPublic

Authored by Jason Fielder (jason_apple) on Dec 8 2022, 12:22 PM.

Diff Detail

Repository
rB Blender

Event Timeline

Jason Fielder (jason_apple) requested review of this revision.Dec 8 2022, 12:22 PM
Jason Fielder (jason_apple) created this revision.
Clément Foucault (fclem) requested changes to this revision.Dec 8 2022, 6:36 PM

As far as I can see, the issue is that you cannot pack float after float3 because of the explicit float3 alignment to 16 bytes. I suggest this much simpler solution:

diff --git a/source/blender/gpu/intern/gpu_uniform_buffer.cc b/source/blender/gpu/intern/gpu_uniform_buffer.cc
index 5d9e2af631e..96b085dee2d 100644
--- a/source/blender/gpu/intern/gpu_uniform_buffer.cc
+++ b/source/blender/gpu/intern/gpu_uniform_buffer.cc
@@ -56,6 +56,10 @@ static eGPUType get_padded_gpu_type(LinkData *link)
 {
   GPUInput *input = (GPUInput *)link->data;
   eGPUType gputype = input->type;
+  /* Metal cannot pack floats after vec3. */
+  if (GPU_backend_get_type() == GPU_BACKEND_METAL) {
+    return (gputype == GPU_VEC3) ? GPU_VEC4 : gputype;
+  }
   /* Unless the vec3 is followed by a float we need to treat it as a vec4. */
   if (gputype == GPU_VEC3 && (link->next != nullptr) &&
       (((GPUInput *)link->next->data)->type != GPU_FLOAT)) {
@@ -89,6 +93,11 @@ static void buffer_from_list_inputs_sort(ListBase *inputs)
   /* Order them as mat4, vec4, vec3, vec2, float. */
   BLI_listbase_sort(inputs, inputs_cmp);
 
+  /* Metal cannot pack floats after vec3. */
+  if (GPU_backend_get_type() == GPU_BACKEND_METAL) {
+    return;
+  }
+
   /* Creates a lookup table for the different types; */
   LinkData *inputs_lookup[MAX_UBO_GPU_TYPE + 1] = {nullptr};
   eGPUType cur_type = static_cast<eGPUType>(MAX_UBO_GPU_TYPE + 1);

Also this patch contains a lot of unrelated modifications.

This revision now requires changes to proceed.Dec 8 2022, 6:36 PM

As far as I can see, the issue is that you cannot pack float after float3 because of the explicit float3 alignment to 16 bytes. I suggest this much simpler solution:

diff --git a/source/blender/gpu/intern/gpu_uniform_buffer.cc b/source/blender/gpu/intern/gpu_uniform_buffer.cc
index 5d9e2af631e..96b085dee2d 100644
--- a/source/blender/gpu/intern/gpu_uniform_buffer.cc
+++ b/source/blender/gpu/intern/gpu_uniform_buffer.cc
@@ -56,6 +56,10 @@ static eGPUType get_padded_gpu_type(LinkData *link)
 {
   GPUInput *input = (GPUInput *)link->data;
   eGPUType gputype = input->type;
+  /* Metal cannot pack floats after vec3. */
+  if (GPU_backend_get_type() == GPU_BACKEND_METAL) {
+    return (gputype == GPU_VEC3) ? GPU_VEC4 : gputype;
+  }
   /* Unless the vec3 is followed by a float we need to treat it as a vec4. */
   if (gputype == GPU_VEC3 && (link->next != nullptr) &&
       (((GPUInput *)link->next->data)->type != GPU_FLOAT)) {
@@ -89,6 +93,11 @@ static void buffer_from_list_inputs_sort(ListBase *inputs)
   /* Order them as mat4, vec4, vec3, vec2, float. */
   BLI_listbase_sort(inputs, inputs_cmp);
 
+  /* Metal cannot pack floats after vec3. */
+  if (GPU_backend_get_type() == GPU_BACKEND_METAL) {
+    return;
+  }
+
   /* Creates a lookup table for the different types; */
   LinkData *inputs_lookup[MAX_UBO_GPU_TYPE + 1] = {nullptr};
   eGPUType cur_type = static_cast<eGPUType>(MAX_UBO_GPU_TYPE + 1);

Also this patch contains a lot of unrelated modifications.

Thanks for the feedback! Tested these changes locally and lightweight fix does address the issue at hand. I can't recall if there were any other cases where types had incompatible alignment, though this appears to be working correctly, so will update patch.

I will also strip out the other changes and move them into a separate patch for clarity.

Michael Parkin-White (MichaelPW) retitled this revision from Metal: Generate uniform buffers using Metal compatible alignment. Enable multilayered rendering. to Metal: Remove Vec3 packing from uniform buffer generation as this causes UBO misalignment in Metal..Dec 14 2022, 5:55 PM
Michael Parkin-White (MichaelPW) edited the summary of this revision. (Show Details)
Jason Fielder (jason_apple) edited the summary of this revision. (Show Details)
  • Fix typo in rB2c5a525d6409
  • Fix mismatching PTX function declarations for OSL intrinsics with string parameters
  • Cleanup: Remove unused headers in asset files
  • Fix asset loading indicator in node add menu disappearing too early
  • Fix FallbackCyclesBlitShader compilation error
  • deps_builder: add missing freetype dep to harfbuzz
  • Fix compiler error on MSVC after a243a9dc79eb
  • Geometry Nodes: support storing 2d vector attributes
  • Metal: Remove Vec3 packing from uniform buffer generation as this causes UBO misalignment in Metal.

Seems like the diff got broken. I will commit only the changes in the gpu_uniform_buffer.cc.

This revision is now accepted and ready to land.Dec 15 2022, 10:56 AM