Page MenuHome

Metal: MTLShader and MTLShaderGenerator implementation.
ClosedPublic

Authored by Thomas Dinges (dingto) on Jul 28 2022, 5:48 PM.

Details

Summary

Full support for translation and compilation of shaders in Metal, using GPUShaderCreateInfo. Includes render pipeline state creation and management, enabling all standard GPU viewport rendering features in Metal.

Authored by Apple: Michael Parkin-White, Marco Giordano

Ref T96261

Diff Detail

Repository
rB Blender

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
source/blender/gpu/metal/mtl_shader.hh
54–78

Could we use a shared header for these? similar to draw_shader_shared.h.

158
210

This is weird to have it here. Would have thought this would go more into a metal_state.hh conceptually. It has no direct link to shaders excepth the MTLVertexDescriptor.

216
328

Is there a need for it to be stored? I cannot find any usage of it.

330

I could not find any use of this. Is that for shader entry point?

348–356

This is interesting. Maybe we should try to remove the root cause. But that' fine for now.

382

Wait. We already have gpu::Shader::interface as a public member, so you already have access to it. Why another member with diferent type? I see that you have a owns_interface_ but there don't seem to be any case where this is false. Is Interface sharing planned?

385–393

If we can expect compute shader support for the initial version of the Metal backend, I would rather not add this.

397

When is that false? I think it is better to always assume they are separated. We do require different files with different suffix.
If this is for internal metal utility shaders, I would prefer they stick to the global rule of separated files.

398–416

All of this is concerning shader creation. I do not see any benefit of keeping them inside the struct that is supposed to be reserved for actual data needed for drawing. Source debugging should go through GPULogParser.

You can create a temporary MTLShaderBuilder object that contains all of them. It should get freed after compilation.

429

Not sure if this is also shader creation only.

462–468

Not sure I understand fully what this entails. I would have though that push constants would fit 1 UBO. Is that for cases where it doesn't fit?

539–543

Members should be above all functions.

source/blender/gpu/metal/mtl_shader.mm
48–52

Except from context_ all of these should be inside the header on the members declaration.

source/blender/gpu/metal/mtl_shader_generator.mm
581

This needs to take into account case when there is one or more blank char between the type and the parenthesis. i.e: mat3 (.

1803
source/blender/gpu/shaders/metal/mtl_shader_defines.msl
763

Why the define to a single use function?

999

Why add metal namespace here?

1066

Note: I suggest to always add a newline at the end of shader files since they all gets concatenated. Sometime you can end up with #define in the first line of a following file and this would break compilation because directive would be preceded by another char.

1071–1082

So all of this is only needed because of the mat4_to_mat3 case?

Maybe we can make an exception and enforce the use of a mat4_to_mat3. We could even catch the error in the log parser and suggest the workaround.

source/blender/gpu/metal/mtl_shader.hh
424

Would have thought this would be given by the MTLState instead of living inside the MTLShader.

source/blender/gpu/metal/mtl_shader.mm
277

I would prefer this is set through member initialization.
Also, prefer nullptr.

279

Can we remove this loop if we only allow 1 ubo for all push_constants?

285

Same here.

353

Could also test for location >= interface_->get_total_uniforms().

359

This is already tested by the warning check.

363–364

Shouldn't these be checked at interface creation time? Seems redundant to check for every single uniform set.

408

Move the return statement inside the case brackets.

423

Makes it fit in 1 line.

There extra parenthesis are everywhere in this function.

479–480

Already checked above.

540

Say, how bad is it if we just #define main my_function_entry_point and declare all the variation upfront in the create info? Leaving only the GLSL path but you still use it with msl source. This shouldn't be a problem with the glsl parser since it is only called when source is not using ShaderCreateInfo.

Also realistically, how much perf are we loosing if we use an uber shader with enum push constant parameters instead of the specialized shaders? I guess not a lot since theses (the metal specific shader) are only used for data conversion which happens on read and texture upload (which are infrequent).

Trying to reduce complexity here by removing different paths.

594
694

What's up with all these empty lines? This is triggering my OCD 😆

782

Comment formating.

797

Comment style

809–810

comment style

845

Just set MTL_pointsize to the right value and set the values once.

Submitting first round of addressed feedback. Some of my replies are overly detailed, but aiming to provide additional context if it is useful.
I'll continue working through a first pass of changes where straight forward.

source/blender/gpu/CMakeLists.txt
475

Ah right, can keep this in mind. I originally had these two lists kept separate, but appended the Metal shader sources onto this list as a number of these shaders are used in create info.

Perhaps best to split this section further, separating out MSL shaders from GLSL shaders used by the Metal backend.

These GLSL shaders could be common between all APIs, though they are not required by the Gl backend, as support for Depth-write etc; is hidden behind the API.

source/blender/gpu/GPU_shader.h
17 ↗(On Diff #54141)

This may have been a left-over from the previously proposed uniform addressing change which was changed in the end.

It is used throughout a few places in the backend, just as a more explicit check for uniform addresses being valid. Though happy to remove its use, and replace GPU_UNIFORM_IS_VALID use cases with a direct check, as is done in other places in Blender, if this is preferred?

source/blender/gpu/intern/gpu_context_private.hh
52

I can clarify this comment. This is to deal with a source collision, which happens particularly if shader compilation is occurring on different threads for the same shader source. This is a problem with the OS-level global shader cache, and only affects certain older OS's, rather than anything inside Blender itself.

The context ID being a number ensures we can still benefit from shader caching between runs of the application, as this ID is appended onto the end of the source to make it slightly different, though ensuring that the same source hashes will exist between runs.

Unfortunately using the pointer would produce a different result between re-runs meaning full shader compilation would happen every time.

source/blender/gpu/intern/gpu_shader.cc
69–70

Ah right, will update.

source/blender/gpu/intern/gpu_shader_create_info.hh
105–138

These additional types were added to provide direct type mapping between actual input vertex types and their corresponding type inside a Metal shader. Given Metal does not support as many implicit type conversions as Metal, it is more convenient and optimal to allow data to be directly passed.

Two use-cases for this are in the edit_mesh_vert shader. Where input vertex data can be in several forms, but is used as follows in the Geometry shader alternative path (not included in this PR):

.vertex_in(0, Type::VEC3, "pos")
.vertex_in(1, Type::UCHAR4, "data")
 .vertex_in(2, Type::VEC3_101010I2, "vnor")

or

.vertex_in(0, Type::VEC3, "pos")
.vertex_in(1, Type::UCHAR4, "data")
.vertex_in(2, Type::VEC3, "vnor")

vs original:

.vertex_in(0, Type::VEC3, "pos")
.vertex_in(1, Type::IVEC4, "data")
.vertex_in(2, Type::VEC3, "vnor")

For the "data" vertex attributes, these are often encoded as a UChar4 in the vertex format:

GPU_vertformat_attr_add(&format, "data", GPU_COMP_U8, 4, GPU_FETCH_INT);

Conversion from a UChar4 vertex attribute input data to a UInt4 vertex attribute in the shader is invalid in Metal via implicit conversion. Therefore, under the Metal path, changing the input type in the Shader create info to UCHAR4 allows us to directly read that data as-is.

The GLSL path then has a macro to map char types to integer types, such that implicit shader conversion still works in the same way as the original version. e.g. #define uchar4 uvec4

This could likely be handled in a different way, and a number of these types were added for thoroughness, though given these are not supported by GLSL directly, perhaps a different approach is best.

It's also worth noting that vec3_1010102_Inorm is also not a valid literal type in Metal, but this was a type alias, again converted back by macro, to explicitly allow direct data mapping, specifically in this case for SSBO vertex fetch, which needs to interpret a compressed vector manually.

In GLSL, this macro converted the type directly to vec3, to allow implicit vertex assembly to populate the values:
#define vec3_1010102_Inorm vec3

For Metal, this instead uses an int as a backing store in the shader, and a special vertex attribute read routine to unpack into a vec3 during ssbo vertex fetch, as vertex assembly isn't used in this mode:
#define vec3_1010102_Inorm int

640

Following separate discussions on BlenderChat and to make a more permanent note here, the overall geometry shader alternative path could certainly be done in another way.

In this case, the Create-info change is to register the data which would be specified in the current:
#pragma USE_SSBO_VERTEX_FETCH(TriangleList, 3) which aims to match the geometry shader declaration.

Though it was a slight oversight of me not completely remove the other path for detecting the pragma string.

These data points aim to match the same input as the geometry shader. The aim is that for each input primitive, we specify the type of output primitive and number of desired output vertices.

i.e. for 3D poly line, to take an example where your input primitive type is LineList. To generate a wide line using a quad, we would use TriangleList as the output primitive type, and specify 6 vertices to create two triangles per input line. Under the hood, when this draw call is executed, rather than executing a line draw, the backend will execute a draw using TriangleList and multiply the output vertices by the number of input primitives to invoke that many instances of the vertex shader.
(So if we had a wide line with 6 line primitives, we would end up invoking a TriangleList draw call with 36 vertices instead).

There are other cases such as Workbench shadows where the output vertex count can be up to 12 vertices at most.

However, yes, at the moment, use of this line implies use of SSBO vertex fetch mode being enabled, in much the same way as if .geometry_layout( or .geometry_source( are specified.

Happy to either add an explicit create-info line to enable this, or also add a boolean param to specify whether it is enabled if this is preferred?

source/blender/gpu/intern/gpu_shader_interface.hh
45–46 ↗(On Diff #54141)

The main reasoning for this was simply because the name buffer was being written to from several different utility functions during construction in the MetalBackend variant, and such needed a counter to track the current location and size. There is also use of a name buffer utility function name_buffer_copystr, for adding a string to the name buffer safely during construction, and reducing code duplication.

In the GLSL implementation, name_buffer_ population only appears to happen from a single place (GLShaderInterface constructor) and thus the counters are local to that code.

In practise, all of the uses of these member functions are inside MTLShader::bake_shader_interface, which populates and finalizes a shader interface after GLSL conversion is complete.

So I propose we could either:

  1. Move those variables inside mtl_shader_interface, knowing they are only used in one place.
  2. Perhaps easier, Move the name copy outside of add_input_attribute/add_uniform style functions, and keep a local counter inside bake_shader_interface. We would instead directly pass the name buffer offset into those functions. The name buffer utility functions would then be static, local only to the bake_shader_interface function.

Number 2) would effectively match GL and ensure it is not editable after construction and size compaction. Number 1) I perhaps slightly cleaner/easier to follow, but would not enforce the same strict guarantees.

source/blender/gpu/metal/kernels/compute_texture_read.msl
77

This was a previous oversight from the other review. Double is not supported as a type in Metal, so the type must be kept as float.

In the case of val being > 1.0, this is a good point, as in these cases, this value would overflow. Perhaps need to understand the exact use-cases of this, and whether values are stored in the zero-one range. Should val likely be clamped?

Is there an actual use case which may hit this, with a higher-precision colour target being read as UINT?

source/blender/gpu/GPU_shader.h
17 ↗(On Diff #54141)

Yes. It is prefered.

source/blender/gpu/intern/gpu_context_private.hh
52

Well my issue is that a new context is created for each render context. So does this mean doing multiple render will always create a new shader in the OS cache? The material thumbnails also use a temporary context if I remember correctly.

source/blender/gpu/intern/gpu_shader_create_info.hh
105–138

Okay, I don't have issue with them being added. But it might be confusing for user of the API as these are reserved for internal Metal backend usage. Make sure to make that clear inside enum class Type by separating the new types with a comment.

We might use them directly in the future if they prove to reduce code complexity.

640

Since the shaders are different implementation alltogether, I would like to have different create infos. The workaround one should replace the standard one just like that draw_modelmat = draw_modelmat_legacy; in gpu_shader_create_info.cc.

So the create_info API should not have any change to support this.

source/blender/gpu/intern/gpu_shader_interface.hh
45–46 ↗(On Diff #54141)

I would go for number 2).

source/blender/gpu/metal/kernels/compute_texture_read.msl
77

You are right, it is unlikely to be the case if we make sure to enforce correct API usage in the backend.

Still have a number of outstanding items to address, but sharing these feedback items for now. Thank you for the detail! In the process of applying refactors and fixing up issues.

source/blender/gpu/intern/gpu_context_private.hh
52

In most cases, this is avoided, and once the cache is populated with several permutations, there are no issues on future runs.

However, I'll add an OS version guard for this, and only apply the context ID to the affected OS, as it is still needed to avoid crashing. (Even guarding compilation on the app-side can be problematic, as shading cache is still handled outside of the app).
It may be the case that the Metal backend will not be supported on these versions anyway, as our minimum supported OS for the required feature set of Metal 2.2 is macOS 10.15.

If it is preferred, the ContextID can be moved inside the Metal backend?

source/blender/gpu/metal/mtl_context.mm
260

The size was likely excessive, so have reduced. It is used for any place where a vertex buffer may be required but is not yet ready. This is often a fallback, though there are a few cases where UBOs are bound but no data has yet been uploaded. (This is usually for cases where the UBO is not actually accessed, i.e. lighting data where no lights are present, but the binding still needs to be satisfied).

So in practise, it only needs to be as big as the largest possible UBO.

I believe 4k should be sufficient, though this may still be overkill.
If UBOS are truly unaccessed, then the actual size is somewhat irrelevant, though Metal API validation wants to ensure buffers are big enough for the backing struct in the shader.

source/blender/gpu/metal/mtl_shader.hh
210

Yeah, this struct could likely be moved to a more global location. The thinking of having it local to the shader class was that this contains a subset of global state which feeds PSO creation (which is shader specific), and is then used as a key in the hash map of previously compiled PSOs.

In this case, PSO backend compilation combines the intermediate representation of the compiled shader library with the render pipeline state to compile the final shader program which will execute on the GPU.

Currently, this struct has data copied in prior to PSO creation to ensure everything is up to date based on the current pipeline state. This mechanism could still happen in the same way, to avoid double-population of state.

There is also a separation between this struct and the MTLGlobalPipelineState/MTLRenderPassState, as this one contains only the hashable subset which feeds the MTLRenderPipelineDescriptor for shader-specific PSO creation.

The reason this is also a member struct rather than a local one is such that MTLBatch and MTLImmediate can directly populate the Vertex descriptor in this struct. However, this could most likely be replaced with a global vertex descriptor in a global copy of this struct.

328

Yeah, you're right, this appears to be stored unnecessarily, and if it ever did need to be fetched, it could just be found from the corresponding key in the pso_cache_ lookup.

330

This is a reference to a specialised variant of the vertex/fragment function for the current shader. Storing it here is primarily just a means of keeping a reference to the resource. Such that it can be retained while in use in a PSO, and released when the shader object (and pso_cache_) is destroyed.

382

When the backend was first implemented, interface sharing, and external interface creation was supported, though since everything now goes through the GLSL pipeline, rather than using hand-written MSL shaders, this is now no longer needed.

It was kept as a convenience to avoid repeated casting. Though I imagine the compiler probably negates any overhead with this. I can remove this.

385–393

Compute support is still something needing attention. It should not be too complicated and should be more straight forward than the existing shader translation. Though I can remove this as soon as that support is in place.

Would it be okay to keep this as a stop-gap for the time being, to enable testing while compute features are worked on?

397

Yeah, we can remove this and just always use separate files. This did exist from a time where internal metal vertex/fragment shaders shared a single source.

398–416

Yeah this is true, though the shader libraries and function entry point names are required prior to PSO (backend) compilation which happens upon a draw, based on the current rendering state.

A shader will potentially have several PSO permutations based on the pipeline state at the time it is used. This compilation is deferred until draw time, as this state is generally unknown up-front.

We do not however require any of the original string-based sources however, so I'll place these inside an MTLShaderBuilder struct and free these after a shader is finalized.

Will also refactor debugging to use GPULogParser.

429

This is also required for PSO creation (backend shader compilation). When layered rendering is used, primitive type needs to be explicitly specified.

Usually, specifying primitive type explicitly means that each time a different primitive type is used with a given shader, a PSO permutation needs to be compiled. We can instead use "unspecified" which allows this to be any type for most cases, reducing the number of backend compilations.

462–468

This was perhaps over-engineering for cases where push constants (general shader uniforms) may be stored in several different UBOs, to allow optimisations such as only updating/re-binding ones which had been modified.

Left this in as UBO's had the Frequency field, which seemed like it could extend to uniforms in future. I.e. if certain classes of uniforms only get updated per scene, then having split UBOs could have the advantage of less data being encoded in the command buffer between similar draw calls.

However, as it stands, this code does not have an active use-case. So it can be removed and just replaced with a single data store. As all uniforms (push_constants) currently get written to the first data block only.
If the data size exceeds the limit of 4KB (which none do currently, but path added for robustness), then the push constant data is bound as a buffer, using a temporary scratch buffer allocation, rather than using Metal's setBytes call, which allows data to be encoded in the command stream.

source/blender/gpu/metal/mtl_shader.mm
279

Yep, as per above, will remove and replace with a single block.

845

I believe this needs to be handled for different point size states, and this is also used to disable global point-size, if it is dynamically disabled. This will generate various PSO specialisations for each case, if the same shader is used with both global point size and programmatic point size.

source/blender/gpu/shaders/metal/mtl_shader_defines.msl
758

Yes, M_PI_F is declared globally. Currently it does not appear to collide, though I can replace this definition with a pre-computed constant.

Do pyGPU shaders include common_math_lib as default? On this note, are there any good test-cases for pyGPU, would be useful to have a range of extensions which use the features to test the implementation.

763

Good point, this was unnecessary, have removed.

936–939

This was thrown in In-case it was needed, and also used in a few experiments, but will remove it as I imagine it is not needed.

999

Ah yes, this is also no longer needed. This was carry-over after previously using some functions from the Metal headers, but this is ultimately unnecessary, especially also given the shader explicitly uses the metal namespace.

1071–1082

To simplify the code, we could remove mat4_to_mat3 and have the conversion directly in the MAT3 function.

Currently, a find-and-replace expression is used to detect casts to mat3( and replaces those with MAT3 so as to use these function calls, such that all cases can be handled. But yes, the functions here are only wrappers around the constructor, as find-and-replace is not advanced enough to know the input parameter type.

I believe a previous patch had replaced mat3(mat4 m) cases to use mat4_to_mat3 directly, though this was removed for compatibility.

Would be more explicit however to have this in the high level shaders, as mat4_to_mat3 does cause a type truncation.

source/blender/gpu/intern/gpu_shader_create_info.hh
640

Right, so currently they do have different create info's (again not included in this PR), though like with geometry shaders, those new create-info's require some configuration to declare how the SSBO vertex fetch mode invokes a draw.

I can remove these configurations in gpu_shader_create_info.cc if this is not desired, though this would mean needing to use something like the existing pragma inside the non-geometry-shader variant of the source:
#pragma USE_SSBO_VERTEX_FETCH(TriangleList, 3)

For a little bit of context, this is an example of create info for gpu_shader_3D_polyline using a non-geometry-shader variant which relies on SSBO vertex fetch:

GPU_SHADER_CREATE_INFO(gpu_shader_3D_polyline)
    .define("SMOOTH_WIDTH", "1.0")
    .push_constant(Type::MAT4, "ModelViewProjectionMatrix")
    .push_constant(Type::VEC2, "viewportSize")
    .push_constant(Type::FLOAT, "lineWidth")
    .push_constant(Type::BOOL, "lineSmooth")
    .vertex_in(0, Type::VEC3, "pos")
    .vertex_out(gpu_shader_3D_polyline_iface)
    .geometry_layout(PrimitiveIn::LINES, PrimitiveOut::TRIANGLE_STRIP, 4)
    .geometry_out(gpu_shader_3D_polyline_iface)
    .fragment_out(0, Type::VEC4, "fragColor")
    .vertex_source("gpu_shader_3D_polyline_vert.glsl")
    .geometry_source("gpu_shader_3D_polyline_geom.glsl")
    .fragment_source("gpu_shader_3D_polyline_frag.glsl")
    .additional_info("gpu_srgb_to_framebuffer_space");

GPU_SHADER_CREATE_INFO(gpu_shader_3D_polyline_no_geom)
    .define("SMOOTH_WIDTH", "1.0")
    .push_constant(Type::MAT4, "ModelViewProjectionMatrix")
    .push_constant(Type::VEC2, "viewportSize")
    .push_constant(Type::FLOAT, "lineWidth")
    .push_constant(Type::BOOL, "lineSmooth")
    .vertex_in(0, Type::VEC3, "pos")
    .vertex_out(gpu_shader_3D_polyline_iface)
    .ssbo_vertex_fetch_out(PrimitiveOut::TRIANGLES, 6)
    .fragment_out(0, Type::VEC4, "fragColor")
    .vertex_source("gpu_shader_3D_polyline_vert_no_geom.glsl")
    .fragment_source("gpu_shader_3D_polyline_frag.glsl")
    .additional_info("gpu_srgb_to_framebuffer_space");

Perhaps best to remove this from this PR anyhow, and instead use a future PR which has a little more context for this solution?

source/blender/gpu/metal/mtl_shader.mm
540

Not quite sure I understand the first comment exactly, do you mean pre-declaring all Metal entry points up-front for every possible type of shader?

The generated entry-point wrapper for a GLSL shader is fairly complex as is, and pretty unique to each GLSL shader, depending on the combination of inputs/outputs and GLSL globals used.

For the second comment on having an uber shader using push constants instead of specialising, this is certainly possible, though following knowledge of performance overhead incurred from using Uber-shaders, these tend to increase local register pressure, which does negatively impact performance on Apple Silicon Hardware.

While on the flip side you can save compilation overhead, shader "specialisation" often needs to happen anyway for unique vertex descriptors, whenever the vertex descriptor changes. Given this PSO compilation is required anyway for differing vertex descriptors, the specialisation won't really have any additional negative impact.

So TL;DR as far as I can tell, this wouldn't actually reduce the number of paths we need, just change it from using uniform data vs specialisation constants, which we need to use anyway to support other features.

Though I do understand the sentiment of wanting to minimise complexity.

Rebased with latest master.

source/blender/gpu/intern/gpu_context_private.hh
52

If it is preferred, the ContextID can be moved inside the Metal backend?

Absolutely. This makes it clearer that this is Metal related. Also would put a bit more context (not the full context, maybe a link to a more in depth comment) in the comment about why this is needed.

source/blender/gpu/metal/mtl_context.mm
260

This is usually for cases where the UBO is not actually accessed, i.e. lighting data where no lights are present, but the binding still needs to be satisfied

This should trigger an assert. You shouldn't have to satisfy bindings that are user fault. We check for that on debug builds but not release if I remember correctly.

This was already crashing on some Mac in the past with Opengl. So I do expect that even pyGPU users will fulfill their bindings.

Also, if this is really needed, this should definitely use calloc and discard it right away. The memset is not even the right size (should be sizeof(null_data) which is sizeof(uint32_t) * NULL_BUF_SIZE) . But you still allocate the UBO with NULL_BUF_SIZE which I guess is in bytes? then why uint32_t? would be happy if all of this wasn't necessary :D

source/blender/gpu/metal/mtl_shader.hh
385–393

Ok to keep that for the moment then.

462–468

Just as a note, the Frequency is only for binding frequency. It is not about update frequency which in almost all cases should be once per frame. If there is an update frequency flag it should be on UBO creation. But we haven't one for the moment.

So I agree that you can remove it.

source/blender/gpu/metal/mtl_shader.mm
540

Well my point was that we could avoid using source file using multiple entry points. Removing the need of this set_vertex_function_name() which seems to only be of use for the internal MSL source since all GLSL sources are using main() as entry point.

The uber shader question / suggestion was related to the aforementioned MSL source shaders, which (if I'm not mistake), are very small as they only do conversion operations. So I don't think a uber shader in this case would make a performance difference.

Nevertheless, if shader variants are needed for this case, can't they just be declared as separate ShaderCreateInfo?

I'm failling to see the use of it outside of the backend.

845

I meant that the code is not straightforward to read and could be written in a more simple way keeping the same behavior.

source/blender/gpu/shaders/metal/mtl_shader_defines.msl
758

I would prefer precomputed constant.

Do pyGPU shaders include common_math_lib as default?

They don't. They would have to declare their own M_PI_F constant.

On this note, are there any good test-cases for pyGPU

I will ask the right person and foward you the answer. But to my knowledge, there is only the official extensions that we maintain. And I would be there is not a lot of custom shaders in it.

Updated the patch with the latest diff against the correct hash.

I think it can land in this state. The remaining comments are either nitpicks or things to be addressed in the future.

@Michael Parkin-White (MichaelPW) If you plan to still add some changes to this patch let me know. Otherwise I will merge it tomorrow or this weekend.

I think it can land in this state. The remaining comments are either nitpicks or things to be addressed in the future.

@Michael Parkin-White (MichaelPW) If you plan to still add some changes to this patch let me know. Otherwise I will merge it tomorrow or this weekend.

Thanks Clément, I'm happy to land as-is and iterate on improvements if that is okay. If there are any key outstanding things that I have missed, happy to iterate on those and submit another round.

The one thing I have not actioned from this feedback is using GPULogParser to handle the Metal shader errors. This is a little more intricate and involved than would first seem, as the error message semantics are quite different to the ones returned by the GLSL compiler. Similarly, as the shader is wrapped and modified verses the input source, returning errors on specific lines may not be as useful as it is with the GLSL shaders, where errors match up exactly.

This is perhaps something that will require a bit more refactoring and feature modification at the higher level, to handle the cases for MSL compilation errors more elegantly.

The one thing I have not actioned from this feedback is using GPULogParser to handle the Metal shader errors. This is a little more intricate and involved than would first seem, as the error message semantics are quite different to the ones returned by the GLSL compiler. Similarly, as the shader is wrapped and modified verses the input source, returning errors on specific lines may not be as useful as it is with the GLSL shaders, where errors match up exactly.

This is perhaps something that will require a bit more refactoring and feature modification at the higher level, to handle the cases for MSL compilation errors more elegantly.

I don't think this is a huge problem. For now, we could do without the error line (it is not mandatory for an error to be printed) and could just forward the errors the compiler outputs.

Later on, we could consider adding markers (as comments) inside the generated shader to know which file we are in.

This revision is now accepted and ready to land.Sep 1 2022, 11:57 AM