Details
- Reviewers
Clément Foucault (fclem) Sergey Sharybin (sergey) Brecht Van Lommel (brecht) - Maniphest Tasks
- T92212: Cycles Metal device
T96261: Metal Viewport - Commits
- rB009f7de619e6: Cleanup: use better matching integer types for graphics interop handle
rBb132e3b3ce1c: Cycles: use GPU module for viewport display
Diff Detail
- Repository
- rB Blender
- Branch
- viewport_commits/GPU_cycles_blit_2
- Build Status
Buildable 23887 Build 23887: arc lint + arc unit
Event Timeline
| intern/cycles/blender/display_driver.cpp | ||
|---|---|---|
| 90 | Compilation errors are logged to console ? | |
| 119 | If the shader was bound using GPU_shader_bind one would intuitively expect GPU_shader_unbind be used to unbind it. | |
| 204 | I do not see the point of such shortening. | |
| 248 | Use min and max from the Cycles math.h utilities. | |
| 286 | This is not really an ID anymore. | |
| 289 | ||
| 305 | Don't just shorten existing variables. | |
| 343 | Why the 256 bytes is so important? What does it mean "to enable allocation" ? | |
| 344 | While currently this is OK, more correct would be be to use size_t instead of uint here. | |
| 345 | Use max from the ccl namespace. | |
| 394 | ||
| 397 | ||
| 498 | Is this something planned for this patch? | |
| 654–656 | Would it make sense, or even possible, to allocate fences as part of the rest of the GPU resources (as opposite of doing on-demand) ? | |
| 740 | Why was this changed to the immediate mode emulation? | |
| source/blender/gpu/metal/mtl_state.hh | ||
| 95 | ||
Overall seems fine. There are some stylistic comments. Also, seems that the patch needs to be update against the latest master branch: i had some merge conflicts in the Metal texture header. Didn't look into details though.
Tested on Linux and Mac and did not see any issues. Although, not sure if the Metal code path was actually used on Mac.
Thank you for the feedback! I'll address the requested changes.
| intern/cycles/blender/display_driver.cpp | ||
|---|---|---|
| 90 | GPU module shader compilation should have its own error logger via gl_shader_log.cc for OpenGL. Would this suffice, or is other logging preferred? Thanks! | |
| 343 | For Metal, minimum buffer allocation size generally needs to be 256 bytes. This detail could be moved to the backend, but the key is to ensure that we can create a valid allocation, if the is what the remaining code expects. This is also a catch-all against a zero-sized guard, to ensure that the path functions as expected. The alternative here would be to guard creation if either dimension ends up as zero. Let me know if there is a preference either way? | |
| 344 | Will update, thanks | |
| 498 | Ah yes, this is possibly left open to discussion on whether GPUVertBuf via GPUBatch is preferred over GPU_Immediate? Given this is effectively just drawing a quad, the performance overhead of GPU_Immediate is negligible and this also simplifies the implementation detail. However, using GPUVertBuf may be preferred to keep the sentiment of the original code. Please let me know what the preference is? | |
| source/blender/render/intern/engine.cc | ||
| 1267 | Ah yes, this is probably a safe bet. However the pattern does seem to vary a little. comp->gl_context = WM_opengl_context_create(); comp->gpu_context = GPU_context_create(NULL); GPU_context_active_set(NULL); WM_opengl_context_activate(DST.gl_context); GPU_context_active_set(DST.gpu_context); Though ensuring this context is active prior to GPU Context creation appears to make more sense, to ensure the backing OpenGL context is active before resources are created. Should be reasonable to modify the code accordingly. | |
| intern/cycles/blender/display_driver.cpp | ||
|---|---|---|
| 740 | This was done for simplicity of the code, as it would seem the cost overhead of performing GPUBatch rendering with a pre-baked vertex buffer is more expensive than using GPUImmediate for very small vertex counts, especially if they are subject to data changing relatively frequently. However, please let me know if you prefer this to be written using GPUBatch + GPUVertBuf, as this can be changed. | |
| source/blender/render/intern/engine.cc | ||
| 1267 | From testing, adding a WM_opengl_context_activate(comp->gl_context) before GPU_context_creation fails on the assertion: BLI_assert(GPU_framebuffer_active_get() == GPU_framebuffer_back_get());``` due to having no valid GPU context active at the time.
```void WM_opengl_context_activate(void *context)
{
BLI_assert(GPU_framebuffer_active_get() == GPU_framebuffer_back_get());
GHOST_ActivateOpenGLContext((GHOST_ContextHandle)context);
}As DRW_opengl_context_release() called above disables the active GPU_context, which is later restored in the call to: DRW_opengl_context_activate Given the code does not immediately require usage of the newly created GPU_context, then perhaps this is not a problem as-is. However, if we did want to ensure the backing GL context was active during context creation, then we could:
Or, please let me know if there is an alternative preferred approach for this. Thanks! | |
| intern/cycles/blender/display_driver.cpp | ||
|---|---|---|
| 90 | Use the GPU module's own error logging is fine, since this is code part of Blender. | |
| 343 | Ensuring GPU_pixel_buffer allocates at least 256 in the backend seems needed for consistency between platforms then, since small buffers could run into the same issue, not just zero sized ones. If we need additional changes to reject zero sized buffers in the GPU API I don't know. The main thing is that we either reject or gracefully handle such cases in the GPU API, the same way on platforms. I personally don't have a preference, not familiar enough with the work on the GPU API. | |
| 498 | I think GPU immediate mode is fine if the code is simpler and performance is not impacted, as in this case. We just could not use it before with direct OpenGL calls. At least that is my understanding, unless @Clément Foucault (fclem) wants to move away from GPU_immediate.h in general. | |
| source/blender/render/intern/engine.cc | ||
| 1267 | I think we can leave the code as is then for this patch. Not sure what exactly the plan is for Metal contexts, but in general this code should be refactored at some point. Removing "opengl" from the names, consistent order of creation/activation, platform consistency so we don't need #ifndef _WIN32 and better comments explaining the different types of contexts and how to use them. But that's beyond the scope of this patch. | |
| intern/cycles/blender/display_driver.cpp | ||
|---|---|---|
| 562–564 | Can become simply GPU_fence_wait(gpu_render_sync_); (without check) ? | |
| 617–626 | Use DCHECK_NOTNULL/CHECK_NOTNULL check that something is not null. | |
| 740 | Thanks for the explanation. If there is no speed penalty on both OpenGL and Metal, and the GPU_immediate.h is not planned to be deprecated with the coming development in the GPU module your suggested code is indeed simpler. | |
| 922–923 | Add an assert, or check for nullptr and return false? Also, if the sync objects are created here, checks before sync calls can be removed in the file? | |
Looks good to me now, also seems to work fine in testing.
Will set @Clément Foucault (fclem) as blocking reviewer for the gpu module changes.
Also, let's land this in master after we branch for Blender 3.4 (October 26), since there is not good reason to have it in that release and there is some risk of breaking things. Better leave it for Blender 3.5 where we have more time for testing.