Page MenuHome

T61752: Compile Kernels During Scene Update
ClosedPublic

Authored by Jeroen Bakker (jbakker) on Feb 28 2019, 4:14 PM.

Details

Summary

The main goals of this change is faster starting with foreground
rendering. User will be able to continue working when new kernels are
needed.

This patch will build optimized kernels during the update process of
the scene. When these optimized kernels are not available (yet) an AO
kernel will be used for foreground/interactive mode.

These AO kernels are really fast to compile (3-7 seconds) and can be
reused by all scenes. When the optimized kernals become available we
will switch to these optimized kernels.

When adding features to the scene that required recompilation of the
optimized kernels we switch back to the AO kernels until the new
optimized kernels become available.

In background mode the AO kernels will not be used.
Some kernels are being used during Scene update (displace, background
light). When these kernels are being used the process can halt until
these become available.

Diff Detail

Repository
rB Blender
Branch
arcpatch-D4428 (branched from blender2.7)
Build Status
Buildable 3064
Build 3064: arc lint + arc unit

Event Timeline

Jeroen Bakker (jbakker) planned changes to this revision.Feb 28 2019, 4:22 PM
Jeroen Bakker (jbakker) added inline comments.
intern/cycles/kernel/kernel_types.h
131

Remove __BRANCHED_PATH__ As it will not be used

intern/cycles/render/session.cpp
901

Use an ENUM

USES_PREVIEW_KERNEL_WAITING_FOR_OPTIMIZED_KERNEL,
USES_PREVIEW_KERNEL_OPTIMIZED_KERNEL_AVAILABLE
USES_OPTIMIZED_KERNEL

We could provide feedback via progress

  • Check all scenes.
  • Perform stress tests

D4428

  • better state evaluation of the current kernel
  • remove branched_path from the preview kernel
  • Codestyle: remove empty lines
Brecht Van Lommel (brecht) requested changes to this revision.Mar 6 2019, 4:17 PM
Brecht Van Lommel (brecht) added inline comments.
intern/cycles/device/device.h
60–63

Maybe rename OPTIMIZED to FEATURE, perhaps slightly less confusing since we are going from a preview to the real render kernel, not for an unoptimized to an optimized one.

intern/cycles/device/opencl/opencl.h
368–377

This could be deduplicated with the real kernel, by putting these in a struct.

intern/cycles/device/opencl/opencl_split.cpp
420–421

Shorten this line:

OpenCLProgram& program_split = device->use_preview_kernels ? device->program_preview_split : device->program_split;
cl_kernel kernel_state_buffer_size = program_split(ustring("path_trace_state_buffer_size"));
471–472

Same here.

681

Only need preview kernels when background == false.

899–900

This could be deduplicated with the real kernel.

929–935

Code style: should look like this with * at the start of each line:

/* Do not switch kernels for background renderings
 * We do foreground rendering but use the preview kernels
 * Check for the optimized kernels
intern/cycles/kernel/kernel_types.h
126

This line should not be dependent on preview kernel being used or not.

Instead do something like this:

#  if defined(__KERNEL_OPENCL_AMD__) || defined(__KERNEL_OPENCL_INTEL_CPU__)
#    __CL_USE_NATIVE__
#  endif

#  ifdef __KERNEL_OPENCL_PREVIEW__
...
#  else
...
#  endif
intern/cycles/render/session.cpp
905

Rename to Compiling render kernels.

908

I think this can also have Compiling render kernels as message. The "waiting" thing is more an implementation detail.

intern/cycles/util/util_task.cpp
155

This unlock is not needed, can just return num == 0.

This revision now requires changes to proceed.Mar 6 2019, 4:17 PM
Jeroen Bakker (jbakker) marked 11 inline comments as done.
  • D4428: Background Compilation OpenCL Kernels
intern/cycles/device/opencl/opencl_split.cpp
929–935

I will go over the code to address this code style in a separate commit.

Brecht Van Lommel (brecht) requested changes to this revision.Mar 7 2019, 12:11 PM

Testing this I noticed a few issues:

  • When you enable the rendered viewport with the kernel cache cleared, it compiles the preview and final kernel in parallel. The result for me is that the preview kernel is followed almost immediately by the final kernel and not very useful.
  • Then if you enable hair, it spends some time compiling and then shows the preview kernel followed by the final kernel. Maybe it's best not to revert to the preview kernel at all, but if we do it should be immediate.
  • If you start Blender again with both kernels cached, it still flashes briefly with the preview kernel. It should be able to detect that the final kernel is available immediately.
This revision now requires changes to proceed.Mar 7 2019, 12:11 PM
Jeroen Bakker (jbakker) updated this revision to Diff 14045.EditedMar 7 2019, 3:32 PM

D4428: Split loading and compiling of OpenCL

1. The user might see a white flash when kernels are available on disk,
but not yet in the device cache. To remove this flash we first load the
kernels via the main thread. Only when the kernels could not be loaded
via disk or cache we start the background compile jobs.

2. Removed the switch back to preview kernels when a feature would
need new kernels. We will monitor user feedback to see if we need to
change this to a better solution.

3. Kernels are compiled in two phases. First the required kernels
When the required kernels are available the feature kernels are
compiled.

Note to see the mentioned artifacts you need to use a very small scene (default cube)

Jeroen Bakker (jbakker) planned changes to this revision.Mar 7 2019, 4:21 PM

barbershop viewport rendering is failing. It cannot load path_init. previously it used to compile it and it was available but this should be fixed.

  • Codestyle: remove empty lines
  • D4428: Background Compilation OpenCL Kernels
  • Codestyle
  • D4428: Split loading and compiling of OpenCL
  • D4428: Small fixes
  • Codestyle: remove empty lines
  • D4428: Background Compilation OpenCL Kernels
  • Codestyle
  • D4428: Split loading and compiling of OpenCL
  • D4428: Small fixes
  • D4428: Base program was not always compiled

Does this solve the mentioned issues now, or is further work needed?

I can't tell from the "updated this revision to" messages.

This patch is ready for review. Sorry for the confusion

Rebased with latest blender2.7 branch

Harbormaster completed remote builds in B3110: Diff 14131.

Compile times before the preview kernel shows up is like this for me:

Kernel compilation of split_do_volume finished in 1.23s.
Kernel compilation of split_subsurface_scatter finished in 1.38s.
Kernel compilation of base finished in 0.80s.
Kernel compilation of split_direct_lighting finished in 1.32s.
Kernel compilation of split_lamp_emission finished in 1.49s.
Kernel compilation of split_shadow_blocked_dl finished in 1.71s.
Kernel compilation of split_indirect_background finished in 1.47s.
Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 1.71s.
Kernel compilation of split_shader_eval finished in 1.47s.
Kernel compilation of split_shadow_blocked_ao finished in 1.84s.
Kernel compilation of split_bundle finished in 3.00s.
Kernel compilation of background finished in 8.05s.
...
Kernel compilation of split_bundle finished in 7.80s.
Kernel compilation of split_shader_eval finished in 9.90s.
Kernel compilation of split_holdout_emission_blurring_pathtermination_ao finished in 11.77s.
Kernel compilation of split_indirect_background finished in 12.62s.
Kernel compilation of split_lamp_emission finished in 12.77s.
Kernel compilation of split_direct_lighting finished in 13.27s.
Kernel compilation of split_shadow_blocked_ao finished in 13.60s.
Kernel compilation of split_shadow_blocked_dl finished in 17.48s.
Kernel compilation of split_subsurface_scatter finished in 19.33s.

Seems like that background includes all features and slows down the whole thing?

Automatically switching from the preview to the full kernel also seems broken when rendering has finished by the time the full kernel is available. This can be reproduced with the default cube and preview samples set to 1.

Other than that is seems to be working well now.

I will alter the state engine in the session.cpp so the system restarts when kernels become available. I agree that this is undesired behavior for the user.

The background kernel is used for MIS during scene update. As MIS is by default on this kernel is required during the scene update. The same happens with true displacement. I chose in the current implementation if there will be a freeze to have it as soon as possible so the rest of the user experience would be least intrusive.

Since D4485 there are 2 variations of the background kernels (Experimental and Supported)
For displacement there are 8 Variations (Experimental, patch evaluation, use shader raytrace)

I thought about the following solutions:

  1. (Current situation) During preview we will wait for these kernels (there are only 8 variations) this will have a pause during the update_scene before the first sample is calculated. But will lead to no freezes later on.
  2. Rebuild scene in active session when render kernels become available. Will freeze the user interaction.
  3. Rebuild in a different session and switch sessions: When doing right this is the most friendly in user interaction, but leads to increase of memory. Pro is earlier feedback of the preview kernels as real displacement will not be used during preview. most impact in session.cpp and blender_session.cpp. Will lead to more visibility of this patch to other parts of Cycles.

I am not found of having multiple sessions due to the memory requirements and how to cleanly solve this in the current codebase.
I am also not found of freezing after a pixel has been drawn.

Please advice on a different approach and the next step.

As discussed face2face with @Brecht Van Lommel (brecht)

For the background kernel it would be nice that

  1. no MIS will be used when the background is a static color.
  2. no background kernel will be needed for the preview kernel.

We don't see these changes as part of this patch as they impact the update of the scene a lot; we leave current implementation in place.

Reset after feature kernel becomes available

Reset the session when feature kernels become available and the preview kernels have finished rendering.

This revision is now accepted and ready to land.Mar 15 2019, 4:01 PM

Tested and could not find any more issues, looks good to go.

This revision was automatically updated to reflect the committed changes.