Page MenuHome

Reduce lock stepping caused by profiling on CPU.
AbandonedPublic

Authored by William Leeson (leesonw) on Nov 11 2021, 11:32 AM.

Details

Summary

Only use profiling when it is enabled otherwise don't use it.

Some performance data on the three patches

patchJuans Scene UI 256 samplesJuans Scene bg 256 samplesjunkshop UIjunkshop bg
No patch6:16.594:05.372:08.481:59.7
D131874:12.153:57.362:07.251:58.16
D131854.11.183:54.742:07.441:58.03
D131904:12.393:55.422:07.621:58.68

UI - means rendered from within Blender
bg - means rendered from the command line using blender -b scene.blend -f 1

Diff Detail

Repository
rB Blender
Branch
slowCPU (branched from master)
Build Status
Buildable 18590
Build 18590: arc lint + arc unit

Event Timeline

William Leeson (leesonw) requested review of this revision.Nov 11 2021, 11:32 AM
William Leeson (leesonw) created this revision.
Sergey Sharybin (sergey) requested changes to this revision.Nov 11 2021, 11:49 AM

For the performance related changes I'd expect to see some quantification and how it was obtained.
My guess that this is part of T92601. Does this change bring any performance improvement?

intern/cycles/integrator/path_trace.cpp
402

Avoid white-space changes.

781

Same as above.

intern/cycles/integrator/path_trace.h
60

Here and in other places the indentation is broken: tabs are used instead of spaces.
You might want to configure the IDE to use spaces, and also re-ensure clang-format is used.

intern/cycles/integrator/path_trace_work_cpu.cpp
83–86

Code style.

89

Avoid dead code.

116–119

Code style.

intern/cycles/integrator/shader_eval.cpp
109–113

This will make it impossible to quickly cancel evaluation, Might not be that problematic for F12 renders, but for viewport it could introduce unwanted lag in refresh rate.

If this is a hotspot we should consider doing something similar to PathTraceCPU where the check is done based on boolean flag.

This revision now requires changes to proceed.Nov 11 2021, 11:49 AM

For the case where profiling is enabled we should try to improve performance, but that can be done separate from this bugfix. Probably we can only mark the state as active/inactive PathTraceWorkCPU::render_samples, and do the intialization/merging once per render.

intern/cycles/integrator/path_trace.h
60

I don't think we have to pass in use_profiling. I think instead Profiler::add_state and Profiler::remove_state can check if the profiler has been started, and if not, do nothing.

William Leeson (leesonw) marked 7 inline comments as done.

Removed dead code and restored cancel.

Some performance data on the three patches

patchJuans Scene UI 256 samplesJuans Scene bg 256 samplesjunkshop UIjunkshop bg
No patch6:16.594:05.372:08.481:59.7
D131874:12.153:57.362:07.251:58.16
D131854.11.183:54.742:07.441:58.03
D131904:12.393:55.422:07.621:58.68

UI - means rendered from within Blender
bg - means rendered from the command line using blender -b scene.blend -f 1

Did you accidentally swap some columns? Not sure why this patch would make UI rendering slower otherwise.

Thanks for the number! Suggest moving them to the patch description (which will also end up in the commit message, communicating results to those who follow those logs ;).

The background mode speedup seems impressive! But what's up with the UI case form the first column becoming slower?

If we can avoid passing use_profiling that'd be great.

We can avoid passing use_profiling and avoid the per-thread overhead by adding an Profiler::active() { return worker != nullptr; } method and then checking if (device_->profiler.active()).

Yes @Sergey Sharybin (sergey) I had swapped the columns not sure what happened there. @Brecht Van Lommel (brecht) I'll try the method to the device to see what happens.

It was decided to go with D13190 instead.