Page MenuHome

Cycles: Fix kernel compilation errors with certain drivers
ClosedPublic

Authored by Sergey Sharybin (sergey) on Mar 8 2017, 1:48 PM.

Details

Summary

Was using AMD CPU platform to get things compiled here, but similar issues
were happening with Intel OpenCL as well.

Don't currently have access to Intel OpenCL to verify that, but while this
patch solves compilation error for AMD CPU platform it still is behaving
really flackey -- sometimes rendering will just stuck or render emptiness.

Doesnt' seem to be issue with the patch since CUDA Split, NVIDIA OpenCL
Split and AMD OpenCL split are still rendering things correctly here.

This patch might also fix compilation error with older OpenCL drivers on
Linux, but with those we can not guarantee anything.

Diff Detail

Repository
rB Blender

Event Timeline

Sergey Sharybin (sergey) retitled this revision from to Cycles: Fix kernel compilation errors with certain drivers.
Sergey Sharybin (sergey) updated this object.

Looks fine I guess, would be nice to deduplicate things more but not so important.

intern/cycles/kernel/kernels/cpu/kernel_cpu_impl.h
181 ↗(On Diff #8401)

Maybe this and the previous can be combined?

192 ↗(On Diff #8401)

Dont need the ;

Sergey Sharybin (sergey) edited edge metadata.

Update patch against latest master

Why to bother?

Currently amdgpu driver onyl supports GCN 1.2+ cards, all the rest cards
are in semi-unsupported state. And are in fully unsupported state from AMD
itself. There is no chance to expect amdgpu support any time soon for them.

For the meantime this patch allows fglrx to compile the kernels. While there
might be some rendering artifacts, those we'll prbably wouldn't be able to
solve but for someone OpenCL might work just fine.

This patch is very usefull to work on the kernel in the train with a laptop, thank you. But it's very slow using the AMD 17.3.1 driver and Intel CPU as device. The default cube at 50% HD and 10 spp takes 8 minutes to render. Would it be possible to make it render at a more normal speed?

I'm not sure why it causes slowdown, it is more about getting things to work, not about speed. How does the speed compares to 2.78c?

In any case, we are in very restricted environment here where we need latest drivers to have decent speed on newer hardware. If the performance on older drivers (which are declared deprecated without bringing any usable replacement) is so poor afraid we can't do much from Blender side.

Update against latest master and also try using safer changes around ccl_local

@Sergey Sharybin (sergey) thanks for the update, I will test and report. The slowdown is with latest driver, but with CPU device. GPU is fast as usual. But when working on laptop, it's great to be able to compile the kernel for CPU. With master, it won't compile. With this patch it compiles. Only problem is that OpenCL CPU is (hopefully was) approximatively 400x slower than normal CPU render. It may be a driver bug, but Luxcore works great on OpenCL CPU on the same machine with same driver.

New version has the same problem, works but is insanely slow, making testing nearly impossible. Can someone else test on Intel CPU with Radeon Crimson 17.3.1 (driver from March 2017) ?

Not entirely happy with how locals are done but not sure theres really a good way to go. Maybe have DEFINE_SPLIT_KERNEL_FUNCTION_LOCALS_$N and have kernel functions like kernel_func(KernelGlobals *kg, ccl_local unsigned int* local_1, unsigned int* local_2, ...)? That way we get errors if the number of locals falls out of sync.

Otherwise tho I think we should get this into master, as we keep getting reports related to this.

Made it less difficult to screw up atomics/locals,. their number and type

lgtm

intern/cycles/kernel/kernels/cuda/kernel_split.cu
96 ↗(On Diff #8426)

Dont need this I think.

This revision is now accepted and ready to land.Mar 16 2017, 11:21 AM

Remove unused macro

This revision was automatically updated to reflect the committed changes.