Page MenuHome

CUDA ERROR: Out of memory in cuLaunchKernel when rendering on GPU GTX 580 Branched Path Tracing / probably sm_20 issue
Closed, ArchivedPublic

Description

System Information
Windows 7 x64, OS X 10.9, OS X 10.10.3, GPU: GTX 580 1,5GB + GTS 450 1GB - OS independent

Blender Version
Broken: 2.74-000dfc0 official release, 2.74-9e9a3cb, 2.73a-bbf09d9
Worked: 2.69 r60995

Short description of error
CUDA ERROR: Out of memory in cuLaunchKernel when rendering on GPU GTX 580 1,5GB and using Branched Path Tracing

Exact steps for others to reproduce the error
Open Blender, load factory settings, enable CUDA GPU in Blender settings.
Switch to Cycles render engine, enable Compute Device: GPU.
Switch to Branched Path Tracing, enable square samples, set AA Render samples to 8, Diffuse to 3, Glossy to 3.
Select Lamp and set Samples to 8
Hit Render

- created in 2.74 - crashing
- created in 2.69 - rendering ok

Event Timeline

Assigning to Sergey for check, but not quite sure we can do much about that, kernel did change quite a bit between 2.69 and 2.74…

Bastien Montagne (mont29) lowered the priority of this task from 90 to Normal.Apr 24 2015, 2:30 PM

When I was doing some more tests, I noticed that Branched Path Tracing uses much more vRAM than Normal Path Tracing. Is that normal?

2.69 Path Tracing: 372MB vRAM:

2.69 Branched Path Tracing: 751MB vRAM:

2.74 Path Tracing: 848MB vRAM:

2.74 Branched Path Tracing: 1263MB vRAM:

I managed to render the scene in Blender 2.74 by unplugging any monitor cable from GTX 580 - using it only for rendering.

Yes, it is normal that branched can use more VRAM. It's different code path in the kernel which requires different memory per-gpu thread.

Some optimization happened during this release cycle, so you might want to have a look into a newer build: https://builder.blender.org/download/

I'll have a closer look into the report this week, but it's probably just increased VRAM usage caused by more features enabled in the GPU kernel which is the only way to solve by spliiting the kernel (work on this is in progress actually).

Thank You for the explanation of the problem, I'll try the latest build of Blender and see what will happen. I only wonder about that on 2 GTS cards 1GB VRAM the scene rendered OK but they have CUDA Compute capability 2.1, GTX 580 has 2.0. Thank You once again :)

Sergey Sharybin (sergey) changed the task status from Unknown Status to Unknown Status.May 1 2015, 11:44 AM

Older shading models might require more registers to do same operations.

Also, checked the kernel statistics -- branched path tracing kernel indeed has higher memory requirements. It is probably possible to reduce memory footprint there, but that's better to be done after the kernel split work fully finished (because splitting kernel should already make GPU more happy).

So thanks for the report, but it's just optimization which is to happen.