Just bouncing some ideas via code, it's a working patch though. More info in the related bug (T41222).
Details
- Reviewers
Sergey Sharybin (sergey) - Commits
- rBS983cbafd1877: Final Fix T41222 Blender gives weird ouput when baking (4096*4096) resolution…
rBSa48b372b0442: Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU
rC7fcc57348b97: Final Fix T41222 Blender gives weird ouput when baking (4096*4096) resolution…
rC9ced308636a7: Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU
rB983cbafd1877: Final Fix T41222 Blender gives weird ouput when baking (4096*4096) resolution…
rBa48b372b0442: Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU
Diff Detail
- Repository
- rB Blender
Event Timeline
Generally idea seem fine, a bit of a clean up and think it could go.
| intern/cycles/render/bake.cpp | ||
|---|---|---|
| 132 | A bit arbitrary it seems, where are the numbers came from? Just throwing ideas:
| |
| 133–134 | Spaces around operators, same applies to some cases below. Also, it's nice readability currently, but seems you need to indent the whole cycle body. | |
| 185 | Does it mean progress bar will go from 0 to 1 for every "tile" ? | |
| intern/cycles/render/bake.cpp | ||
|---|---|---|
| 132 | That was just to confirm that 3k * 3k would render while 4k * 4k would not. To use the size from somewhere else is the way to go indeed. | |
| 133–134 | it is indented, but phabricator doesn't show those code changes. | |
| 185 | Yes it does, and it's the main issue to be handled before this patch is to be considered for real. One thing that could work is to get the total count (all shader parts times their respective num_tasks) before starting the loop. | |
| intern/cycles/render/bake.cpp | ||
|---|---|---|
| 185 | Would it work if we implement splitting in the add_task for shader jobs in CUDA device? | |
- fixing progress bar
The only missing part is to use scene tiles (or similar) instead of hardcoded 512x512 (shader_limit)
@Sergey Sharybin (sergey) I went ahead and did the changes to make progress bar to work without having to implement that in CUDA specifically.
| intern/cycles/render/bake.cpp | ||
|---|---|---|
| 185 | That would be the ideal solution. This patch was actually to illustrate that. Though since I have no CUDA at hand (nor I'm very acknowledged in CUDA coding) I took this proof of concept approach. | |
- using tiles size as shader limit
@Sergey Sharybin (sergey) does cycles have its own 'next power of two' function?
anyways, it should work
The idea of using tile size was just an idea. It works for CUDA case, but for CPU it'll mean like doing split on the baker side and then on the device side?
Did you check if implementing task split for CUDA is easy/workable here?
| intern/cycles/render/bake.cpp | ||
|---|---|---|
| 143–144 | Why not to keep it where it used to be and avoid having rather obscure cycle? Don't really think you'll notice non-linearity in the progress. | |
- Cycles Bake: split CUDA tasks for SHADER (baking)
UNTESTED !!! We need to see if that works for CUDA, in particular for big files like 4k. This does not take tile size into consideration, so it's simply a matter of testing if it fails with big images (4k * 4k) in those gpus that were failing before.
Hi, new patch work on 560Ti, 760 and both cards with 4096 image size.
Tested with 512 and 1024 tiles setting.
If I bake with both cards the Bar reach end immediate and doesn´t change until end of bake process.
Opensuse 13.1/64
Intel i5 3770K
GTX 760 4 GB (Display)
GTX 560Ti 1.28 GB 448 Cores
Driver 340.24
Cheers, mib
EDIT: 2.71 is 3x faster on GTX 760 than patched master, even my i5 is faster than baking with both Cuda cards.
Final patch including the reverting from master + a squash of the other changes (2 commits in the end).
@Sergey Sharybin (sergey) @Wolfgang Faehnle (mib2berlin) can you do a final test on that?
I wasn't going to say anything but this is really disappointing. Dalai, you said on IRC today that that you feel bad for people who are using CPU's to render and yet you are still going to push this change through even though with a 2.7 GHZ i5 I can't even consider rendering indirect lighting at 4096x4096. I can't even render indirect at 2048x2048. It would take me hours per model and now this is being made worse because someone with a GPU is having problems rendering at texture sizes that are completely unnecessary and unreasonable.
I don't have the option to upgrade the GPU on my iMac and most people are not going to be able to upgrade their laptops either. This is really bad news for a lot of people just to satisfy a few.
Blender developers have gone out of their way to make Blender work on almost any computer but with Cycles bake many people would have to replace perfectly good machines just to use it because nVidia GPU's owners are being giving priority over everyone else.
Could you please at least post a patch that reverts this change so people can re-apply things the way they are now.
@marc dion (marcclintdion) I don't follow. Did you try the patch? Did it prevent a scene from bake? Is there a comparison of performance? I said this patch adds some overhead to CPU baking, but nothing relevant if you bum the tile size.
Hi, my tests on CPU are fine on vanilla master hash 0b64126 with 4096x4096.
Bake combined, environment, AO, indirect diffuse in < 35 seconds.
Indeed 2.71 is faster, some bake take only 50% time compare to master hash 0b64126, combined.
As 4096x4096 is rare it is not a problem, 2048 bake in a few seconds on CPU.
Opensuse 13.1/64
Intel i5 3770K
GTX 760 4 GB (Display)
GTX 560Ti 1.28 GB 448 Cores
Driver 340.24
Cheers, mib
You can always tweak tile size, dont think it's so much the huge issue. Using 2x memory is much worse IMO. Same behavior you actually aready experiencing with the final render and tile size.
Also, it's not just 4K issue as far as i know, baking to 4 of 1K images could also be a real issue. And one more thing --baking several 4K textures might be rather an issue on CPU as well.
So if it all works, and even if for maximum boost you need to tweak tiles, think it's rather good to go.
After talk with dfelinto retest with 1024 and 2048 tiles and got similar bake times compare to 2.71 Release.
Cheers, mib
My reaction was overblown, I'm embarrassed about that. I had noticed that if I try to bake 20+ models all at the same time that only about a dozen of them would actually bake and the rest of the textures were the same as they were before the bake. From what I gather, this is supposed to be a solution to that problem.