Page MenuHome

Cycles denoising on CPU buffer overflow
Closed, ArchivedPublicBUG

Description

System Information
Operating system: any

Blender Version
Broken: (master, as of now, rB35c707684befb2fe823ab1bfa002b34785c31841), intern/cycles/kernel/filter/filter_nlm_cpu.h

Short description of error
When doing denoising sometimes cycles might read values that are out of memory buffer

Exact steps for others to reproduce the error
Issue lies in methods:
kernel_filter_nlm_calc_difference, method, part: int aligned_lowx = rect.x & (~3);
kernel_filter_nlm_update_output, method, part: int aligned_lowx = round_down(rect.x, 4);

This happens when aligned_lowx + dx becomes less than zero.

For example if method get rect.x = 5 value and dx is -5, then after rounding down aligned_lowx becomes 4, and after that aligned_lowx + dx becomes -1, then when reading image we get something like this load4_u(image, -1):

ccl_device_inline void kernel_filter_nlm_update_output(int dx, int dy, ..., int4 rect, ...) {
  nlm_blur_horizontal(difference_image, temp_image, rect, stride, f);

  int aligned_lowx = round_down(rect.x, 4); // Becomes 4
  for (int y = rect.y; y < rect.w; y++) {
    for (int x = aligned_lowx; x < rect.z; x += 4) {
      int4 x4 = make_int4(x) + make_int4(0, 1, 2, 3);
      int4 active = (x4 >= make_int4(rect.x)) & (x4 < make_int4(rect.z));

      int idx_p = y * stride + x, idx_q = (y + dy) * stride + (x + dx); // idx_q - becomes -1

      float4 weight = load4_a(temp_image, idx_p);
      load4_a(accum_image, idx_p) += mask(active, weight);

      float4 val = load4_u(image, idx_q); // we try to read image at position -1
      if (channel_offset) {
        val += load4_u(image, idx_q + channel_offset);
        val += load4_u(image, idx_q + 2 * channel_offset);
        val *= 1.0f / 3.0f;
      }

      load4_a(out_image, idx_p) += mask(active, weight * val);
    }
  }
}

Possible fix would be (but I am not sure if it is correct in case of denosing algorithm):

int aligned_lowx = round_down(rect.x, 4);
if (aligned_lowx + dx < 0) {
    aligned_lowx += 4;
}

Not sure I can reproduce easily as I was cross-compiling cycles and then running it on ARM.

Event Timeline

Tautvydas Andrikys (esminis) renamed this task from Cycles denoising on GPU buffer overflow to Cycles denoising on CPU buffer overflow.Oct 10 2019, 11:57 AM
Tautvydas Andrikys (esminis) updated the task description. (Show Details)
Ankit Meel (ankitm) changed the task status from Needs Triage to Needs Information from User.Feb 25 2020, 6:17 PM

Is it still an issue ? Can it be reliably redone ?

I could reproduce it reliably, now I have not checked again, but from code I still see that there is no changes to mitigate this in at least kernel_filter_nlm_update_output method.

So, if calls to kernel_filter_nlm_update_output were not changed to make sure out of bounds coordinates are not passed in, then yes issue still exists.

Richard Antalik (ISS) changed the task status from Needs Information from User to Needs Information from Developers.Feb 26 2020, 10:37 PM
Richard Antalik (ISS) changed the task status from Needs Information from Developers to Needs Information from User.Mar 20 2020, 12:54 AM

I would recommend sending patch via https://developer.blender.org/differential/diff/create/

Or please provide .blend file with reproducible case.

Brecht Van Lommel (brecht) changed the task status from Needs Information from User to Confirmed.Mar 25 2020, 11:03 PM
Brecht Van Lommel (brecht) changed the subtype of this task from "Report" to "Bug".

I will try once I have time for this, as this probably is not so easy to reproduce using .blend

Also I doubt it is good idea for me to submit patch as this is essentially a quick-fix and not a proper one (as I don`t fully know how to properly clip coordinates here)

The NLM denoiser was removed in Blender 3.0, so this problem will not longer happpen there.