Page MenuHome

Blender crashes accessing the OpenCL settings with ROCm AMD drivers
Closed, ArchivedPublic

Description

System Information
Operating system: Linux-5.14.12-tkg-cacule-llvm-x86_64-AMD_Ryzen_7_3800X_8-Core_Processor-with-glibc2.33 64 Bits
Graphics card: AMD Radeon RX 5600 XT (NAVI10, DRM 3.42.0, 5.14.12-tkg-cacule-llvm, LLVM 12.0.1) AMD 4.6 (Core Profile) Mesa 21.2.2

Blender Version
Broken: version: 2.93.5, branch: master, commit date: 2021-10-05 12:04, hash: rBa791bdabd0b2
Broken: version: blender-2.93.6-candidate+v293.0930a70e8110-linux.x86_64-release
Worked: none

Short description of error
Accessing the OpenCL settings crashes blender

Exact steps for others to reproduce the error
Navigating to Edit -> Preferences -> System -> OpenCL always crashes blender

Tested with both 2.93.5 from blender.org and 2.93.6 daily from blender.org (10/17/2021)

Output in console:

./blender --debug --debug-gpu --debug-gpu-force-workarounds --debug-cycles --verbose 9  --debug-value 9
Switching to fully guarded memory allocator.
Blender 2.93.5
Build: 2021-10-06 06:26:10 Linux release
argv[0] = ./blender
argv[1] = --debug
argv[2] = --debug-gpu
argv[3] = --debug-gpu-force-workarounds
argv[4] = --debug-cycles
argv[5] = --verbose
argv[6] = 9
argv[7] = --debug-value
argv[8] = 9
Read prefs: /home/slavius/.config/blender/2.93/config/userpref.blend
read file
  Version 293 sub 18 date 2021-04-16 16:00 hash 463b38b0e0b0

GL: Forcing workaround usage and disabling extensions.
    OpenGL identification strings
    vendor: AMD
    renderer: AMD Radeon RX 5600 XT (NAVI10, DRM 3.42.0, 5.14.12-tkg-cacule-llvm, LLVM 12.0.1)
    version: 4.6 (Core Profile) Mesa 21.2.2

I1017 14:23:25.093932 68007 blender_python.cpp:195] Debug flags initialized to:
CPU flags:
  AVX2       : True
  AVX        : True
  SSE4.1     : True
  SSE3       : True
  SSE2       : True
  BVH layout : EMBREE
  Split      : False
CUDA flags:
  Adaptive Compile : False
OptiX flags:
  CUDA streams : 1
OpenCL flags:
  Device type    : ALL
  Debug          : False
  Memory limit   : 0
I1017 14:23:33.636139 68007 device_cuda.cpp:56] CUEW initialization failed: Error opening the library
I1017 14:23:34.619366 68007 device_opencl.cpp:48] CLEW initialization succeeded.
mesa: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
[1]    68007 IOT instruction (core dumped)

dmesg output:

[   34.503901] ------------[ cut here ]------------
[   34.503903] WARNING: CPU: 13 PID: 242 at drivers/gpu/drm/ttm/ttm_bo.c:409 ttm_bo_put+0x3fc/0x430 [ttm]
[   34.503908] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq af_packet amdgpu snd_hda_codec_realtek iommu_v2 snd_hda_codec_generic gpu_sched ledtrig_audio drm_ttm_helper mousedev snd_usb_audio ttm rapl btusb snd_hda_intel drm_kms_helper snd_intel_dspcfg btrtl snd_usbmidi_lib btbcm snd_hda_codec sysimgblt snd_rawmidi btintel input_leds syscopyarea snd_hda_core snd_seq_device led_class snd_hwdep sysfillrect joydev fb_sys_fops backlight snd_pcm bluetooth snd_timer cfbimgblt snd cfbcopyarea rfkill pcspkr cfbfillrect ecdh_generic fb mc soundcore i2c_piix4 fbdev k10temp ecc igb dca squashfs zlib_inflate loop vfat fat sch_fq_codel fuse configfs usb_storage hid_jabra hid_generic usbhid hid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel xhci_pci xhci_pci_renesas xhci_hcd pinctrl_amd nct6775 hwmon_vid efivarfs unix ipv6
[   34.503937] CPU: 13 PID: 242 Comm: kworker/13:1 Not tainted 5.14.12-tkg-cacule-llvm #1
[   34.503939] Hardware name: Micro-Star International Co., Ltd. MS-7B93/MPG X570 GAMING PRO CARBON WIFI (MS-7B93), BIOS 1.A0 10/30/2020
[   34.503940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]
[   34.503975] RIP: 0010:ttm_bo_put+0x3fc/0x430 [ttm]
[   34.503977] Code: 00 4c 89 fa 5b 41 5c 41 5d 41 5e 41 5f 5d e9 9b 05 cf cc 8d 48 ff 09 c1 78 2e 4c 89 f7 5b 41 5c 41 5d 41 5e 41 5f 5d ff 62 f8 <0f> 0b e9 2e fc ff ff 48 89 d7 be 03 00 00 00 5b 41 5c 41 5d 41 5e
[   34.503978] RSP: 0018:ffffc900005c3ce0 EFLAGS: 00010202
[   34.503979] RAX: 0000000000000001 RBX: ffff8881cd42aa00 RCX: 0000000080400034
[   34.503980] RDX: ffff8881057779b8 RSI: ffffea0007350a80 RDI: ffff888105777858
[   34.503981] RBP: ffff88810a16d830 R08: 0000000000000001 R09: 0000000000000000
[   34.503982] R10: ffff8881cd42a880 R11: ffffffff00000000 R12: ffff88810a16d800
[   34.503982] R13: dead000000000122 R14: ffff88810a16d830 R15: ffff8881529e5258
[   34.503983] FS:  0000000000000000(0000) GS:ffff888feef40000(0000) knlGS:0000000000000000
[   34.503984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.503985] CR2: 00007f2a7c062000 CR3: 0000000679e0a000 CR4: 00000000003506a0
[   34.503986] Call Trace:
[   34.503988]  amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x27f/0x2b0 [amdgpu]
[   34.504018]  kfd_process_device_free_bos+0xa1/0xd0 [amdgpu]
[   34.504049]  kfd_process_wq_release+0x280/0x340 [amdgpu]
[   34.504085]  process_one_work+0x2a5/0x3d0
[   34.504088]  worker_thread+0x34d/0x620
[   34.504090]  ? schedule+0xbe/0x240
[   34.504093]  kthread+0x212/0x240
[   34.504095]  ? rcu_free_pool+0x30/0x30
[   34.504097]  ? kthreadd+0x2a0/0x2a0
[   34.504098]  ret_from_fork+0x1f/0x30
[   34.504100] ---[ end trace c7b6a1315f7a2fd5 ]---

Installed packages:

$ eix --installed -c -r "amdgpu|rocm|ocm|mesa|xf86\-video|opencl|\-roc"
[I] dev-libs/opencl-icd-loader (2021.06.30@10/13/2021): Official Khronos OpenCL ICD Loader
[I] dev-libs/rocm-comgr (4.3.0(0/4.3)@10/14/2021): Radeon Open Compute Code Object Manager
[I] dev-libs/rocm-device-libs (4.3.0(0/4.3)@10/14/2021): Radeon Open Compute Device Libraries
[I] dev-libs/rocm-opencl-runtime (4.3.0(0/4.3)@10/14/2021): Radeon Open Compute OpenCL Compatible Runtime
[I] dev-util/opencl-headers (2021.06.30@10/17/2021): Unified C language headers for the OpenCL API
[I] dev-util/rocm-cmake (4.3.0(0/4.3)@10/14/2021): Radeon Open Compute CMake Modules
[I] media-libs/mesa (21.2.2@10/14/2021): OpenGL-like graphic library for Linux
[I] sys-devel/llvm-roc (4.3.0-r1@10/13/2021): Radeon Open Compute llvm,lld,clang
[I] virtual/opencl (3-r1@10/13/2021): Virtual for OpenCL API
[I] x11-apps/mesa-progs (8.4.0@10/03/2021): Mesa's OpenGL utility and demo programs (glxgears and glxinfo)
[I] x11-drivers/xf86-video-amdgpu (21.0.0@10/03/2021): Accelerated Open Source driver for AMDGPU cards
[I] x11-drivers/xf86-video-ati (19.1.0@09/28/2021): ATI video driver

I'm using the ROCm 4.3.0 drivers from AMD.

Event Timeline

Jesse Yurkovich (deadpin) changed the task status from Needs Triage to Needs Information from User.Oct 17 2021, 10:40 PM

Hello,

AMDGPU PRO driver does not work as it fails to find an OpenCL device.

I have tested the amdgpu-pro-opencl driver 20.40.1147286 and 21.30.1290604. None of them works and clinfo detects 0 GPU cards.

clinfo output:

$ clinfo
Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3302.5)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform
$ ./blender --debug --debug-gpu --debug-gpu-force-workarounds --debug-cycles --verbose 9  --debug-value 9
Switching to fully guarded memory allocator.
Blender 2.93.5
Build: 2021-10-06 06:26:10 Linux release
argv[0] = ./blender
argv[1] = --debug
argv[2] = --debug-gpu
argv[3] = --debug-gpu-force-workarounds
argv[4] = --debug-cycles
argv[5] = --verbose
argv[6] = 9
argv[7] = --debug-value
argv[8] = 9
Read prefs: /home/slavius/.config/blender/2.93/config/userpref.blend
read file
  Version 293 sub 18 date 2021-04-16 16:00 hash 463b38b0e0b0

GL: Forcing workaround usage and disabling extensions.
    OpenGL identification strings
    vendor: AMD
    renderer: AMD Radeon RX 5600 XT (NAVI10, DRM 3.42.0, 5.14.13-tkg-cacule-llvm, LLVM 12.0.1)
    version: 4.6 (Core Profile) Mesa 21.2.2

I1018 09:03:22.994125 48766 blender_python.cpp:195] Debug flags initialized to:
CPU flags:
  AVX2       : True
  AVX        : True
  SSE4.1     : True
  SSE3       : True
  SSE2       : True
  BVH layout : EMBREE
  Split      : False
CUDA flags:
  Adaptive Compile : False
OptiX flags:
  CUDA streams : 1
OpenCL flags:
  Device type    : ALL
  Debug          : False
  Memory limit   : 0
read file
  Version 293 sub 18 date 2021-04-16 16:00 hash 463b38b0e0b0
I1018 09:03:32.333719 48766 device_opencl.cpp:48] CLEW initialization succeeded.
I1018 09:03:32.353891 48766 opencl_util.cpp:957] Enumerating devices for platform AMD Accelerated Parallel Processing.
I1018 09:03:32.353911 48766 opencl_util.cpp:964] Ignoring platform AMD Accelerated Parallel Processing, failed to fetch of devices: CL_DEVICE_NOT_FOUND
Jesse Yurkovich (deadpin) changed the task status from Needs Information from User to Needs Triage.Oct 18 2021, 9:08 AM
Thomas Dinges (dingto) closed this task as Archived.Nov 19 2021, 10:18 AM

OpenCL rendering support was removed in Blender 3.0.
The combination of the limited Cycles kernel implementation, driver bugs, and stalled OpenCL standard
has made maintenance too difficult. Thanks for your report, but it's unlikely that there will be further fixes for OpenCL.

For AMD GPUs, there is a new backend based on the HIP platform.
In Blender 3.0, this is supported on Windows with RDNA and RDNA2 generation discrete graphics cards.
It includes Radeon RX 5000 and RX 6000 series GPUs. Driver version Radeon Pro 21.Q4 or newer is required.

https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://code.blender.org/2021/11/next-level-support-for-amd-gpus/

Thank you. I'm waiting until HIP arrives to Linux to be able to test it with Blender 3.0