CWE
476
Advisory Published
Updated

CVE-2023-52738: drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

First published: Tue May 21 2024(Updated: )

In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully. Happens that we faced a driver probe failure in the Steam Deck recently, and the function drm_sched_fini() was called even without its counter-part had been previously called, causing the following oops: amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace: <TASK> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu] amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] devm_drm_dev_init_release+0x49/0x70 [...] To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part. Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1]. [0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Credit: 416baaa9-dc9f-4396-8d5f-8c081fb06d67

Affected SoftwareAffected VersionHow to fix
Red Hat Kernel-devel
Linux Kernel>=5.14.10<5.15.94
Linux Kernel>=5.16<=6.1.12
Linux Kernel=6.2-rc1
Linux Kernel=6.2-rc2
Linux Kernel=6.2-rc3
Linux Kernel=6.2-rc4
Linux Kernel=6.2-rc5
Linux Kernel=6.2-rc6
Linux Kernel=6.2-rc7

Never miss a vulnerability like this again

Sign up to SecAlerts for real-time vulnerability data matched to your software, aggregated from hundreds of sources.

Frequently Asked Questions

  • What is the severity of CVE-2023-52738?

    CVE-2023-52738 is categorized with moderate severity due to potential system instability.

  • How do I fix CVE-2023-52738?

    To address CVE-2023-52738, upgrade your Linux kernel to a version that includes the patch for this vulnerability.

  • Which versions of the Linux kernel are affected by CVE-2023-52738?

    CVE-2023-52738 affects Linux kernel versions from 5.14.10 up to 5.15.94, and specific 6.2 release candidates.

  • What impact does CVE-2023-52738 have on Linux systems?

    CVE-2023-52738 can cause an oops or crash due to mismatched DRM scheduler initialization and finalization.

  • Is CVE-2023-52738 specific to any particular Linux distributions?

    CVE-2023-52738 affects distributions using the vulnerable Linux kernel versions, including various Linux distributions.

Contact

SecAlerts Pty Ltd.
132 Wickham Terrace
Fortitude Valley,
QLD 4006, Australia
info@secalerts.co
By using SecAlerts services, you agree to our services end-user license agreement. This website is safeguarded by reCAPTCHA and governed by the Google Privacy Policy and Terms of Service. All names, logos, and brands of products are owned by their respective owners, and any usage of these names, logos, and brands for identification purposes only does not imply endorsement. If you possess any content that requires removal, please get in touch with us.
© 2025 SecAlerts Pty Ltd.
ABN: 70 645 966 203, ACN: 645 966 203