aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu
Commit message (Collapse)AuthorAgeFilesLines
...
| | * drm/amdgpu: Use cached partition mode, if validLijo Lazar2025-07-151-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | For current partition mode queries, return the mode cached in partition manager whenever it's valid. Signed-off-by: Lijo Lazar <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * | Merge tag 'drm-misc-next-2025-07-17' of ↵Dave Airlie2025-07-206-9/+38
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: Cross-subsystem Changes: Core Changes: - mode_config: Change fb_create prototype to pass the drm_format_info and avoid redundant lookups in drivers - sched: kunit improvements, memory leak fixes, reset handling improvements - tests: kunit EDID update Driver Changes: - amdgpu: Hibernation fixes, structure lifetime fixes - nouveau: sched improvements - sitronix: Add Sitronix ST7567 Support - bridge: - Make connector available to bridge detect hook - panel: - More refcounting changes - New panels: BOE NE14QDM Signed-off-by: Dave Airlie <[email protected]> From: Maxime Ripard <[email protected]> Link: https://lore.kernel.org/r/20250717-efficient-kudu-of-fantasy-ff95e0@houat
| | * drm/amdgpu: Fix missing unlocking in an error path in amdgpu_userq_create()Christophe JAILLET2025-07-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If kasprintf() fails, some mutex still need to be released to avoid locking issue, as already done in all other error handling path. Fixes: c03ea34cbf88 ("drm/amdgpu: add support of debugfs for mqd information") Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/all/366557fa7ca8173fd78c58336986ca56953369b9.1752087753.git.christophe.jaillet@wanadoo.fr/ Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu: Pass along the format info from .fb_create() to ↵Ville Syrjälä2025-07-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drm_helper_mode_fill_fb_struct() Plumb the format info from .fb_create() all the way to drm_helper_mode_fill_fb_struct() to avoid the redundant lookup. Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Thomas Zimmermann <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
| | * drm: Allow the caller to pass in the format info to ↵Ville Syrjälä2025-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drm_helper_mode_fill_fb_struct() Soon all drivers should have the format info already available in the places where they call drm_helper_mode_fill_fb_struct(). Allow it to be passed along into drm_helper_mode_fill_fb_struct() instead of doing yet another redundant lookup. Start by always passing in NULL and still doing the extra lookup. The actual changes to avoid the lookup will follow. Done with cocci (with some manual fixups): @@ identifier dev, fb, mode_cmd; expression get_format_info; @@ void drm_helper_mode_fill_fb_struct(struct drm_device *dev, struct drm_framebuffer *fb, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd) { ... - fb->format = get_format_info; + fb->format = info ?: get_format_info; ... } @@ identifier dev, fb, mode_cmd; @@ void drm_helper_mode_fill_fb_struct(struct drm_device *dev, struct drm_framebuffer *fb, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd); @@ expression dev, fb, mode_cmd; @@ drm_helper_mode_fill_fb_struct(dev, fb + ,NULL ,mode_cmd); Cc: Alex Deucher <[email protected]> Cc: Liviu Dudau <[email protected]> Cc: Russell King <[email protected]> Cc: Inki Dae <[email protected]> Cc: Seung-Woo Kim <[email protected]> Cc: Kyungmin Park <[email protected]> Cc: Patrik Jakobsson <[email protected]> Cc: Rob Clark <[email protected]> Cc: Abhinav Kumar <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Sean Paul <[email protected]> Cc: Marijn Suijten <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Tomi Valkeinen <[email protected]> Cc: Thierry Reding <[email protected]> Cc: Mikko Perttunen <[email protected]> Cc: Gerd Hoffmann <[email protected]> Cc: Dmitry Osipenko <[email protected]> Cc: Gurchetan Singh <[email protected]> Cc: Chia-I Wu <[email protected]> Cc: Zack Rusin <[email protected]> Cc: Broadcom internal kernel review list <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Reviewed-by: Thomas Zimmermann <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Reviewed-by: Liviu Dudau <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Rodrigo Vivi <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
| | * drm: Pass the format info to .fb_create()Ville Syrjälä2025-07-162-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass along the format information from the top to .fb_create() so that we can avoid redundant (and somewhat expensive) lookups in the drivers. Done with cocci (with some manual fixups): @@ identifier func =~ ".*create.*"; identifier dev, file, mode_cmd; @@ struct drm_framebuffer *func( struct drm_device *dev, struct drm_file *file, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd) { ... ( - const struct drm_format_info *info = drm_get_format_info(...); | - const struct drm_format_info *info; ... - info = drm_get_format_info(...); ) <... - if (!info) - return ...; ...> } @@ identifier func =~ ".*create.*"; identifier dev, file, mode_cmd; @@ struct drm_framebuffer *func( struct drm_device *dev, struct drm_file *file, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd) { ... } @find@ identifier fb_create_func =~ ".*create.*"; identifier dev, file, mode_cmd; @@ struct drm_framebuffer *fb_create_func( struct drm_device *dev, struct drm_file *file, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd); @@ identifier find.fb_create_func; expression dev, file, mode_cmd; @@ fb_create_func(dev, file + ,info ,mode_cmd) @@ expression dev, file, mode_cmd; @@ drm_gem_fb_create(dev, file + ,info ,mode_cmd) @@ expression dev, file, mode_cmd; @@ drm_gem_fb_create_with_dirty(dev, file + ,info ,mode_cmd) @@ expression dev, file_priv, mode_cmd; identifier info, fb; @@ info = drm_get_format_info(...); ... fb = dev->mode_config.funcs->fb_create(dev, file_priv + ,info ,mode_cmd); @@ identifier dev, file_priv, mode_cmd; @@ struct drm_mode_config_funcs { ... struct drm_framebuffer *(*fb_create)(struct drm_device *dev, struct drm_file *file_priv, + const struct drm_format_info *info, const struct drm_mode_fb_cmd2 *mode_cmd); ... }; v2: Fix kernel docs (Laurent) Fix commit msg (Geert) Cc: Alex Deucher <[email protected]> Cc: Liviu Dudau <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Russell King <[email protected]> Cc: Inki Dae <[email protected]> Cc: Seung-Woo Kim <[email protected]> Cc: Kyungmin Park <[email protected]> Cc: Patrik Jakobsson <[email protected]> Cc: Chun-Kuang Hu <[email protected]> Cc: Philipp Zabel <[email protected]> Cc: Rob Clark <[email protected]> Cc: Abhinav Kumar <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Sean Paul <[email protected]> Cc: Marijn Suijten <[email protected]> Cc: Marek Vasut <[email protected]> Cc: Stefan Agner <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Tomi Valkeinen <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Gerd Hoffmann <[email protected]> Cc: Kieran Bingham <[email protected]> Cc: Biju Das <[email protected]> Cc: Sandy Huang <[email protected]> Cc: "Heiko Stübner" <[email protected]> Cc: Andy Yan <[email protected]> Cc: Thierry Reding <[email protected]> Cc: Mikko Perttunen <[email protected]> Cc: Dave Stevenson <[email protected]> Cc: "Maíra Canal" <[email protected]> Cc: Raspberry Pi Kernel Maintenance <[email protected]> Cc: Dmitry Osipenko <[email protected]> Cc: Gurchetan Singh <[email protected]> Cc: Chia-I Wu <[email protected]> Cc: Zack Rusin <[email protected]> Cc: Broadcom internal kernel review list <[email protected]> Cc: Oleksandr Andrushchenko <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Laurent Pinchart <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Acked-by: Liviu Dudau <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Rodrigo Vivi <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
| | * drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESETMaíra Canal2025-07-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Among the scheduler's statuses, the only one that indicates an error is DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV signifies that the operation succeeded and the GPU is in a nominal state. However, to provide more information about the GPU's status, it is needed to convey more information than just "OK". Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has hung, but it has been successfully reset and is now in a nominal state again. Reviewed-by: Philipp Stanner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maíra Canal <[email protected]>
| | * drm/amdgpu: Fix lifetime of struct amdgpu_task_info after ring resetAndré Almeida2025-07-101-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a ring reset happens, amdgpu calls drm_dev_wedged_event() using struct amdgpu_task_info *ti as one of the arguments. After using *ti, a call to amdgpu_vm_put_task_info(ti) is required to correctly track its lifetime. However, it's called from a place that the ring reset path never reaches due to a goto after drm_dev_wedged_event() is called. Move amdgpu_vm_put_task_info() bellow the exit label to make sure that it's called regardless of the code path. amdgpu_vm_put_task_info() can only accept a valid address or NULL as argument, so initialise *ti to make sure we can call this function if *ti isn't used. Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info") Reported-by: Dave Airlie <[email protected]> Closes: https://lore.kernel.org/dri-devel/CAPM=9tz0rQP8VZWKWyuF8kUMqRScxqoa6aVdwWw9=5yYxyYQ2Q@mail.gmail.com/ Reviewed-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: André Almeida <[email protected]>
| | * drm/amdgpu: do not resume device in thaw for normal hibernationSamuel Zhang2025-07-101-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For normal hibernation, GPU do not need to be resumed in thaw since it is not involved in writing the hibernation image. Skip resume in this case can reduce the hibernation time. On VM with 8 * 192GB VRAM dGPUs, 98% VRAM usage and 1.7TB system memory, this can save 50 minutes. Signed-off-by: Samuel Zhang <[email protected]> Tested-by: Mario Limonciello <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mario Limonciello <[email protected]>
| | * drm/amdgpu: move GTT to shmem after eviction for hibernationSamuel Zhang2025-07-101-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When hibernate with data center dGPUs, huge number of VRAM BOs evicted to GTT and takes too much system memory. This will cause hibernation fail due to insufficient memory for creating the hibernation image. Move GTT BOs to shmem in KMD, then shmem to swap disk in kernel hibernation code to make room for hibernation image. Signed-off-by: Samuel Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mario Limonciello <[email protected]>
| * | Merge tag 'amd-drm-next-6.17-2025-07-11' of ↵Simona Vetter2025-07-1122-233/+253
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.17-2025-07-11: amdgpu: - Clean up function signatures - GC 10 KGQ reset fix - SDMA reset cleanups - Misc fixes - LVDS fixes - UserQ fix amdkfd: - Reset fix Signed-off-by: Simona Vetter <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
| | * drm/amdgpu: fix use-after-free in amdgpu_userq_suspend+0x51a/0x5a0Vitaly Prosyak2025-07-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ +0.000020] BUG: KASAN: slab-use-after-free in amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000817] Read of size 8 at addr ffff88812eec8c58 by task amd_pci_unplug/1733 [ +0.000027] CPU: 10 UID: 0 PID: 1733 Comm: amd_pci_unplug Tainted: G W 6.14.0+ #2 [ +0.000009] Tainted: [W]=WARN [ +0.000003] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000003] dump_stack_lvl+0x76/0xa0 [ +0.000011] print_report+0xce/0x600 [ +0.000009] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? kasan_complete_mode_report_info+0x76/0x200 [ +0.000007] ? kasan_addr_to_slab+0xd/0xb0 [ +0.000006] ? amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000707] kasan_report+0xbe/0x110 [ +0.000006] ? amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000541] __asan_report_load8_noabort+0x14/0x30 [ +0.000005] amdgpu_userq_suspend+0x51a/0x5a0 [amdgpu] [ +0.000535] ? stop_cpsch+0x396/0x600 [amdgpu] [ +0.000556] ? stop_cpsch+0x429/0x600 [amdgpu] [ +0.000536] ? __pfx_amdgpu_userq_suspend+0x10/0x10 [amdgpu] [ +0.000536] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? kgd2kfd_suspend+0x132/0x1d0 [amdgpu] [ +0.000542] amdgpu_device_fini_hw+0x581/0xe90 [amdgpu] [ +0.000485] ? down_write+0xbb/0x140 [ +0.000007] ? __mutex_unlock_slowpath.constprop.0+0x317/0x360 [ +0.000005] ? __pfx_amdgpu_device_fini_hw+0x10/0x10 [amdgpu] [ +0.000482] ? __kasan_check_write+0x14/0x30 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? up_write+0x55/0xb0 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? blocking_notifier_chain_unregister+0x6c/0xc0 [ +0.000008] amdgpu_driver_unload_kms+0x69/0x90 [amdgpu] [ +0.000484] amdgpu_pci_remove+0x93/0x130 [amdgpu] [ +0.000482] pci_device_remove+0xae/0x1e0 [ +0.000008] device_remove+0xc7/0x180 [ +0.000008] device_release_driver_internal+0x3d4/0x5a0 [ +0.000007] device_release_driver+0x12/0x20 [ +0.000004] pci_stop_bus_device+0x104/0x150 [ +0.000006] pci_stop_and_remove_bus_device_locked+0x1b/0x40 [ +0.000005] remove_store+0xd7/0xf0 [ +0.000005] ? __pfx_remove_store+0x10/0x10 [ +0.000006] ? __pfx__copy_from_iter+0x10/0x10 [ +0.000006] ? __pfx_dev_attr_store+0x10/0x10 [ +0.000006] dev_attr_store+0x3f/0x80 [ +0.000006] sysfs_kf_write+0x125/0x1d0 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __kasan_check_write+0x14/0x30 [ +0.000005] kernfs_fop_write_iter+0x2ea/0x490 [ +0.000005] ? rw_verify_area+0x70/0x420 [ +0.000005] ? __pfx_kernfs_fop_write_iter+0x10/0x10 [ +0.000006] vfs_write+0x90d/0xe70 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __pfx_vfs_write+0x10/0x10 [ +0.000004] ? local_clock+0x15/0x30 [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_slab_free+0x5f/0x80 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_check_read+0x11/0x20 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? fdget_pos+0x1d3/0x500 [ +0.000007] ksys_write+0x119/0x220 [ +0.000005] ? putname+0x1c/0x30 [ +0.000006] ? __pfx_ksys_write+0x10/0x10 [ +0.000007] __x64_sys_write+0x72/0xc0 [ +0.000006] x64_sys_call+0x18ab/0x26f0 [ +0.000006] do_syscall_64+0x7c/0x170 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __pfx___x64_sys_openat+0x10/0x10 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? __kasan_check_read+0x11/0x20 [ +0.000003] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? fpregs_assert_state_consistent+0x21/0xb0 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? syscall_exit_to_user_mode+0x4e/0x240 [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? do_syscall_64+0x88/0x170 [ +0.000003] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? irqentry_exit+0x43/0x50 [ +0.000004] ? srso_return_thunk+0x5/0x5f [ +0.000004] ? exc_page_fault+0x7c/0x110 [ +0.000006] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000006] RIP: 0033:0x7480c0b14887 [ +0.000005] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ +0.000005] RSP: 002b:00007fff142b0058 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ +0.000006] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007480c0b14887 [ +0.000003] RDX: 0000000000000001 RSI: 00007480c0e7365a RDI: 0000000000000004 [ +0.000003] RBP: 00007fff142b0080 R08: 0000563b2e73c170 R09: 0000000000000000 [ +0.000003] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff142b02f8 [ +0.000003] R13: 0000563b159a72a9 R14: 0000563b159a9d48 R15: 00007480c0f19040 [ +0.000008] </TASK> [ +0.000445] Allocated by task 427 on cpu 5 at 29.342331s: [ +0.000011] kasan_save_stack+0x28/0x60 [ +0.000006] kasan_save_track+0x18/0x70 [ +0.000006] kasan_save_alloc_info+0x38/0x60 [ +0.000005] __kasan_kmalloc+0xc1/0xd0 [ +0.000006] __kmalloc_cache_noprof+0x1bd/0x430 [ +0.000007] amdgpu_driver_open_kms+0x172/0x760 [amdgpu] [ +0.000493] drm_file_alloc+0x569/0x9a0 [ +0.000007] drm_client_init+0x1b7/0x410 [ +0.000007] drm_fbdev_client_setup+0x174/0x470 [ +0.000006] drm_client_setup+0x8a/0xf0 [ +0.000006] amdgpu_pci_probe+0x510/0x10c0 [amdgpu] [ +0.000483] local_pci_probe+0xe7/0x1b0 [ +0.000006] pci_device_probe+0x5bf/0x890 [ +0.000006] really_probe+0x1fd/0x950 [ +0.000005] __driver_probe_device+0x307/0x410 [ +0.000006] driver_probe_device+0x4e/0x150 [ +0.000005] __driver_attach+0x223/0x510 [ +0.000006] bus_for_each_dev+0x102/0x1a0 [ +0.000005] driver_attach+0x3d/0x60 [ +0.000006] bus_add_driver+0x309/0x650 [ +0.000005] driver_register+0x13d/0x490 [ +0.000006] __pci_register_driver+0x1ee/0x2b0 [ +0.000006] rfcomm_dlc_clear_state+0x69/0x220 [rfcomm] [ +0.000011] do_one_initcall+0x9c/0x3e0 [ +0.000007] do_init_module+0x29e/0x7f0 [ +0.000006] load_module+0x5c75/0x7c80 [ +0.000006] init_module_from_file+0x106/0x180 [ +0.000006] idempotent_init_module+0x377/0x740 [ +0.000006] __x64_sys_finit_module+0xd7/0x180 [ +0.000006] x64_sys_call+0x1f0b/0x26f0 [ +0.000006] do_syscall_64+0x7c/0x170 [ +0.000005] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000013] Freed by task 1733 on cpu 5 at 59.907086s: [ +0.000011] kasan_save_stack+0x28/0x60 [ +0.000006] kasan_save_track+0x18/0x70 [ +0.000005] kasan_save_free_info+0x3b/0x60 [ +0.000005] __kasan_slab_free+0x54/0x80 [ +0.000006] kfree+0x127/0x470 [ +0.000006] amdgpu_driver_postclose_kms+0x455/0x760 [amdgpu] [ +0.000493] drm_file_free.part.0+0x5b1/0xba0 [ +0.000006] drm_file_free+0x13/0x30 [ +0.000006] drm_client_release+0x1c4/0x2b0 [ +0.000006] drm_fbdev_ttm_fb_destroy+0xd2/0x120 [drm_ttm_helper] [ +0.000007] put_fb_info+0x97/0xe0 [ +0.000007] unregister_framebuffer+0x197/0x380 [ +0.000005] drm_fb_helper_unregister_info+0x94/0x100 [ +0.000005] drm_fbdev_client_unregister+0x3c/0x80 [ +0.000007] drm_client_dev_unregister+0x144/0x330 [ +0.000006] drm_dev_unregister+0x49/0x1b0 [ +0.000006] drm_dev_unplug+0x4c/0xd0 [ +0.000006] amdgpu_pci_remove+0x58/0x130 [amdgpu] [ +0.000484] pci_device_remove+0xae/0x1e0 [ +0.000008] device_remove+0xc7/0x180 [ +0.000007] device_release_driver_internal+0x3d4/0x5a0 [ +0.000006] device_release_driver+0x12/0x20 [ +0.000007] pci_stop_bus_device+0x104/0x150 [ +0.000006] pci_stop_and_remove_bus_device_locked+0x1b/0x40 [ +0.000006] remove_store+0xd7/0xf0 [ +0.000006] dev_attr_store+0x3f/0x80 [ +0.000005] sysfs_kf_write+0x125/0x1d0 [ +0.000006] kernfs_fop_write_iter+0x2ea/0x490 [ +0.000006] vfs_write+0x90d/0xe70 [ +0.000006] ksys_write+0x119/0x220 [ +0.000006] __x64_sys_write+0x72/0xc0 [ +0.000006] x64_sys_call+0x18ab/0x26f0 [ +0.000005] do_syscall_64+0x7c/0x170 [ +0.000006] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000012] The buggy address belongs to the object at ffff88812eec8000 which belongs to the cache kmalloc-rnd-07-4k of size 4096 [ +0.000016] The buggy address is located 3160 bytes inside of freed 4096-byte region [ffff88812eec8000, ffff88812eec9000) [ +0.000023] The buggy address belongs to the physical page: [ +0.000009] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12eec8 [ +0.000007] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ +0.000005] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) [ +0.000007] page_type: f5(slab) [ +0.000008] raw: 0017ffffc0000040 ffff888100054500 dead000000000122 0000000000000000 [ +0.000005] raw: 0000000000000000 0000000080040004 00000000f5000000 0000000000000000 [ +0.000006] head: 0017ffffc0000040 ffff888100054500 dead000000000122 0000000000000000 [ +0.000005] head: 0000000000000000 0000000080040004 00000000f5000000 0000000000000000 [ +0.000006] head: 0017ffffc0000003 ffffea0004bbb201 ffffffffffffffff 0000000000000000 [ +0.000005] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 [ +0.000005] page dumped because: kasan: bad access detected [ +0.000010] Memory state around the buggy address: [ +0.000009] ffff88812eec8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff88812eec8b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] >ffff88812eec8c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ^ [ +0.000010] ffff88812eec8c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ffff88812eec8d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000011] ================================================================== The use-after-free occurs because a delayed work item (`suspend_work`) may still be pending or running when resources it accesses are freed during device removal or file close. The previous code used `flush_work(&fpriv->evf_mgr.suspend_work.work)`, which does not wait for delayed work that has not yet started. As a result, the delayed work could run after its memory was freed, causing a use-after-free. By switching to `flush_delayed_work(&fpriv->evf_mgr.suspend_work)`, we ensure that the kernel waits for both queued and delayed work to finish before freeing memory, closing this race. Fixes: adba0929736a ("drm/amdgpu: Fix Illegal opcode in command stream Error") Signed-off-by: Vitaly Prosyak <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * Revert "drm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini"Vitaly Prosyak2025-07-072-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5fb90421fa0fbe0a968274912101fe917bf1c47b. The original patch moved `amdgpu_userq_mgr_fini()` to the driver's `postclose` callback, which is called after `drm_gem_release()` in the DRM file cleanup sequence.If a user application crashes or aborts without cleaning up its user queues, 'drm_gem_release()` may free GEM objects that are still referenced by active user queues, leading to use-after-free. By reverting, we ensure that user queues are disabled and cleaned up before any GEM objects are released, preventing this class of bug. However, this reintroduces a race during PCI hot-unplug, where device removal can race with per-file cleanup, leading to use-after-free in suspend/unplug paths. This will be fixed in the next patch. Fixes: 5fb90421fa0f ("drm/amdgpu: fix slab-use-after-free in amdgpu_userq_mgr_fini+0x70c") Signed-off-by: Vitaly Prosyak <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu/sdma: allow caller to handle kernel rings in engine resetAlex Deucher2025-07-075-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a parameter to amdgpu_sdma_reset_engine() to let the caller handle the kernel rings. This allows the kernel rings to back up their unprocessed state if the reset comes in via the drm scheduler rather than KFD. Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu/sdma: consolidate engine reset handlingAlex Deucher2025-07-074-25/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Move the force completion handling into the common engine reset function. No need to duplicate it for every IP version. Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu: Add a noverbose flag to psp_wait_forLijo Lazar2025-07-0710-118/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | For extended wait with retries on a PSP register value, add a noverbose flag to avoid excessive error messages on each timeout. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu/gfx10: fix KGQ reset sequenceAlex Deucher2025-07-071-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Need to reinit the ring before remapping it and all of the KIQ handling needs to be within the kiq lock. Fixes: 1741281a157f ("drm/amdgpu/gfx10: add ring reset callbacks") Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | * drm/amdgpu: Pass adev pointer to functionsLijo Lazar2025-07-075-61/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass amdgpu device context instead of drm device context to some amdgpu_device_* functions. DRM device context is not required in those functions. No functional change. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * | drm/amdgpu: fix the logic to validate fpriv and root boSunil Khatri2025-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the smatch warning, smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:2146 amdgpu_pt_info_read() error: we previously assumed 'fpriv' could be null (see line 2146) "if (!fpriv && !fpriv->vm.root.bo)", It has to be an OR condition rather than an AND which makes an NULL dereference in case fpriv is NULL. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Sunil Khatri <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
| * | drm/amdgpu: fix MQD debugfs undefined symbol when DEBUG_FS=nSunil Khatri2025-07-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix undefined reference to amdgpu_mqd_info_fops during debugfs_create_file if DEBUG_FS=n Signed-off-by: Sunil Khatri <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
| * | Merge remote-tracking branch 'drm/drm-next' into drm-misc-nextMaarten Lankhorst2025-07-0899-1500/+2285
| |\| | | | | | | | | | | | | | | | Pull in drm-intel-next for the updates to drm panic handling. Signed-off-by: Maarten Lankhorst <[email protected]>
| | * Merge tag 'amd-drm-next-6.17-2025-07-01' of ↵Dave Airlie2025-07-0499-1500/+2285
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.17-2025-07-01: amdgpu: - FAMS2 fixes - OLED fixes - Misc cleanups - AUX fixes - DMCUB updates - SR-IOV hibernation support - RAS updates - DP tunneling fixes - DML2 fixes - Backlight improvements - Suspend improvements - Use scaling for non-native modes on eDP - SDMA 4.4.x fixes - PCIe DPM fixes - SDMA 5.x fixes - Cleaner shader updates for GC 9.x - Remove fence slab - ISP genpd support - Parition handling rework - SDMA FW checks for userq support - Add missing firmware declaration - Fix leak in amdgpu_ctx_mgr_entity_fini() - Freesync fix - Ring reset refactoring - Legacy dpm verbosity changes amdkfd: - GWS fix - mtype fix for ext coherent system memory - MMU notifier fix - gfx7/8 fix radeon: - CS validation support for additional GL extensions - Bump driver version for new CS validation checks From: Alex Deucher <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dave Airlie <[email protected]>
| | | * drm/amdgpu/sdma6: add more ucode version checks for userq supportAlex Deucher2025-06-301-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fill in the SDMA ucode version checks for more SDMA 6.x parts. v2: squash in fixes (Alex) Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Remove useless timeout error messageYiPeng Chai2025-06-301-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timeout is only used to interrupt polling and not need to print a error message. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Fix code style issueCe Sun2025-06-301-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cocci warnings: (new ones prefixed by >>) >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6088:8-9: Unneeded variable: "r". Return "0" on line 6141 Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Ce Sun <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: refine ras error injection when eeprom initialization failedganglxie2025-06-302-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when eeprom initialization failed, we still support ras error injection, and reserve bad pages, but do not save bad pages to eeprom Signed-off-by: ganglxie <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Fix error with dev_info_once usageLijo Lazar2025-06-303-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes error with dev_info_once usage in amdgpu_device_asic_has_dc_support. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: a3e510fd69c3 ("drm/amdgpu: Convert from DRM_* to dev_*") Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Use correct severity for BP threshold exceed eventXiang Liu2025-06-301-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The severity of CPER for BP threshold exceed event should be set as CPER_SEV_FATAL to match the OOB implementation. Signed-off-by: Xiang Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: move scheduler wqueue handling into callbacksAlex Deucher2025-06-3019-21/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the scheduler wqueue stopping and starting into the ring reset callbacks. On some IPs we have to reset an engine which may have multiple queues. Move the wqueue handling into the backend so we can handle them as needed based on the type of reset available. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: move guilty handling into ring resetsAlex Deucher2025-06-303-51/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move guilty logic into the ring reset callbacks. This allows each ring reset callback to better handle fence errors and force completions in line with the reset behavior for each IP. It also allows us to remove the ring guilty callback since that logic now lives in the reset callback. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: move force completion into ring resetsAlex Deucher2025-06-3021-30/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the force completion handling into each ring reset function so that each engine can determine whether or not it needs to force completion on the jobs in the ring. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: rework queue reset scheduler interactionChristian König2025-06-301-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stopping the scheduler for queue reset is generally a good idea because it prevents any worker from touching the ring buffer. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: update ring reset function signatureAlex Deucher2025-06-3022-26/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Going forward, we'll need more than just the vmid. Add the guilty amdgpu_fence. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: remove job parameter from amdgpu_fence_emit()Alex Deucher2025-06-303-24/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What we actually care about is the amdgpu_fence object so pass that in explicitly to avoid possible mistakes in the future. The job_run_counter handling can be safely removed at this point as we no longer support job resubmission. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8Alex Deucher2025-06-302-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These were missed when support was added for other generations. The callbacks are called unconditionally so we need to make sure all generations have them. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4304 Link: https://github.com/ROCm/ROCm/issues/4965 Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+") Cc: Jonathan Kim <[email protected]> Reported-by: Johl Brown <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_finiLin.Cao2025-06-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | patch dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") removed ctx put, which will cause amdgpu_ctx_fini() cannot be called and then cause some finished fence that added by amdgpu_ctx_add_fence() cannot be released and cause memleak. Fixes: dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") Signed-off-by: Lin.Cao <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: indent an if statementDan Carpenter2025-06-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "return true;" line wasn't indented. Also checkpatch likes when we align the && conditions. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Use dma_buf from GEM object instanceThomas Zimmermann2025-06-303-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid dereferencing struct drm_gem_object.import_attach for the imported dma-buf. The dma_buf field in the GEM object instance refers to the same buffer. Prepares to make import_attach an implementation detail of PRIME. Reviewed-by: Christian König <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Test for imported buffers with drm_gem_is_imported()Thomas Zimmermann2025-06-306-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of testing import_attach for imported GEM buffers, invoke drm_gem_is_imported() to do the test. v2: - keep amdgpu_bo_print_info() as-is (Christian) Reviewed-by: Christian König <[email protected]> Signed-off-by: Thomas Zimmermann <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Convert from DRM_* to dev_*Lijo Lazar2025-06-3012-219/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert from generic DRM_* to dev_* calls to have device context info. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Include sdma_4_4_4.binKent Russell2025-06-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This got missed during SDMA 4.4.4 support. Signed-off-by: Kent Russell <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amd: Include <linux/export.h> when neededAndré Almeida2025-06-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following compile time warning when building with W=1: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing Acked-by: Christian König <[email protected]> Signed-off-by: André Almeida <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amd: Do not include <linux/export.h> when unusedAndré Almeida2025-06-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following compile time warning when building with W=1: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present Acked-by: Christian König <[email protected]> Signed-off-by: André Almeida <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu/sdma5.x: suspend KFD queues in ring resetAlex Deucher2025-06-302-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SDMA 5.x only supports engine soft reset which resets all queues on the engine. As such, we need to suspend KFD queues around resets like we do for SDMA 4.x. Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amd: Fix spelling mistake "correctalbe" -> "correctable"Colin Ian King2025-06-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a spelling mistake in a pr_info message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu/sdma7: add ucode version checks for userq supportAlex Deucher2025-06-241-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | SDMA 7.0.0/1: 7836028 Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu/sdma6: add ucode version checks for userq supportAlex Deucher2025-06-241-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SDMA 6.0.0 version 24 SDMA 6.0.2 version 21 SDMA 6.0.3 version 25 Reviewed-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Add more checks to PSP mailboxLijo Lazar2025-06-249-61/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of checking the response flag, use status mask also to check against any unexpected failures like a device drop. Also, log error if waiting on a psp response fails/times out. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Convert init_mem_ranges into common helpersHawking Zhang2025-06-243-184/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They can be shared across multiple products Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| | | * drm/amdgpu: Generalize is_multi_chiplet with a common helper v2Hawking Zhang2025-06-242-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not necessary to be ip generation specific v2: rename the helper to is_multi_aid (Lijo) Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>