aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
diff options
context:
space:
mode:
authorJesse.Zhang <[email protected]>2025-05-09 09:18:16 +0000
committerAlex Deucher <[email protected]>2025-05-13 13:32:25 +0000
commit648a0dc0d78c369233b16878e4f351efe7fd8df6 (patch)
tree0e4625bf01e204a222773f0f9002aa6144c7abbf /drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
parentdrm/amdgpu: Fix the kernel panic caused by RAS records exceed threshold (diff)
downloadkernel-648a0dc0d78c369233b16878e4f351efe7fd8df6.tar.gz
kernel-648a0dc0d78c369233b16878e4f351efe7fd8df6.zip
drm/amdgpu: Fix user queue deadlock by reordering mutex locking
This resolves a deadlock between user queue management and GPU reset paths by enforcing consistent lock ordering. The deadlock occurred when: 1. Process exit path (amdgpu_userq_mgr_fini) would: - Take uqm->userq_mutex - Then try to take adev->userq_mutex for list operations 2. GPU reset path (amdgpu_userq_pre_reset) would: - Take adev->userq_mutex first (for list traversal) - Then take uqm->userq_mutex The solution establishes a strict top-down locking order: 1. Always take adev->userq_mutex before any uqm->userq_mutex 2. Maintain this order consistently across all code paths Changes made: - Reordered locking in amdgpu_userq_mgr_fini() to take device lock first - Kept existing proper order in amdgpu_userq_pre_reset() - Simplified the fini flow by removing redundant operations This prevents circular dependencies while maintaining thread safety during both normal operation and GPU reset scenarios. Fixes: 4ce60dbada96 ("drm/amdgpu: store userq_managers in a list in adev") Reviewed-by: Christian König <[email protected]> Reviewed-by: Arvind Yadav <[email protected]> Signed-off-by: Jesse Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c')
0 files changed, 0 insertions, 0 deletions