aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
Commit message (Collapse)AuthorAgeFilesLines
* amd/amdkfd: correct mem limit calculation for small APUsYifan Zhang2025-09-091-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit fa7c99f04f6dd299388e9282812b14e95558ac8e)
* amd/amdkfd: Trigger segfault for early userptr unmmappingShane Xiao2025-05-071-0/+12
| | | | | | | | | | | | | | | If applications unmap the memory before destroying the userptr, it needs trigger a segfault to notify user space to correct the free sequence in VM debug mode. v2: Send gpu access fault to user space v3: Report gpu address to user space, remove unnecessary params v4: update pr_err into one line, remove userptr log info Signed-off-by: Shane Xiao <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: use GFP_NOWAIT for memory allocationsChristian König2025-03-211-4/+4
| | | | | | | | | | | | | In the critical submission path memory allocations can't wait for reclaim since that can potentially wait for submissions to finish. Finally clean that up and mark most memory allocations in the critical path with GFP_NOWAIT. The only exception left is the dma_fence_array() used when no VMID is available, but that will be cleaned up later on. Signed-off-by: Christian König <[email protected]> Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: remove unnecessary cpu domain validationJames Zhu2025-03-051-6/+0
| | | | | | | | before move to GTT domain. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: remove all KFD fences from the BO on releaseChristian König2025-02-211-30/+22
| | | | | | | | | | | | | Remove all KFD BOs from the private dma_resv object. This prevents the KFD from being evict unecessarily when an exported BO is released. Signed-off-by: Christian König <[email protected]> Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-and-tested-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Fix pasid value leakXiaogang Chen2025-02-131-21/+0
| | | | | | | | | | | | Curret kfd does not allocate pasid values, instead uses pasid value for each vm from graphic driver. So should not prevent graphic driver from releasing pasid values since the values are allocated by graphic driver, not kfd driver anymore. This patch does not stop graphic driver release pasid values. Fixes: 8544374c0f82 ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver") Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: add a new flag to manage where VRAM allocations goAlex Deucher2025-02-131-8/+8
| | | | | | | | | | | | | | On big and small APUs we send KFD VRAM allocations to GTT since the carve out is either non-existent or relatively small. However, if someone sets the carve out size to be relatively large, we may end up using GTT rather than VRAM. No change of logic with this patch, but it allows the driver to determine which logic to use based on the carve out size in the future. Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Have kfd driver use same PASID values from graphic driverXiaogang Chen2025-02-131-21/+0
| | | | | | | | | | | | | | | | | Current kfd driver has its own PASID value for a kfd process and uses it to locate vm at interrupt handler or mapping between kfd process and vm. That design is not working when a physical gpu device has multiple spatial partitions, ex: adev in CPX mode. This patch has kfd driver use same pasid values that graphic driver generated which is per vm per pasid. These pasid values are passed to fw/hardware. We do not need change interrupt handler though more pasid values are used. Also, pasid values at log are replaced by user process pid; pasid values are not exposed to user. Users see their process pids that have meaning in user space. Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Failed to check various return codeAndrew Martin2024-12-181-7/+7
| | | | | | | | | Clean up code to quiet the compiler on us failing to check the return code. Signed-off-by: Andrew Martin <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Not restore userptr buffer if kfd process has been removedXiaogang Chen2024-10-221-4/+7
| | | | | | | | | When kfd process has been terminated not restore userptr buffer after mmu notifier invalidates a range. Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Fix an eviction fence leakLang Yu2024-10-071-2/+2
| | | | | | | | | | Only creating a new reference for each process instead of each VM. Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs") Suggested-by: Felix Kuehling <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: remove amdgpu_pin_restricted()Christian König2024-09-181-1/+1
| | | | | | | | | We haven't used the functionality to pin BOs in a certain range at all while the driver existed. Just nuke it. Signed-off-by: Christian König <[email protected]> Acked-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix a race in kfd_mem_export_dmabuf()Al Viro2024-09-101-9/+3
| | | | | | | | | | | | | | | Using drm_gem_prime_handle_to_fd() to set dmabuf up and insert it into descriptor table, only to have it looked up by file descriptor and remove it from descriptor table is not just too convoluted - it's racy; another thread might have modified the descriptor table while we'd been going through that song and dance. Switch kfd_mem_export_dmabuf() to using drm_gem_prime_handle_to_dmabuf() and leave the descriptor table alone... Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Al Viro <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Ensure user queue buffers residencyPhilip Yang2024-07-231-2/+12
| | | | | | | | | | | | | | Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap BO from the GPU if the bo_va queue_refcount is not zero. Create queue to increase the bo_va queue_refcount, destroy queue to decrease the bo_va queue_refcount, to ensure the queue buffers mapped on the GPU when queue is active. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Refactor queue wptr_bo GART mappingPhilip Yang2024-07-231-2/+3
| | | | | | | | | | | | | | | | | | | Add helper function kfd_queue_acquire_buffers to get queue wptr_bo reference from queue write_ptr if it is mapped to the KFD node with expected size. Add wptr_bo to structure queue_properties because structure queue is allocated after queue buffers are validated, then we can remove wptr_bo parameter from pqm_create_queue. Rename structure queue wptr_bo_gart to hold wptr_bo reference for GART mapping and umapping. Move MES wptr_bo_gart mapping to init_user_queue, the same location with queue ctx_bo GART mapping. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: kfd_bo_mapped_dev support partitionPhilip Yang2024-07-231-2/+3
| | | | | | | | | | | Change amdgpu_amdkfd_bo_mapped_to_dev to use drm_priv as parameter instead of adev, to support spatial partition. This is only used by CRIU checkpoint restore now. No functional change. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Estimate RAS reservation when report capacity v2Hawking Zhang2024-06-051-2/+7
| | | | | | | | | | | Add estimate of how much vram we need to reserve for RAS when caculating the total available vram. v2: apply the change to MP0 v13_0_2 and v13_0_14 Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: simplify APU VRAM handlingAlex Deucher2024-05-291-8/+8
| | | | | | | | | | | | | | With commit 89773b85599a ("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs") big and small APU "VRAM" handling in KFD was unified. Since AMD_IS_APU is set for both big and small APUs, we can simplify the checks in the code. v2: clean up a few more places (Lang) Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Lang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/kfd: Correct pinned buffer handling at kfd restore and validate processXiaogang Chen2024-05-171-4/+5
| | | | | | | | | | This reverts commit 8a774fe912ff ("drm/amdgpu: avoid restore process run into dead loop") since buffer got pinned is not related whether it needs mapping And skip buffer validation at kfd driver if the buffer has been pinned. Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Remove arbitrary timeout for hmm_range_faultPhilip Yang2024-05-131-1/+4
| | | | | | | | | | | | | | | | On system with khugepaged enabled and user cases with THP buffer, the hmm_range_fault may takes > 15 seconds to return -EBUSY, the arbitrary timeout value is not accurate, cause memory allocation failure. Remove the arbitrary timeout value, return EAGAIN to application if hmm_range_fault return EBUSY, then userspace libdrm and Thunk will call ioctl again. Change EAGAIN to debug message as this is not error. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Let VRAM allocations go to GTT domain on small APUsLang Yu2024-05-081-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pytorch/examples/tree/main/mnist. Error Log: "torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 84.00 MiB. GPU 0 has a total capacity of 512.00 MiB of which 0 bytes is free. Of the allocated memory 103.83 MiB is allocated by PyTorch, and 22.17 MiB is reserved by PyTorch but unallocated" Though we can change BIOS settings to enlarge carveout size, which is inflexible and may bring complaint. On the other hand, the memory resource can't be effectively used between host and device. The solution is MI300A approach, i.e., let VRAM allocations go to GTT. Then device and host can flexibly and effectively share memory resource. v2: Report local_mem_size_private as 0. (Felix) Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vmsLang Yu2024-05-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Observed on gfx8 ASIC where KFD_IOC_ALLOC_MEM_FLAGS_AQL_QUEUE_MEM is used. Two attachments use the same VM, root PD would be locked twice. [ 57.910418] Call Trace: [ 57.793726] ? reserve_bo_and_cond_vms+0x111/0x1c0 [amdgpu] [ 57.793820] amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu+0x6c/0x1c0 [amdgpu] [ 57.793923] ? idr_get_next_ul+0xbe/0x100 [ 57.793933] kfd_process_device_free_bos+0x7e/0xf0 [amdgpu] [ 57.794041] kfd_process_wq_release+0x2ae/0x3c0 [amdgpu] [ 57.794141] ? process_scheduled_works+0x29c/0x580 [ 57.794147] process_scheduled_works+0x303/0x580 [ 57.794157] ? __pfx_worker_thread+0x10/0x10 [ 57.794160] worker_thread+0x1a2/0x370 [ 57.794165] ? __pfx_worker_thread+0x10/0x10 [ 57.794167] kthread+0x11b/0x150 [ 57.794172] ? __pfx_kthread+0x10/0x10 [ 57.794177] ret_from_fork+0x3d/0x60 [ 57.794181] ? __pfx_kthread+0x10/0x10 [ 57.794184] ret_from_fork_asm+0x1b/0x30 Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Evict BO itself for contiguous allocationPhilip Yang2024-04-301-1/+18
| | | | | | | | | | | | If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to system memory first to free the VRAM space, then allocate contiguous VRAM space, and then move it from system memory back to VRAM. v6: user context should use interruptible call (Felix) Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Support contiguous VRAM allocationPhilip Yang2024-04-301-0/+4
| | | | | | | | | | | | | | RDMA device with limited scatter-gather ability requires contiguous VRAM buffer allocation for RDMA peer direct support. Add a new KFD alloc memory flag and store as bo alloc flag AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA peerdirect access, this will set TTM_PL_FLAG_CONTIFUOUS flag, and ask VRAM buddy allocator to get contiguous VRAM. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Fix VRAM memory accountingMukul Joshi2024-04-261-1/+1
| | | | | | | | | | Subtract the VRAM pinned memory when checking for available memory in amdgpu_amdkfd_reserve_mem_limit function since that memory is not available for use. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: make sure VM is ready for updating operationsLang Yu2024-04-191-14/+20
| | | | | | | | | | | | | | | | | | | When page table BOs were evicted but not validated before updating page tables, VM is still in evicting state, amdgpu_vm_update_range returns -EBUSY and restore_process_worker runs into a dead loop. v2: Split the BO validation and page table update into two separate loops in amdgpu_amdkfd_restore_process_bos. (Felix) 1.Validate BOs 2.Validate VM (and DMABuf attachments) 3.Update page tables for the BOs validated above Fixes: 50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Signed-off-by: Lang Yu <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Fix leak when GPU memory allocation failsMukul Joshi2024-04-191-0/+1
| | | | | | | | | Free the sync object if the memory allocation fails for any reason. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Handle duplicate BOs during process restoreMukul Joshi2024-03-201-4/+10
| | | | | | | | | | | | In certain situations, some apps can import a BO multiple times (through IPC for example). To restore such processes successfully, we need to tell drm to ignore duplicate BOs. While at it, also add additional logging to prevent silent failures when process restore fails. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* amd/amdkfd: remove unused parameterEric Huang2024-02-281-2/+1
| | | | | | | | | | The adev can be found from bo by amdgpu_ttm_adev(bo->tbo.bdev), and adev is also not used in the function amdgpu_amdkfd_map_gtt_bo_to_gart(). Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: reserve the BO before validating itLang Yu2024-01-311-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a warning. v2: Avoid unmapping attachment repeatedly when ERESTARTSYS. v3: Lock the BO before accessing ttm->sg to avoid race conditions.(Felix) [ 41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.708989] Call Trace: [ 41.708992] <TASK> [ 41.708996] ? show_regs+0x6c/0x80 [ 41.709000] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709008] ? __warn+0x93/0x190 [ 41.709014] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709024] ? report_bug+0x1f9/0x210 [ 41.709035] ? handle_bug+0x46/0x80 [ 41.709041] ? exc_invalid_op+0x1d/0x80 [ 41.709048] ? asm_exc_invalid_op+0x1f/0x30 [ 41.709057] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709185] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709197] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709337] ? srso_alias_return_thunk+0x5/0x7f [ 41.709346] kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu] [ 41.709467] amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu] [ 41.709586] kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu] [ 41.709710] kfd_ioctl+0x1ec/0x650 [amdgpu] [ 41.709822] ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu] [ 41.709945] ? srso_alias_return_thunk+0x5/0x7f [ 41.709949] ? tomoyo_file_ioctl+0x20/0x30 [ 41.709959] __x64_sys_ioctl+0x9c/0xd0 [ 41.709967] do_syscall_64+0x3f/0x90 [ 41.709973] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 101b8104307e ("drm/amdkfd: Move dma unmapping after TLB flush") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Auto-validate DMABuf imports in compute VMsFelix Kuehling2024-01-151-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to validate them or add eviction fences to them. This patch automatically validates and fences dymanic DMABuf imports when they are added to a compute VM. Revalidation after evictions is handled in the VM code. v2: * Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate * Eliminated evicted_user state, use evicted state for VM BOs and user BOs * Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs * Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into amdgpu_vm_validate, outside the vm->status_lock * Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds without KFD v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the reservation with its fences is shared with the export, as long as all imports are from KFD, with the exports already reserved, validated and fenced by the KFD restore worker. v5: Reintroduced separate evicted_user state to simplify the state machine and CS error handling when amdgpu_vm_validate is called without a ticket. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Fix sparse __rcu annotation warningsFelix Kuehling2024-01-091-2/+2
| | | | | | | | | | | Properly mark kfd_process->ef as __rcu and consistently use the right accessor functions. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* Merge tag 'drm-msm-next-2023-12-15' of ↵Dave Airlie2023-12-191-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.8: Core: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts on register write - Kconfig fix for QMP dependency - DT schema fixes DPU: - Add support for SDM670, SM8650 - Enable SmartDMA on SM8350 and SM8450 - Correct UBWC settings for SC8280XP - Fix catalog settings for SC8180X - Actually make use of the version to switch between QSEED3/3LITE/4 scalers - Use devres-managed and drm-managed allocations where appropriate - misc other fixes - Enabled YUV writeback on SC7280, SM8250 - Enabled writeback on SM8350, SM8450 - CRC fix when encoder is selected as the input source - other misc fixes MDP4: - Use devres-managed and drm-managed allocations where appropriate - flush vblank event on CRTC disable MDP5: - Use devres-managed and drm-managed allocations where appropriate DP: - Add support for SM8650 - Enable PM runtime support - Merge msm-specific debugfs dir with the generic one - Described DisplayPort on SM8150 in DeviceTree bindings - Moved dp_display_get_next_bridge() to probe() DSI: - Add support for SM8650 - Enable PM runtime support GPU/GEM: - demote userspace triggerable warnings to debug - add GEM object metadata UAPI - move GPU devcoredumps to GPU device - fix hangcheck to skip retired submits - expose UBWC config to userspace - fix a680 chip-id - drm_exec conversion - drm/ci: remove rebase-merge directory (to unblock CI) [airlied: fix drm_exec/amd interaction] Signed-off-by: Dave Airlie <[email protected]> From: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
| * drm/exec: Pass in initial # of objectsRob Clark2023-12-101-4/+4
| | | | | | | | | | | | | | | | | | In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Christian König <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/568338/
* | drm/amdkfd: Import DMABufs for interop through DRMFelix Kuehling2023-12-131-20/+44
| | | | | | | | | | | | | | | | | | | | | | | | Use drm_gem_prime_fd_to_handle to import DMABufs for interop. This ensures that a GEM handle is created on import and that obj->dma_buf will be set and remain set as long as the object is imported into KFD. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Ramesh Errabolu <[email protected]> Reviewed-by: Xiaogang.Chen <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* | drm/amdkfd: Export DMABufs from KFD using GEM handlesFelix Kuehling2023-12-131-7/+26
| | | | | | | | | | | | | | | | | | | | Create GEM handles for exporting DMABufs using GEM-Prime APIs. The GEM handles are created in a drm_client_dev context to avoid exposing them in user mode contexts through a DMABuf import. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* | drm/amdkfd: Run restore_workers on freezable WQsFelix Kuehling2023-11-291-19/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make restore workers freezable so we don't have to explicitly flush them in suspend and GPU reset code paths, and we don't accidentally try to restore BOs while the GPU is suspended. Not having to flush restore_work also helps avoid lock/fence dependencies in the GPU reset case where we're not allowed to wait for fences. A side effect of this is, that we can now have multiple concurrent threads trying to signal the same eviction fence. Rework eviction fence signaling and replacement to account for that. The GPU reset path can no longer rely on restore_process_worker to resume queues because evict/restore workers can run independently of it. Instead call a new restore_process_helper directly. This is an RFC and request for testing. v2: - Reworked eviction fence signaling - Introduced restore_process_helper v3: - Handle unsignaled eviction fences in restore_process_bos Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* | drm/amdgpu: update mappings not managed by KFDFelix Kuehling2023-11-171-6/+22
|/ | | | | | | | | | | | | When restoring after an eviction, use amdgpu_vm_handle_moved to update BO VA mappings in KFD VMs that are not managed through the KFD API. This should allow using the render node API to create more flexible memory mappings in KFD VMs. v2: rebase on drm_exec changes (Alex) Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* Merge tag 'drm-next-2023-11-07' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2023-11-081-31/+48
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more drm updates from Dave Airlie: "Geert pointed out I missed the renesas reworks in my main pull, so this pull contains the renesas next work for atomic conversion and DT support. It also contains a bunch of amdgpu and some small ssd13xx fixes. renesas: - atomic conversion - DT support ssd13xx: - dt binding fix for ssd132x - Initialize ssd130x crtc_state to NULL. amdgpu: - Fix RAS support check - RAS fixes - MES fixes - SMU13 fixes - Contiguous memory allocation fix - BACO fixes - GPU reset fixes - Min power limit fixes - GFX11 fixes - USB4/TB hotplug fixes - ARM regression fix - GFX9.4.3 fixes - KASAN/KCSAN stack size check fixes - SR-IOV fixes - SMU14 fixes - PSP13 fixes - Display blend fixes - Flexible array size fixes amdkfd: - GPUVM fix radeon: - Flexible array size fixes" * tag 'drm-next-2023-11-07' of git://anongit.freedesktop.org/drm/drm: (83 commits) drm/amd/display: Enable fast update on blendTF change drm/amd/display: Fix blend LUT programming drm/amd/display: Program plane color setting correctly drm/amdgpu: Query and report boot status drm/amdgpu: Add psp v13 function to query boot status drm/amd/swsmu: remove fw version check in sw_init. drm/amd/swsmu: update smu v14_0_0 driver if and metrics table drm/amdgpu: Add C2PMSG_109/126 reg field shift/masks drm/amdgpu: Optimize the asic type fix code drm/amdgpu: fix GRBM read timeout when do mes_self_test drm/amdgpu: check recovery status of xgmi hive in ras_reset_error_count drm/amd/pm: only check sriov vf flag once when creating hwmon sysfs drm/amdgpu: Attach eviction fence on alloc drm/amdkfd: Improve amdgpu_vm_handle_moved drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml2 drm/amd/display: Avoid NULL dereference of timing generator drm/amdkfd: Update cache info for GFX 9.4.3 drm/amdkfd: Populate cache info for GFX 9.4.3 drm/amdgpu: don't put MQDs in VRAM on ARM | ARM64 drm/amdgpu/smu13: drop compute workload workaround ...
| * drm/amdgpu: Attach eviction fence on allocFelix Kuehling2023-11-031-31/+48
| | | | | | | | | | | | | | | | | | | | Instead of attaching the eviction fence when a KFD BO is first mapped, attach it when it is allocated or imported. This in preparation to allow KFD BOs to be mapped using the render node API. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* | Merge tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2023-11-011-11/+46
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "Highlights: - AMD adds some more upcoming HW platforms - Intel made Meteorlake stable and started adding Lunarlake - nouveau has a bunch of display rework in prepartion for the NVIDIA GSP firmware support - msm adds a7xx support - habanalabs has finished migration to accel subsystem Detail summary: kernel: - add initial vmemdup-user-array core: - fix platform remove() to return void - drm_file owner updated to reflect owner - move size calcs to drm buddy allocator - let GPUVM build as a module - allow variable number of run-queues in scheduler edid: - handle bad h/v sync_end in EDIDs panfrost: - add Boris as maintainer fbdev: - use fb_ops helpers more - only allow logo use from fbcon - rename fb_pgproto to pgprot_framebuffer - add HPD state to drm_connector_oob_hotplug_event - convert to fbdev i/o mem helpers i915: - Enable meteorlake by default - Early Xe2 LPD/Lunarlake display enablement - Rework subplatforms into IP version checks - GuC based TLB invalidation for Meteorlake - Display rework for future Xe driver integration - LNL FBC features - LNL display feature capability reads - update recommended fw versions for DG2+ - drop fastboot module parameter - added deviceid for Arrowlake-S - drop preproduction workarounds - don't disable preemption for resets - cleanup inlines in headers - PXP firmware loading fix - Fix sg list lengths - DSC PPS state readout/verification - Add more RPL P/U PCI IDs - Add new DG2-G12 stepping - DP enhanced framing support to state checker - Improve shared link bandwidth management - stop using GEM macros in display code - refactor related code into display code - locally enable W=1 warnings - remove PSR watchdog timers on LNL amdgpu: - RAS/FRU EEPROM updatse - IP discovery updatses - GC 11.5 support - DCN 3.5 support - VPE 6.1 support - NBIO 7.11 support - DML2 support - lots of IP updates - use flexible arrays for bo list handling - W=1 fixes - Enable seamless boot in more cases - Enable context type property for HDMI - Rework GPUVM TLB flushing - VCN IB start/size alignment fixes amdkfd: - GC 10/11 fixes - GC 11.5 support - use partial migration in GPU faults radeon: - W=1 Fixes - fix some possible buffer overflow/NULL derefs nouveau: - update uapi for NO_PREFETCH - scheduler/fence fixes - rework suspend/resume for GSP-RM - rework display in preparation for GSP-RM habanalabs: - uapi: expose tsc clock - uapi: block access to eventfd through control device - uapi: force dma-buf export to PAGE_SIZE alignments - complete move to accel subsystem - move firmware interface include files - perform hard reset on PCIe AXI drain event - optimise user interrupt handling msm: - DP: use existing helpers for DPCD - DPU: interrupts reworked - gpu: a7xx (a730/a740) support - decouple msm_drv from kms for headless devices mediatek: - MT8188 dsi/dp/edp support - DDP GAMMA - 12 bit LUT support - connector dynamic selection capability rockchip: - rv1126 mipi-dsi/vop support - add planar formats ast: - rename constants panels: - Mitsubishi AA084XE01 - JDI LPM102A188A - LTK050H3148W-CTA6 ivpu: - power management fixes qaic: - add detach slice bo api komeda: - add NV12 writeback tegra: - support NVSYNC/NHSYNC - host1x suspend fixes ili9882t: - separate into own driver" * tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm: (1803 commits) drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo drm/amdgpu: Remove duplicate fdinfo fields drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table drm/amdgpu: Identify data parity error corrected in replay mode drm/amdgpu: Fix typo in IP discovery parsing drm/amd/display: fix S/G display enablement drm/amdxcp: fix amdxcp unloads incompletely drm/amd/amdgpu: fix the GPU power print error in pm info drm/amdgpu: Use pcie domain of xcc acpi objects drm/amd: check num of link levels when update pcie param drm/amdgpu: Add a read to GFX v9.4.3 ring test drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported. drm/amdgpu: get RAS poison status from DF v4_6_2 drm/amdgpu: Use discovery table's subrevision drm/amd/display: 3.2.256 drm/amd/display: add interface to query SubVP status drm/amd/display: Read before writing Backlight Mode Set Register drm/amd/display: Disable SYMCLK32_SE RCO on DCN314 ...
| * drm/amdgpu: Correctly use bo_va->ref_count in compute VMsXiaogang Chen2023-10-131-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed to correctly handle BOs imported into compute VM from gfx. Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly when map the Bos into same VM, otherwise we may trigger kernel general protection when iterate mappings over bo_va's valids or invalids list. Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Xiaogang Chen <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Ramesh Errabolu <[email protected]> Tested-by: Xiaogang Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * drm/amdgpu: Rework KFD memory max limitsRajneesh Bhardwaj2023-10-041-2/+8
| | | | | | | | | | | | | | | | | | | | To allow bigger allocations specially on systems such as GFXIP 9.4.3 that use GTT memory for VRAM allocations, relax the limits to maximize ROCm allocations. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * drm/amdkfd: Move dma unmapping after TLB flushPhilip Yang2023-09-261-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * drm/amdgpu: Add EXT_COHERENT memory allocation flagsDavid Francis2023-09-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These flags (for GEM and SVM allocations) allocate memory that allows for system-scope atomic semantics. On GFX943 these flags cause caches to be avoided on non-local memory. On all other ASICs they are identical in functionality to the equivalent COHERENT flags. Corresponding Thunk patch is at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 Reviewed-by: David Yat Sin <[email protected]> Signed-off-by: David Francis <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
| * drm/amdkfd: Share the original BO for GTT mappingPhilip Yang2023-08-301-2/+3
| | | | | | | | | | | | | | | | | | | | If mGPUs is on same IOMMU group, or is ram direct mapped, then mGPUs can share the original BO for GTT mapping dma address, without creating new BO from export/import dmabuf. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* | drm/amdkfd: reserve a fence slot while locking the BOChristian König2023-10-231-1/+1
|/ | | | | | | | | | Looks like the KFD still needs this. Signed-off-by: Christian König <[email protected]> Fixes: 8abc1eb2987a ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <[email protected]> Acked-by: Felix Kuehling <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
* drm/amdgpu: Remove duplicated includesGUO Zihua2023-08-151-1/+0
| | | | | | | | Remove duplicated includes in amdgpu_amdkfd_gpuvm.c and amdgpu_ttm.c. Resolves checkincludes message. Signed-off-by: GUO Zihua <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* Merge tag 'amd-drm-next-6.6-2023-07-28' of ↵Daniel Vetter2023-08-041-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.6-2023-07-28: amdgpu: - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SubVP fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - Add PSP 14.0 support - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes radeon: - Lots of checkpatch cleanups Merge conflicts: - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c The switch to drm eu helpers in 8a206685d36f ("drm/amdgpu: use drm_exec for GEM and CSA handling v2") clashed with the cosmetic cleanups from 30953c4d000b ("drm/amdgpu: Fix style issues in amdgpu_gem.c"). I kept the former since the cleanup up code is gone. - drivers/gpu/drm/amd/amdgpu/atom.c. adf64e214280 ("drm/amd: Avoid reading the VBIOS part number twice") removed code that 992b8fe106ab ("drm/radeon: Replace all non-returning strlcpy with strscpy") polished. From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] [sima: some merge conflict wrangling as noted] Signed-off-by: Daniel Vetter <[email protected]>
| * drm/amdgpu: use a macro to define no xcp partition caseGuchun Chen2023-07-181-1/+2
| | | | | | | | | | | | | | | | | | | | ~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>