aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdgpu: Reduce dequeue retry timeout for gfx9 familyHarish Kasiviswanathan2025-03-141-2/+3
| | | | | | | | | | | Dequeue retry timeout controls the interval between checks for unmet conditions. On MI series, reduce this from 0x40 to 0x1 (~ 1 uS). The cost of additional bandwidth consumed by CP when polling memory shouldn't be substantial. Signed-off-by: Harish Kasiviswanathan <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: implement per queue sdma reset for gfx 9.4+Jonathan Kim2025-03-051-0/+2
| | | | | | | | | | | | | | To reset hung SDMA queues on GFX 9.4+ for the GFX9 family, a soft reset must be issued through SMU. Since soft resets will reset an entire SDMA engine, use a common KGD call to do the reset as the KGD will handle avoiding a reset of in flight GFX and paging queues on that engine. In addition, create a common call for all reset types to simplify the handling of module parameter settings that block gpu resets. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: support per-queue reset on gfx9Jonathan Kim2024-08-061-0/+9
| | | | | | | | | | | | | | | | | | Support per-queue reset for GFX9. The recommendation is for the driver to target reset the HW queue via a SPI MMIO register write. Since this requires pipe and HW queue info and MEC FW is limited to doorbell reports of hung queues after an unmap failure, scan the HW queue slots defined by SET_RESOURCES first to identify the user queue candidates to reset. Only signal reset events to processes that have had a queue reset. If queue reset fails, fall back to GPU reset. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Fix reg offset for setting CWSR grace periodMukul Joshi2023-09-111-2/+1
| | | | | | | | | | | | | This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: add kfd2kgd debugger callbacks for GC v9.4.3Eric Huang2023-07-121-3/+7
| | | | | | | | | | Implement the similarities as GC v9.4.2, and the difference for GC v9.4.3 HW spec, i.e. xcc instance. Signed-off-by: Jonathan Kim <[email protected]> Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: add debug set and clear address watch points operationJonathan Kim2023-06-091-0/+8
| | | | | | | | | | | | | | | Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception. Allow the debugger to pass in a watch point to a particular memory address per device. Note that there exists only 4 watch points per devices to date, so have the KFD keep track of what watch points are allocated or not. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: add debug wave launch mode operationJonathan Kim2023-06-091-0/+3
| | | | | | | | | | Allow the debugger to set wave behaviour on to either normally operate, halt at launch, trap on every instruction, terminate immediately or stall on allocation. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: add debug wave launch override operationJonathan Kim2023-06-091-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | This operation allows the debugger to override the enabled HW exceptions on the device. On debug devices that only support the debugging of a single process, the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK register. Because they are global, only address watch exceptions are allowed to be enabled. In other words, the debugger must preserve all non-address watch exception states in normal mode operation by barring a full replacement override or a non-address watch override request. For multi-process debugging, all HW exception overrides are per-VMID so all exceptions can be overridden or fully replaced. In order for the debugger to know what is permissible, returned the supported override mask back to the debugger along with the previously enable overrides. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: add configurable grace period for unmap queuesJonathan Kim2023-06-091-0/+6
| | | | | | | | | | | | | | | | | | The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can potentially complete quickly. The debugger, on the other hand, will want to inspect wave status immediately after it actively triggers preemption (a suspend function to be provided). To minimize latency between preemption and debugger wave inspection, allow immediate preemption by setting the grace period to 0. Note that setting the preepmtion grace period to 0 will result in an infinite grace period being set due to a CP FW bug so set it to 1 for now. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: add gfx10 hw debug mode enable and disable callsJonathan Kim2023-06-091-0/+28
Similar to GFX9 debug devices, set the hardware debug mode by draining the SPI appropriately prior the mode setting request. Because GFX10 has waves allocated by the work group boundary and each SE's SPI instances do not communicate, the SPI drain time is much longer. This long drain time will be fixed for GFX11 onwards. Also remove a bunch of deprecated misplaced references for GFX10.3. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>