aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdgpu: move reset support type checks into the callerAlex Deucher2025-07-171-3/+0
| | | | | | | | Rather than checking in the callbacks, check if the reset type is supported in the caller. Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/gfx9.4.3: re-emit unprocessed state on kcq resetAlex Deucher2025-07-171-7/+2
| | | | | | | | Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Check SQ_CONFIG register support on SRIOVTony Yi2025-07-161-1/+3
| | | | | | | | | On SRIOV environments, check if RLCG supports SQ_CONFIG register programming. Signed-off-by: Tony Yi <[email protected]> Reviewed-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: clean up GC reset functionsAlex Deucher2025-07-161-5/+9
| | | | | | | Make them consistent and use the reset flags. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/gfx9.4.3: fix kiq locking in KCQ resetAlex Deucher2025-07-151-2/+1
| | | | | | | | | The ring test needs to be inside the lock. Fixes: 4c953e53cc34 ("drm/amdgpu/gfx_9.4.3: wait for reset done before remap") Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: Jiadong Zhu <[email protected]>
* drm/amdgpu: move scheduler wqueue handling into callbacksAlex Deucher2025-06-301-0/+3
| | | | | | | | | | | Move the scheduler wqueue stopping and starting into the ring reset callbacks. On some IPs we have to reset an engine which may have multiple queues. Move the wqueue handling into the backend so we can handle them as needed based on the type of reset available. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: move force completion into ring resetsAlex Deucher2025-06-301-1/+6
| | | | | | | | | | Move the force completion handling into each ring reset function so that each engine can determine whether or not it needs to force completion on the jobs in the ring. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update ring reset function signatureAlex Deucher2025-06-301-1/+2
| | | | | | | | Going forward, we'll need more than just the vmid. Add the guilty amdgpu_fence. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdkfd: Support chain runlists of XNACK+/XNACK-Amber Lin2025-05-161-0/+31
| | | | | | | | | | | | If the MEC firmware supports chaining runlists of XNACK+/XNACK- processes, set SQ_CONFIG1 chicken bit and SET_RESOURCES bit 28. When the MEC/HWS supports it, KFD checks the XNACK+/XNACK- processes mix happens or not. If it does, enter over-subscription. Signed-off-by: Amber Lin <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Add GFX 9.5.0 support for per-queue/pipe resetJesse.Zhang2025-05-131-3/+7
| | | | | | | | | | | | | | | | | | | This patch enables per-queue and per-pipe reset functionality for GFX IP v9.5.0 when using MEC firmware version 21 (0x15) or later. This change: 1. Refactors the pipe reset support check in gfx_v9_4_3_pipe_reset_support() to use the compute_supported_reset flags instead of hardcoding version checks. 2. Adds support for GFX9.5.0 (IP 9.5.0) with MEC firmware version >= 21 to enable per-queue and per-pipe reset capabilities. v2: Replaced mec version check with !!(adev->gfx.compute_supported_reset & AMDGPU_RESET_TYPE_PER_PIPE) (Lijo) Signed-off-by: Asad Kamal <[email protected]> Signed-off-by: Jesse Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOsAlex Deucher2025-04-071-15/+37
| | | | | | | | | In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: stop unmapping MQD for kernel queues v3Christian König2025-03-261-49/+10
| | | | | | | | | | | | | | This looks unnecessary and actually extremely harmful since using kmap() is not possible while inside the ring reset. Remove all the extra mapping and unmapping of the MQDs. v2: also fix debugfs v3: fix coding style typo Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Parse all deferred errors with UMC aca handleXiang Liu2025-03-261-5/+3
| | | | | | | | We should only increase the deferred errors in UMC block. Signed-off-by: Xiang Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Use correct gfx deferred error countXiang Liu2025-03-261-3/+4
| | | | | | | | | | In the case of parsing GFX deferred error from SMU corrected error channel, the error count should be set to 1 instead of parsing from MISC0 register, which is 0. Signed-off-by: Xiang Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Decode deferred error type in gfx aca bank parserXiang Liu2025-03-211-3/+2
| | | | | | | | | | | | | | In the case of injecting uncorrected error with background workload, the deferred error among uncorrected errors need to be specified by checking the deferred and poison bits of status register. v2: refine checking for deferred error v2: log possiable DEs among CEs v2: generate CPER records for DEs among UEs Signed-off-by: Xiang Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 for sriov ↵Emily Deng2025-03-141-1/+1
| | | | | | | | | | multiple vf. In sriov multiple vf, Set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD. Signed-off-by: Emily Deng <[email protected]> Acked-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix the gb_addr_config_fields init value mismatchShiwu Zhang2025-03-071-5/+1
| | | | | | | | | | | | | | For gfx_v9_4_3 specifically, before regGB_ADDR_CONFIG is overwritten in gfx hw_init it is read out to popluate the gb_addr_config_fields in the sw_init stage, which causes mismatch. Fix it by using the golden value in sw_init as well. v2: This is a driver-set golden reg and keep as it is (Lijo) Signed-off-by: Shiwu Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: retire ip init code specific for A0 revShiwu Zhang2025-03-071-12/+1
| | | | | | | | | For aqua_vanjaram, A0 HW is retired so remove the code specific for it in gfx ip init. Signed-off-by: Shiwu Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Decode deferred error type in aca bank parserXiang Liu2025-02-271-2/+3
| | | | | | | | | | | In the case of poison inband log, the error type need to be specified by checking the deferred or poison bit of status register. v2: check both deferred and poison bit Signed-off-by: Xiang Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Remove redundant logic in GC v9.4.3Lijo Lazar2025-02-251-16/+1
| | | | | | | | | GFXOFF check is not needed for GC v9.4.3. Also, save/restore list is available by default. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in is_idleSunil Khatri2025-02-251-3/+3
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of is_idle. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in get_clockgating_stateSunil Khatri2025-02-191-2/+2
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of get_clockgating_state. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Include ACA error type in aca bankHawking Zhang2025-02-171-0/+2
| | | | | | | | | | | | | | | ACA error types managed by driver a direct 1:1 correspondence with those managed by firmware. To address this, for each ACA bank, include both the ACA error type and the ACA SMU type. This addition is useful for creating CPER records. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Yang Wang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Clean up GFX v9.4.3 IP version checksLijo Lazar2025-02-131-52/+16
| | | | | | | | | Remove unnecessary IP version checks for GFX 9.4.3 and similar variants. Wrap checks inside meaningful function. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Fix for MEC SJT FW Load Fail on VFyfeng12025-01-061-2/+7
| | | | | | | | | Users might switch to ROCM build does not include MEC SJT FW and driver needs to consider this case.w Signed-off-by: yfeng1 <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: drop the amdgpu_device argument from amdgpu_ib_freePierre-Eric Pelloux-Prayer2024-12-181-1/+1
| | | | | | | | | It's unused. Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: move common ACA ipid defines into amdgpu_aca.hYang Wang2024-12-101-4/+0
| | | | | | | | | move common ACA ipid defines into amdgpu_aca.h file. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd: Add the capability to mark certain firmware as "required"Mario Limonciello2024-12-101-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | Some of the firmware that is loaded by amdgpu is not actually required. For example the ISP firmware on some SoCs is optional, and if it's not present the ISP IP block just won't be initialized. The firmware loader core however will show a warning when this happens like this: ``` Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2 ``` To avoid confusion for non-required firmware, adjust the amd-ucode helper to take an extra argument indicating if the firmware is required or optional. On optional firmware use firmware_request_nowarn() instead of request_firmware() to avoid the warnings. Reviewed-by: Alex Deucher <[email protected]> Link: https://lore.kernel.org/amd-gfx/[email protected]/T/#t Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Apply gc v9_5_0 golden settingsHawking Zhang2024-12-101-6/+10
| | | | | | | | Apply gc v9_5_0 golden settings. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: add initial support for gfx950Le Ma2024-12-101-0/+3
| | | | | | | | add gfx950 basic support Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/gfx: add gfx950 microcodeLe Ma2024-12-101-0/+2
| | | | | | | | Add firmware declarations. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Remove gfxoff usageLijo Lazar2024-12-101-4/+0
| | | | | | | | | GFXOFF is not valid for these IP versions. Also, SDMA v4.4.2 is not in GFX domain. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: pass ip_block in set_clockgating_stateBoyuan Zhang2024-12-101-2/+2
| | | | | | | | | | | | Pass ip_block instead of adev in set_clockgating_state() callback functions. Modify set_clockgating_state()for all correspoding ip blocks. v2: remove all changes for is_idle(), remove type casting Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: pass ip_block in set_powergating_stateBoyuan Zhang2024-12-101-1/+1
| | | | | | | | | | | | | | | | Pass ip_block instead of adev in set_powergating_state callback function. Modify set_powergating_state ip functions for all correspoding ip blocks. v2: fix a ip block index error. v3: remove type casting Signed-off-by: Boyuan Zhang <[email protected]> Suggested-by: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: reduce the mmio writes in kiq settingPrike Liang2024-12-101-3/+1
| | | | | | | | | | There's no need to perform the two MMIO writes in the KIQ Setting registers programmed period, and reducing the MMIO writes will save the driver loading time. Signed-off-by: Prike Liang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: use sjt mec fw on gfx943 for sriovVictor Zhao2024-12-051-2/+8
| | | | | | | | | | Use second jump table in sriov for live migration or mulitple VF support so different VF can load different version of MEC as long as they support sjt Signed-off-by: Victor Zhao <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Add sysfs interface for gc reset mask[email protected]2024-11-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | Add two sysfs interfaces for gfx and compute: gfx_reset_mask compute_reset_mask These interfaces are read-only and show the resets supported by the IP. For example, full adapter reset (mode1/mode2/BACO/etc), soft reset, queue reset, and pipe reset. V2: the sysfs node returns a text string instead of some flags (Christian) v3: add a generic helper which takes the ring as parameter and print the strings in the order they are applied (Christian) check amdgpu_gpu_recovery before creating sysfs file itself, and initialize supported_reset_types in IP version files (Lijo) v4: Fixing uninitialized variables (Tim) Signed-off-by: Jesse Zhang <[email protected]> Suggested-by: Alex Deucher <[email protected]> Reviewed-by: Tim Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Group gfx sysfs functionsLijo Lazar2024-11-041-5/+0
| | | | | | | | | Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all gfx related sysfs nodes. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* Revert "drm/amdgpu/gfx9: put queue resets behind a debug option"Alex Deucher2024-10-221-6/+0
| | | | | | | | | | | | This reverts commit 7c1a2d8aba6cadde0cc542b2d805edc0be667e79. Extended validation has completed successfully, so enable these features by default. Acked-by: Jiadong Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: Jonathan Kim <[email protected]> Cc: Jiadong Zhu <[email protected]>
* drm/amdgpu: Show current compute partition on VFLijo Lazar2024-10-151-8/+4
| | | | | | | | | Enable sysfs node for current compute partition mode on VFs also. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Vignesh Chander <[email protected]> Tested-by: Vignesh Chander <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: optimize fn gfx_v9_4_3_ring_insert_nopSunil Khatri2024-10-151-4/+1
| | | | | | | | | | | Optimize gfx_v9_4_3_ring_insert_nop() to call optimized version of amdgpu_ring_insert_nop instead of calling amdgpu_ring_write for number of nop times. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in hw_finiSunil Khatri2024-10-071-5/+3
| | | | | | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_fini. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_fini on suspend and some followed clean up. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in hw_initSunil Khatri2024-10-071-3/+3
| | | | | | | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_init. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_init on resume. v2: squash in isp fix Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in resumeSunil Khatri2024-10-071-2/+2
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of resume. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in suspendSunil Khatri2024-10-071-2/+4
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of suspend. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in wait_for_idleSunil Khatri2024-10-071-3/+3
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of wait_for_idle. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in soft_resetSunil Khatri2024-10-011-2/+2
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of soft_reset. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in sw_finiSunil Khatri2024-10-011-2/+2
| | | | | | | | | update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_fini. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in sw_initSunil Khatri2024-10-011-2/+2
| | | | | | | | | update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_init. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in late_initSunil Khatri2024-10-011-2/+2
| | | | | | | | | Update the ptr handle to amdgpu_ip_block ptr in all the functions of late_init function ptr. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>