| Commit message (Collapse) | Author | Age | Files | Lines |
| |\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull drm fixes from Daniel Vetter:
"Dave's VPN to the big machine died, so it's on me to do fixes pr this
and next week while everyone else is at plumbers.
- big pile of amd fixes, but mostly for hw support newly added in 6.7
- i915 fixes, mostly minor things
- qxl memory leak fix
- vc4 uaf fix in mock helpers
- syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE"
* tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm: (78 commits)
drm/amdgpu: fix error handling in amdgpu_vm_init
drm/amdgpu: Fix possible null pointer dereference
drm/amdgpu: move UVD and VCE sched entity init after sched init
drm/amdgpu: move kfd_resume before the ip late init
drm/amd: Explicitly check for GFXOFF to be enabled for s0ix
drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)
drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5)
drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support
drm/amdgpu: fix software pci_unplug on some chips
drm/amd/display: remove duplicated argument
drm/amdgpu: correct mca debugfs dump reg list
drm/amdgpu: correct acclerator check architecutre dump
drm/amdgpu: add pcs xgmi v6.4.0 ras support
drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3
drm/amdgpu: disable smu v13.0.6 mca debug mode by default
drm/amdgpu: Support multiple error query modes
drm/amdgpu: refine smu v13.0.6 mca dump driver
drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2)
drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV
drm: amd: Resolve Sphinx unexpected indentation warning
...
|
| | |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.7-2023-11-10:
amdgpu:
- SR-IOV fixes
- DMCUB fixes
- DCN3.5 fixes
- DP2 fixes
- SubVP fixes
- SMU14 fixes
- SDMA4.x fixes
- Suspend/resume fixes
- AGP regression fix
- UAF fixes for some error cases
- SMU 13.0.6 fixes
- Documentation fixes
- RAS fixes
- Hotplug fixes
- Scheduling entity ordering fix
- GPUVM fixes
Signed-off-by: Daniel Vetter <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When clearing the root PD fails we need to properly release it again.
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need kernel scheduling entities to deal with handle clean up
if apps are not cleaned up properly. With commit 56e449603f0ac5
("drm/sched: Convert the GPU scheduler to variable number of run-queues")
the scheduler entities have to be created after scheduler init, so
change the ordering to fix this.
v2: Leave logic in UVD and VCE code
Fixes: 56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The kfd_resume needs to touch GC registers to enable the interrupts,
it needs to be done before GFXOFF is enabled to ensure that the GFX is
not off and GC registers can be touched. So move kfd_resume before the
amdgpu_device_ip_late_init which enables the CGPG/GFXOFF.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If a user has disabled GFXOFF this may cause problems for the suspend
sequence. Ensure that it is enabled in amdgpu_acpi_is_s0ix_active().
The system won't reach the deepest state but it also won't hang.
Signed-off-by: Mario Limonciello <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC
should be used instead when inst != 0.
v2: rebase
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.
Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to use the new xcc_id parameter.
Using amdgpu_sriov_runtime to determine whether to access via kiq or
RLC is sufficient for now.
v5: add condition in amdgpu_device_xcc_w/rreg, remove trace func call
v4: avoid using amdgpu_sriov_w/rreg
v3: use W/RREG32_XCC to handle non-kiq case
v2: define amdgpu_device_xcc_wreg/rreg instead of changing parameters
of amdgpu_device_wreg/rreg
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
add pcs xgmi ras error query support for smu v13.0.6.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When software 'pci unplug' using IGT is executed we got a sysfs directory
entry is NULL for differant ras blocks like hdp, umc, etc.
Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group'
check that 'sd' is not NULL.
[ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90
[ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
[ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246
[ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000
[ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000
[ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000
[ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000
[ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0
[ +0.000001] Call Trace:
[ +0.000001] <TASK>
[ +0.000001] ? show_regs+0x72/0x90
[ +0.000002] ? sysfs_remove_group+0x83/0x90
[ +0.000002] ? __warn+0x8d/0x160
[ +0.000001] ? sysfs_remove_group+0x83/0x90
[ +0.000001] ? report_bug+0x1bb/0x1d0
[ +0.000003] ? handle_bug+0x46/0x90
[ +0.000001] ? exc_invalid_op+0x19/0x80
[ +0.000002] ? asm_exc_invalid_op+0x1b/0x20
[ +0.000003] ? sysfs_remove_group+0x83/0x90
[ +0.000001] dpm_sysfs_remove+0x61/0x70
[ +0.000002] device_del+0xa3/0x3d0
[ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0
[ +0.000002] device_unregister+0x18/0x70
[ +0.000001] i2c_del_adapter+0x26d/0x330
[ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu]
[ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu]
[ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu]
[ +0.000186] ? mutex_lock+0x13/0x50
[ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu]
[ +0.000192] drm_minor_release+0x4f/0x80 [drm]
[ +0.000025] drm_release+0xfe/0x150 [drm]
[ +0.000027] __fput+0x9f/0x290
[ +0.000002] ____fput+0xe/0x20
[ +0.000002] task_work_run+0x61/0xa0
[ +0.000002] exit_to_user_mode_prepare+0x150/0x170
[ +0.000002] syscall_exit_to_user_mode+0x2a/0x50
Cc: Hawking Zhang <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian Koenig <[email protected]>
Signed-off-by: Vitaly Prosyak <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
avoid driver to touch invalid mca reg.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
So driver doesn't touch invalid aca entries.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
add pcs xgmi v6.4.0 ras support
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Change local memory type to MTYPE_UC on revision id 0
Signed-off-by: David Yat Sin <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Direct error query mode and firmware error query mode
are supported for now.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function in amdgpu_mca_smu_funcs{}
- add new mca_bank_set{} structure to collect mca bank
- move enum mca_reg_idx into amdgpu_mca.h header
- add mca status register field decode macro
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The following regs can only be programmed by the PF:
HDP_MISC_CNTL
HDP_NONSURFACE_BASE
HDP_NONSURFACE_BASE_HI
v2: update commit message
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Samir Dhume <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
PCTL0_MMHUB_DEEPSLEEP_IB is blocked for VF access
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Samir Dhume <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
correct smu v13.0.0 umc ras error check
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
WREG32/RREG32_SOC15_IP_NO_KIQ and amdgpu_virt_kiq_reg_write_reg_wait
are not using the correct rlcg interface or mec engine, respectively.
Add xcc instance parameter to them.
v4: Use GET_INST and squash commit with:
"drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait"
v3: xcc not needed for MMMHUB
v2: rebase
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The "rlcg_reg_access_supported" flag is missing. Add it back in.
Signed-off-by: Victor Lu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
correct following amdgpu ip block version information:
- gfx_v9_4_3
- sdma_v4_4_2
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
set_xgmi_plpd_mode may be unsupported and this isn't error, no need to
print warning for it.
v2: add ret2 to save the status of psp_ras_trigger_error.
Suggested-by: [email protected]
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Otherwise userspace can spam the logs by using incorrect input values.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We should not leak the pointer where we couldn't grab the reference
on to the caller because it can be that the error handling still
tries to put the reference then.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The default AGP settings were overwriting the IP selected
ones since the default was getting set after the IP ones
were selected.
Fixes: de59b69932e6 ("drm/amdgpu/gmc: set a default disable value for AGP")
Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html
Tested-by: Mikhail Gavrilov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: Mikhail Gavrilov <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Reset/query RAS error status and count.
v2: use XGMI IP version instead of WAFL version.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The UE registe list is larger than CE list.
Reported-by: [email protected]
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Doorbell rptr/wptr can be set through multiple ways including direct
register initialization. Disable doorbell during hw_fini once the ring
is disabled so that during next module reload direct initialization
takes effect. Also, move the direct initialization after minor update is
set to 1 since rptr/wptr are reinitialized back to 0 which could be
lower than the previous doorbell value (ex: cases like module reload).
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Tested-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Set the default reset method to mode2 for SMU IP v14.0.0
Signed-off-by: Jiadong Zhu <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The purpose of this patch is to disable XNACK or set XNACK OFF mode
on SRIOV platform which doesn't support it.
This will prevent user-space application to fail or result into
unexpected behaviour whenever the application need to run test-case
in XNACK ON mode.
Signed-off-by: Surbhi Kakarya <[email protected]>
Reviewed-by: Shaoyun Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | |\ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-fixes for v6.7-rc1:
qxl:
- qxl memory leak fix.
syncobj:
- Fix waiting for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
vc4:
- Fix UAF in mock helpers
Signed-off-by: Daniel Vetter <[email protected]>
From: Maarten Lankhorst <[email protected]>
[sima: Stitch together both changelogs from Maarten. Also because of
branch history this contains a few more bugfixes which are already in
v6.6, but I didn't feel like this justifies some backmerge since there
wasn't any real conflict.]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
| |\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull more drm updates from Dave Airlie:
"Geert pointed out I missed the renesas reworks in my main pull, so
this pull contains the renesas next work for atomic conversion and DT
support.
It also contains a bunch of amdgpu and some small ssd13xx fixes.
renesas:
- atomic conversion
- DT support
ssd13xx:
- dt binding fix for ssd132x
- Initialize ssd130x crtc_state to NULL.
amdgpu:
- Fix RAS support check
- RAS fixes
- MES fixes
- SMU13 fixes
- Contiguous memory allocation fix
- BACO fixes
- GPU reset fixes
- Min power limit fixes
- GFX11 fixes
- USB4/TB hotplug fixes
- ARM regression fix
- GFX9.4.3 fixes
- KASAN/KCSAN stack size check fixes
- SR-IOV fixes
- SMU14 fixes
- PSP13 fixes
- Display blend fixes
- Flexible array size fixes
amdkfd:
- GPUVM fix
radeon:
- Flexible array size fixes"
* tag 'drm-next-2023-11-07' of git://anongit.freedesktop.org/drm/drm: (83 commits)
drm/amd/display: Enable fast update on blendTF change
drm/amd/display: Fix blend LUT programming
drm/amd/display: Program plane color setting correctly
drm/amdgpu: Query and report boot status
drm/amdgpu: Add psp v13 function to query boot status
drm/amd/swsmu: remove fw version check in sw_init.
drm/amd/swsmu: update smu v14_0_0 driver if and metrics table
drm/amdgpu: Add C2PMSG_109/126 reg field shift/masks
drm/amdgpu: Optimize the asic type fix code
drm/amdgpu: fix GRBM read timeout when do mes_self_test
drm/amdgpu: check recovery status of xgmi hive in ras_reset_error_count
drm/amd/pm: only check sriov vf flag once when creating hwmon sysfs
drm/amdgpu: Attach eviction fence on alloc
drm/amdkfd: Improve amdgpu_vm_handle_moved
drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml2
drm/amd/display: Avoid NULL dereference of timing generator
drm/amdkfd: Update cache info for GFX 9.4.3
drm/amdkfd: Populate cache info for GFX 9.4.3
drm/amdgpu: don't put MQDs in VRAM on ARM | ARM64
drm/amdgpu/smu13: drop compute workload workaround
...
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Query boot status and report boot errors. A follow
up change is needed to stop GPU initialization if boot
fails.
v2: only invoke the call for dGPU (Le/Lijo)
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add psp v13 function to query boot status.
v2: limit the use case to dGPU only (Lijo)
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use a new struct array to define the asic information which
asic type needs to be fixed.
Signed-off-by: Ma Jun <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use a proper MEID to make sure the CP_HQD_* and CP_GFX_HQD_* registers
can be touched when initialize the compute and gfx mqd in mes_self_test.
Otherwise, we expect no response from CP and an GRBM eventual timeout.
Signed-off-by: Tim Huang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Handle xgmi hive case.
Suggested-by: Hawking Zhang <[email protected]>
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Instead of attaching the eviction fence when a KFD BO is first mapped,
attach it when it is allocated or imported. This in preparation to allow
KFD BOs to be mapped using the render node API.
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by
the caller. This will be useful for handling extra BO VA mappings in
KFD VMs that are managed through the render node API.
v2: rebase against drm_exec changes (Alex)
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Issues were reported with commit 1cfb4d612127
("drm/amdgpu: put MQDs in VRAM") on an ADLINK Ampere
Altra Developer Platform (AVA developer platform).
Various ARM systems seem to have problems related
to PCIe and MMIO access. In this case, I'm not sure
if this is specific to the ADLINK platform or ARM
in general. Seems to be some coherency issue with
VRAM. For now, just don't put MQDs in VRAM on ARM.
Link: https://lists.freedesktop.org/archives/amd-gfx/2023-October/100453.html
Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM")
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
AMD dGPUs have integrated FW that runs as soon as the
device gets power and initializes the board (determines
the amount of memory, provides configuration details to
the driver, etc.). For direct PCIe attached cards this
happens as soon as power is applied and normally completes
well before the OS has even started loading. However, with
hotpluggable ports like USB4, the driver needs to wait for
this to complete before initializing the device.
This normally takes 60-100ms, but could take longer on
some older boards periodically due to memory training.
Retry for up to a second. In the non-hotplug case, there
should be no change in behavior and this should complete
on the first try.
v2: adjust test criteria
v3: adjust checks for the masks, only enable on removable devices
v4: skip bif_fb_en check
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
[Why] During suspend, if GFX DPM is enabled and GFXOFF feature is
enabled the system may get hung. So, it is suggested to disable
GFXOFF feature during suspend and enable it after resume.
[How] Update the code to disable GFXOFF feature during suspend and enable
it after resume.
[ 311.396526] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[ 311.396530] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[ 311.396531] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
Acked-by: Yang Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Perry Yuan <[email protected]>
Signed-off-by: Kun Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It's only valid on Intel systems with the Intel VSEC.
Use dev_is_removable() instead. This should do the right
thing regardless of the platform.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The ATRM ACPI method is for fetching the dGPU vbios rom
image on laptops and all-in-one systems. It should not be
used for external add in cards. If the dGPU is thunderbolt
connected, don't try ATRM.
v2: pci_is_thunderbolt_attached only works for Intel. Use
pdev->external_facing instead.
v3: dev_is_removable() seems to be what we want
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since they were moved to VRAM, we need to use the IO
variants of memcpy.
Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM")
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Switch from mode-1 reset to mode-2 for poison consumption.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
GFX doorbell range should be set after flr otherwise the gfx doorbell
range will be overlap with MEC.
v2: remove "amdgpu_sriov_vf" and "amdgpu_in_reset" check, and add grbm
select for the case of 2 gfx rings.
Signed-off-by: Lin.Cao <[email protected]>
Acked-by: ZhenGuo Yin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
remove unused macro HW_REV
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|