| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
atom bios header files are not required in these files.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
No special requirement in RAS injection for the two versions, switch to
use default injection interface.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
User can specify injected instances by the mask. For backward
compatibility, the mask value is incorporated into sub block index
without interface change of RAS TA.
User uses logical mask and driver should convert it to physical value
before sending it to RAS TA.
v2: update parameter name.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1.Modify gfx block to fit for the unified ras block data and ops.
2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list.
5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.
Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
gfx ras is only available in cerntain ip generations.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCEA/MMHUB EA error should not result to DF freeze, this is
fixed in next generation, but for some reasons the GCEA/MMHUB
EA error will result to DF freeze in previous generation,
diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
error in kernel message by read GCEA/MMHUB err status registers.
Changed from V1:
make query_ras_error_status function more general
make read mmhub er status register more friendly
Changed from V2:
move ras error status query function into do_recovery workqueue
Changed from V3:
remove useless code from V2, print GCEA error status
instance number
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
When resetting EDC related register, all CUs needs to be visited,
otherwise, garbage data from EDC register of missed SEs would present.
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Besides the intended change, commit 4cc1178e166a ("drm/amdgpu: replace DRM
prefix with PCI device info for gfx/mmhub") also set the source files
mmhub_v1_0.c and gfx_v9_4.c to be executable, i.e., changed fromold mode
644 to new mode 755.
Commit 241b2ec9317e ("drm/amd/display: Add dcn30 Headers (v2)") added the
four header files {dpcs,dcn}_3_0_0_{offset,sh_mask}.h as executable, i.e.,
mode 755.
Set to the usual modes for source and headers files and clean up those
mistakes. No functional change.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
Prefix RAS message printing in gfx/mmhub with PCI device info,
which assists the debug in multiple GPU case.
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
| |
prefix RAS error related dmesg print with pci device info
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
disallow the logical to be enabled on platforms that
don't support gfx ras at this stage, like sriov skus,
dgpu with legacy ras.etc
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Monk Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
GFX ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang warns:
../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit
conversion from enumeration type 'enum amdgpu_ras_block' to different
enumeration type 'enum ta_ras_block' [-Wenum-conversion]
block_info.block_id = info->head.block;
~ ~~~~~~~~~~~^~~~~
1 warning generated.
Use the function added in commit 828cfa29093f ("drm/amdgpu: Fix amdgpu
ras to ta enums conversion") that handles this conversion explicitly.
Fixes: 4c461d89db4f ("drm/amdgpu: add RAS support for the gfx block of Arcturus")
Link: https://github.com/ClangBuiltLinux/linux/issues/849
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
Implement functions to do the RAS error injection and
query EDC counter.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|