| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The doorbells should already be reserved, just enable them.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Disable overflow and enable full drain. This makes fault handling on ring 1
much more reliable since we don't generate back pressure any more.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The buffers should be cleared when possible but we also don't want
buffer creation to fail in the rare case where the ring isn't ready
during the call. This could happen during some suspend/resume sequences.
Cc: Christian König <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The dumb_create API isn't intended for high performance rendering
and it's more useful for userspace (ie. IGT) to have them precleared.
The bonus here is that we also won't needlessly leak whatever was
previously in VRAM, but it also probably wasn't sensitive if it was
going through this API.
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:405:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:435:2-3: Unneeded semicolon
Remove unneeded semicolon.
Generated by: scripts/coccinelle/misc/semicolon.cocci
CC: xinhui pan <[email protected]>
Signed-off-by: kbuild test robot <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
add ras post init function.
Do some initialization after all IP have finished their late init.
Add new member flags which will control the ras work flow.
For now, vbios enable ras for us on boot. That might change in the
future.
So there should be a flag from vbios to tell us if ras is enabled or not
on boot. Looks like there is no such info now.
Other bits of the flags are reserved to control other parts of ras.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
add drm info output if ras initialized successfully.
add ras atomfirmware sanity check.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
lockdep need a static key.
Previously we set ignore bit to avoid the warning.
Now call sysfs_attr_init to initialize the static key.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-and-Tested-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Unzero char is accepted by sscanf, so when data is structure but
unexpectedly return error invalid;
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, it is not clear how ras is supported. Both software and
hardware can set the supported. That is confusing.
Fix it by adding new member hw_supported.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Set ignore bit to satisfy locpdep.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Ta is optional, so check if ta firmware is loaded or not.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The descriptions of modinfo wrongly show two parameters
for each feature(see below). This patch can fix this
incorrect outputs.
parm: amdgpu_ras_enable:Enable RAS features on the GPU (0 = disable, 1 = enable, -1 = auto (default))
parm: ras_enable:int
parm: amdgpu_ras_mask:Mask of RAS features to enable (default 0xffffffff), only valid when ras_enable == 1
parm: ras_mask:uint
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: xinhui pan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
query sram ecc capability via amdgpu_atomfirmware_ecc_default_enabled
query ecc availability via amdgpu_atomfirmware_sram_ecc_supported
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
sram ecc capability could be get from firmware_capability field in firmwareinfo table
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
ecc default status (enabled or disabled) could be get from umc_config field in umc_info table
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Suspend will put irq, so resume need get irq back.
And in the same time, skip other ras initialization.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.
v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, the debugfs control node can't parse bash-like commands.
Now add such support for any tester that uses scripts.
v2: squash in fixes for input validation
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
gpu reset is not stable on vega20 A1.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add a query for userspace to check which RAS features
are enabled.
v2: squash in warning fix
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add AMDGPU_CTX_QUERY2_FLAGS_RAS_CE/UE which indicate if any error happened
between previous query and this query.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Register ecc interrupts and ecc interrupt handler on gfx9.
Add ras support on gfx9
v2: squash in warning fix
Signed-off-by: Feifei Xu <[email protected]>
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
register IH, enable ras features on sdma.
create sysfs debugfs file for sdma.
Signed-off-by: xinhui pan <[email protected]>
Signed-off-by: Feifei Xu <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Mark vram pages with errors as bad and prevent the driver
from using them.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
allow userspace enable/disable ras
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
add obj management.
add feature control.
add debugfs infrastructure.
add sysfs infrastructure.
add IH infrastructure.
add recovery infrastructure.
It is a framework. Other IPs need call amdgpu_ras_xxx function instead of
psp_ras_xxx functions.
v2: squash in warning fixes
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add trigger_error and cure_posion.
Acked-by: Hawking Zhang <[email protected]>
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add ras fw loading, init, terminate.
Add ras cmd submit helper.
Add ras feature enable/disable common function.
v2: squash in unused variable warning fix
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Define the driver side interface for ras ta.
Acked-by: Hawking Zhang <[email protected]>
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Allow RAS feature enable/disable via boot parameter.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Output the ta fw, aka xgmi/ras, via debugfs.
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add ras fw part, xgmi and ras fw are combined together in ta binary.
Reading the data from the info is not implemented yet.
v2: squash in "drm/amdgpu: fix NULL pointer when ta is missing"
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Use function parameter mc as the second parameter of amdgpu_gmc_vram_location,
so codes look more consistent.
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Problem:
Using SDMA for TLB invalidation in certain ASICs exposed a problem
of IB pool not being ready while SDMA already up on Init and already
shutt down while SDMA still running on Fini. This caused
IB allocation failure. Temproary fix was commited into a
bringup branch but this is the generic fix.
Fix:
Init IB pool rigth after GMC is ready but before SDMA is ready.
Do th opposite for Fini.
v2: Remove restriction on SDMA early init and move amdgpu_ib_pool_fini
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Applied vdci flush workaround for Vega20 BACO.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Driver vote low to high pstate switch whenever there is an outstanding
XGMI mapping request. Driver vote high to low pstate when all the
outstanding XGMI mapping is terminated.
Signed-off-by: shaoyunl <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Adjust vram base offset for XGMI mapping when update the PT entry so
the address will fall into correct XGMI aperture for peer device
Signed-off-by: shaoyunl <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
For each device a file xgmi_device_id is created.
On the first device a subdirectory named xgmi_hive_info is created,
It contains a file named hive_id and symlinks named node 1-4 linking
to each device in the hive.
v2: Return error codes instead of '-1' and few misspellings.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Only GMC9 supports true huge pages, but we can still free invalid mappings
on GMC8.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Not needed any more since we now free PDs/PTs on demand.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When something is unmapped we now free the affected PDs/PTs again.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Let's start to allocate VM PDs/PTs on demand instead of pre-allocating
them during mapping.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Instead of providing it from outside figure out the ats status in the
function itself from the data structures.
v2: simplify finding the right level
v3: partially revert changes from v2, more cleanup and split code
into more functions.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This way we only deal with the real BO in here.
v2: use a do { ... } while loop instead
v3: fix NULL pointer in v2
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|