aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdgpu/mes11: make MES_MISC_OP_CHANGE_CONFIG failure non-fatalAlex Deucher2025-08-291-2/+3
| | | | | | | | | | | | If the firmware is too old, just warn and return success. Fixes: 27b791514789 ("drm/amdgpu/mes: keep enforce isolation up to date") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4414 Cc: [email protected] Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 9f28af76fab0948b59673f69c10aeec47de11c60) Cc: [email protected]
* drm/amdgpu/mes: add compatibility checks for set_hw_resource_1Alex Deucher2025-06-241-4/+6
| | | | | | | | | | | | | | | Seems some older MES firmware versions do not properly support this packet. Add back some the compatibility checks. v2: switch to fw version check (Shaoyun) Fixes: f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4295 Cc: Shaoyun Liu <[email protected]> Reviewed-by: shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 0180e0a5dd5c6ff118043ee42dbbbddaf881f283) Cc: [email protected]
* drm/amdgpu/mes: consolidate on a single mes reset callbackAlex Deucher2025-04-301-4/+4
| | | | | | | | Use the legacy one as it covers both kernel queues and user queues. Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: remove more unused functionsAlex Deucher2025-04-301-26/+0
| | | | | | | These were leftover from mes bring up and are unused. Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Fix API status offset for MES queue resetJesse.Zhang2025-04-301-1/+1
| | | | | | | | | | | | | | | The mes_v11_0_reset_hw_queue and mes_v12_0_reset_hw_queue functions were using the wrong union type (MESAPI__REMOVE_QUEUE) when getting the offset for api_status. Since these functions handle queue reset operations, they should use MESAPI__RESET union instead. This fixes the polling of API status during hardware queue reset operations in the MES for both v11 and v12 versions. Signed-off-by: Jesse Zhang <[email protected]> Reviewed-By: Shaoyun.liu <[email protected]> Reviewed-by: Prike Liang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: add conversion for priority levelsAlex Deucher2025-04-211-2/+19
| | | | | | | | | | Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <[email protected]> Reviewed-by: Jesse.Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: adjust enforce_isolation handlingAlex Deucher2025-04-111-1/+1
| | | | | | | | | | | | Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provides better flexibility for more use cases. Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: use the device value for enforce isolationAlex Deucher2025-04-111-1/+1
| | | | | | | Use the local setting rather than the global parameter. Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: centralize gfx_hqd mask managementAlex Deucher2025-04-081-13/+3
| | | | | | | | | | | | Move it to amdgpu_mes to align with the compute and sdma hqd masks. No functional change. v2: rebase on new changes v3: misc optimizations Reviewed-by: Prike Liang <[email protected]> Reviewed-by: Sunil Khatri<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Remove the MES self testArunpravin Paneer Selvam2025-04-081-13/+1
| | | | | | | | | | | | Remove MES self test as this conflicts the userqueue fence interrupts. v2:(Christian) - remove the amdgpu_mes_self_test() function and any now unused code. Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix MES GFX maskArvind Yadav2025-04-081-2/+13
| | | | | | | | | | | | | | | | | Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. v2: squash in fix from Shashank Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: Fix typoAnanta Srikar2025-04-071-1/+1
| | | | | | | Fixes a typo in the word "version" in an error message. Signed-off-by: Ananta Srikar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: optimize MES pipe FW version fetchingAlex Deucher2025-04-071-0/+4
| | | | | | | | | | | | Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 028c3fb37e70 ("drm/amdgpu/mes11: initiate mes v11 support") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
* drm/amd/amdgpu: Fix MES init sequenceShaoyun Liu2025-03-141-30/+29
| | | | | | | | | | When MES is been used , the set_hw_resource_1 API is required to initialize MES internal context correctly Signed-off-by: Shaoyun Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: drop amdgpu_mes_suspend()/amdgpu_mes_resume() callsAlex Deucher2025-02-271-13/+1
| | | | | | | | | They are noops on GFX11 for most firmware versions. KFD already handles its own queues and they should already be unmapped at this point so even if this runs, it's not doing anything. Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: keep enforce isolation up to dateAlex Deucher2025-02-251-0/+4
| | | | | | | | | | | | Re-send the mes message on resume to make sure the mes state is up to date. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: Shaoyun Liu <[email protected]> Cc: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: correct the name of mes_pipe structureLikun Gao2025-02-251-7/+7
| | | | | | | | Correct the structure name admgpu_mes_pipe to amdgpu_mes_pipe. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: allocate hw_resource_1 buffer onceAlex Deucher2025-02-171-26/+24
| | | | | | | | Allocate the buffer at sw init time so we don't alloc and free it for every suspend/resume or reset cycle. Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: Add cleaner shader fence address handling in MES for GFX11Srinivasan Shanmugam2025-02-131-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces enhancements to the handling of the cleaner shader fence in the AMDGPU MES driver: - The MES (Microcode Execution Scheduler) now sends a PM4 packet to the KIQ (Kernel Interface Queue) to request the cleaner shader, ensuring that requests are handled in a controlled manner and avoiding the race conditions. - The CP (Compute Processor) firmware has been updated to use a private bus for accessing specific registers, avoiding unnecessary operations that could lead to issues in VF (Virtual Function) mode. - The cleaner shader fence memory address is now set correctly in the `mes_set_hw_res_pkt` structure, allowing for proper synchronization of the cleaner shader execution. Cc: lin cao <[email protected]> Cc: Jingwen Chen <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Suggested-by: Shaoyun Liu <[email protected]> Reviewed by: Shaoyun.liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: fix set_hw_resources_1 calculationAlex Deucher2025-02-131-1/+1
| | | | | | | | | | | It's GPU page size not CPU page size. In most cases they are the same, but not always. This can lead to overallocation on systems with larger pages. Cc: Srinivasan Shanmugam <[email protected]> Cc: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: add support for GC IP version 11.5.3Tim Huang2025-02-131-0/+2
| | | | | | | | This initializes GC IP version 11.5.3. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: reduce the mmio writes in kiq settingPrike Liang2024-12-101-3/+1
| | | | | | | | | | There's no need to perform the two MMIO writes in the KIQ Setting registers programmed period, and reducing the MMIO writes will save the driver loading time. Signed-off-by: Prike Liang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: limit single process inside MESShaoyun Liu2024-11-121-0/+15
| | | | | | | | This is for MES to limit only one process for the user queues Signed-off-by: Shaoyun Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: Increase MES log buffer to dump mes scratch datashaoyunl2024-11-111-1/+11
| | | | | | | | | | MES internal scratch data is useful for mes debug, it can only located in VRAM, change the allocation type and increase size for mes 11 Signed-off-by: shaoyunl <[email protected]> Acked-by: Feifei Xu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Clean the functions pointer set as NULLSunil Khatri2024-10-221-2/+0
| | | | | | | | | | | We dont need to set the functions to NULL which arent needed as global structure members are by default set to zero or NULL for pointers. Cc: Leo Liu <[email protected]> Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in hw_finiSunil Khatri2024-10-071-14/+13
| | | | | | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_fini. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_fini on suspend and some followed clean up. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in hw_initSunil Khatri2024-10-071-7/+13
| | | | | | | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_init. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_init on resume. v2: squash in isp fix Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in resumeSunil Khatri2024-10-071-2/+2
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of resume. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in suspendSunil Khatri2024-10-071-2/+2
| | | | | | | | | Update the *handle to amdgpu_ip_block ptr for all functions pointers of suspend. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in sw_finiSunil Khatri2024-10-011-2/+2
| | | | | | | | | update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_fini. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in sw_initSunil Khatri2024-10-011-2/+2
| | | | | | | | | update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_init. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in late_initSunil Khatri2024-10-011-2/+2
| | | | | | | | | Update the ptr handle to amdgpu_ip_block ptr in all the functions of late_init function ptr. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: update the handle ptr in early_initSunil Khatri2024-10-011-2/+2
| | | | | | | | | update the handle ptr to amdgpu_ip_block ptr for all functions pointers on early_init. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: update mes_reset_queue function to support sdma queueJiadong Zhu2024-09-261-1/+26
| | | | | | | | | | Reset sdma queue through mmio based on me_id and queue_id. v2: simplify callflows and register calculation. Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: reduce timeoutAlex Deucher2024-09-181-1/+1
| | | | | | | | | | | The firmware timeout is 2s. Reduce the driver timeout to 2.1 seconds to avoid back pressure on queue submissions. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3627 Fixes: f7c161a4c250 ("drm/amdgpu: increase mes submission timeout") Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
* drm/amdgpu/mes11: Indent an if statmentDan Carpenter2024-09-061-1/+1
| | | | | | | Indent the "break" statement one more tab. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: implement mmio queue reset for gfx11Jiadong Zhu2024-09-021-0/+80
| | | | | | | | | | | | | Implement queue reset for graphic and compute queue. v2: use amdgpu_gfx_rlc funcs to enter/exit safe mode. v3: use gfx_v11_0_request_gfx_index_mutex() v4: fix mutex handling Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add mes mapping legacy queue switchJack Xiao2024-08-291-15/+34
| | | | | | | | | | | | For mes11 old firmware has issue to map legacy queue, add a flag to switch mes to map legacy queue. Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support") Reported-by: Andrew Worsley <[email protected]> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11Mukul Joshi2024-08-211-2/+30
| | | | | | | | | | | Add implementation for MES Suspend and Resume APIs to unmap/map all queues for GFX11. Support for GFX12 will be added when the corresponding firmware support is in place. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: add API for user queue resetAlex Deucher2024-08-161-0/+21
| | | | | | | Add API for resetting user queues. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes12: configure two pipes hardware resourcesJack Xiao2024-08-131-5/+2
| | | | | | | | | Configure two pipes with different hardware resources. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add multiple mes ring instances supportJack Xiao2024-08-131-17/+17
| | | | | | | | | | Add multiple mes ring instances in mes structure to support multiple mes pipes. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: add API for legacy queue resetAlex Deucher2024-08-131-0/+33
| | | | | | | Add API for resetting kernel queues. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: fix mes ring buffer overflowJack Xiao2024-08-131-4/+14
| | | | | | | | | | | | | wait memory room until enough before writing mes packets to avoid ring buffer overflow. v2: squash in sched_hw_submission fix Fixes: de3246254156 ("drm/amdgpu: cleanup MES11 command submission") Fixes: fffe347e1478 ("drm/amdgpu: cleanup MES12 command submission") Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: increase mes log buffer size for gfx12Michael Chen2024-07-271-0/+2
| | | | | | | | | MES firmware requires larger log buffer for gfx12. Allocate proper buffer respectively for gfx11 and gfx12. Signed-off-by: Michael Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: update opcode stringsAlex Deucher2024-07-121-0/+3
| | | | | | | Add new packet. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: add firmware for GC IP v11.5.2Tim Huang2024-07-021-0/+2
| | | | | | | | | This patch is to add firmware for GC 11.5.2. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: cleanup MES11 command submissionChristian König2024-06-191-28/+48
| | | | | | | | | | | | | | | | The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: eef016ba8986 ("drm/amdgpu/mes11: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: fix kiq ring ready flagJack Xiao2024-05-081-1/+2
| | | | | | | | | kiq ring test has overwitten ready flag, need disable after gfx hw init. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: increase waiting time for engine readyJack Xiao2024-04-301-1/+1
| | | | | | | | | mes schq engine require more waiting time for engine ready before packet submission. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>