aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdgpu: adjust enforce_isolation handlingAlex Deucher2025-04-111-1/+1
| | | | | | | | | | | | Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provides better flexibility for more use cases. Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: make more vmids available when disable_kq=1Alex Deucher2025-04-081-1/+1
| | | | | | | | | If we don't have kernel queues, the vmids can be used by the MES for user queues. Acked-by: Prike Liang <[email protected]> Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: update hqd masks when disable_kq is setAlex Deucher2025-04-081-3/+3
| | | | | | | | | Make all resources available to user queues. Reviewed-by: Prike Liang <[email protected]> Suggested-by: Sunil Khatri <[email protected]> Reviewed-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: warn on unexpected pipe numbersAlex Deucher2025-04-081-4/+19
| | | | | | | Warn if the number of pipes exceeds what the MES supports. Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: centralize gfx_hqd mask managementAlex Deucher2025-04-081-0/+21
| | | | | | | | | | | | Move it to amdgpu_mes to align with the compute and sdma hqd masks. No functional change. v2: rebase on new changes v3: misc optimizations Reviewed-by: Prike Liang <[email protected]> Reviewed-by: Sunil Khatri<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: remove unused functionsAlex Deucher2025-04-081-800/+0
| | | | | | | Leftover from the MES self tests that were removed previously. Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Modify the MES process va end limitChristian König2025-04-081-2/+1
| | | | | | | | | Modify the MES process va end limit to max pfn. Signed-off-by: Christian König <[email protected]> Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Remove the MES self testArunpravin Paneer Selvam2025-04-081-169/+0
| | | | | | | | | | | | Remove MES self test as this conflicts the userqueue fence interrupts. v2:(Christian) - remove the amdgpu_mes_self_test() function and any now unused code. Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix MES GFX maskArvind Yadav2025-04-081-3/+0
| | | | | | | | | | | | | | | | | Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. v2: squash in fix from Shashank Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: clean up SDMA HQD loopAlex Deucher2025-03-211-5/+3
| | | | | | | | Follow the same logic as the other IP types. Reviewed-by: Prike Liang <[email protected]> Acked-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: enable compute pipes across all MECAlex Deucher2025-03-211-2/+1
| | | | | | | | | Enable pipes on both MECs for MES. Fixes: 745f46b6a99f ("drm/amdgpu: enable mes v12 self test") Acked-by: Lijo Lazar <[email protected]> Reviewed-by: Prike Liang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: drop MES 10.x leftoversAlex Deucher2025-03-211-4/+1
| | | | | | | | | Leftover from MES bring up. There is no production MES support for MES 10.x. The rest of the MES 10.x code has already been removed so drop this. Acked-by: Prike Liang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: optimize compute loop handlingAlex Deucher2025-03-211-1/+1
| | | | | | | | Break when we get to the end of the supported pipes rather than continuing the loop. Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: use GFP_NOWAIT for memory allocationsChristian König2025-03-211-2/+2
| | | | | | | | | | | | | In the critical submission path memory allocations can't wait for reclaim since that can potentially wait for submissions to finish. Finally clean that up and mark most memory allocations in the critical path with GFP_NOWAIT. The only exception left is the dma_fence_array() used when no VMID is available, but that will be cleaned up later on. Signed-off-by: Christian König <[email protected]> Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: keep enforce isolation up to dateAlex Deucher2025-02-251-1/+19
| | | | | | | | | | | | Re-send the mes message on resume to make sure the mes state is up to date. Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES") Acked-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: Shaoyun Liu <[email protected]> Cc: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd: Add the capability to mark certain firmware as "required"Mario Limonciello2024-12-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Some of the firmware that is loaded by amdgpu is not actually required. For example the ISP firmware on some SoCs is optional, and if it's not present the ISP IP block just won't be initialized. The firmware loader core however will show a warning when this happens like this: ``` Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2 ``` To avoid confusion for non-required firmware, adjust the amd-ucode helper to take an extra argument indicating if the firmware is required or optional. On optional firmware use firmware_request_nowarn() instead of request_firmware() to avoid the warnings. Reviewed-by: Alex Deucher <[email protected]> Link: https://lore.kernel.org/amd-gfx/[email protected]/T/#t Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: limit single process inside MESShaoyun Liu2024-11-121-0/+23
| | | | | | | | This is for MES to limit only one process for the user queues Signed-off-by: Shaoyun Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: Increase MES log buffer to dump mes scratch datashaoyunl2024-11-111-1/+1
| | | | | | | | | | MES internal scratch data is useful for mes debug, it can only located in VRAM, change the allocation type and increase size for mes 11 Signed-off-by: shaoyunl <[email protected]> Acked-by: Feifei Xu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix return random value when multiple threads read registers via ↵chongli22024-11-081-17/+13
| | | | | | | | | | | | | | | | | | mes. The currect code use the address "adev->mes.read_val_ptr" to store the value read from register via mes. So when multiple threads read register, multiple threads have to share the one address, and overwrite the value each other. Assign an address by "amdgpu_device_wb_get" to store register value. each thread will has an address to store register value. Signed-off-by: chongli2 <[email protected]> Reviewed-by: Emily Deng <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: fetch fw version from firmware headerAlex Deucher2024-11-051-0/+5
| | | | | | | | | | | We need this prior to the firmware being loaded so fetch from the header. v2: fetch directly from the firmware v3: store both fw versions Reviewed-by: Srinivasan Shanmugam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd/amdgpu: Fix double unlock in amdgpu_mes_add_ringSrinivasan Shanmugam2024-10-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses a double unlock issue in the amdgpu_mes_add_ring function. The mutex was being unlocked twice under certain error conditions, which could lead to undefined behavior. The fix ensures that the mutex is unlocked only once before jumping to the clean_up_memory label. The unlock operation is moved to just before the goto statement within the conditional block that checks the return value of amdgpu_ring_init. This prevents the second unlock attempt after the clean_up_memory label, which is no longer necessary as the mutex is already unlocked by this point in the code flow. This change resolves the potential double unlock and maintains the correct mutex handling throughout the function. Fixes below: Commit d0c423b64765 ("drm/amdgpu/mes: use ring for kernel queue submission"), leads to the following Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1240 amdgpu_mes_add_ring() warn: double unlock '&adev->mes.mutex_hidden' (orig line 1213) drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 1143 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id, 1144 int queue_type, int idx, 1145 struct amdgpu_mes_ctx_data *ctx_data, 1146 struct amdgpu_ring **out) 1147 { 1148 struct amdgpu_ring *ring; 1149 struct amdgpu_mes_gang *gang; 1150 struct amdgpu_mes_queue_properties qprops = {0}; 1151 int r, queue_id, pasid; 1152 1153 /* 1154 * Avoid taking any other locks under MES lock to avoid circular 1155 * lock dependencies. 1156 */ 1157 amdgpu_mes_lock(&adev->mes); 1158 gang = idr_find(&adev->mes.gang_id_idr, gang_id); 1159 if (!gang) { 1160 DRM_ERROR("gang id %d doesn't exist\n", gang_id); 1161 amdgpu_mes_unlock(&adev->mes); 1162 return -EINVAL; 1163 } 1164 pasid = gang->process->pasid; 1165 1166 ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL); 1167 if (!ring) { 1168 amdgpu_mes_unlock(&adev->mes); 1169 return -ENOMEM; 1170 } 1171 1172 ring->ring_obj = NULL; 1173 ring->use_doorbell = true; 1174 ring->is_mes_queue = true; 1175 ring->mes_ctx = ctx_data; 1176 ring->idx = idx; 1177 ring->no_scheduler = true; 1178 1179 if (queue_type == AMDGPU_RING_TYPE_COMPUTE) { 1180 int offset = offsetof(struct amdgpu_mes_ctx_meta_data, 1181 compute[ring->idx].mec_hpd); 1182 ring->eop_gpu_addr = 1183 amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset); 1184 } 1185 1186 switch (queue_type) { 1187 case AMDGPU_RING_TYPE_GFX: 1188 ring->funcs = adev->gfx.gfx_ring[0].funcs; 1189 ring->me = adev->gfx.gfx_ring[0].me; 1190 ring->pipe = adev->gfx.gfx_ring[0].pipe; 1191 break; 1192 case AMDGPU_RING_TYPE_COMPUTE: 1193 ring->funcs = adev->gfx.compute_ring[0].funcs; 1194 ring->me = adev->gfx.compute_ring[0].me; 1195 ring->pipe = adev->gfx.compute_ring[0].pipe; 1196 break; 1197 case AMDGPU_RING_TYPE_SDMA: 1198 ring->funcs = adev->sdma.instance[0].ring.funcs; 1199 break; 1200 default: 1201 BUG(); 1202 } 1203 1204 r = amdgpu_ring_init(adev, ring, 1024, NULL, 0, 1205 AMDGPU_RING_PRIO_DEFAULT, NULL); 1206 if (r) 1207 goto clean_up_memory; 1208 1209 amdgpu_mes_ring_to_queue_props(adev, ring, &qprops); 1210 1211 dma_fence_wait(gang->process->vm->last_update, false); 1212 dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false); 1213 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1214 1215 r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id); 1216 if (r) 1217 goto clean_up_ring; ^^^^^^^^^^^^^^^^^^ 1218 1219 ring->hw_queue_id = queue_id; 1220 ring->doorbell_index = qprops.doorbell_off; 1221 1222 if (queue_type == AMDGPU_RING_TYPE_GFX) 1223 sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id); 1224 else if (queue_type == AMDGPU_RING_TYPE_COMPUTE) 1225 sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id, 1226 queue_id); 1227 else if (queue_type == AMDGPU_RING_TYPE_SDMA) 1228 sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id, 1229 queue_id); 1230 else 1231 BUG(); 1232 1233 *out = ring; 1234 return 0; 1235 1236 clean_up_ring: 1237 amdgpu_ring_fini(ring); 1238 clean_up_memory: 1239 kfree(ring); --> 1240 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1241 return r; 1242 } Fixes: d0c423b64765 ("drm/amdgpu/mes: use ring for kernel queue submission") Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Hawking Zhang <[email protected]> Suggested-by: Jack Xiao <[email protected]> Reported by: Dan Carpenter <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Jack Xiao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: update mes_reset_queue function to support sdma queueJiadong Zhu2024-09-261-1/+1
| | | | | | | | | | Reset sdma queue through mmio based on me_id and queue_id. v2: simplify callflows and register calculation. Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: fix queue reset issue by mmioJesse Zhang2024-09-061-0/+1
| | | | | | | | Initialize the queue type before resetting the queue using mmio. Signed-off-by: Jesse Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: implement amdgpu_mes_reset_hw_queue_mmioJiadong Zhu2024-09-021-0/+20
| | | | | | | | | | | The reset_queue api could be used from kfd or kgd. v2: add use_mmio parameter for mes_reset_legacy_queue. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: modify mes api for mmio queue resetJiadong Zhu2024-09-021-1/+2
| | | | | | | | | | | Add me/pipe/queue parameters for queue reset input. v2: fix build (Alex) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11Mukul Joshi2024-08-211-34/+37
| | | | | | | | | | | Add implementation for MES Suspend and Resume APIs to unmap/map all queues for GFX11. Support for GFX12 will be added when the corresponding firmware support is in place. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add API for user queue resetAlex Deucher2024-08-161-0/+43
| | | | | | | Add API for resetting user queues. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes12: configure two pipes hardware resourcesJack Xiao2024-08-131-30/+47
| | | | | | | | | Configure two pipes with different hardware resources. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes12: load unified mes fw on pipe0 and pipe1Jack Xiao2024-08-131-1/+1
| | | | | | | | | Enable unified mes firmware to load on pipe0 and pipe1. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add multiple mes ring instances supportJack Xiao2024-08-131-1/+3
| | | | | | | | | | Add multiple mes ring instances in mes structure to support multiple mes pipes. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add API for legacy queue resetAlex Deucher2024-08-131-0/+24
| | | | | | | Add API for resetting kernel queues. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amd: Use a constant format string for amdgpu_ucode_requestArnd Bergmann2024-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple files in amdgpu call amdgpu_ucode_request() with a fw_name variable that the compiler cannot check for being a valid format string, as seen by enabling the (default-disabled) -Wformat-security option: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function 'amdgpu_mes_init_microcode': drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1517:61: error: format not a string literal and no format arguments [-Werror=format-security] 1517 | r = amdgpu_ucode_request(adev, &adev->mes.fw[pipe], fw_name); | ^~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c: In function 'amdgpu_uvd_sw_init': drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:9: error: format not a string literal and no format arguments [-Werror=format-security] 263 | r = amdgpu_ucode_request(adev, &adev->uvd.fw, fw_name); | ^ drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c: In function 'amdgpu_vce_sw_init': drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:161:9: error: format not a string literal and no format arguments [-Werror=format-security] 161 | r = amdgpu_ucode_request(adev, &adev->vce.fw, fw_name); | ^ drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c: In function 'amdgpu_umsch_mm_init_microcode': drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c:590:9: error: format not a string literal and no format arguments [-Werror=format-security] 590 | r = amdgpu_ucode_request(adev, &adev->umsch_mm.fw, fw_name); | ^ drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c: In function 'amdgpu_cgs_get_firmware_info': drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c:417:72: error: format not a string literal and no format arguments [-Werror=format-security] 417 | err = amdgpu_ucode_request(adev, &adev->pm.fw, fw_name); | ^~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 'load_dmcu_fw': drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:2221:9: error: format not a string literal and no format arguments [-Werror=format-security] 2221 | r = amdgpu_ucode_request(adev, &adev->dm.fw_dmcu, fw_name_dmcu); | ^ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 'dm_init_microcode': drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:5147:9: error: format not a string literal and no format arguments [-Werror=format-security] 5147 | r = amdgpu_ucode_request(adev, &adev->dm.dmub_fw, fw_name_dmub); | ^ Change these all to use a "%s" format with the actual name as an argument, to let the compiler prove this to be correct. Fixes: e5a7d047f41b ("drm/amd: Use `amdgpu_ucode_*` helpers for CGS") Fixes: 52215e2a5d4a ("drm/amd: Use `amdgpu_ucode_*` helpers for VCE") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: increase mes log buffer size for gfx12Michael Chen2024-07-271-3/+3
| | | | | | | | | MES firmware requires larger log buffer for gfx12. Allocate proper buffer respectively for gfx11 and gfx12. Signed-off-by: Michael Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: remove amdgpu_mes_fence_wait_polling()Alex Deucher2024-06-191-12/+0
| | | | | | | No longer used so remove it. Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* Revert "drm/amdgpu: Add missing locking for MES API calls"Mukul Joshi2024-06-191-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 3612702852acbded39233b1600c8d9f47e40139f. This is causing a BUG message during suspend. [ 61.603542] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 [ 61.603550] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2028, name: kworker/u64:14 [ 61.603553] preempt_count: 1, expected: 0 [ 61.603555] RCU nest depth: 0, expected: 0 [ 61.603557] Preemption disabled at: [ 61.603559] [<ffffffffc08a3261>] amdgpu_gfx_disable_kgq+0x61/0x160 [amdgpu] [ 61.603789] CPU: 9 PID: 2028 Comm: kworker/u64:14 Tainted: G W 6.8.0+ #7 [ 61.603795] Workqueue: events_unbound async_run_entry_fn [ 61.603801] Call Trace: [ 61.603803] <TASK> [ 61.603806] dump_stack_lvl+0x37/0x50 [ 61.603811] ? amdgpu_gfx_disable_kgq+0x61/0x160 [amdgpu] [ 61.604007] dump_stack+0x10/0x20 [ 61.604010] __might_resched+0x16f/0x1d0 [ 61.604016] __might_sleep+0x43/0x70 [ 61.604020] mutex_lock+0x1f/0x60 [ 61.604024] amdgpu_mes_unmap_legacy_queue+0x6d/0x100 [amdgpu] [ 61.604226] gfx11_kiq_unmap_queues+0x3dc/0x430 [amdgpu] [ 61.604422] ? srso_alias_return_thunk+0x5/0xfbef5 [ 61.604429] amdgpu_gfx_disable_kgq+0x122/0x160 [amdgpu] [ 61.604621] gfx_v11_0_hw_fini+0xda/0x100 [amdgpu] [ 61.604814] gfx_v11_0_suspend+0xe/0x20 [amdgpu] [ 61.605008] amdgpu_device_ip_suspend_phase2+0x135/0x1d0 [amdgpu] [ 61.605175] amdgpu_device_suspend+0xec/0x180 [amdgpu] Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: refine mes firmware loadingYang Wang2024-06-141-4/+2
| | | | | | | | | | | | v1: refine mes firmware loading v2: use dev_info instead of DRM_INFO Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Add missing locking for MES API callsMukul Joshi2024-06-141-0/+12
| | | | | | | | | Add missing locking at a few places when calling MES APIs to ensure exclusive access to MES queue. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add uni_mes fw loading supportJack Xiao2024-05-021-1/+4
| | | | | | | | Add the unified mes firmware loading support. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: enable mes v12 self testJack Xiao2024-05-021-1/+1
| | | | | | | | | 1. fix available compute queue to use 2. enable mes v12 self test Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: init mes ucode name for gfx v12Likun Gao2024-05-021-1/+2
| | | | | | | | | Keep gfx v12 mes fw name to gc_12_x_x_mes.bin and gc_12_x_x_mes1.bin. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: add mes mapping legacy queue supportJack Xiao2024-04-261-0/+22
| | | | | | | | | Add mes mapping legacy queue framework support. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes11: Use a separate fence per transactionAlex Deucher2024-04-261-0/+12
| | | | | | | | We can't use a shared fence location because each transaction should be considered independently. Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu/mes: fix use-after-free issueJack Xiao2024-04-261-0/+1
| | | | | | | | | | | | Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao <[email protected]> Acked-by: Lijo Lazar <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu : Increase the mes log buffer size as per new MES FW versionshaoyunl2024-04-101-3/+2
| | | | | | | | | From MES version 0x54, the log entry increased and require the log buffer size to be increased. The 16k is maximum size agreed Signed-off-by: shaoyunl <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu : Add mes_log_enable to control mes log featureshaoyunl2024-04-101-1/+4
| | | | | | | | | The MES log might slow down the performance for extra step of log the data, disable it by default and introduce a parameter can enable it when necessary Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Fix 'fw_name' buffer size to prevent truncations in ↵Srinivasan Shanmugam2024-03-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | amdgpu_mes_init_microcode The snprintf function is used to write a formatted string into fw_name. The format of the string is "amdgpu/%s_mes%s.bin", where %s is replaced by the string in ucode_prefix and the second %s is replaced by either "_2" or "1" depending on the condition pipe == AMDGPU_MES_SCHED_PIPE. The length of the string "amdgpu/%s_mes%s.bin" is 16 characters plus the length of ucode_prefix and the length of the string "_2" or "1". The size of ucode_prefix is 30, so the maximum length of ucode_prefix is 29 characters (since one character is needed for the null terminator). Therefore, the maximum possible length of the string written into fw_name is 16 + 29 + 2 = 47 characters. The size of fw_name is 40, so if the length of the string written into fw_name is more than 39 characters (since one character is needed for the null terminator), it will be truncated by the snprintf function, and thus warnings will be seen. By increasing the size of fw_name to 50, we ensure that fw_name is large enough to hold the maximum possible length of the string, so the snprintf function will not truncate the output. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_init_microcode’: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1482:66: warning: ‘%s’ directive output may be truncated writing up to 1 bytes into a region of size between 0 and 29 [-Wformat-truncation=] 1482 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1482:17: note: ‘snprintf’ output between 16 and 46 bytes into a destination of size 40 1482 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1483 | ucode_prefix, | ~~~~~~~~~~~~~ 1484 | pipe == AMDGPU_MES_SCHED_PIPE ? "" : "1"); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1477:66: warning: ‘%s’ directive output may be truncated writing 1 byte into a region of size between 0 and 29 [-Wformat-truncation=] 1477 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~ 1478 | ucode_prefix, 1479 | pipe == AMDGPU_MES_SCHED_PIPE ? "_2" : "1"); | ~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1477:17: note: ‘snprintf’ output between 17 and 46 bytes into a destination of size 40 1477 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1478 | ucode_prefix, | ~~~~~~~~~~~~~ 1479 | pipe == AMDGPU_MES_SCHED_PIPE ? "_2" : "1"); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1477:66: warning: ‘%s’ directive output may be truncated writing 2 bytes into a region of size between 0 and 29 [-Wformat-truncation=] 1477 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~ 1478 | ucode_prefix, 1479 | pipe == AMDGPU_MES_SCHED_PIPE ? "_2" : "1"); | ~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1477:17: note: ‘snprintf’ output between 18 and 47 bytes into a destination of size 40 1477 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes%s.bin", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1478 | ucode_prefix, | ~~~~~~~~~~~~~ 1479 | pipe == AMDGPU_MES_SCHED_PIPE ? "_2" : "1"); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1489:62: warning: ‘_mes.bin’ directive output may be truncated writing 8 bytes into a region of size between 4 and 33 [-Wformat-truncation=] 1489 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin", | ^~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1489:17: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 40 1489 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1490 | ucode_prefix); | ~~~~~~~~~~~~~ Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: cleanup unused variableShashank Sharma2024-03-201-7/+3
| | | | | | | | | | | This patch removes an unused input variable in the MES doorbell function. Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Only create mes event log debugfs when mes is enabledshaoyunl2024-02-071-3/+3
| | | | | | | | | | Skip the debugfs file creation for mes event log if the GPU doesn't use MES. This to prevent potential kernel oops when user try to read the event log in debugfs on a GPU without MES Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* drm/amdgpu: Enable seq64 manager and fix bugsArunpravin Paneer Selvam2024-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | - Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs. v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE otherwise the areas will conflict with user space allocations (Alex) - It needs to be mapped read only in the user VM (Alex) v2: - Instead of just one define for TOP/BOTTOM reserved space separate them into two (Christian) - Fix the CPU and VA calculations and while at it also cleanup error handling and kerneldoc (Christian) Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]>
* Merge tag 'drm-msm-next-2023-12-15' of ↵Dave Airlie2023-12-191-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.8: Core: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts on register write - Kconfig fix for QMP dependency - DT schema fixes DPU: - Add support for SDM670, SM8650 - Enable SmartDMA on SM8350 and SM8450 - Correct UBWC settings for SC8280XP - Fix catalog settings for SC8180X - Actually make use of the version to switch between QSEED3/3LITE/4 scalers - Use devres-managed and drm-managed allocations where appropriate - misc other fixes - Enabled YUV writeback on SC7280, SM8250 - Enabled writeback on SM8350, SM8450 - CRC fix when encoder is selected as the input source - other misc fixes MDP4: - Use devres-managed and drm-managed allocations where appropriate - flush vblank event on CRTC disable MDP5: - Use devres-managed and drm-managed allocations where appropriate DP: - Add support for SM8650 - Enable PM runtime support - Merge msm-specific debugfs dir with the generic one - Described DisplayPort on SM8150 in DeviceTree bindings - Moved dp_display_get_next_bridge() to probe() DSI: - Add support for SM8650 - Enable PM runtime support GPU/GEM: - demote userspace triggerable warnings to debug - add GEM object metadata UAPI - move GPU devcoredumps to GPU device - fix hangcheck to skip retired submits - expose UBWC config to userspace - fix a680 chip-id - drm_exec conversion - drm/ci: remove rebase-merge directory (to unblock CI) [airlied: fix drm_exec/amd interaction] Signed-off-by: Dave Airlie <[email protected]> From: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com