diff options
| author | Linus Torvalds <[email protected]> | 2025-05-28 16:46:39 +0000 |
|---|---|---|
| committer | Linus Torvalds <[email protected]> | 2025-05-28 16:46:39 +0000 |
| commit | b08494a8f7416e5f09907318c5460ad6f6e2a548 (patch) | |
| tree | f7c5615c5a5f02e9cb9033883938788e223a6960 /drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | |
| parent | Merge tag 'media/v6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mc... (diff) | |
| parent | drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_reso... (diff) | |
| download | kernel-b08494a8f7416e5f09907318c5460ad6f6e2a548.tar.gz kernel-b08494a8f7416e5f09907318c5460ad6f6e2a548.zip | |
Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"As part of building up nova-core/nova-drm pieces we've brought in some
rust abstractions through this tree, aux bus being the main one, with
devres changes also in the driver-core tree. Along with the drm core
abstractions and enough nova-core/nova-drm to use them. This is still
all stub work under construction, to build the nova driver upstream.
The other big NVIDIA related one is nouveau adds support for
Hopper/Blackwell GPUs, this required a new GSP firmware update to
570.144, and a bunch of rework in order to support multiple fw
interfaces.
There is also the introduction of an asahi uapi header file as a
precursor to getting the real driver in later, but to unblock
userspace mesa packages while the driver is trapped behind rust
enablement.
Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe,
and msm being the main ones, and some changes to vsprintf.
new drivers:
- bring in the asahi uapi header standalone
- nova-drm: stub driver
rust dependencies (for nova-core):
- auxiliary
- bus abstractions
- driver registration
- sample driver
- devres changes from driver-core
- revocable changes
core:
- add Apple fourcc modifiers
- add virtio capset definitions
- extend EXPORT_SYNC_FILE for timeline syncobjs
- convert to devm_platform_ioremap_resource
- refactor shmem helper page pinning
- DP powerup/down link helpers
- extended %p4cc in vsprintf.c to support fourcc prints
- change vsprintf %p4cn to %p4chR, remove %p4cn
- Add drm_file_err function
- IN_FORMATS_ASYNC property
- move sitronix from tiny to their own subdir
rust:
- add drm core infrastructure rust abstractions
(device/driver, ioctl, file, gem)
dma-buf:
- adjust sg handling to not cache map on attach
- allow setting dma-device for import
- Add a helper to sort and deduplicate dma_fence arrays
docs:
- updated drm scheduler docs
- fbdev todo update
- fb rendering
- actual brightness
ttm:
- fix delayed destroy resv object
bridge:
- add kunit tests
- convert tc358775 to atomic
- convert drivers to devm_drm_bridge_alloc
- convert rk3066_hdmi to bridge driver
scheduler:
- add kunit tests
panel:
- refcount panels to improve lifetime handling
- Powertip PH128800T004-ZZA01
- NLT NL13676BC25-03F, Tianma TM070JDHG34-00
- Himax HX8279/HX8279-D DDIC
- Visionox G2647FB105
- Sitronix ST7571
- ZOTAC rotation quirk
vkms:
- allow attaching more displays
i915:
- xe3lpd display updates
- vrr refactor
- intel_display struct conversions
- xe2hpd memory type identification
- add link rate/count to i915_display_info
- cleanup VGA plane handling
- refactor HDCP GSC
- fix SLPC wait boosting reference counting
- add 20ms delay to engine reset
- fix fence release on early probe errors
xe:
- SRIOV updates
- BMG PCI ID update
- support separate firmware for each GT
- SVM fix, prelim SVM multi-device work
- export fan speed
- temp disable d3cold on BMG
- backup VRAM in PM notifier instead of suspend/freeze
- update xe_ttm_access_memory to use GPU for non-visible access
- fix guc_info debugfs for VFs
- use copy_from_user instead of __copy_from_user
- append PCIe gen5 limitations to xe_firmware document
amdgpu:
- DSC cleanup
- DC Scaling updates
- Fused I2C-over-AUX updates
- DMUB updates
- Use drm_file_err in amdgpu
- Enforce isolation updates
- Use new dma_fence helpers
- USERQ fixes
- Documentation updates
- SR-IOV updates
- RAS updates
- PSP 12 cleanups
- GC 9.5 updates
- SMU 13.x updates
- VCN / JPEG SR-IOV updates
amdkfd:
- Update error messages for SDMA
- Userptr updates
- XNACK fixes
radeon:
- CIK doorbell cleanup
nouveau:
- add support for NVIDIA r570 GSP firmware
- enable Hopper/Blackwell support
nova-core:
- fix task list
- register definition infrastructure
- move firmware into own rust module
- register auxiliary device for nova-drm
nova-drm:
- initial driver skeleton
msm:
- GPU:
- ACD (adaptive clock distribution) for X1-85
- drop fictional address_space_size
- improve GMU HFI response time out robustness
- fix crash when throttling during boot
- DPU:
- use single CTL path for flushing on DPU 5.x+
- improve SSPP allocation code for better sharing
- Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
- Added SAR2130P support
- Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
- DP:
- switch to new audio helpers
- better LTTPR handling
- DSI:
- Added support for SA8775P
- Added SAR2130P support
- HDMI:
- Switched to use new helpers for ACR data
- Fixed old standing issue of HPD not working in some cases
amdxdna:
- add dma-buf support
- allow empty command submits
renesas:
- add dma-buf support
- add zpos, alpha, blend support
panthor:
- fail properly for NO_MMAP bos
- add SET_LABEL ioctl
- debugfs BO dumping support
imagination:
- update DT bindings
- support TI AM68 GPU
hibmc:
- improve interrupt handling and HPD support
virtio:
- add panic handler support
rockchip:
- add RK3588 support
- add DP AUX bus panel support
ivpu:
- add heartbeat based hangcheck
mediatek:
- prepares support for MT8195/99 HDMIv2/DDCv2
anx7625:
- improve HPD
tegra:
- speed up firmware loading
* tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits)
drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr()
drm/xe: Default auto_link_downgrade status to false
drm/xe/guc: Make creation of SLPC debugfs files conditional
drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue()
drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read
drm/i915/ptl: Use everywhere the correct DDI port clock select mask
drm/nouveau/kms: add support for GB20x
drm/dp: add option to disable zero sized address only transactions.
drm/nouveau: add support for GB20x
drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle
drm/nouveau: add support for GB10x
drm/nouveau/gf100-: track chan progress with non-WFI semaphore release
drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA
drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos
drm/nouveau: add support for GH100
drm/nouveau: improve handling of 64-bit BARs
drm/nouveau/gv100-: switch to volta semaphore methods
drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES
drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY
drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/mes_userqueue.c')
| -rw-r--r-- | drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 355 |
1 files changed, 355 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c new file mode 100644 index 000000000000..d6f50b13e2ba --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c @@ -0,0 +1,355 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright 2024 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ +#include "amdgpu.h" +#include "amdgpu_gfx.h" +#include "mes_userqueue.h" +#include "amdgpu_userq_fence.h" + +#define AMDGPU_USERQ_PROC_CTX_SZ PAGE_SIZE +#define AMDGPU_USERQ_GANG_CTX_SZ PAGE_SIZE + +static int +mes_userq_map_gtt_bo_to_gart(struct amdgpu_bo *bo) +{ + int ret; + + ret = amdgpu_bo_reserve(bo, true); + if (ret) { + DRM_ERROR("Failed to reserve bo. ret %d\n", ret); + goto err_reserve_bo_failed; + } + + ret = amdgpu_ttm_alloc_gart(&bo->tbo); + if (ret) { + DRM_ERROR("Failed to bind bo to GART. ret %d\n", ret); + goto err_map_bo_gart_failed; + } + + amdgpu_bo_unreserve(bo); + bo = amdgpu_bo_ref(bo); + + return 0; + +err_map_bo_gart_failed: + amdgpu_bo_unreserve(bo); +err_reserve_bo_failed: + return ret; +} + +static int +mes_userq_create_wptr_mapping(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue, + uint64_t wptr) +{ + struct amdgpu_bo_va_mapping *wptr_mapping; + struct amdgpu_vm *wptr_vm; + struct amdgpu_userq_obj *wptr_obj = &queue->wptr_obj; + int ret; + + wptr_vm = queue->vm; + ret = amdgpu_bo_reserve(wptr_vm->root.bo, false); + if (ret) + return ret; + + wptr &= AMDGPU_GMC_HOLE_MASK; + wptr_mapping = amdgpu_vm_bo_lookup_mapping(wptr_vm, wptr >> PAGE_SHIFT); + amdgpu_bo_unreserve(wptr_vm->root.bo); + if (!wptr_mapping) { + DRM_ERROR("Failed to lookup wptr bo\n"); + return -EINVAL; + } + + wptr_obj->obj = wptr_mapping->bo_va->base.bo; + if (wptr_obj->obj->tbo.base.size > PAGE_SIZE) { + DRM_ERROR("Requested GART mapping for wptr bo larger than one page\n"); + return -EINVAL; + } + + ret = mes_userq_map_gtt_bo_to_gart(wptr_obj->obj); + if (ret) { + DRM_ERROR("Failed to map wptr bo to GART\n"); + return ret; + } + + queue->wptr_obj.gpu_addr = amdgpu_bo_gpu_offset_no_check(wptr_obj->obj); + return 0; +} + +static int convert_to_mes_priority(int priority) +{ + switch (priority) { + case AMDGPU_USERQ_CREATE_FLAGS_QUEUE_PRIORITY_NORMAL_LOW: + default: + return AMDGPU_MES_PRIORITY_LEVEL_NORMAL; + case AMDGPU_USERQ_CREATE_FLAGS_QUEUE_PRIORITY_LOW: + return AMDGPU_MES_PRIORITY_LEVEL_LOW; + case AMDGPU_USERQ_CREATE_FLAGS_QUEUE_PRIORITY_NORMAL_HIGH: + return AMDGPU_MES_PRIORITY_LEVEL_MEDIUM; + case AMDGPU_USERQ_CREATE_FLAGS_QUEUE_PRIORITY_HIGH: + return AMDGPU_MES_PRIORITY_LEVEL_HIGH; + } +} + +static int mes_userq_map(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev; + struct amdgpu_userq_obj *ctx = &queue->fw_obj; + struct amdgpu_mqd_prop *userq_props = queue->userq_prop; + struct mes_add_queue_input queue_input; + int r; + + memset(&queue_input, 0x0, sizeof(struct mes_add_queue_input)); + + queue_input.process_va_start = 0; + queue_input.process_va_end = adev->vm_manager.max_pfn - 1; + + /* set process quantum to 10 ms and gang quantum to 1 ms as default */ + queue_input.process_quantum = 100000; + queue_input.gang_quantum = 10000; + queue_input.paging = false; + + queue_input.process_context_addr = ctx->gpu_addr; + queue_input.gang_context_addr = ctx->gpu_addr + AMDGPU_USERQ_PROC_CTX_SZ; + queue_input.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL; + queue_input.gang_global_priority_level = convert_to_mes_priority(queue->priority); + + queue_input.process_id = queue->vm->pasid; + queue_input.queue_type = queue->queue_type; + queue_input.mqd_addr = queue->mqd.gpu_addr; + queue_input.wptr_addr = userq_props->wptr_gpu_addr; + queue_input.queue_size = userq_props->queue_size >> 2; + queue_input.doorbell_offset = userq_props->doorbell_index; + queue_input.page_table_base_addr = amdgpu_gmc_pd_addr(queue->vm->root.bo); + queue_input.wptr_mc_addr = queue->wptr_obj.gpu_addr; + + amdgpu_mes_lock(&adev->mes); + r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input); + amdgpu_mes_unlock(&adev->mes); + if (r) { + DRM_ERROR("Failed to map queue in HW, err (%d)\n", r); + return r; + } + + DRM_DEBUG_DRIVER("Queue (doorbell:%d) mapped successfully\n", userq_props->doorbell_index); + return 0; +} + +static int mes_userq_unmap(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev; + struct mes_remove_queue_input queue_input; + struct amdgpu_userq_obj *ctx = &queue->fw_obj; + int r; + + memset(&queue_input, 0x0, sizeof(struct mes_remove_queue_input)); + queue_input.doorbell_offset = queue->doorbell_index; + queue_input.gang_context_addr = ctx->gpu_addr + AMDGPU_USERQ_PROC_CTX_SZ; + + amdgpu_mes_lock(&adev->mes); + r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input); + amdgpu_mes_unlock(&adev->mes); + if (r) + DRM_ERROR("Failed to unmap queue in HW, err (%d)\n", r); + return r; +} + +static int mes_userq_create_ctx_space(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue, + struct drm_amdgpu_userq_in *mqd_user) +{ + struct amdgpu_userq_obj *ctx = &queue->fw_obj; + int r, size; + + /* + * The FW expects at least one page space allocated for + * process ctx and gang ctx each. Create an object + * for the same. + */ + size = AMDGPU_USERQ_PROC_CTX_SZ + AMDGPU_USERQ_GANG_CTX_SZ; + r = amdgpu_userq_create_object(uq_mgr, ctx, size); + if (r) { + DRM_ERROR("Failed to allocate ctx space bo for userqueue, err:%d\n", r); + return r; + } + + return 0; +} + +static int mes_userq_mqd_create(struct amdgpu_userq_mgr *uq_mgr, + struct drm_amdgpu_userq_in *args_in, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev; + struct amdgpu_mqd *mqd_hw_default = &adev->mqds[queue->queue_type]; + struct drm_amdgpu_userq_in *mqd_user = args_in; + struct amdgpu_mqd_prop *userq_props; + int r; + + /* Structure to initialize MQD for userqueue using generic MQD init function */ + userq_props = kzalloc(sizeof(struct amdgpu_mqd_prop), GFP_KERNEL); + if (!userq_props) { + DRM_ERROR("Failed to allocate memory for userq_props\n"); + return -ENOMEM; + } + + if (!mqd_user->wptr_va || !mqd_user->rptr_va || + !mqd_user->queue_va || mqd_user->queue_size == 0) { + DRM_ERROR("Invalid MQD parameters for userqueue\n"); + r = -EINVAL; + goto free_props; + } + + r = amdgpu_userq_create_object(uq_mgr, &queue->mqd, mqd_hw_default->mqd_size); + if (r) { + DRM_ERROR("Failed to create MQD object for userqueue\n"); + goto free_props; + } + + /* Initialize the MQD BO with user given values */ + userq_props->wptr_gpu_addr = mqd_user->wptr_va; + userq_props->rptr_gpu_addr = mqd_user->rptr_va; + userq_props->queue_size = mqd_user->queue_size; + userq_props->hqd_base_gpu_addr = mqd_user->queue_va; + userq_props->mqd_gpu_addr = queue->mqd.gpu_addr; + userq_props->use_doorbell = true; + userq_props->doorbell_index = queue->doorbell_index; + userq_props->fence_address = queue->fence_drv->gpu_addr; + + if (queue->queue_type == AMDGPU_HW_IP_COMPUTE) { + struct drm_amdgpu_userq_mqd_compute_gfx11 *compute_mqd; + + if (mqd_user->mqd_size != sizeof(*compute_mqd)) { + DRM_ERROR("Invalid compute IP MQD size\n"); + r = -EINVAL; + goto free_mqd; + } + + compute_mqd = memdup_user(u64_to_user_ptr(mqd_user->mqd), mqd_user->mqd_size); + if (IS_ERR(compute_mqd)) { + DRM_ERROR("Failed to read user MQD\n"); + r = -ENOMEM; + goto free_mqd; + } + + userq_props->eop_gpu_addr = compute_mqd->eop_va; + userq_props->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_NORMAL; + userq_props->hqd_queue_priority = AMDGPU_GFX_QUEUE_PRIORITY_MINIMUM; + userq_props->hqd_active = false; + userq_props->tmz_queue = + mqd_user->flags & AMDGPU_USERQ_CREATE_FLAGS_QUEUE_SECURE; + kfree(compute_mqd); + } else if (queue->queue_type == AMDGPU_HW_IP_GFX) { + struct drm_amdgpu_userq_mqd_gfx11 *mqd_gfx_v11; + + if (mqd_user->mqd_size != sizeof(*mqd_gfx_v11) || !mqd_user->mqd) { + DRM_ERROR("Invalid GFX MQD\n"); + r = -EINVAL; + goto free_mqd; + } + + mqd_gfx_v11 = memdup_user(u64_to_user_ptr(mqd_user->mqd), mqd_user->mqd_size); + if (IS_ERR(mqd_gfx_v11)) { + DRM_ERROR("Failed to read user MQD\n"); + r = -ENOMEM; + goto free_mqd; + } + + userq_props->shadow_addr = mqd_gfx_v11->shadow_va; + userq_props->csa_addr = mqd_gfx_v11->csa_va; + userq_props->tmz_queue = + mqd_user->flags & AMDGPU_USERQ_CREATE_FLAGS_QUEUE_SECURE; + kfree(mqd_gfx_v11); + } else if (queue->queue_type == AMDGPU_HW_IP_DMA) { + struct drm_amdgpu_userq_mqd_sdma_gfx11 *mqd_sdma_v11; + + if (mqd_user->mqd_size != sizeof(*mqd_sdma_v11) || !mqd_user->mqd) { + DRM_ERROR("Invalid SDMA MQD\n"); + r = -EINVAL; + goto free_mqd; + } + + mqd_sdma_v11 = memdup_user(u64_to_user_ptr(mqd_user->mqd), mqd_user->mqd_size); + if (IS_ERR(mqd_sdma_v11)) { + DRM_ERROR("Failed to read sdma user MQD\n"); + r = -ENOMEM; + goto free_mqd; + } + + userq_props->csa_addr = mqd_sdma_v11->csa_va; + kfree(mqd_sdma_v11); + } + + queue->userq_prop = userq_props; + + r = mqd_hw_default->init_mqd(adev, (void *)queue->mqd.cpu_ptr, userq_props); + if (r) { + DRM_ERROR("Failed to initialize MQD for userqueue\n"); + goto free_mqd; + } + + /* Create BO for FW operations */ + r = mes_userq_create_ctx_space(uq_mgr, queue, mqd_user); + if (r) { + DRM_ERROR("Failed to allocate BO for userqueue (%d)", r); + goto free_mqd; + } + + /* FW expects WPTR BOs to be mapped into GART */ + r = mes_userq_create_wptr_mapping(uq_mgr, queue, userq_props->wptr_gpu_addr); + if (r) { + DRM_ERROR("Failed to create WPTR mapping\n"); + goto free_ctx; + } + + return 0; + +free_ctx: + amdgpu_userq_destroy_object(uq_mgr, &queue->fw_obj); + +free_mqd: + amdgpu_userq_destroy_object(uq_mgr, &queue->mqd); + +free_props: + kfree(userq_props); + + return r; +} + +static void +mes_userq_mqd_destroy(struct amdgpu_userq_mgr *uq_mgr, + struct amdgpu_usermode_queue *queue) +{ + amdgpu_userq_destroy_object(uq_mgr, &queue->fw_obj); + kfree(queue->userq_prop); + amdgpu_userq_destroy_object(uq_mgr, &queue->mqd); +} + +const struct amdgpu_userq_funcs userq_mes_funcs = { + .mqd_create = mes_userq_mqd_create, + .mqd_destroy = mes_userq_mqd_destroy, + .unmap = mes_userq_unmap, + .map = mes_userq_map, +}; |
