aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/iommu/intel/svm.c
Commit message (Collapse)AuthorAgeFilesLines
* iommu/vt-d: Check if SVA is supported when attaching the SVA domainJason Gunthorpe2025-03-101-0/+43
| | | | | | | | | | | | | | | Attach of a SVA domain should fail if SVA is not supported, move the check for SVA support out of IOMMU_DEV_FEAT_SVA and into attach. Also check when allocating a SVA domain to match other drivers. Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Yi Liu <[email protected]> Tested-by: Zhangfei Gao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacementYi Liu2024-11-081-2/+3
| | | | | | | | | Make intel_svm_set_dev_pasid() support replacement. Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Consolidate the struct dev_pasid_info add/removeYi Liu2024-11-081-21/+7
| | | | | | | | | | | | | | | | | | | | The domain_add_dev_pasid() and domain_remove_dev_pasid() are added to consolidate the adding/removing of the struct dev_pasid_info. Besides, it includes the cache tag assign/unassign as well. This also prepares for adding domain replacement for pasid. The set_dev_pasid callbacks need to deal with the dev_pasid_info for both old and new domain. These two helpers make the life easier. intel_iommu_set_dev_pasid() and intel_svm_set_dev_pasid() are updated to use the helpers. Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu: Pass old domain to set_dev_pasid opYi Liu2024-11-081-1/+2
| | | | | | | | | | | | | | | | | | | | | To support domain replacement for pasid, the underlying iommu driver needs to know the old domain hence be able to clean up the existing attachment. It would be much convenient for iommu layer to pass down the old domain. Otherwise, iommu drivers would need to track domain for pasids by themselves, this would duplicate code among the iommu drivers. Or iommu drivers would rely group->pasid_array to get domain, which may not always the correct one. Suggested-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Nicolin Chen <[email protected]> Reviewed-by: Vasant Hegde <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Separate page request queue from SVMJoel Granados2024-11-051-397/+0
| | | | | | | | | | | | | | | IO page faults are no longer dependent on CONFIG_INTEL_IOMMU_SVM. Move all Page Request Queue (PRQ) functions that handle prq events to a new file in drivers/iommu/intel/prq.c. The page_req_des struct is now declared in drivers/iommu/intel/prq.c. No functional changes are intended. This is a preparation patch to enable the use of IO page faults outside the SVM/PASID use cases. Signed-off-by: Joel Granados <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Add qi_batch for dmar_domainLu Baolu2024-09-021-1/+4
| | | | | | | | | | | | | Introduces a qi_batch structure to hold batched cache invalidation descriptors on a per-dmar_domain basis. A fixed-size descriptor array is used for simplicity. The qi_batch is allocated when the first cache tag is added to the domain and freed during iommu_free_domain(). Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Add support for static identity domainLu Baolu2024-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | Software determines VT-d hardware support for passthrough translation by inspecting the capability register. If passthrough translation is not supported, the device is instructed to use DMA domain for its default domain. Add a global static identity domain with guaranteed attach semantics for IOMMUs that support passthrough translation mode. The global static identity domain is a dummy domain without corresponding dmar_domain structure. Consequently, the device's info->domain will be NULL with the identity domain is attached. Refactor the code accordingly. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
*---. Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' ↵Joerg Roedel2024-05-131-297/+86
|\ \ \ | | | | | | | | | | | | into next
| | | * iommu/vt-d: Remove struct intel_svmLu Baolu2024-04-261-73/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct intel_svm was used for keeping attached devices info for sva domain. Since sva domain is a kind of iommu_domain, the struct dmar_domain should centralize all info of a sva domain, including the info of attached devices. Therefore, retire struct intel_svm and clean up the code. Besides, register mmu notifier callback in domain_alloc_sva() callback which allows the memory management notifier lifetime to follow the lifetime of the iommu_domain. Call mmu_notifier_put() in the domain free and defer the real free to the mmu free_notifier callback. Co-developed-by: Tina Zhang <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | | * iommu/vt-d: Remove intel_svm_devLu Baolu2024-04-261-93/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intel_svm_dev data structure used in the sva implementation for the Intel IOMMU driver stores information about a device attached to an SVA domain. It is a duplicate of dev_pasid_info that serves the same purpose. Replace intel_svm_dev with dev_pasid_info and clean up the use of intel_svm_dev. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | | * iommu/vt-d: Use cache helpers in arch_invalidate_secondary_tlbsLu Baolu2024-04-261-71/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch_invalidate_secondary_tlbs callback is called in the SVA mm notification path. It invalidates all or a range of caches after the CPU page table is modified. Use the cache tag helps in this path. The mm_types defines vm_end as the first byte after the end address which is different from the iommu gather API, hence convert the end parameter from mm_types to iommu gather scheme before calling the cache_tag helper. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | | * iommu/vt-d: Add cache tag assignment interfaceLu Baolu2024-04-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caching tag is a combination of tags used by the hardware to cache various translations. Whenever a mapping in a domain is changed, the IOMMU driver should invalidate the caches with the caching tags. The VT-d specification describes caching tags in section 6.2.1, Tagging of Cached Translations. Add interface to assign caching tags to an IOMMU domain when attached to a RID or PASID, and unassign caching tags when a domain is detached from a RID or PASID. All caching tags are listed in the per-domain tag list and are protected by a dedicated lock. In addition to the basic IOTLB and devTLB caching tag types, NESTING_IOTLB and NESTING_DEVTLB tag types are also introduced. These tags are used for caches that store translations for DMA accesses through a nested user domain. They are affected by changes to mappings in the parent domain. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | | * iommu/vt-d: Remove private data use in fault messageJingqi Liu2024-04-261-60/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to Intel VT-d specification revision 4.0, "Private Data" field has been removed from Page Request/Response. Since the private data field is not used in fault message, remove the related definitions in page request descriptor and remove the related code in page request/response handler, as Intel hasn't shipped any products which support private data in the page request message. Signed-off-by: Jingqi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
| | | * iommu/vt-d: Remove debugfs use of private data fieldJingqi Liu2024-04-261-9/+0
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the page fault report and response have been tracked by ftrace, the users can easily calculate the time used for a page fault handling. There's no need to expose the similar functionality in debugfs. Hence, remove the corresponding operations in debugfs. Signed-off-by: Jingqi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
| | * iommu/vt-d: add wrapper functions for page allocationsPasha Tatashin2024-04-151-6/+5
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to improve observability and accountability of IOMMU layer, we must account the number of pages that are allocated by functions that are calling directly into buddy allocator. This is achieved by first wrapping the allocation related functions into a separate inline functions in new file: drivers/iommu/iommu-pages.h Convert all page allocation calls under iommu/intel to use these new functions. Signed-off-by: Pasha Tatashin <[email protected]> Acked-by: David Rientjes <[email protected]> Tested-by: Bagas Sanjaya <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* | iommu/vt-d: Allocate local memory for page request queueJacob Pan2024-04-121-1/+1
|/ | | | | | | | | | | | The page request queue is per IOMMU, its allocation should be made NUMA-aware for performance reasons. Fixes: a222a7f0bb6c ("iommu/vt-d: Implement page request handling") Signed-off-by: Jacob Pan <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
*-. Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', ↵Joerg Roedel2024-03-081-56/+20
|\ \ | | | | | | | | | 'x86/amd' and 'core' into next
| | * iommu: Make iommu_report_device_fault() return voidLu Baolu2024-02-161-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the iommu_report_device_fault() has been converted to auto-respond a page fault if it fails to enqueue it, there's no need to return a code in any case. Make it return void. Suggested-by: Jason Gunthorpe <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | * iommu: Make iopf_group_response() return voidLu Baolu2024-02-161-16/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iopf_group_response() should return void, as nothing can do anything with the failure. This implies that ops->page_response() must also return void; this is consistent with what the drivers do. The failure paths, which are all integrity validations of the fault, should be WARN_ON'd, not return codes. If the iommu core fails to enqueue the fault, it should respond the fault directly by calling ops->page_response() instead of returning an error number and relying on the iommu drivers to do so. Consolidate the error fault handling code in the core. Co-developed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | * iommu: Separate SVA and IOPFLu Baolu2024-02-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add CONFIG_IOMMU_IOPF for page fault handling framework and select it from its real consumer. Move iopf function declaration from iommu-sva.h to iommu.h and remove iommu-sva.h as it's empty now. Consolidate all SVA related code into iommu-sva.c: - Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c. - Move sva iopf handling code from io-pgfault.c to iommu-sva.c. Consolidate iommu_report_device_fault() and iommu_page_response() into io-pgfault.c. Export iopf_free_group() and iopf_group_response() for iopf handlers implemented in modules. Some functions are renamed with more meaningful names. No other intentional functionality changes. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Tested-by: Yan Zhao <[email protected]> Tested-by: Longfang Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | * iommu: Merge iommu_fault_event and iopf_faultLu Baolu2024-02-161-3/+2
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iommu_fault_event and iopf_fault data structures store the same information about an iopf fault. They are also used in the same way. Merge these two data structures into a single one to make the code more concise and easier to maintain. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Yi Liu <[email protected]> Tested-by: Yan Zhao <[email protected]> Tested-by: Longfang Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Use device rbtree in iopf reporting pathLu Baolu2024-03-011-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing I/O page fault handler currently locates the PCI device by calling pci_get_domain_bus_and_slot(). This function searches the list of all PCI devices until the desired device is found. To improve lookup efficiency, replace it with device_rbtree_find() to search the device within the probed device rbtree. The I/O page fault is initiated by the device, which does not have any synchronization mechanism with the software to ensure that the device stays in the probed device tree. Theoretically, a device could be released by the IOMMU subsystem after device_rbtree_find() and before iopf_get_dev_fault_param(), which would cause a use-after-free problem. Add a mutex to synchronize the I/O page fault reporting path and the IOMMU release device path. This lock doesn't introduce any performance overhead, as the conflict between I/O page fault reporting and device releasing is very rare. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Merge intel_svm_bind_mm() into its callerTina Zhang2024-03-011-11/+3
| | | | | | | | | | | | | | | | | | | | | | intel_svm_set_dev_pasid() is the only caller of intel_svm_bind_mm(). Merge them and remove intel_svm_bind_mm(). No functional change intended. Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_headTina Zhang2024-03-011-1/+0
| | | | | | | | | | | | | | | | | | | | The rcu_head structures allocated dynamically in the heap don't need any initialization. Therefore, remove the init_rcu_head(). Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Remove treatment for revoking PASIDs with pending page faultsTina Zhang2024-03-011-7/+0
|/ | | | | | | | | | | | | | | | | | | | | | Commit 2f26e0a9c986 ("iommu/vt-d: Add basic SVM PASID support") added a special treatment to mandate that no page faults may be outstanding for the PASID after intel_svm_unbind_mm() is called, as the PASID will be released and reused after unbind. This is unnecessary anymore as no outstanding page faults have been ensured in the driver's remove_dev_pasid path: - Tear down the pasid entry, which guarantees that new page faults for the PASID will be rejected by the iommu hardware. - All outstanding page faults have been responded to. - All hardware pending faults are drained in intel_drain_pasid_prq(). Remove this unnecessary code. Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
*---. Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', ↵Joerg Roedel2024-01-031-24/+10
|\ \ \ | | | | | | | | | | | | 'x86/vt-d', 'x86/amd' and 'core' into next
| | | * iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()Tina Zhang2023-12-121-7/+7
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pasid is passed in as a parameter through .set_dev_pasid() callback. Thus, intel_sva_bind_mm() can directly use it instead of retrieving the pasid value from mm->pasid. Suggested-by: Lu Baolu <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| | * iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directlyLu Baolu2023-12-191-17/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The device_to_iommu() helper was originally designed to look up the DMAR ACPI table to retrieve the iommu device and the request ID for a given device. However, it was also being used in other places where there was no need to lookup the ACPI table at all. Retrieve the iommu device directly from the per-device iommu private data in functions called after device is probed. Rename the original device_to_iommu() function to a more meaningful name, device_lookup_iommu(), to avoid mis-using it. Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* / iommu/vt-d: Fix incorrect cache invalidation for mm notificationLu Baolu2023-11-271-0/+26
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") moved the secondary TLB invalidations into the TLB invalidation functions to ensure that all secondary TLB invalidations happen at the same time as the CPU invalidation and added a flush-all type of secondary TLB invalidation for the batched mode, where a range of [0, -1UL) is used to indicates that the range extends to the end of the address space. However, using an end address of -1UL caused an overflow in the Intel IOMMU driver, where the end address was rounded up to the next page. As a result, both the IOTLB and device ATC were not invalidated correctly. Add a flush all helper function and call it when the invalidation range is from 0 to -1UL, ensuring that the entire caches are invalidated correctly. Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") Cc: [email protected] Cc: Huang Ying <[email protected]> Cc: Alistair Popple <[email protected]> Tested-by: Luo Yuzhang <[email protected]> # QAT Tested-by: Tony Zhu <[email protected]> # DSA Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* Merge tag 'iommu-updates-v6.6' of ↵Linus Torvalds2023-09-011-55/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Consolidate probe_device path - Make the PCI-SAC IOVA allocation trick PCI-only AMD IOMMU: - Consolidate PPR log handling - Interrupt handling improvements - Refcount fixes for amd_iommu_v2 driver Intel VT-d driver: - Enable idxd device DMA with pasid through iommu dma ops - Lift RESV_DIRECT check from VT-d driver to core - Miscellaneous cleanups and fixes ARM-SMMU drivers: - Device-tree binding updates: - Add additional compatible strings for Qualcomm SoCs - Allow ASIDs to be configured in the DT to work around Qualcomm's broken hypervisor - Fix clocks for Qualcomm's MSM8998 SoC - SMMUv2: - Support for Qualcomm's legacy firmware implementation featured on at least MSM8956 and MSM8976 - Match compatible strings for Qualcomm SM6350 and SM6375 SoC variants - SMMUv3: - Use 'ida' instead of a bitmap for VMID allocation - Rockchip IOMMU: - Lift page-table allocation restrictions on newer hardware - Mediatek IOMMU: - Add MT8188 IOMMU Support - Renesas IOMMU: - Allow PCIe devices .. and the usual set of cleanups an smaller fixes" * tag 'iommu-updates-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (64 commits) iommu: Explicitly include correct DT includes iommu/amd: Remove unused declarations iommu/arm-smmu-qcom: Add SM6375 SMMUv2 iommu/arm-smmu-qcom: Add SM6350 DPU compatible iommu/arm-smmu-qcom: Add SM6375 DPU compatible iommu/arm-smmu-qcom: Sort the compatible list alphabetically dt-bindings: arm-smmu: Fix MSM8998 clocks description iommu/vt-d: Remove unused extern declaration dmar_parse_dev_scope() iommu/vt-d: Fix to convert mm pfn to dma pfn iommu/vt-d: Fix to flush cache of PASID directory table iommu/vt-d: Remove rmrr check in domain attaching device path iommu: Prevent RESV_DIRECT devices from blocking domains dmaengine/idxd: Re-enable kernel workqueue under DMA API iommu/vt-d: Add set_dev_pasid callback for dma domain iommu/vt-d: Prepare for set_dev_pasid callback iommu/vt-d: Make prq draining code generic iommu/vt-d: Remove pasid_mutex iommu/vt-d: Add domain_flush_pasid_iotlb() iommu: Move global PASID allocation from SVA to core iommu: Generalize PASID 0 for normal DMA w/o PASID ...
| * iommu/vt-d: Make prq draining code genericLu Baolu2023-08-091-15/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently draining page requests and responses for a pasid is part of SVA implementation. This is because the driver only supports attaching an SVA domain to a device pasid. As we are about to support attaching other types of domains to a device pasid, the prq draining code becomes generic. Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Remove pasid_mutexLu Baolu2023-08-091-40/+5
| | | | | | | | | | | | | | | | | | | | | | The pasid_mutex was used to protect the paths of set/remove_dev_pasid(). It's duplicate with iommu_sva_lock. Remove it to avoid duplicate code. Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* | mmu_notifiers: rename invalidate_range notifierAlistair Popple2023-08-181-4/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two main use cases for mmu notifiers. One is by KVM which uses mmu_notifier_invalidate_range_start()/end() to manage a software TLB. The other is to manage hardware TLBs which need to use the invalidate_range() callback because HW can establish new TLB entries at any time. Hence using start/end() can lead to memory corruption as these callbacks happen too soon/late during page unmap. mmu notifier users should therefore either use the start()/end() callbacks or the invalidate_range() callbacks. To make this usage clearer rename the invalidate_range() callback to arch_invalidate_secondary_tlbs() and update documention. Link: https://lkml.kernel.org/r/6f77248cd25545c8020a54b4e567e8b72be4dca1.1690292440.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <[email protected]> Suggested-by: Jason Gunthorpe <[email protected]> Acked-by: Catalin Marinas <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Cc: Andrew Donnellan <[email protected]> Cc: Chaitanya Kumar Borah <[email protected]> Cc: Frederic Barrat <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kevin Tian <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Nicolin Chen <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zhi Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* iommu: Remove ioasid infrastructureJason Gunthorpe2023-03-311-1/+0
| | | | | | | | | | | This has no use anymore, delete it all. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/ioasid: Rename INVALID_IOASIDJacob Pan2023-03-311-1/+1
| | | | | | | | | | | | | | INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename INVALID_IOASID and consolidate since we are moving away from IOASID infrastructure. Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Add IOMMU perfmon overflow handler supportKan Liang2023-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | While enabled to count events and an event occurrence causes the counter value to increment and roll over to or past zero, this is termed a counter overflow. The overflow can trigger an interrupt. The IOMMU perfmon needs to handle the case properly. New HW IRQs are allocated for each IOMMU device for perfmon. The IRQ IDs are after the SVM range. In the overflow handler, the counter is not frozen. It's very unlikely that the same counter overflows again during the period. But it's possible that other counters overflow at the same time. Read the overflow register at the end of the handler and check whether there are more. Signed-off-by: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Remove sva from intel_svm_devLu Baolu2023-02-031-14/+9
| | | | | | | | | | | | After commit be51b1d6bbff ("iommu/sva: Refactoring iommu_sva_bind/unbind_device()"), the iommu driver doesn't need to return an iommu_sva pointer anymore. This removes the sva field from intel_svm_dev and cleanups the code accordingly. Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Remove users from intel_svm_devLu Baolu2023-02-031-34/+26
| | | | | | | | | | | | It was used as a reference counter of an existing bond between device and user application memory address. Commit be51b1d6bbff ("iommu/sva: Refactoring iommu_sva_bind/unbind_device()") has added this in iommu core. Remove it to avoid duplicate code. Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Remove unused fields in svm structuresLu Baolu2023-02-031-2/+0
| | | | | | | | | They aren't used anywhere. Remove them to avoid dead code. Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Remove include/linux/intel-svm.hLu Baolu2023-02-031-1/+0
| | | | | | | | | | | There's no need to have a public header for Intel SVA implementation. The device driver should interact with Intel SVA implementation via the IOMMU generic APIs. Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2022-12-141-88/+57
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/[email protected]/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...
| * iommu: Rename iommu-sva-lib.{c,h}Lu Baolu2022-11-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code for SVA implementation in iommu core. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Zhangfei Gao <[email protected]> Tested-by: Tony Zhu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu: Remove SVA related callbacks from iommu opsLu Baolu2022-11-031-49/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These ops'es have been deprecated. There's no need for them anymore. Remove them to avoid dead code. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Yi Liu <[email protected]> Tested-by: Zhangfei Gao <[email protected]> Tested-by: Tony Zhu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu/vt-d: Add SVA domain supportLu Baolu2022-11-031-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for SVA domain allocation and provide an SVA-specific iommu_domain_ops. This implementation is based on the existing SVA code. Possible cleanup and refactoring are left for incremental changes later. The VT-d driver will also need to support setting a DMA domain to a PASID of device. Current SVA implementation uses different data structures to track the domain and device PASID relationship. That's the reason why we need to check the domain type in remove_dev_pasid callback. Eventually we'll consolidate the data structures and remove the need of domain type check. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Yi Liu <[email protected]> Tested-by: Tony Zhu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
| * iommu: Remove SVM_FLAG_SUPERVISOR_MODE supportLu Baolu2022-11-031-42/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current kernel DMA with PASID support is based on the SVA with a flag SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address space to a PASID of the device. The device driver programs the device with kernel virtual address (KVA) for DMA access. There have been security and functional issues with this approach: - The lack of IOTLB synchronization upon kernel page table updates. (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.) - Other than slight more protection, using kernel virtual address (KVA) has little advantage over physical address. There are also no use cases yet where DMA engines need kernel virtual addresses for in-kernel DMA. This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface. The device drivers are suggested to handle kernel DMA with PASID through the kernel DMA APIs. The drvdata parameter in iommu_sva_bind_device() and all callbacks is not needed anymore. Cleanup them as well. Link: https://lore.kernel.org/linux-iommu/[email protected]/ Signed-off-by: Jacob Pan <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Tested-by: Zhangfei Gao <[email protected]> Tested-by: Tony Zhu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* | iommu/vt-d: Fix PCI device refcount leak in prq_event_thread()Yang Yingliang2022-12-021-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrease the reference count by calling pci_dev_put(). So call pci_dev_put() after using the 'pdev' to avoid refcount leak. Besides, if the 'pdev' is null or intel_svm_prq_report() returns error, there is no need to trace this fault. Fixes: 06f4b8d09dba ("iommu/vt-d: Remove unnecessary SVA data accesses in page fault path") Suggested-by: Lu Baolu <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* | iommu/vt-d: Add a fix for devices need extra dtlb flushJacob Pan2022-12-021-1/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in address translation service (ATS). These devices may inadvertently issue ATS invalidation completion before posted writes initiated with translated address that utilized translations matching the invalidation address range, violating the invalidation completion ordering. This patch adds an extra device TLB invalidation for the affected devices, it is needed to ensure no more posted writes with translated address following the invalidation completion. Therefore, the ordering is preserved and data-corruption is prevented. Device TLBs are invalidated under the following six conditions: 1. Device driver does DMA API unmap IOVA 2. Device driver unbind a PASID from a process, sva_unbind_device() 3. PASID is torn down, after PASID cache is flushed. e.g. process exit_mmap() due to crash 4. Under SVA usage, called by mmu_notifier.invalidate_range() where VM has to free pages that were unmapped 5. userspace driver unmaps a DMA buffer 6. Cache invalidation in vSVA usage (upcoming) For #1 and #2, device drivers are responsible for stopping DMA traffic before unmap/unbind. For #3, iommu driver gets mmu_notifier to invalidate TLB the same way as normal user unmap which will do an extra invalidation. The dTLB invalidation after PASID cache flush does not need an extra invalidation. Therefore, we only need to deal with #4 and #5 in this patch. #1 is also covered by this patch due to common code path with #5. Tested-by: Yuzhang Luo <[email protected]> Reviewed-by: Ashok Raj <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_supportYi Liu2022-09-261-1/+1
| | | | | | | | | | | This renaming better describes it is for first level page table (a.k.a first stage page table since VT-d spec 3.4). Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Remove unnecessary SVA data accesses in page fault pathLu Baolu2022-09-261-53/+7
| | | | | | | | | | | | | The existing I/O page fault handling code accesses the per-PASID SVA data structures. This is unnecessary and makes the fault handling code only suitable for SVA scenarios. This removes the SVA data accesses from the I/O page fault reporting and responding code, so that the fault handling code could be generic. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
* iommu/vt-d: Refactor iommu information of each domainLu Baolu2022-07-151-1/+1
| | | | | | | | | | | | | | | | When a DMA domain is attached to a device, it needs to allocate a domain ID from its IOMMU. Currently, the domain ID information is stored in two static arrays embedded in the domain structure. This can lead to memory waste when the driver is running on a small platform. This optimizes these static arrays by replacing them with an xarray and consuming memory on demand. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Steve Wahl <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>