aboutsummaryrefslogtreecommitdiffstats
path: root/tools/testing/radix-tree
Commit message (Collapse)AuthorAgeFilesLines
* tools/testing/radix-tree: test maple tree chaining mas_preallocate() callsLiam R. Howlett2025-07-101-0/+12
| | | | | | | | | | | | | | | | | Testing calling multiple mas_preallocate() calls in a row after adjusting the maple state. Ensures new calls to mas_preallocate() will change the number of allocated nodes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Acked-by: Lorenzo Stoakes <[email protected]> Cc: Hailong Liu <[email protected]> Cc: "Liam R. Howlett" <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Sidhartha Kumar <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* testing/radix-tree/maple: increase readers and reduce delay for faster machinesLiam R. Howlett2025-07-101-3/+4
| | | | | | | | | | | | | | | | Faster machines may not see the initial or updated value in the race condition. Reduce the delay so that faster machines are less likely to fail testing of the race conditions. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Cc: Hailong Liu <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Sidhartha Kumar <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: add sufficient heightSidhartha Kumar2025-05-121-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to support rebalancing and spanning stores using less than the worst case number of nodes, we need to track more than just the vacant height. Using only vacant height to reduce the worst case maple node allocation count can lead to a shortcoming of nodes in the following scenarios. For rebalancing writes, when a leaf node becomes insufficient, it may be combined with a sibling into a single node. This means that the parent node which has entries for this children will lose one entry. If this parent node was just meeting the minimum entries, losing one entry will now cause this parent node to be insufficient. This leads to a cascading operation of rebalancing at different levels and can lead to more node allocations than simply using vacant height can return. For spanning writes, a similar situation occurs. At the location at which a spanning write is detected, the number of ancestor nodes may similarly need to rebalanced into a smaller number of nodes and the same cascading situation could occur. To use less than the full height of the tree for the number of allocations, we also need to track the height at which a non-leaf node cannot become insufficient. This means even if a rebalance occurs to a child of this node, it currently has enough entries that it can lose one without any further action. This field is stored in the maple write state as sufficient height. In mas_prealloc_calc() when figuring out how many nodes to allocate, we check if the vacant node is lower in the tree than a sufficient node (has a larger value). If it is, we cannot use the vacant height and must use the difference in the height and sufficient height as the basis for the number of nodes needed. An off by one bug was also discovered in mast_overflow() where it is using >= rather than >. This caused extra iterations of the mas_spanning_rebalance() loop and lead to unneeded allocations. A test is also added to check the number of allocations is correct. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: use vacant nodes to reduce worst case allocationsSidhartha Kumar2025-05-121-8/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: use height and depth consistentlySidhartha Kumar2025-05-121-0/+19
| | | | | | | | | | | | | | | | | | | | | | | For the maple tree, the root node is defined to have a depth of 0 with a height of 1. Each level down from the node, these values are incremented by 1. Various code paths define a root with depth 1 which is inconsisent with the definition. Modify the code to be consistent with this definition. In mas_spanning_rebalance(), l_mas.depth was being used to track the height based on the number of iterations done in the main loop. This information was then used in mas_put_in_tree() to set the height. Rather than overload the l_mas.depth field to track height, simply keep track of height in the local variable new_height and directly pass this to mas_wmb_replace() which will be passed into mas_put_in_tree(). This allows up to remove writes to l_mas.depth. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* xarray: add xas_try_split() to split a multi-index entryZi Yan2025-03-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Buddy allocator like (or non-uniform) folio split", v10. This patchset adds a new buddy allocator like (or non-uniform) large folio split from a order-n folio to order-m with m < n. It reduces 1. the total number of after-split folios from 2^(n-m) to n-m+1; 2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to n/6-m/6, assuming XA_CHUNK_SHIFT=6; 3. keep more large folios after a split from all order-m folios to order-(n-1) to order-m folios. For example, to split an order-9 to order-0, folio split generates 10 (or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios. Instead of duplicating existing split_huge_page*() code, __folio_split() is introduced as the shared backend code for both split_huge_page_to_list_to_order() and folio_split(). __folio_split() can support both uniform split and buddy allocator like (or non-uniform) split. All existing split_huge_page*() users can be gradually converted to use folio_split() if possible. In this patchset, I converted truncate_inode_partial_folio() to use folio_split(). xfstests quick group passed for both tmpfs and xfs. I also semi-replicated Hugh's test[12] and ran it without any issue for almost 24 hours. This patch (of 8): A preparation patch for non-uniform folio split, which always split a folio into half iteratively, and minimal xarray entry split. Currently, xas_split_alloc() and xas_split() always split all slots from a multi-index entry. They cost the same number of xa_node as the to-be-split slots. For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8 xa_node are needed. Instead xas_try_split() is intended to be used iteratively to split the order-9 entry into 2 order-8 entries, then split one order-8 entry, based on the given index, to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. When splitting the order-6 entry and a new xa_node is needed, xas_try_split() will try to allocate one if possible. As a result, xas_try_split() would only need 1 xa_node instead of 8. When a new xa_node is needed during the split, xas_try_split() can try to allocate one but no more. -ENOMEM will be return if a node cannot be allocated. -EINVAL will be return if a sibling node is split or cascade split happens, where two or more new nodes are needed, and these are not supported by xas_try_split(). xas_split_alloc() and xas_split() split an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | | | ------- --- --- ------- | | ... | | V V V V ----------- ----------- ----------- ----------- | xa_node | | xa_node | ... | xa_node | | xa_node | ----------- ----------- ----------- ----------- xas_try_split() splits an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | V ----------- | xa_node | ----------- Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Kirill A. Shuemov <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Yang Shi <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Zi Yan <[email protected]> Cc: Kairui Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* Xarray: do not return sibling entries from xas_find_marked()Kemeng Shi2025-01-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Fixes and cleanups to xarray", v5. This series contains some random fixes and cleanups to xarray. Patch 1-2 are fixes and patch 3-6 are cleanups. More details can be found in respective patches. This patch (of 5): Similar to issue fixed in commit cbc02854331ed ("XArray: Do not return sibling entries from xa_load()"), we may return sibling entries from xas_find_marked as following: Thread A: Thread B: xa_store_range(xa, entry, 6, 7, gfp); xa_set_mark(xa, 6, mark) XA_STATE(xas, xa, 6); xas_find_marked(&xas, 7, mark); offset = xas_find_chunk(xas, advance, mark); [offset is 6 which points to a valid entry] xa_store_range(xa, entry, 4, 7, gfp); entry = xa_entry(xa, node, 6); [entry is a sibling of 4] if (!xa_is_node(entry)) return entry; Skip sibling entry like xas_find() does to protect caller from seeing sibling entry from xas_find_marked() or caller may use sibling entry as a valid entry and crash the kernel. Besides, load_race() test is modified to catch mentioned issue and modified load_race() only passes after this fix is merged. Here is an example how this bug could be triggerred in tmpfs which enables large folio in mapping: Let's take a look at involved racer: 1. How pages could be created and dirtied in shmem file. write ksys_write vfs_write new_sync_write shmem_file_write_iter generic_perform_write shmem_write_begin shmem_get_folio shmem_allowable_huge_orders shmem_alloc_and_add_folios shmem_alloc_folio __folio_set_locked shmem_add_to_page_cache XA_STATE_ORDER(..., index, order) xax_store() shmem_write_end folio_mark_dirty() 2. How dirty pages could be deleted in shmem file. ioctl do_vfs_ioctl file_ioctl ioctl_preallocate vfs_fallocate shmem_fallocate shmem_truncate_range shmem_undo_range truncate_inode_folio filemap_remove_folio page_cache_delete xas_store(&xas, NULL); 3. How dirty pages could be lockless searched sync_file_range ksys_sync_file_range __filemap_fdatawrite_range filemap_fdatawrite_wbc do_writepages writeback_use_writepage writeback_iter writeback_get_folio filemap_get_folios_tag find_get_entry folio = xas_find_marked() folio_try_get(folio) Kernel will crash as following: 1.Create 2.Search 3.Delete /* write page 2,3 */ write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 2, order = 1) xa_store(&xas, folio) shmem_write_end folio_mark_dirty() /* sync page 2 and page 3 */ sync_file_range ... find_get_entry folio = xas_find_marked() /* offset will be 2 */ offset = xas_find_chunk() /* delete page 2 and page 3 */ ioctl ... xas_store(&xas, NULL); /* write page 0-3 */ write ... shmem_write_begin XA_STATE_ORDER(xas, i_pages, index = 0, order = 2) xa_store(&xas, folio) shmem_write_end folio_mark_dirty(folio) /* get sibling entry from offset 2 */ entry = xa_entry(.., 2) /* use sibling entry as folio and crash kernel */ folio_try_get(folio) Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Cc: Mattew Wilcox <[email protected]> [English fixes] Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: add some alloc node test caseJiazi Li2024-11-071-0/+22
| | | | | | | | | | | Add some maple_tree alloc node tese case. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jiazi Li <[email protected]> Signed-off-by: Liam R. Howlett <[email protected]> Suggested-by: Liam R. Howlett <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: add regression test for spanning store bugLorenzo Stoakes2024-10-171-0/+84
| | | | | | | | | | | | | | | | | | | | Add a regression test to assert that, when performing a spanning store which consumes the entirety of the rightmost right leaf node does not result in maple tree corruption when doing so. This achieves this by building a test tree of 3 levels and establishing a store which ultimately results in a spanned store of this nature. Link: https://lkml.kernel.org/r/30cdc101a700d16e03ba2f9aa5d83f2efa894168.1728314403.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Reviewed-by: Wei Yang <[email protected]> Cc: Bert Karwatzki <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mikhail Gavrilov <[email protected]> Cc: Sidhartha Kumar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: check for MA_STATE_BULK on setting wr_rebalanceSidhartha Kumar2024-10-171-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | It is possible for a bulk operation (MA_STATE_BULK is set) to enter the new_end < mt_min_slots[type] case and set wr_rebalance as a store type. This is incorrect as bulk stores do not rebalance per write, but rather after the all of the writes are done through the mas_bulk_rebalance() path. Therefore, add a check to make sure MA_STATE_BULK is not set before we return wr_rebalance as the store type. Also add a test to make sure wr_rebalance is never the store type when doing bulk operations via mas_expected_entries() This is a hotfix for this rc however it has no userspace effects as there are no users of the bulk insertion mode. Link: https://lkml.kernel.org/r/[email protected] Fixes: 5d659bbb52a2 ("maple_tree: introduce mas_wr_store_type()") Suggested-by: Liam Howlett <[email protected]> Signed-off-by: Sidhartha <[email protected]> Reviewed-by: Wei Yang <[email protected]> Reviewed-by: Liam Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* Merge tag 'memblock-v6.12-rc1' of ↵Linus Torvalds2024-09-251-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set - fixes for memblock tests * tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: s390/mm: get estimated free pages by memblock api kernel/fork.c: get estimated free pages by memblock api mm/memblock: introduce a new helper memblock_estimated_nr_free_pages() memblock test: fix implicit declaration of function 'strscpy' memblock test: fix implicit declaration of function 'isspace' memblock test: fix implicit declaration of function 'memparse' memblock test: add the definition of __setup() memblock test: fix implicit declaration of function 'virt_to_phys' tools/testing: abstract two init.h into common include directory memblock tests: include export.h in linkage.h as kernel dose memblock tests: include memory_hotplug.h in mmzone.h as kernel dose
| * tools/testing: abstract two init.h into common include directoryWei Yang2024-08-062-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we have two test suits define its own init.h. This is a little redundant. Let's create a init.h in common include directory and merge these two into it. Signed-off-by: Wei Yang <[email protected]> CC: Mike Rapoport <[email protected]> CC: Liam R. Howlett <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
* | maple_tree: remove mas_destroy() from mas_nomem()Sidhartha Kumar2024-09-021-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Separate call to mas_destroy() from mas_nomem() so we can check for no memory errors without destroying the current maple state in mas_store_gfp(). We then add calls to mas_destroy() to callers of mas_nomem(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: introduce mas_wr_store_type()Sidhartha Kumar2024-09-021-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce mas_wr_store_type() which will set the correct store type based on a walk of the tree. In mas_wr_node_store() the <= min_slots condition is changed to < as if new_end is = to mt_min_slots then there is not enough room. mas_prealloc_calc() is also introduced to abstract the calculation used to determine the number of nodes needed for a store operation. In this change a call to mas_reset() is removed in the error case of mas_prealloc(). This is only needed in the MA_STATE_REBALANCE case of mas_destroy(). We can move the call to mas_reset() directly to mas_destroy(). Also, add a test case to validate the order that we check the store type in is correct. This test models a vma expanding and then shrinking which is part of the boot process. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add test to replicate low memory race conditionsSidhartha Kumar2024-09-021-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new callback fields to the userspace implementation of struct kmem_cache. This allows for executing callback functions in order to further test low memory scenarios where node allocation is retried. This callback can help test race conditions by calling a function when a low memory event is tested. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | tools: separate out shared radix-tree componentsLorenzo Stoakes2024-09-0221-481/+14
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core components contained within the radix-tree tests which provide shims for kernel headers and access to the maple tree are useful for testing other things, so separate them out and make the radix tree tests dependent on the shared components. This lays the groundwork for us to add VMA tests of the newly introduced vma.c file. Link: https://lkml.kernel.org/r/1ee720c265808168e0d75608e687607d77c36719.1722251717.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Christian Brauner <[email protected]> Cc: David Gow <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Jan Kara <[email protected]> Cc: Kees Cook <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Rae Moar <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Pengfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linuxLinus Torvalds2024-07-262-25/+2
|\ | | | | | | | | | | | | | | | | | | | | | | Pull bitmap updates from Yury Norov: "Random fixes" * tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux: riscv: Remove unnecessary int cast in variable_fls() radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c bitops: Add a comment explaining the double underscore macros lib: bitmap: add missing MODULE_DESCRIPTION() macros cpumask: introduce assign_cpu() macro
| * radix tree test suite: put definition of bitmap_clear() into lib/bitmap.cWei Yang2024-07-102-25/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In tools/ directory, function bitmap_clear() is currently only used in object file tools/testing/radix-tree/xarray.o. But instead of keeping a bitmap.c with only bitmap_clear() definition in radix-tree's own directory, it would be more proper to put it in common directory lib/. Sync the kernel definition and link some related libs, no functional change is expected. Signed-off-by: Wei Yang <[email protected]> CC: Matthew Wilcox <[email protected]> CC: Yury Norov <[email protected]> Signed-off-by: Yury Norov <[email protected]>
* | tools/testing/radix-tree/idr-test: add missing MODULE_DESCRIPTION defineSidhartha Kumar2024-06-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace builds of the radix-tree testing suite fails because of patch KUnit: add missing MODULE_DESCRIPTION() macros for lib/test_*.ko. Add the proper defines to tools/testing/radix-tree/idr-test.c so MODULE_DESCRIPTION has a definition. This allows the build to succeed. Link: https://lkml.kernel.org/r/[email protected] Fixes: f069e33dafe1 ("KUnit: add missing MODULE_DESCRIPTION() macros for lib/test_*.ko") Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Jeff Johnson <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | tools/testing/radix-tree: add missing MODULE_DESCRIPTION definitionSidhartha Kumar2024-06-252-0/+2
|/ | | | | | | | | | | | | | | | Userspace builds of the radix-tree testing suite fails because of commit test_maple_tree: add the missing MODULE_DESCRIPTION() macro. Add the proper defines to tools/testing/radix-tree/maple.c and tools/testing/radix-tree/xarray.c so MODULE_DESCRIPTION has a definition. This allows the build to succeed. Link: https://lkml.kernel.org/r/[email protected] Fixes: 9f8090e8c4d1 ("test_maple_tree: add the missing MODULE_DESCRIPTION() macro") Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Jeff Johnson <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* tools: fix userspace compilation with new test_xarray changesLuis Chamberlain2024-05-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "test_xarray: couple of fixes for v6-9-rc6", v2. Here are a couple of fixes which should be merged into the queue for v6.9-rc6. The first one was reported by Liam, after fixing that I noticed an issue with a test, and a fix for that is in the second patch. This patch (of 2): Liam reported that compiling the test_xarray on userspace was broken. I was not even aware that was possible but you can via and you can run these tests in userspace with: make -C tools/testing/radix-tree ./tools/testing/radix-tree/xarray Add the two helpers we need to fix compilation. We don't need a userspace schedule() so just make it do nothing. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: a60cc288a1a2 ("test_xarray: add tests for advanced multi-index use") Signed-off-by: Luis Chamberlain <[email protected]> Reported-by: "Liam R. Howlett" <[email protected]> Cc: Daniel Gomez <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Pankaj Raghav <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: fix warning comparing pointer to 0Jiapeng Chong2023-12-201-3/+3
| | | | | | | | | | | | | | Avoid pointer type value compared with 0 to make code clear. ./tools/testing/radix-tree/maple.c:34142:15-16: WARNING comparing pointer to 0. Link: https://lkml.kernel.org/r/[email protected] Reported-by: Abaci Robot <[email protected]> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7696 Signed-off-by: Jiapeng Chong <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* sync mm-stable with mm-hotfixes-stable to pick up depended-upon changesAndrew Morton2023-12-201-1/+1
|\
| * maple_tree: do not preallocate nodes for slot storesSidhartha Kumar2023-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mas_preallocate() defaults to requesting 1 node for preallocation and then ,depending on the type of store, will update the request variable. There isn't a check for a slot store type, so slot stores are preallocating the default 1 node. Slot stores do not require any additional nodes, so add a check for the slot store case that will bypass node_count_gfp(). Update the tests to reflect that slot stores do not require allocations. User visible effects of this bug include increased memory usage from the unneeded node that was allocated. Link: https://lkml.kernel.org/r/[email protected] Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing") Signed-off-by: Sidhartha Kumar <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Peng Zhang <[email protected]> Cc: <[email protected]> [6.6+] Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: remove mas_searchable()Liam R. Howlett2023-12-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | Now that the status of the maple state is outside of the node, the mas_searchable() function can be dropped for easier open-coding of what is going on. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: separate ma_state node from statusLiam R. Howlett2023-12-121-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The maple tree node is overloaded to keep status as well as the active node. This, unfortunately, results in a re-walk on underflow or overflow. Since the maple state has room, the status can be placed in its own enum in the structure. Once an underflow/overflow is detected, certain modes can restore the status to active and others may need to re-walk just that one node to see the entry. The status being an enum has the benefit of detecting unhandled status in switch statements. [[email protected]: fix comments about MAS_*] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: update forking to separate maple state and node] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix mas_prev() state separation code] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add end of node tracking to the maple stateLiam R. Howlett2023-12-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Analysis of the mas_for_each() iteration showed that there is a significant time spent finding the end of a node. This time can be greatly reduced if the end of the node is cached in the maple state. Care must be taken to update & invalidate as necessary. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: move debug check to __mas_set_range()Liam R. Howlett2023-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __mas_set_range() was created to shortcut resetting the maple state and a debug check was added to the caller (the vma iterator) to ensure the internal maple state remains safe to use. Move the debug check from the vma iterator into the maple tree itself so other users do not incorrectly use the advanced maple state modification. Fallout from this change include a large amount of debug setup needed to be moved to earlier in the header, and the maple_tree.h radix-tree test code needed to move the inclusion of the header to after the atomic define. None of those changes have functional changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: skip other tests when BENCH is enabledPeng Zhang2023-12-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Skip other tests when BENCH is enabled so that performance can be measured in user space. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add test for mtree_dup()Peng Zhang2023-12-111-0/+361
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add test for mtree_dup(). Test by duplicating different maple trees and then comparing the two trees. Includes tests for duplicating full trees and memory allocation failures on different nodes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | radix tree test suite: align kmem_cache_alloc_bulk() with kernel behavior.Peng Zhang2023-12-111-12/+33
|/ | | | | | | | | | | | | | | | | | | | | When kmem_cache_alloc_bulk() fails to allocate, leave the freed pointers in the array. This enables a more accurate simulation of the kernel's behavior and allows for testing potential double-free scenarios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* radix tree test suite: fix allocation calculation in kmem_cache_alloc_bulk()Liam R. Howlett2023-10-181-2/+2
| | | | | | | | | | | | | The bulk allocation is iterating through an array and storing enough memory for the entire bulk allocation instead of a single array entry. Only allocate an array element of the size set in the kmem_cache. Link: https://lkml.kernel.org/r/[email protected] Fixes: cc86e0c2f306 ("radix tree test suite: add support for slab bulk APIs") Signed-off-by: Liam R. Howlett <[email protected]> Reported-by: Christophe JAILLET <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* Merge tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarrayLinus Torvalds2023-09-091-2/+66
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xarray fixes from Matthew Wilcox: - Fix a bug encountered by people using bittorrent where they'd get NULL pointer dereferences on page cache lookups when using XFS - Two documentation fixes * tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray: idr: fix param name in idr_alloc_cyclic() doc xarray: Document necessary flag in alloc functions XArray: Do not return sibling entries from xa_load()
| * XArray: Do not return sibling entries from xa_load()Matthew Wilcox (Oracle)2023-07-281-2/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible for xa_load() to observe a sibling entry pointing to another sibling entry. An example: Thread A: Thread B: xa_store_range(xa, entry, 188, 191, gfp); xa_load(xa, 191); entry = xa_entry(xa, node, 63); [entry is a sibling of 188] xa_store_range(xa, entry, 184, 191, gfp); if (xa_is_sibling(entry)) offset = xa_to_sibling(entry); entry = xa_entry(xas->xa, node, offset); [entry is now a sibling of 184] It is sufficient to go around this loop until we hit a non-sibling entry. Sibling entries always point earlier in the node, so we are guaranteed to terminate this search. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Cc: [email protected]
* | merge mm-hotfixes-stable into mm-stable to pick up depended-upon changesAndrew Morton2023-08-211-1/+1
|\ \
| * | radix tree test suite: fix incorrect allocation size for pthreadsColin Ian King2023-08-041-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the pthread allocation for each array item is based on the size of a pthread_t pointer and should be the size of the pthread_t structure, so the allocation is under-allocating the correct size. Fix this by using the size of each element in the pthreads array. Static analysis cppcheck reported: tools/testing/radix-tree/regression1.c:180:2: warning: Size of pointer 'threads' used instead of size of its data. [pointerSize] Link: https://lkml.kernel.org/r/[email protected] Fixes: 1366c37ed84b ("radix tree test harness") Signed-off-by: Colin Ian King <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: update mas_preallocate() testingLiam R. Howlett2023-08-181-11/+16
| | | | | | | | | | | | | | | | | | | | | | Since the mas_preallocate() calculation has been updated to be more precise, the testing must also be updated to check for what is expected. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: re-introduce entry to mas_preallocate() argumentsLiam R. Howlett2023-08-181-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | The current preallocation strategy is to preallocate the absolute worst-case allocation for a tree modification. The entry (or NULL) is needed to know how many nodes are needed to write to the tree. Start by adding the argument to the mas_preallocate() definition. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add test for expanding range in RCU modePeng Zhang2023-08-181-0/+75
|/ | | | | | | | | | Add test for expanding range in RCU mode. If we use the fast path of the slot store to expand range in RCU mode, this test will fail. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: fix node allocation testing on 32 bitLiam R. Howlett2023-07-171-3/+3
| | | | | | | | | | | | | | Internal node counting was altered and the 64 bit test was updated, however the 32bit test was missed. Restore the 32bit test to a functional state. Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 541e06b772c1 ("maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()") Signed-off-by: Liam R. Howlett <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes.Andrew Morton2023-06-231-2/+3
|\
| * radix-tree: move declarations to headerArnd Bergmann2023-06-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xarray.c file contains the only call to radix_tree_node_rcu_free(), and it comes with its own extern declaration for it. This means the function definition causes a missing-prototype warning: lib/radix-tree.c:288:6: error: no previous prototype for 'radix_tree_node_rcu_free' [-Werror=missing-prototypes] Instead, move the declaration for this function to a new header that can be included by both, and do the same for the radix_tree_node_cachep variable that has the same underlying problem but does not cause a warning with gcc. [[email protected]: fix building radix tree test suite] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Peng Zhang <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add __init and __exit to test moduleLiam R. Howlett2023-06-092-71/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test functions are not needed after the module is removed, so mark them as such. Add __exit to the module removal function. Some other variables have been marked as const static as well. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Suggested-by: Andrew Morton <[email protected]> Cc: David Binderman <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Vernon Yang <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: make test code work without debug enabledLiam R. Howlett2023-06-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test code is less useful without debug, but can still do general validations. Define mt_dump(), mas_dump() and mas_wr_dump() as a noop if debug is not enabled and document it in the test module information that more information can be obtained with another kernel config option. MT_BUG_ON() will report a failures without tree dumps, and the output will be less useful. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: David Binderman <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Vernon Yang <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: add format option to mt_dump()Liam R. Howlett2023-06-091-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow different formatting strings to be used when dumping the tree. Currently supports hex and decimal. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Cc: David Binderman <[email protected]> Cc: Peng Zhang <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Vernon Yang <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | maple_tree: avoid unnecessary ascendingLiam R. Howlett2023-06-091-0/+4
|/ | | | | | | | | | | | | | | | | | | | The maple tree node limits are implied by the parent. When walking up the tree, the limit may not be known until a slot that does not have implied limits are encountered. However, if the node is the left-most or right-most node, the walking up to find that limit can be skipped. This commit also fixes the debug/testing code that was not setting the limit on walking down the tree as that optimization is not compatible with this change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Reviewed-by: Peng Zhang <[email protected]> Cc: David Binderman <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Vernon Yang <[email protected]> Cc: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: add a test case to check maple_allocPeng Zhang2023-04-181-0/+24
| | | | | | | | | | Add a test case to check whether the number of maple_alloc structures is actually equal to mas->alloc->total. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Cc: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: fix write memory barrier of nodes once dead for RCU modeLiam R. Howlett2023-04-061-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Link: https://lkml.kernel.org/r/[email protected] Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: remove the parameter entry of mas_preallocateVernon Yang2023-02-031-16/+16
| | | | | | | | | | The parameter entry of mas_preallocate is not used, so drop it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vernon Yang <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()Liam Howlett2023-01-191-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | Preallocations are common in the VMA code to avoid allocating under certain locking conditions. The preallocations must also cover the worst-case scenario. Removing the GFP_ZERO flag from the kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time spent zeroing memory that may not be used. Only zero out the necessary area to keep track of the allocations in the maple state. Zero the entire node prior to using it in the tree. This required internal changes to node counting on allocation, so the test code is also updated. This restores some micro-benchmark performance: up to +9% in mmtests mmap1 by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam Howlett <[email protected]> Reported-by: Jirka Hladky <[email protected]> Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>