diff options
| author | Jinjiang Tu <[email protected]> | 2025-06-27 12:57:46 +0000 |
|---|---|---|
| committer | Andrew Morton <[email protected]> | 2025-07-20 02:26:15 +0000 |
| commit | 9f1e8cd0b7c4c944e9921b52a6661b5eda2705ab (patch) | |
| tree | ab0b26ee36e0f9b0047c0e0bc557a1899570865f | |
| parent | Revert "sched/numa: add statistics of numa balance task" (diff) | |
| download | kernel-9f1e8cd0b7c4c944e9921b52a6661b5eda2705ab.tar.gz kernel-9f1e8cd0b7c4c944e9921b52a6661b5eda2705ab.zip | |
mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
In shrink_folio_list(), the hwpoisoned folio may be large folio, which
can't be handled by unmap_poisoned_folio(). For THP, try_to_unmap_one()
must be passed with TTU_SPLIT_HUGE_PMD to split huge PMD first and then
retry. Without TTU_SPLIT_HUGE_PMD, we will trigger null-ptr deref of
pvmw.pte. Even we passed TTU_SPLIT_HUGE_PMD, we will trigger a
WARN_ON_ONCE due to the page isn't in swapcache.
Since UCE is rare in real world, and race with reclaimation is more rare,
just skipping the hwpoisoned large folio is enough. memory_failure() will
handle it if the UCE is triggered again.
This happens when memory reclaim for large folio races with
memory_failure(), and will lead to kernel panic. The race is as
follows:
cpu0 cpu1
shrink_folio_list memory_failure
TestSetPageHWPoison
unmap_poisoned_folio
--> trigger BUG_ON due to
unmap_poisoned_folio couldn't
handle large folio
[[email protected]: add comment to unmap_poisoned_folio()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jinjiang Tu <[email protected]>
Fixes: 1b0449544c64 ("mm/vmscan: don't try to reclaim hwpoison folio")
Reported-by: [email protected]
Closes: https://lore.kernel.org/all/[email protected]/
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Acked-by: Zi Yan <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
| -rw-r--r-- | mm/memory-failure.c | 4 | ||||
| -rw-r--r-- | mm/vmscan.c | 8 |
2 files changed, 12 insertions, 0 deletions
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index b91a33fb6c69..225dddff091d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1561,6 +1561,10 @@ static int get_hwpoison_page(struct page *p, unsigned long flags) return ret; } +/* + * The caller must guarantee the folio isn't large folio, except hugetlb. + * try_to_unmap() can't handle it. + */ int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill) { enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON; diff --git a/mm/vmscan.c b/mm/vmscan.c index f8dfd2864bbf..424412680cfc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1138,6 +1138,14 @@ retry: goto keep; if (folio_contain_hwpoisoned_page(folio)) { + /* + * unmap_poisoned_folio() can't handle large + * folio, just skip it. memory_failure() will + * handle it if the UCE is triggered again. + */ + if (folio_test_large(folio)) + goto keep_locked; + unmap_poisoned_folio(folio, folio_pfn(folio), false); folio_unlock(folio); folio_put(folio); |
