aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* chore(bcm2712-iommu): sync source code from rpi kernel repoHEADv6.17mainsaturneric2025-10-172-201/+289
|
* chore(bcm2712-ds): clean up unused device configurationssaturneric2025-10-171-1/+14
|
* refactor(rpi_poe_power): update power supply configurationsaturneric2025-10-171-1/+1
| | | | | * Change `psy_cfg.of_node` to `psy_cfg.fwnode` for better compatibility * Align with recent driver updates and improve code clarity
* chore(macb): remove unused shutdown functionsaturneric2025-10-171-13/+0
| | | | | * The `macb_shutdown` function was not utilized in the codebase. * Cleaning up unused code improves maintainability and readability.
* chore(iommu): clean up unused code sectionssaturneric2025-10-171-2/+0
| | | | | * Removed redundant comments and unused macros * Improves code readability and maintainability
* refactor(gpio): change set functions to return intsaturneric2025-10-171-2/+3
| | | | | | | * Update `bgpio_set_direct` and `bgpio_set_multiple_direct` to return an integer instead of void for better error handling. * This change allows the caller to check for success or failure of the operations.
* chore(Makefile): remove unused device tree overlay filesaturneric2025-10-173-133/+757
| | | | | * Deleted `bcm2712-rpi-5-b-ovl-rp1.dts` as it is no longer needed. * Updated `Makefile` to reflect the removal of the overlay.
* Merge tag 'v6.17'saturneric2025-10-1612558-239966/+483679
|\ | | | | | | Linux 6.17
| * Linux 6.17Linus Torvalds2025-09-281-1/+1
| |
| * Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds2025-09-281-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | Pull ARM fix from Russell King: "Just one fix to the module freeing function that was declared __weak when it should not have been. Thanks to Petr Pavlu for spotting this" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9458/1: module: Ensure the override of module_arch_freeing_init()
| | * ARM: 9458/1: module: Ensure the override of module_arch_freeing_init()Petr Pavlu2025-09-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function module_arch_freeing_init() defined in arch/arm/kernel/module.c is supposed to override a weak function of the same name defined in kernel/module/main.c. However, the ARM override is also marked as weak, which means that selecting the correct function unnecessarily depends on the order in which object files with both functions are passed to the linker. Although it happens to be correct at the moment, the proper pattern is to make the ARM override a strong definition. Fixes: cdcb07e45a91 ("ARM: 8975/1: module: fix handling of unwind init sections") Signed-off-by: Petr Pavlu <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]>
| * | Merge tag 'i2c-for-6.17-rc8' of ↵Linus Torvalds2025-09-284-9/+14
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - various MAINTAINERS updates - fix an off-by-one error in riic - fix k1 DT schema to allow validation - rtl9300: fix faulty merge conflict resolution * tag 'i2c-for-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: rtl9300: Drop unsupported I2C_FUNC_SMBUS_I2C_BLOCK MAINTAINERS: add entry for SpacemiT K1 I2C driver MAINTAINERS: Add me as maintainer of Synopsys DesignWare I2C driver MAINTAINERS: delete email for Tharun Kumar P dt-bindings: i2c: spacemit: extend and validate all properties i2c: riic: Allow setting frequencies lower than 50KHz MAINTAINERS: Remove myself as Synopsys DesignWare I2C maintainer MAINTAINERS: Update email address for Qualcomm's I2C GENI maintainers
| | * | i2c: rtl9300: Drop unsupported I2C_FUNC_SMBUS_I2C_BLOCKSven Eckelmann2025-09-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While applying the patch for commit ede965fd555a ("i2c: rtl9300: remove broken SMBus Quick operation support"), a conflict was incorrectly solved by adding the I2C_FUNC_SMBUS_I2C_BLOCK feature flag. But the code to handle I2C_SMBUS_I2C_BLOCK_DATA requests will be added by a separate commit. Fixes: ede965fd555a ("i2c: rtl9300: remove broken SMBus Quick operation support") Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | MAINTAINERS: add entry for SpacemiT K1 I2C driverTroy Mitchell2025-09-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a MAINTAINERS entry for the SpacemiT K1 I2C driver and its DT binding. Signed-off-by: Troy Mitchell <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | MAINTAINERS: Add me as maintainer of Synopsys DesignWare I2C driverMika Westerberg2025-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I volunteered as maintainer of the DesignWare I2C driver, so update my entry from reviewer to maintainer. Signed-off-by: Mika Westerberg <[email protected]> Acked-by: Jarkko Nikula <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | MAINTAINERS: delete email for Tharun Kumar PWolfram Sang2025-09-271-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The email address bounced. I couldn't find a newer one in recent git history, so delete this email entry. Signed-off-by: Wolfram Sang <[email protected]>
| | * | dt-bindings: i2c: spacemit: extend and validate all propertiesYixun Lan2025-09-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the K1 I2C properties by including generic i2c-controller schema. and this will enable it to do the DT validation check later. Signed-off-by: Yixun Lan <[email protected]> Reviewed-by: Troy Mitchell <[email protected]> Tested-by: Alex Elder <[email protected]> Acked-by: Conor Dooley <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | i2c: riic: Allow setting frequencies lower than 50KHzClaudiu Beznea2025-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MR1.CKS field is 3 bits wide and all the possible values (from 0 to 7) are valid. This is true for all the SoCs currently integrated in upstream Linux. Take into account CKS=7 which allows setting bus frequencies lower than 50KHz. This may be useful at least for debugging. Fixes: d982d6651419 ("i2c: riic: remove clock and frequency restrictions") Signed-off-by: Claudiu Beznea <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | MAINTAINERS: Remove myself as Synopsys DesignWare I2C maintainerJarkko Nikula2025-09-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My address is going to bounce soon and I won't have access to the Synopsys datasheets so I'm going step down being a maintainer for this driver. Signed-off-by: Jarkko Nikula <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| | * | MAINTAINERS: Update email address for Qualcomm's I2C GENI maintainersMukesh Kumar Savaliya2025-09-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update email address to @oss.qualcomm.com for both the maintainers from qualcomm, Viken dadhani and Mukesh Kumar Savaliya. Signed-off-by: Mukesh Kumar Savaliya <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
| * | | Merge tag 'trace-v6.17-rc7' of ↵Linus Torvalds2025-09-282-1/+14
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix buffer overflow in osnoise_cpu_write() The allocated buffer to read user space did not add a nul terminating byte after copying from user the string. It then reads the string, and if user space did not add a nul byte, the read will continue beyond the string. Add a nul terminating byte after reading the string. - Fix missing check for lockdown on tracing There's a path from kprobe events or uprobe events that can update the tracing system even if lockdown on tracing is activate. Add a check in the dynamic event path. - Add a recursion check for the function graph return path Now that fprobes can hook to the function graph tracer and call different code between the entry and the exit, the exit code may now call functions that are not called in entry. This means that the exit handler can possibly trigger recursion that is not caught and cause the system to crash. Add the same recursion checks in the function exit handler as exists in the entry handler path. * tag 'trace-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: fgraph: Protect return handler from recursion loop tracing: dynevent: Add a missing lockdown check on dynevent tracing/osnoise: Fix slab-out-of-bounds in _parse_integer_limit()
| | * | | tracing: fgraph: Protect return handler from recursion loopMasami Hiramatsu (Google)2025-09-271-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | function_graph_enter_regs() prevents itself from recursion by ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), which is called at the exit, does not prevent such recursion. Therefore, while it can prevent recursive calls from fgraph_ops::entryfunc(), it is not able to prevent recursive calls to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. This can lead an unexpected recursion bug reported by Menglong. is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> kprobe_multi_link_exit_handler -> is_endbr. To fix this issue, acquire ftrace_test_recursion_trylock() in the __ftrace_return_to_handler() after unwind the shadow stack to mark this section must prevent recursive call of fgraph inside user-defined fgraph_ops::retfunc(). This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer"), because before that fgraph was only used from the function graph tracer. Fprobe allowed user to run any callbacks from fgraph after that commit. Reported-by: Menglong Dong <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/175852292275.307379.9040117316112640553.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Menglong Dong <[email protected]> Acked-by: Menglong Dong <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
| | * | | tracing: dynevent: Add a missing lockdown check on dyneventMasami Hiramatsu (Google)2025-09-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since dynamic_events interface on tracefs is compatible with kprobe_events and uprobe_events, it should also check the lockdown status and reject if it is set. Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/175824455687.45175.3734166065458520748.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
| | * | | tracing/osnoise: Fix slab-out-of-bounds in _parse_integer_limit()Wang Liang2025-09-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When config osnoise cpus by write() syscall, the following KASAN splat may be observed: BUG: KASAN: slab-out-of-bounds in _parse_integer_limit+0x103/0x130 Read of size 1 at addr ffff88810121e3a1 by task test/447 CPU: 1 UID: 0 PID: 447 Comm: test Not tainted 6.17.0-rc6-dirty #288 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_report+0xcb/0x610 kasan_report+0xb8/0xf0 _parse_integer_limit+0x103/0x130 bitmap_parselist+0x16d/0x6f0 osnoise_cpus_write+0x116/0x2d0 vfs_write+0x21e/0xcc0 ksys_write+0xee/0x1c0 do_syscall_64+0xa8/0x2a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> This issue can be reproduced by below code: const char *cpulist = "1"; int fd=open("/sys/kernel/debug/tracing/osnoise/cpus", O_WRONLY); write(fd, cpulist, strlen(cpulist)); Function bitmap_parselist() was called to parse cpulist, it require that the parameter 'buf' must be terminated with a '\0' or '\n'. Fix this issue by adding a '\0' to 'buf' in osnoise_cpus_write(). Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 17f89102fe23 ("tracing/osnoise: Allow arbitrarily long CPU string") Signed-off-by: Wang Liang <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
| * | | | Merge tag 'spi-fix-v6.17-rc7' of ↵Linus Torvalds2025-09-282-17/+37
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few final driver specific fixes that have been sitting in -next for a bit. The OMAP issue is likely to come up very infrequently since mixed configuration SPI buses are rare and the Cadence issue is specific to SoCFPGA systems" * tag 'spi-fix-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: omap2-mcspi: drive SPI_CLK on transfer_setup() spi: cadence-qspi: defer runtime support on socfpga if reset bit is enabled
| | * | | | spi: omap2-mcspi: drive SPI_CLK on transfer_setup()Bastien Curutchet (Schneider Electric)2025-09-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the cached contents of the CHCONF register doesn't have the FORCE bit set, the setup() function failed to set the relevant idle state of the SPI_CLK pin. In such case, the SPI_CLK's idle state is reached later with set_cs(), but it's too late for the first SPI transfer which fails since the CS is asserted before the clock reaching its idle state. Add a first write in setup() that always sets the FORCE bit. Keep the current write afterwards to ensure the FORCE bit won't stay in the cached contents of the CHCONF register unless it's intended. Signed-off-by: Bastien Curutchet (Schneider Electric) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]>
| | * | | | spi: cadence-qspi: defer runtime support on socfpga if reset bit is enabledKhairul Anuar Romli2025-09-181-17/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling runtime PM allows the kernel to gate clocks and power to idle devices. On SoCFPGA, a warm reset does not fully reinitialize these domains.This leaves devices suspended and powered down, preventing U-Boot or the kernel from reusing them after a warm reset, which breaks the boot process. Fixes: 4892b374c9b7 ("mtd: spi-nor: cadence-quadspi: Add runtime PM support") CC: [email protected] # 6.12+ Signed-off-by: Khairul Anuar Romli <[email protected]> Signed-off-by: Adrian Ng Ho Yin <[email protected]> Reviewed-by: Niravkumar L Rabara <[email protected]> Reviewed-by: Matthew Gerlach <[email protected]> Link: https://patch.msgid.link/910aad68ba5d948919a7b90fa85a2fadb687229b.1757491372.git.khairul.anuar.romli@altera.com Signed-off-by: Mark Brown <[email protected]>
| * | | | | Merge tag 'mm-hotfixes-stable-2025-09-27-22-35' of ↵Linus Torvalds2025-09-289-21/+49
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 hotfixes. 4 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 6 of these fixes are for MM. All singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2025-09-27-22-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: include/linux/pgtable.h: convert arch_enter_lazy_mmu_mode() and friends to static inlines mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call() mailmap: add entry for Bence Csókás fs/proc/task_mmu: check p->vec_buf for NULL kmsan: fix out-of-bounds access to shadow memory mm/hugetlb: fix copy_hugetlb_page_range() to use ->pt_share_count mm/hugetlb: fix folio is still mapped when deleted
| | * | | | | include/linux/pgtable.h: convert arch_enter_lazy_mmu_mode() and friends to ↵Andrew Morton2025-09-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | static inlines commit c519c3c0a113 ("mm/kasan: avoid lazy MMU mode hazards") introduced the use of arch_enter_lazy_mmu_mode(), which results in the compiler complaining about "statement has no effect", when __HAVE_ARCH_LAZY_MMU_MODE is not defined in include/linux/pgtable.h The exact warning/error is: In file included from ./include/linux/kasan.h:37, from mm/kasan/shadow.c:14: mm/kasan/shadow.c: In function kasan_populate_vmalloc_pte: ./include/linux/pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] 247 | #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) | ^ mm/kasan/shadow.c:322:9: note: in expansion of macro arch_enter_lazy_mmu_mode> 322 | arch_enter_lazy_mmu_mode(); | ^~~~~~~~~~~~~~~~~~~~~~~~ switching these "functions" to static inlines fixes this up. Fixes: c519c3c0a113 ("mm/kasan: avoid lazy MMU mode hazards") Reported-by: Balbir Singh <[email protected]> Closes: https://lkml.kernel.org/r/[email protected] Cc: Alexander Gordeev <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | mm/damon/sysfs: do not ignore callback's return value in ↵Akinobu Mita2025-09-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | damon_sysfs_damon_call() The callback return value is ignored in damon_sysfs_damon_call(), which means that it is not possible to detect invalid user input when writing commands such as 'commit' to /sys/kernel/mm/damon/admin/kdamonds/<K>/state. Fix it. Link: https://lkml.kernel.org/r/[email protected] Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats") Signed-off-by: Akinobu Mita <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Cc: <[email protected]> [6.14+] Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | mailmap: add entry for Bence CsókásBence Csókás2025-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I will be leaving Prolan this week. You can reach me by my personal email for now. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Bence Csókás <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | fs/proc/task_mmu: check p->vec_buf for NULLJakub Acs2025-09-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches pagemap_scan_backout_range(), kernel panics with null-ptr-deref: [ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none) [ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80 <snip registers, unreliable trace> [ 44.946828] Call Trace: [ 44.947030] <TASK> [ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0 [ 44.952593] walk_pmd_range.isra.0+0x302/0x910 [ 44.954069] walk_pud_range.isra.0+0x419/0x790 [ 44.954427] walk_p4d_range+0x41e/0x620 [ 44.954743] walk_pgd_range+0x31e/0x630 [ 44.955057] __walk_page_range+0x160/0x670 [ 44.956883] walk_page_range_mm+0x408/0x980 [ 44.958677] walk_page_range+0x66/0x90 [ 44.958984] do_pagemap_scan+0x28d/0x9c0 [ 44.961833] do_pagemap_cmd+0x59/0x80 [ 44.962484] __x64_sys_ioctl+0x18d/0x210 [ 44.962804] do_syscall_64+0x5b/0x290 [ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are allocated and p->vec_buf remains set to NULL. This breaks an assumption made later in pagemap_scan_backout_range(), that page_region is always allocated for p->vec_buf_index. Fix it by explicitly checking p->vec_buf for NULL before dereferencing. Other sites that might run into same deref-issue are already (directly or transitively) protected by checking p->vec_buf. Note: From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output is requested and it's only the side effects caller is interested in, hence it passes check in pagemap_scan_get_args(). This issue was found by syzkaller. Link: https://lkml.kernel.org/r/[email protected] Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Jakub Acs <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Jinjiang Tu <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Penglei Jiang <[email protected]> Cc: Mark Brown <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: "Michał Mirosław" <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | kmsan: fix out-of-bounds access to shadow memoryEric Biggers2025-09-252-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running sha224_kunit on a KMSAN-enabled kernel results in a crash in kmsan_internal_set_shadow_origin(): BUG: unable to handle page fault for address: ffffbc3840291000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0 Oops: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary) Tainted: [N]=TEST Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100 [...] Call Trace: <TASK> __msan_memset+0xee/0x1a0 sha224_final+0x9e/0x350 test_hash_buffer_overruns+0x46f/0x5f0 ? kmsan_get_shadow_origin_ptr+0x46/0xa0 ? __pfx_test_hash_buffer_overruns+0x10/0x10 kunit_try_run_case+0x198/0xa00 This occurs when memset() is called on a buffer that is not 4-byte aligned and extends to the end of a guard page, i.e. the next page is unmapped. The bug is that the loop at the end of kmsan_internal_set_shadow_origin() accesses the wrong shadow memory bytes when the address is not 4-byte aligned. Since each 4 bytes are associated with an origin, it rounds the address and size so that it can access all the origins that contain the buffer. However, when it checks the corresponding shadow bytes for a particular origin, it incorrectly uses the original unrounded shadow address. This results in reads from shadow memory beyond the end of the buffer's shadow memory, which crashes when that memory is not mapped. To fix this, correctly align the shadow address before accessing the 4 shadow bytes corresponding to each origin. Link: https://lkml.kernel.org/r/[email protected] Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning") Signed-off-by: Eric Biggers <[email protected]> Tested-by: Alexander Potapenko <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | mm/hugetlb: fix copy_hugetlb_page_range() to use ->pt_share_countJane Chu2025-09-252-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared count") introduced ->pt_share_count dedicated to hugetlb PMD share count tracking, but omitted fixing copy_hugetlb_page_range(), leaving the function relying on page_count() for tracking that no longer works. When lazy page table copy for hugetlb is disabled, that is, revert commit bcd51a3c679d ("hugetlb: lazy page table copies in fork()") fork()'ing with hugetlb PMD sharing quickly lockup - [ 239.446559] watchdog: BUG: soft lockup - CPU#75 stuck for 27s! [ 239.446611] RIP: 0010:native_queued_spin_lock_slowpath+0x7e/0x2e0 [ 239.446631] Call Trace: [ 239.446633] <TASK> [ 239.446636] _raw_spin_lock+0x3f/0x60 [ 239.446639] copy_hugetlb_page_range+0x258/0xb50 [ 239.446645] copy_page_range+0x22b/0x2c0 [ 239.446651] dup_mmap+0x3e2/0x770 [ 239.446654] dup_mm.constprop.0+0x5e/0x230 [ 239.446657] copy_process+0xd17/0x1760 [ 239.446660] kernel_clone+0xc0/0x3e0 [ 239.446661] __do_sys_clone+0x65/0xa0 [ 239.446664] do_syscall_64+0x82/0x930 [ 239.446668] ? count_memcg_events+0xd2/0x190 [ 239.446671] ? syscall_trace_enter+0x14e/0x1f0 [ 239.446676] ? syscall_exit_work+0x118/0x150 [ 239.446677] ? arch_exit_to_user_mode_prepare.constprop.0+0x9/0xb0 [ 239.446681] ? clear_bhb_loop+0x30/0x80 [ 239.446684] ? clear_bhb_loop+0x30/0x80 [ 239.446686] entry_SYSCALL_64_after_hwframe+0x76/0x7e There are two options to resolve the potential latent issue: 1. warn against PMD sharing in copy_hugetlb_page_range(), 2. fix it. This patch opts for the second option. While at it, simplify the comment, the details are not actually relevant anymore. Link: https://lkml.kernel.org/r/[email protected] Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Jane Chu <[email protected]> Reviewed-by: Harry Yoo <[email protected]> Acked-by: Oscar Salvador <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Jann Horn <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| | * | | | | mm/hugetlb: fix folio is still mapped when deletedJinjiang Tu2025-09-251-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migration may be raced with fallocating hole. remove_inode_single_folio will unmap the folio if the folio is still mapped. However, it's called without folio lock. If the folio is migrated and the mapped pte has been converted to migration entry, folio_mapped() returns false, and won't unmap it. Due to extra refcount held by remove_inode_single_folio, migration fails, restores migration entry to normal pte, and the folio is mapped again. As a result, we triggered BUG in filemap_unaccount_folio. The log is as follows: BUG: Bad page cache in process hugetlb pfn:156c00 page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00 head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0 aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file" flags: 0x17ffffc00000c1(locked|waiters|head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f4(hugetlb) page dumped because: still mapped when deleted CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x4f/0x70 filemap_unaccount_folio+0xc4/0x1c0 __filemap_remove_folio+0x38/0x1c0 filemap_remove_folio+0x41/0xd0 remove_inode_hugepages+0x142/0x250 hugetlbfs_fallocate+0x471/0x5a0 vfs_fallocate+0x149/0x380 Hold folio lock before checking if the folio is mapped to avold race with migration. Link: https://lkml.kernel.org/r/[email protected] Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch") Signed-off-by: Jinjiang Tu <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
| * | | | | | Merge tag 'trace-tools-v6.17-rc5' of ↵Linus Torvalds2025-09-271-2/+2
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull rtla tool fixes from Steven Rostedt: - Fix a buffer overflow in actions_parse() The "trigger_c" variable did not account for the nul byte when determining its size - Fix a compare that had the values reversed actions_destroy() is supposed to reallocate when len is greater than the current size, but the compare was testing if size is greater than the new length * tag 'trace-tools-v6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla/actions: Fix condition for buffer reallocation rtla: Fix buffer overflow in actions_parse
| | * | | | | | rtla/actions: Fix condition for buffer reallocationWander Lairson Costa2025-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The condition to check if the actions buffer needs to be resized was incorrect. The check `self->size >= self->len` would evaluate to true on almost every call to `actions_new()`, causing the buffer to be reallocated unnecessarily each time an action was added. Fix the condition to `self->len >= self.size`, ensuring that the buffer is only resized when it is actually full. Cc: John Kacur <[email protected]> Cc: Luis Goncalves <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Chang Yin <[email protected]> Cc: Costa Shulyupin <[email protected]> Cc: Crystal Wood <[email protected]> Cc: Gabriele Monaco <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 6ea082b171e00 ("rtla/timerlat: Add action on threshold feature") Signed-off-by: Wander Lairson Costa <[email protected]> Reviewed-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
| | * | | | | | rtla: Fix buffer overflow in actions_parseIvan Pravdin2025-09-271-1/+1
| | | |_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, tests 3 and 13-22 in tests/timerlat.t fail with error: *** buffer overflow detected ***: terminated timeout: the monitored command dumped core The result of running `sudo make check` is tests/timerlat.t (Wstat: 0 Tests: 22 Failed: 11) Failed tests: 3, 13-22 Files=3, Tests=34, 140 wallclock secs ( 0.07 usr 0.01 sys + 27.63 cusr 27.96 csys = 55.67 CPU) Result: FAIL Fix buffer overflow in actions_parse to avoid this error. After this change, the tests results are tests/hwnoise.t ... ok tests/osnoise.t ... ok tests/timerlat.t .. ok All tests successful. Files=3, Tests=34, 186 wallclock secs ( 0.06 usr 0.01 sys + 41.10 cusr 44.38 csys = 85.55 CPU) Result: PASS Link: https://lore.kernel.org/164ffc2ec8edacaf1295789dad82a07817b6263d.1757034919.git.ipravdin.official@gmail.com Fixes: 6ea082b171e0 ("rtla/timerlat: Add action on threshold feature") Signed-off-by: Ivan Pravdin <[email protected]> Reviewed-by: Tomas Glozar <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
| * | | | | | Merge tag 'riscv-for-linus-v6.17-rc8' of ↵Linus Torvalds2025-09-262-4/+21
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - A race-free implementation of pudp_huge_get_and_clear() (based on the x86 code) - A MAINTAINERS update to my E-mail address * tag 'riscv-for-linus-v6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: MAINTAINERS: Update Paul Walmsley's E-mail address riscv: Use an atomic xchg in pudp_huge_get_and_clear()
| | * | | | | | MAINTAINERS: Update Paul Walmsley's E-mail addressPaul Walmsley2025-09-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My experiment with using corporate Gmail for Linux kernel list interaction has come to an end. For my MAINTAINERS entries that use that E-mail address, let's switch those to use the k.org E-mail forwarding. Signed-off-by: Paul Walmsley <[email protected]> Signed-off-by: Paul Walmsley <[email protected]>
| | * | | | | | riscv: Use an atomic xchg in pudp_huge_get_and_clear()Alexandre Ghiti2025-09-241-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure we return the right pud value and not a value that could have been overwritten in between by a different core. Fixes: c3cc2a4a3a23 ("riscv: Add support for PUD THP") Cc: [email protected] Signed-off-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/[email protected] [[email protected]: use xchg rather than atomic_long_xchg; avoid atomic op for !CONFIG_SMP like x86] Signed-off-by: Paul Walmsley <[email protected]>
| * | | | | | | Merge tag 'x86-urgent-2025-09-26' of ↵Linus Torvalds2025-09-263-1/+24
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fix a CPU topology code regression that caused the mishandling of certain boot command line options, and re-enable CONFIG_PTDUMP on i386 that was mistakenly turned off in the Kconfig" * tag 'x86-urgent-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology: Implement topology_is_core_online() to address SMT regression x86/Kconfig: Reenable PTDUMP on i386
| | * | | | | | | x86/topology: Implement topology_is_core_online() to address SMT regressionThomas Gleixner2025-09-222-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Christian reported that commit a430c11f4015 ("intel_idle: Rescan "dead" SMT siblings during initialization") broke the use case in which both 'nosmt' and 'maxcpus' are on the kernel command line because it onlines primary threads, which were offline due to the maxcpus limit. The initially proposed fix to skip primary threads in the loop is inconsistent. While it prevents the primary thread to be onlined, it then onlines the corresponding hyperthread(s), which does not really make sense. The CPU iterator in cpuhp_smt_enable() contains a check which excludes all threads of a core, when the primary thread is offline. The default implementation is a NOOP and therefore not effective on x86. Implement topology_is_core_online() on x86 to address this issue. This makes the behaviour consistent between x86 and PowerPC. Fixes: a430c11f4015 ("intel_idle: Rescan "dead" SMT siblings during initialization") Fixes: f694481b1d31 ("ACPI: processor: Rescan "dead" SMT siblings during initialization") Closes: https://lore.kernel.org/linux-pm/[email protected]/ Reported-by: Christian Loehle <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Rafael J. Wysocki (Intel) <[email protected]> Tested-by: Christian Loehle <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/[email protected]
| | * | | | | | | x86/Kconfig: Reenable PTDUMP on i386Alexander Popov2025-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit f9aad622006bd64c ("mm: rename GENERIC_PTDUMP and PTDUMP_CORE") has broken PTDUMP and the Kconfig options that use it on ARCH=i386, including CONFIG_DEBUG_WX. CONFIG_GENERIC_PTDUMP was renamed into CONFIG_ARCH_HAS_PTDUMP, but it was mistakenly moved from "config X86" to "config X86_64". That made PTDUMP unavailable for i386. Move CONFIG_ARCH_HAS_PTDUMP back to "config X86" to fix it. [ bp: Massage commit message. ] Fixes: f9aad622006bd64c ("mm: rename GENERIC_PTDUMP and PTDUMP_CORE") Signed-off-by: Alexander Popov <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Cc: [email protected]
| * | | | | | | | Merge tag 'sched-urgent-2025-09-26' of ↵Linus Torvalds2025-09-264-46/+35
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix two dl_server regressions: a race that can end up leaving the dl_server stuck, and a dl_server throttling bug causing lag to fair tasks" * tag 'sched-urgent-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix dl_server behaviour sched/deadline: Fix dl_server getting stuck
| | * | | | | | | | sched/deadline: Fix dl_server behaviourPeter Zijlstra2025-09-253-24/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | John reported undesirable behaviour with the dl_server since commit: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"). When starving fair tasks on purpose (starting spinning FIFO tasks), his fair workload, which often goes (briefly) idle, would delay fair invocations for a second, running one invocation per second was both unexpected and terribly slow. The reason this happens is that when dl_se->server_pick_task() returns NULL, indicating no runnable tasks, it would yield, pushing any later jobs out a whole period (1 second). Instead simply stop the server. This should restore behaviour in that a later wakeup (which restarts the server) will be able to continue running (subject to the CBS wakeup rules). Notably, this does not re-introduce the behaviour cccb45d7c4295 set out to solve, any start/stop cycle is naturally throttled by the timer period (no active cancel). Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: John Stultz <[email protected]>
| | * | | | | | | | sched/deadline: Fix dl_server getting stuckPeter Zijlstra2025-09-254-22/+2
| | | |_|_|_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | John found it was easy to hit lockup warnings when running locktorture on a 2 CPU VM, which he bisected down to: commit cccb45d7c429 ("sched/deadline: Less agressive dl_server handling"). While debugging it seems there is a chance where we end up with the dl_server dequeued, with dl_se->dl_server_active. This causes dl_server_start() to return without enqueueing the dl_server, thus it fails to run when RT tasks starve the cpu. When this happens, dl_server_timer() catches the '!dl_se->server_has_tasks(dl_se)' case, which then calls replenish_dl_entity() and dl_server_stopped() and finally return HRTIMER_NO_RESTART. This ends in no new timer and also no enqueue, leaving the dl_server 'dead', allowing starvation. What should have happened is for the bandwidth timer to start the zero-laxity timer, which in turn would enqueue the dl_server and cause dl_se->server_pick_task() to be called -- which will stop the dl_server if no fair tasks are observed for a whole period. IOW, it is totally irrelevant if there are fair tasks at the moment of bandwidth refresh. This removes all dl_se->server_has_tasks() users, so remove the whole thing. Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: John Stultz <[email protected]>
| * | | | | | | | Merge tag 'locking-urgent-2025-09-26' of ↵Linus Torvalds2025-09-262-3/+5
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Fix a PI-futexes race, and fix a copy_process() futex cleanup bug" * tag 'locking-urgent-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Use correct exit on failure from futex_hash_allocate_default() futex: Prevent use-after-free during requeue-PI
| | * | | | | | | | futex: Use correct exit on failure from futex_hash_allocate_default()Sebastian Andrzej Siewior2025-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copy_process() uses the wrong error exit path from futex_hash_allocate_default(). After exiting from futex_hash_allocate_default(), neither tasklist_lock nor siglock has been acquired. The exit label bad_fork_core_free unlocks both of these locks which is wrong. The next exit label, bad_fork_cancel_cgroup, is the correct exit. sched_cgroup_fork() did not allocate any resources that need to freed. Use bad_fork_cancel_cgroup on error exit from futex_hash_allocate_default(). Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash") Reported-by: [email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Steven Rostedt (Google) <[email protected]> Closes: https://lore.kernel.org/all/[email protected]
| | * | | | | | | | futex: Prevent use-after-free during requeue-PISebastian Andrzej Siewior2025-09-201-2/+4
| | |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot managed to trigger the following race: T1 T2 futex_wait_requeue_pi() futex_do_wait() schedule() futex_requeue() futex_proxy_trylock_atomic() futex_requeue_pi_prepare() requeue_pi_wake_futex() futex_requeue_pi_complete() /* preempt */ * timeout/ signal wakes T1 * futex_requeue_pi_wakeup_sync() // Q_REQUEUE_PI_LOCKED futex_hash_put() // back to userland, on stack futex_q is garbage /* back */ wake_up_state(q->task, TASK_NORMAL); In this scenario futex_wait_requeue_pi() is able to leave without using futex_q::lock_ptr for synchronization. This can be prevented by reading futex_q::task before updating the futex_q::requeue_state. A reference on the task_struct is not needed because requeue_pi_wake_futex() is invoked with a spinlock_t held which implies a RCU read section. Even if T1 terminates immediately after, the task_struct will remain valid during T2's wake_up_state(). A READ_ONCE on futex_q::task before futex_requeue_pi_complete() is enough because it ensures that the variable is read before the state is updated. Read futex_q::task before updating the requeue state, use it for the following wakeup. Fixes: 07d91ef510fb1 ("futex: Prevent requeue_pi() lock nesting issue on RT") Reported-by: [email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/