aboutsummaryrefslogtreecommitdiffstats
path: root/tools/lib
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of ↵Linus Torvalds2025-02-253-39/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure
| * tools: Remove redundant quiet setupCharlie Jenkins2025-02-183-39/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Q is exported from Makefile.include so it is not necessary to manually set it. Reviewed-by: Jiri Olsa <[email protected]> Signed-off-by: Charlie Jenkins <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Quentin Monnet <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Benjamin Tissoires <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: John Fastabend <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: KP Singh <[email protected]> Cc: Lukasz Luba <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mykola Lysenko <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Song Liu <[email protected]> Cc: Stanislav Fomichev <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Yonghong Song <[email protected]> Cc: Zhang Rui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
* | Merge tag 'perf-tools-for-v6.14-2025-01-21' of ↵Linus Torvalds2025-01-247-103/+48
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf-tools updates from Namhyung Kim: "There are a lot of changes in the perf tools in this cycle. build: - Use generic syscall table to generate syscall numbers on supported archs - This also enables to get rid of libaudit which was used for syscall numbers - Remove python2 support as it's deprecated for years - Fix issues on static build with libzstd perf record: - Intel-PT supports "aux-action" config term to pause or resume tracing in the aux-buffer. Users can start the intel_pt event as "started-paused" and configure other events to control the Intel-PT tracing: # perf record --kcore -e intel_pt/aux-action=start-paused/ \ -e syscalls:sys_enter_newuname/aux-action=resume/ \ -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname This requires kernel support (which was added in v6.13) perf lock: - 'perf lock contention' command has an ability to symbolize locks in dynamically allocated objects using slab cache name when it runs with BPF. Those dynamic locks would have "&" prefix in the name to distinguish them from ordinary (static) locks # perf lock con -abl -E 5 sleep 1 contended total wait max wait avg wait address symbol 2 1.95 us 1.77 us 975 ns ffff9d5e852d3498 &task_struct (mutex) 1 1.18 us 1.18 us 1.18 us ffff9d5e852d3538 &task_struct (mutex) 4 1.12 us 354 ns 279 ns ffff9d5e841ca800 &kmalloc-cg-512 (mutex) 2 859 ns 617 ns 429 ns ffffffffa41c3620 delayed_uprobe_lock (mutex) 3 691 ns 388 ns 230 ns ffffffffa41c0940 pack_mutex (mutex) This also requires kernel/BPF support (which was added in v6.13) perf ftrace: - 'perf ftrace latency' command gets a couple of options to support linear buckets instead of exponential. Also it's possible to specify max and min latency for the linear buckets: # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100 \ --min-latency=200 --max-latency=800 -- sleep 1 # DURATION | COUNT | GRAPH | 0 - 200 ns | 186 | ### | 200 - 300 ns | 256 | ##### | 300 - 400 ns | 364 | ####### | 400 - 500 ns | 223 | #### | 500 - 600 ns | 111 | ## | 600 - 700 ns | 41 | | 700 - 800 ns | 141 | ## | 800 - ... ns | 169 | ### | # statistics (in nsec) total time: 2162212 avg time: 967 max time: 16817 min time: 132 count: 2236 - As you can see in the above example, it nows shows the statistics at the end so that users can see the avg/max/min latencies easily - 'perf ftrace profile' command has --graph-opts option like 'perf ftrace trace' so that it can control the tracing behaviors in the same way. For example, it can limit the function call depth or threshold perf script: - Improve physical memory resolution in 'mem-phys-addr' script by parsing /proc/iomem file # perf script mem-phys-addr -- find / ... Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-85f7fffff : System RAM 8929 69.7 547600000-54785d23f : Kernel data 1240 9.7 546a00000-5474bdfff : Kernel rodata 490 3.8 5480ce000-5485fffff : Kernel bss 121 0.9 0-fff : Reserved 3860 30.1 100000-89c01fff : System RAM 18 0.1 8a22c000-8df6efff : System RAM 5 0.0 Others: - 'perf test' gets --runs-per-test option to run the test cases repeatedly. This would be helpful to see if it's flaky - Add 'parse_events' method to Python perf extension module, so that users can use the same event parsing logic in the python code. One more step towards implementing perf tools in Python. :) - Support opening tracepoint events without libtraceevent. This will be helpful if it won't use the tracing data like in 'perf stat' - Update ARM Neoverse N2/V2 JSON events and metrics" * tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits) perf test: Update event_groups test to use instructions perf bench: Fix undefined behavior in cmpworker() perf annotate: Prefer passing evsel to evsel->core.idx perf lock: Rename fields in lock_type_table perf lock: Add percpu-rwsem for type filter perf lock: Fix parse_lock_type which only retrieve one lock flag perf lock: Fix return code for functions in __cmd_contention perf hist: Fix width calculation in hpp__fmt() perf hist: Fix bogus profiles when filters are enabled perf hist: Deduplicate cmp/sort/collapse code perf test: Improve verbose documentation perf test: Add a runs-per-test flag perf test: Fix parallel/sequential option documentation perf test: Send list output to stdout rather than stderr perf test: Rename functions and variables for better clarity perf tools: Expose quiet/verbose variables in Makefile.perf perf config: Add a function to set one variable in .perfconfig perf test perftool_testsuite: Return correct value for skipping perf test perftool_testsuite: Add missing description perf test record+probe_libc_inet_pton: Make test resilient ...
| * Merge remote-tracking branch 'torvalds/master' into perf-tools-nextArnaldo Carvalho de Melo2024-12-131-2/+16
| |\ | | | | | | | | | | | | | | | To get the fixes that went thru perf-tools for v6.13. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | tool api fs: Correctly encode errno for read/write open failuresIan Rogers2024-12-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch from returning -1 to -errno so that callers can determine types of failure. Reviewed-by: Namhyung Kim <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Falcon <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yang Li <[email protected]> Cc: Ze Gao <[email protected]> Cc: Zixian Cai <[email protected]> Cc: zhaimingbing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Grow array of read CPUs in smaller incrementsIan Rogers2024-12-091-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of growing the array by 2048, grow by the larger of the current range or 16. As ranges are typical for things like the online CPUs this will mean a single allocation happens. While uncore CPU maps will grow 16 at a time which is a value that is generous except say on large servers. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kyle Meyer <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Remove perf_cpu_map__read()Ian Rogers2024-12-094-60/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function is no longer used and duplicates the parsing logic from perf_cpu_map__new(). Remove to allow simplification. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kyle Meyer <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Applied manually to cope with "libperf cpumap: Refactor perf_cpu_map__merge()" ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Remove use of perf_cpu_map__read()Ian Rogers2024-12-091-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove use of a FILE and switch to reading a string that is then passed to perf_cpu_map__new(). Being able to remove perf_cpu_map__read() avoids duplicated parsing logic. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kyle Meyer <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Be tolerant of newline at the end of a cpumaskIan Rogers2024-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | File cpumasks often have a newline that shouldn't trigger the invalid parsing case in perf_cpu_map__new(). Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kyle Meyer <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Hide/reduce scope of MAX_NR_CPUSIan Rogers2024-12-092-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid redefinition of MAX_NR_CPUS as a global constant, the original definition is tools/perf/perf.h. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kyle Meyer <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | perf: Increase MAX_NR_CPUS to 4096Kyle Meyer2024-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Systems have surpassed 2048 CPUs. Increase MAX_NR_CPUS to 4096. Bitmaps declared with MAX_NR_CPUS bits will increase from 256B to 512B, cpus_runtime will increase from 81960B to 163880B, and max_entries will increase from 8192B to 16384B. Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Kyle Meyer <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * | libperf cpumap: Refactor perf_cpu_map__merge()Leo Yan2024-12-093-25/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_cpu_map__merge() function has two arguments, 'orig' and 'other'. The function definition might cause confusion as it could give the impression that the CPU maps in the two arguments are copied into a new allocated structure, which is then returned as the result. The purpose of the function is to merge the CPU map 'other' into the CPU map 'orig'. This commit changes the 'orig' argument to a pointer to pointer, so the new result will be updated into 'orig'. The return value is changed to an int type, as an error number or 0 for success. Update callers and tests for the new function definition. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Leo Yan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
* | | libbpf: Work around kernel inconsistently stripping '.llvm.' suffixAndrii Nakryiko2025-01-171-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some versions of kernel were stripping out '.llvm.<hash>' suffix from kerne symbols (produced by Clang LTO compilation) from function names reported in available_filter_functions, while kallsyms reported full original name. This confuses libbpf's multi-kprobe logic of finding all matching kernel functions for specified user glob pattern by joining available_filter_functions and kallsyms contents, because joining by full symbol name won't work for symbols containing '.llvm.<hash>' suffix. This was eventually fixed by [0] in the kernel, but we'd like to not regress multi-kprobe experience and add a work around for this bug on libbpf side, stripping kallsym's name if it matches user pattern and contains '.llvm.' suffix. [0] fb6a421fb615 ("kallsyms: Match symbols exactly with CONFIG_LTO_CLANG") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | | libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDEDPu Lehui2025-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When redirecting the split BTF to the vmlinux base BTF, we need to mark the distilled base struct/union members of split BTF structs/unions in id_map with BTF_IS_EMBEDDED. This indicates that these types must match both name and size later. Therefore, we need to traverse the entire split BTF, which involves traversing type IDs from nr_dist_base_types to nr_types. However, the current implementation uses an incorrect traversal end type ID, so let's correct it. Fixes: 19e00c897d50 ("libbpf: Split BTF relocation") Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | | libbpf: Fix return zero when elf_begin failedPu Lehui2025-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error number of elf_begin is omitted when encapsulating the btf_find_elf_sections function. Fixes: c86f180ffc99 ("libbpf: Make btf_parse_elf process .BTF.base transparently") Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | | libbpf: Add unique_match option for multi kprobeYonghong Song2025-01-102-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jordan reported an issue in Meta production environment where func try_to_wake_up() is renamed to try_to_wake_up.llvm.<hash>() by clang compiler at lto mode. The original 'kprobe/try_to_wake_up' does not work any more since try_to_wake_up() does not match the actual func name in /proc/kallsyms. There are a couple of ways to resolve this issue. For example, in attach_kprobe(), we could do lookup in /proc/kallsyms so try_to_wake_up() can be replaced by try_to_wake_up.llvm.<hach>(). Or we can force users to use bpf_program__attach_kprobe() where they need to lookup /proc/kallsyms to find out try_to_wake_up.llvm.<hach>(). But these two approaches requires extra work by either libbpf or user. Luckily, suggested by Andrii, multi kprobe already supports wildcard ('*') for symbol matching. In the above example, 'try_to_wake_up*' can match to try_to_wake_up() or try_to_wake_up.llvm.<hash>() and this allows bpf prog works for different kernels as some kernels may have try_to_wake_up() and some others may have try_to_wake_up.llvm.<hash>(). The original intention is to kprobe try_to_wake_up() only, so an optional field unique_match is added to struct bpf_kprobe_multi_opts. If the field is set to true, the number of matched functions must be one. Otherwise, the attachment will fail. In the above case, multi kprobe with 'try_to_wake_up*' and unique_match preserves user functionality. Reported-by: Jordan Rome <[email protected]> Suggested-by: Andrii Nakryiko <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | | libbpf: Set MFD_NOEXEC_SEAL when creating memfdDaniel Xu2024-12-301-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") and until 1717449b4417 ("memfd: drop warning for missing exec-related flags"), the kernel would print a warning if neither MFD_NOEXEC_SEAL nor MFD_EXEC is set in memfd_create(). If libbpf runs on on a kernel between these two commits (eg. on an improperly backported system), it'll trigger this warning. To avoid this warning (and also be more secure), explicitly set MFD_NOEXEC_SEAL. But since libbpf can be run on potentially very old kernels, leave a fallback for kernels without MFD_NOEXEC_SEAL support. Signed-off-by: Daniel Xu <[email protected]> Link: https://lore.kernel.org/r/6e62c2421ad7eb1da49cbf16da95aaaa7f94d394.1735594195.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <[email protected]>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov2024-12-161-2/+16
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge bpf fixes after downstream PR. No conflicts. Adjacent changes in: Auto-merging include/linux/bpf.h Auto-merging include/linux/bpf_verifier.h Auto-merging kernel/bpf/btf.c Auto-merging kernel/bpf/verifier.c Auto-merging kernel/trace/bpf_trace.c Auto-merging tools/testing/selftests/bpf/progs/test_tp_btf_nullable.c Signed-off-by: Alexei Starovoitov <[email protected]>
| * | libperf: evlist: Fix --cpu argument on hybrid platformJames Clark2024-12-111-2/+16
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the linked fixes: commit, specifying a CPU on hybrid platforms results in an error because Perf tries to open an extended type event on "any" CPU which isn't valid. Extended type events can only be opened on CPUs that match the type. Before (working): $ perf record --cpu 1 -- true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ] After (not working): $ perf record -C 1 -- true WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P' Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/). /bin/dmesg | grep -i perf may provide additional information. (Ignore the warning message, that's expected and not particularly relevant to this issue). This is because perf_cpu_map__intersect() of the user specified CPU (1) and one of the PMU's CPUs (16-27) correctly results in an empty (NULL) CPU map. However for the purposes of opening an event, libperf converts empty CPU maps into an any CPU (-1) which the kernel rejects. Fix it by deleting evsels with empty CPU maps in the specific case where user requested CPU maps are evaluated. Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events") Reviewed-by: Ian Rogers <[email protected]> Tested-by: Thomas Falcon <[email protected]> Signed-off-by: James Clark <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
* | libbpf: prog load: Allow to use fd_array_cntAnton Protopopov2024-12-132-2/+6
| | | | | | | | | | | | | | | | | | | | Add new fd_array_cnt field to bpf_prog_load_opts and pass it in bpf_attr, if set. Signed-off-by: Anton Protopopov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | libbpf: Extend linker API to support in-memory ELF filesAlastair Robertson2024-12-123-21/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new_fd and add_fd functions correspond to the original new and add_file functions, but accept an FD instead of a file name. This gives API consumers the option of using anonymous files/memfds to avoid writing ELFs to disk. This new API will be useful for performing linking as part of bpftrace's JIT compilation. The add_buf function is a convenience wrapper that does the work of creating a memfd for the caller. Signed-off-by: Alastair Robertson <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | libbpf: Pull file-opening logic up to top-level functionsAlastair Robertson2024-12-121-42/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the filename arguments and file-descriptor handling from init_output_elf() and linker_load_obj_file() and instead handle them at the top-level in bpf_linker__new() and bpf_linker__add_file(). This will allow the inner functions to be shared with a new, non-filename-based, API in the next commit. Signed-off-by: Alastair Robertson <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | libbpf: Fix segfault due to libelf functions not setting errnoQuentin Monnet2024-12-051-14/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Libelf functions do not set errno on failure. Instead, it relies on its internal _elf_errno value, that can be retrieved via elf_errno (or the corresponding message via elf_errmsg()). From "man libelf": If a libelf function encounters an error it will set an internal error code that can be retrieved with elf_errno. Each thread maintains its own separate error code. The meaning of each error code can be determined with elf_errmsg, which returns a string describing the error. As a consequence, libbpf should not return -errno when a function from libelf fails, because an empty value will not be interpreted as an error and won't prevent the program to stop. This is visible in bpf_linker__add_file(), for example, where we call a succession of functions that rely on libelf: err = err ?: linker_load_obj_file(linker, filename, opts, &obj); err = err ?: linker_append_sec_data(linker, &obj); err = err ?: linker_append_elf_syms(linker, &obj); err = err ?: linker_append_elf_relos(linker, &obj); err = err ?: linker_append_btf(linker, &obj); err = err ?: linker_append_btf_ext(linker, &obj); If the object file that we try to process is not, in fact, a correct object file, linker_load_obj_file() may fail with errno not being set, and return 0. In this case we attempt to run linker_append_elf_sysms() and may segfault. This can happen (and was discovered) with bpftool: $ bpftool gen object output.o sample_ret0.bpf.c libbpf: failed to get ELF header for sample_ret0.bpf.c: invalid `Elf' handle zsh: segmentation fault (core dumped) bpftool gen object output.o sample_ret0.bpf.c Fix the issue by returning a non-null error code (-EINVAL) when libelf functions fail. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
* | libbpf: Improve debug message when the base BTF cannot be foundBen Olson2024-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running `bpftool` on a kernel module installed in `/lib/modules...`, this error is encountered if the user does not specify `--base-btf` to point to a valid base BTF (e.g. usually in `/sys/kernel/btf/vmlinux`). However, looking at the debug output to determine the cause of the error simply says `Invalid BTF string section`, which does not point to the actual source of the error. This just improves that debug message to tell users what happened. Signed-off-by: Ben Olson <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/Z0YqzQ5lNz7obQG7@bolson-desk Signed-off-by: Alexei Starovoitov <[email protected]>
* | libbpf: don't adjust USDT semaphore address if .stapsdt.base addr is missingAndrii Nakryiko2024-12-021-1/+1
|/ | | | | | | | | | | | | | | | | | | USDT ELF note optionally can record an offset of .stapsdt.base, which is used to make adjustments to USDT target attach address. Currently, libbpf will do this address adjustment unconditionally if it finds .stapsdt.base ELF section in target binary. But there is a corner case where .stapsdt.base ELF section is present, but specific USDT note doesn't reference it. In such case, libbpf will basically just add base address and end up with absolutely incorrect USDT target address. This adjustment has to be done only if both .stapsdt.sema section is present and USDT note is recording a reference to it. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
* Merge tag 'perf-tools-for-v6.13-2024-11-24' of ↵Linus Torvalds2024-11-266-4/+145
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "perf record: - Enable leader sampling for inherited task events. It was supported only for system-wide events but the kernel started to support such a setup since v6.12. This is to reduce the number of PMU interrupts. The samples of the leader event will contain counts of other events and no samples will be generated for the other member events. $ perf record -e '{cycles,instructions}:S' ${MYPROG} perf report: - Fix --branch-history option to display more branch-related information like prediction, abort and cycles which is available on Intel machines. $ perf record -bg -- perf test -w brstack $ perf report --branch-history ... # # Overhead Source:Line Symbol Shared Object Predicted Abort Cycles IPC [IPC Coverage] # ........ ........................ .............. .................... ......... ..... ...... .................... # 8.17% copy_page_64.S:19 [k] copy_page [kernel.kallsyms] 50.0% 0 5 - - | ---xas_load xarray.h:171 | |--5.68%--xas_load xarray.c:245 (cycles:1) | xas_load xarray.c:242 | xas_load xarray.h:1260 (cycles:1) | xas_descend xarray.c:146 | xas_load xarray.c:244 (cycles:2) | xas_load xarray.c:245 | xas_descend xarray.c:218 (cycles:10) ... perf stat: - Add HWMON PMU support. The HWMON provides various system information like CPU/GPU temperature, fan speed and so on. Expose them as PMU events so that users can see the values using perf stat commands. $ perf stat -e temp_cpu,fan1 true Performance counter stats for 'true': 60.00 'C temp_cpu 0 rpm fan1 0.000745382 seconds time elapsed 0.000883000 seconds user 0.000000000 seconds sys - Display metric threshold in JSON output. Some metrics define thresholds to classify value ranges. It used to be in a different color but it won't work for JSON. Add "metric-threshold" field to the JSON that can be one of "good", "less good", "nearly bad" and "bad". # perf stat -a -M TopdownL1 -j true {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"} {"metric-value" : "29.212267", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"} {"metric-value" : "7.138972", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"} {"metric-value" : "20.422759", "metric-unit" : "% tma_retiring", "metric-threshold" : "good"} {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, } ... perf sched: - Add -P/--pre-migrations option for 'timehist' sub-command to track time a task waited on a run-queue before migrating to a different CPU. $ perf sched timehist -P time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 585940.535527 [0000] perf[584885] 0.000 0.000 0.000 0.000 585940.535535 [0000] migration/0[20] 0.000 0.002 0.008 0.000 585940.535559 [0001] perf[584885] 0.000 0.000 0.000 0.000 585940.535563 [0001] migration/1[25] 0.000 0.001 0.004 0.000 585940.535678 [0002] perf[584885] 0.000 0.000 0.000 0.000 585940.535686 [0002] migration/2[31] 0.000 0.002 0.008 0.000 585940.535905 [0001] <idle> 0.000 0.000 0.342 0.000 585940.535938 [0003] perf[584885] 0.000 0.000 0.000 0.000 585940.537048 [0001] sleep[584886] 0.000 0.019 1.142 0.001 585940.537749 [0002] <idle> 0.000 0.000 2.062 0.000 ... Build: - Make libunwind opt-in (LIBUNWIND=1) rather than opt-out. The perf tools are generally built with libelf and libdw which has unwinder functionality. The libunwind support predates it and no need to have duplicate unwinders by default. - Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify it's using libdw for handling DWARF information. Internals: - Do not set exclude_guest bit in the perf_event_attr by default. This was causing a trouble in AMD IBS PMU as it doesn't support the bit. The bit will be set when it's needed later by the fallback logic. Also update the missing feature detection logic to make sure not clear supported bits unnecessarily. - Run perf test in parallel by default and mark flaky tests "exclusive" to run them serially at the end. Some test numbers are changed but the test can complete in less than half the time. JSON vendor events: - Add AMD Zen 5 events and metrics. - Add i.MX91 and i.MX95 DDR metrics - Fix HiSilicon HIP08 Topdown metric name. - Support compat events on PowerPC" * tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits) perf tests: Fix hwmon parsing with PMU name test perf hwmon_pmu: Ensure hwmon key union is zeroed before use perf tests hwmon_pmu: Remove double evlist__delete() perf/test: fix perf ftrace test on s390 perf bpf-filter: Return -ENOMEM directly when pfi allocation fails perf test: Correct hwmon test PMU detection perf: Remove unused del_perf_probe_events() perf pmu: Move pmu_metrics_table__find and remove ARM override perf jevents: Add map_for_cpu() perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str perf header: Avoid transitive PMU includes perf arm64 header: Use cpu argument in get_cpuid perf header: Refactor get_cpuid to take a CPU for ARM perf header: Move is_cpu_online to numa bench perf jevents: fix breakage when do perf stat on system metric perf test: Add missing __exit calls in tool/hwmon tests perf tests: Make leader sampling test work without branch event perf util: Remove kernel version deadcode perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved perf test shell trace_exit_race: Show what went wrong in verbose mode ...
| * libsubcmd: Move va_end() before exitLuo Yifan2024-11-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes a minor adjustment by moving the va_end call before exit. Since the exit() function terminates the program, any code after exit(128) (i.e., va_end(params)) is unreachable and thus not executed. Placing va_end before exit ensures that the va_list is properly cleaned up. Signed-off-by: Luo Yifan <[email protected]> Cc: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
| * tools api io: Ensure line_len_out is always initializedIan Rogers2024-11-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure initialization to avoid compiler warnings about potential use of uninitialized variables. Signed-off-by: Ian Rogers <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Yoshihiro Furudera <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ze Gao <[email protected]> Cc: Changbin Du <[email protected]> Cc: Junhao He <[email protected]> Cc: Weilin Wang <[email protected]> Cc: James Clark <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Athira Jajeev <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
| * tools subcmd: Add non-waitpid check_if_command_finished()Ian Rogers2024-10-281-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using waitpid can cause stdout/stderr of the child process to be lost. Use Linux's /prod/<pid>/status file to determine if the process has reached the zombie state. Use the 'status' file rather than 'stat' to avoid issues around skipping the process name. Tested-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Howard Chu <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Veronika Molnarova <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
| * Merge tag 'v6.12-rc3' into perf-tools-nextNamhyung Kim2024-10-1416-75/+145
| |\ | | | | | | | | | | | | | | | To get the fixes in the current perf-tools tree. Signed-off-by: Namhyung Kim <[email protected]>
| * | tools/perf: Correctly calculate sample period for inherited SAMPLE_READ valuesBen Gainey2024-10-022-2/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sample period calculation in deliver_sample_value is updated to calculate the per-thread period delta for events that are inherit + PERF_SAMPLE_READ. When the sampling event has this configuration, the read_format.id is used with the tid from the sample to lookup the storage of the previously accumulated counter total before calculating the delta. All existing valid configurations where read_format.value represents some global value continue to use just the read_format.id to locate the storage of the previously accumulated total. perf_sample_id is modified to support tracking per-thread values, along with the existing global per-id values. In the per-thread case, values are stored in a hash by tid within the perf_sample_id, and are dynamically allocated as the number is not known ahead of time. Signed-off-by: Ben Gainey <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
| * | libperf: Explicitly specify install-html dependenciesAkihiko Odaki2024-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | install_doc of tools/lib/perf/Makefile invokes install-man, install-html, and install-examples of tools/lib/perf/Documentation/Makefile at once. This invocation succeeds when make runs in serial but can fail when make runs in parallel because while install-man of tools/lib/perf/Documentation/Makefile depends on all, install-html depends on nothing and can run ahead of all. Explicitly specify the dependencies of install-html to ensure that they are resolved before install-html. Signed-off-by: Akihiko Odaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
* | | Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of ↵Linus Torvalds2024-11-261-2/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - The series "resource: A couple of cleanups" from Andy Shevchenko performs some cleanups in the resource management code - The series "Improve the copy of task comm" from Yafang Shao addresses possible race-induced overflows in the management of task_struct.comm[] - The series "Remove unnecessary header includes from {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a small fix to the list_sort library code and to its selftest - The series "Enhance min heap API with non-inline functions and optimizations" also from Kuan-Wei Chiu optimizes and cleans up the min_heap library code - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi finishes off nilfs2's folioification - The series "add detect count for hung tasks" from Lance Yang adds more userspace visibility into the hung-task detector's activity - Apart from that, singelton patches in many places - please see the individual changelogs for details * tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) gdb: lx-symbols: do not error out on monolithic build kernel/reboot: replace sprintf() with sysfs_emit() lib: util_macros_kunit: add kunit test for util_macros.h util_macros.h: fix/rework find_closest() macros Improve consistency of '#error' directive messages ocfs2: fix uninitialized value in ocfs2_file_read_iter() hung_task: add docs for hung_task_detect_count hung_task: add detect count for hung tasks dma-buf: use atomic64_inc_return() in dma_buf_getfile() fs/proc/kcore.c: fix coccinelle reported ERROR instances resource: avoid unnecessary resource tree walking in __region_intersects() ocfs2: remove unused errmsg function and table ocfs2: cluster: fix a typo lib/scatterlist: use sg_phys() helper checkpatch: always parse orig_commit in fixes tag nilfs2: convert metadata aops from writepage to writepages nilfs2: convert nilfs_recovery_copy_block() to take a folio nilfs2: convert nilfs_page_count_clean_buffers() to take a folio nilfs2: remove nilfs_writepage nilfs2: convert checkpoint file to be folio-based ...
| * | | tools/lib/list_sort: remove unnecessary header includesKuan-Wei Chiu2024-11-061-2/+0
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since lib/list_sort.c no longer requires ARRAY_SIZE() and memset(), the includes for kernel.h, bug.h, and string.h have been removed. Similarly, tools/lib/list_sort.c also does not need to include these headers, so they have been removed as well. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kuan-Wei Chiu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Ching-Chun (Jim) Huang <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: "Liang, Kan" <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
* | | Merge tag 'bpf-next-6.13' of ↵Linus Torvalds2024-11-2126-474/+915
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Add BPF uprobe session support (Jiri Olsa) - Optimize uprobe performance (Andrii Nakryiko) - Add bpf_fastcall support to helpers and kfuncs (Eduard Zingerman) - Avoid calling free_htab_elem() under hash map bucket lock (Hou Tao) - Prevent tailcall infinite loop caused by freplace (Leon Hwang) - Mark raw_tracepoint arguments as nullable (Kumar Kartikeya Dwivedi) - Introduce uptr support in the task local storage map (Martin KaFai Lau) - Stringify errno log messages in libbpf (Mykyta Yatsenko) - Add kmem_cache BPF iterator for perf's lock profiling (Namhyung Kim) - Support BPF objects of either endianness in libbpf (Tony Ambardar) - Add ksym to struct_ops trampoline to fix stack trace (Xu Kuohai) - Introduce private stack for eligible BPF programs (Yonghong Song) - Migrate samples/bpf tests to selftests/bpf test_progs (Daniel T. Lee) - Migrate test_sock to selftests/bpf test_progs (Jordan Rife) * tag 'bpf-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (152 commits) libbpf: Change hash_combine parameters from long to unsigned long selftests/bpf: Fix build error with llvm 19 libbpf: Fix memory leak in bpf_program__attach_uprobe_multi bpf: use common instruction history across all states bpf: Add necessary migrate_disable to range_tree. bpf: Do not alloc arena on unsupported arches selftests/bpf: Set test path for token/obj_priv_implicit_token_envvar selftests/bpf: Add a test for arena range tree algorithm bpf: Introduce range_tree data structure and use it in bpf arena samples/bpf: Remove unused variable in xdp2skb_meta_kern.c samples/bpf: Remove unused variables in tc_l2_redirect_kern.c bpftool: Cast variable `var` to long long bpf, x86: Propagate tailcall info only for subprogs bpf: Add kernel symbol for struct_ops trampoline bpf: Use function pointers count as struct_ops links count bpf: Remove unused member rcu from bpf_struct_ops_map selftests/bpf: Add struct_ops prog private stack tests bpf: Support private stack for struct_ops progs selftests/bpf: Add tracing prog private stack tests bpf, x86: Support private stack in jit ...
| * | | libbpf: Change hash_combine parameters from long to unsigned longSidong Yang2024-11-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hash_combine() could be trapped when compiled with sanitizer like "zig cc" or clang with signed-integer-overflow option. This patch parameters and return type to unsigned long to remove the potential overflow. Signed-off-by: Sidong Yang <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
| * | | libbpf: Fix memory leak in bpf_program__attach_uprobe_multiJiri Olsa2024-11-151-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrii reported memory leak detected by Coverity on error path in bpf_program__attach_uprobe_multi. Fixing that by moving the check earlier before the offsets allocations. Reported-by: Andrii Nakryiko <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov2024-11-132-1/+5
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge bpf fixes after downstream PR. In particular to bring the fix in commit aa30eb3260b2 ("bpf: Force checkpoint when jmp history is too long"). The follow up verifier work depends on it. And the fix in commit 6801cf7890f2 ("selftests/bpf: Use -4095 as the bad address for bits iterator"). It's fixing instability of BPF CI on s390 arch. No conflicts. Adjacent changes in: Auto-merging arch/Kconfig Auto-merging kernel/bpf/helpers.c Auto-merging kernel/bpf/memalloc.c Auto-merging kernel/bpf/verifier.c Auto-merging mm/slab_common.c Signed-off-by: Alexei Starovoitov <[email protected]>
| * | | | libbpf: Stringify errno in log messages in the remaining codeMykyta Yatsenko2024-11-126-53/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert numeric error codes into the string representations in log messages in the rest of libbpf source files. Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Stringify errno in log messages in btf*.cMykyta Yatsenko2024-11-122-13/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert numeric error codes into the string representations in log messages in btf.c and btf_dump.c. Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Stringify errno in log messages in libbpf.cMykyta Yatsenko2024-11-121-200/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert numeric error codes into the string representations in log messages in libbpf.c. Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Introduce errstr() for stringifying errnoMykyta Yatsenko2024-11-122-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add function errstr(int err) that allows converting numeric error codes into string representations. Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Add support for uprobe multi session attachJiri Olsa2024-11-113-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding support to attach program in uprobe session mode with bpf_program__attach_uprobe_multi function. Adding session bool to bpf_uprobe_multi_opts struct that allows to load and attach the bpf program via uprobe session. the attachment to create uprobe multi session. Also adding new program loader section that allows: SEC("uprobe.session/bpf_fentry_test*") and loads/attaches uprobe program as uprobe session. Adding sleepable hook (uprobe.session.s) as well. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | bpf: Add support for uprobe multi session attachJiri Olsa2024-11-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding support to attach BPF program for entry and return probe of the same function. This is common use case which at the moment requires to create two uprobe multi links. Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs kernel to attach single link program to both entry and exit probe. It's possible to control execution of the BPF program on return probe simply by returning zero or non zero from the entry BPF program execution to execute or not the BPF program on return probe respectively. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: start v1.6 development cycleAndrii Nakryiko2024-10-292-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With libbpf v1.5.0 release out, start v1.6 dev cycle. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
| * | | | libbpf: define __uptr.Kui-Feng Lee2024-10-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make __uptr available to BPF programs to enable them to define uptrs. Acked-by: Andrii Nakryiko <[email protected]> Signed-off-by: Kui-Feng Lee <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
| * | | | libbpf: move global data mmap()'ing into bpf_object__load()Andrii Nakryiko2024-10-241-43/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since BPF skeleton inception libbpf has been doing mmap()'ing of global data ARRAY maps in bpf_object__load_skeleton() API, which is used by code generated .skel.h files (i.e., by BPF skeletons only). This is wrong because if BPF object is loaded through generic bpf_object__load() API, global data maps won't be re-mmap()'ed after load step, and memory pointers returned from bpf_map__initial_value() would be wrong and won't reflect the actual memory shared between BPF program and user space. bpf_map__initial_value() return result is rarely used after load, so this went unnoticed for a really long time, until bpftrace project attempted to load BPF object through generic bpf_object__load() API and then used BPF subskeleton instantiated from such bpf_object. It turned out that .data/.rodata/.bss data updates through such subskeleton was "blackholed", all because libbpf wouldn't re-mmap() those maps during bpf_object__load() phase. Long story short, this step should be done by libbpf regardless of BPF skeleton usage, right after BPF map is created in the kernel. This patch moves this functionality into bpf_object__populate_internal_map() to achieve this. And bpf_object__load_skeleton() is now simple and almost trivial, only propagating these mmap()'ed pointers into user-supplied skeleton structs. We also do trivial adjustments to error reporting inside bpf_object__populate_internal_map() for consistency with the rest of libbpf's map-handling code. Reported-by: Alastair Robertson <[email protected]> Reported-by: Jonathan Wiepert <[email protected]> Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
| * | | | libsubcmd: Silence compiler warningEder Zulian2024-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize the pointer 'o' in options__order to NULL to prevent a compiler warning/error which is observed when compiling with the '-Og' option, but is not emitted by the compiler with the current default compilation options. For example, when compiling libsubcmd with $ make "EXTRA_CFLAGS=-Og" -C tools/lib/subcmd/ clean all Clang version 17.0.6 and GCC 13.3.1 fail to compile parse-options.c due to following error: parse-options.c: In function ‘options__order’: parse-options.c:832:9: error: ‘o’ may be used uninitialized [-Werror=maybe-uninitialized] 832 | memcpy(&ordered[nr_opts], o, sizeof(*o)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ parse-options.c:810:30: note: ‘o’ was declared here 810 | const struct option *o, *p = opts; | ^ cc1: all warnings being treated as errors Signed-off-by: Eder Zulian <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Prevent compiler warnings/errorsEder Zulian2024-10-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize 'new_off' and 'pad_bits' to 0 and 'pad_type' to NULL in btf_dump_emit_bit_padding to prevent compiler warnings/errors which are observed when compiling with 'EXTRA_CFLAGS=-g -Og' options, but do not happen when compiling with current default options. For example, when compiling libbpf with $ make "EXTRA_CFLAGS=-g -Og" -C tools/lib/bpf/ clean all Clang version 17.0.6 and GCC 13.3.1 fail to compile btf_dump.c due to following errors: btf_dump.c: In function ‘btf_dump_emit_bit_padding’: btf_dump.c:903:42: error: ‘new_off’ may be used uninitialized [-Werror=maybe-uninitialized] 903 | if (new_off > cur_off && new_off <= next_off) { | ~~~~~~~~^~~~~~~~~~~ btf_dump.c:870:13: note: ‘new_off’ was declared here 870 | int new_off, pad_bits, bits, i; | ^~~~~~~ btf_dump.c:917:25: error: ‘pad_type’ may be used uninitialized [-Werror=maybe-uninitialized] 917 | btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 918 | in_bitfield ? new_off - cur_off : 0); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ btf_dump.c:871:21: note: ‘pad_type’ was declared here 871 | const char *pad_type; | ^~~~~~~~ btf_dump.c:930:20: error: ‘pad_bits’ may be used uninitialized [-Werror=maybe-uninitialized] 930 | if (bits == pad_bits) { | ^ btf_dump.c:870:22: note: ‘pad_bits’ was declared here 870 | int new_off, pad_bits, bits, i; | ^~~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Eder Zulian <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
| * | | | libbpf: Fix possible compiler warnings in hashmapNamhyung Kim2024-10-111-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hashmap__for_each_entry[_safe] is accessing 'map' as a pointer. But it does without parentheses so passing a static hash map with an ampersand (like '&slab_hash') will cause compiler warnings due to unmatched types as '->' operator has a higher precedence. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]