| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | perf flamegraph: Fix minor pylint/type hint issues | Ian Rogers | 2025-07-16 | 1 | -28/+33 |
| | | | | | | | | | | | | Switch to assuming python3. Fix minor pylint issues on line length, repeated compares, not using f-strings and variable case. Add type hints and check with mypy. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf script: Add -e option to flamegraph script | Tianyou Li | 2025-06-24 | 1 | -1/+14 |
| | | | | | | | | | | | | | | | | | | | When processing the perf data file generated with multiple events, the flamegraph script will count all the events regardless of different event names. This patch tries to add a -e option to specify the event name that the flamegraph will be generated accordingly. If the -e option omitted, the behavior remains unchanged. Signed-off-by: Tianyou Li <[email protected]> Reviewed-by: Pan Deng <[email protected]> Reviewed-by: Zhiguo Zhou <[email protected]> Reviewed-by: Wangyang Guo <[email protected]> Reviewed-by: Tim Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf script: Handle -i option for perf script flamegraph | Tianyou Li | 2025-06-24 | 1 | -1/+7 |
| | | | | | | | | | | | | | | | | | If specify the perf data file with -i option, the script will try to read the header information regardless of the file name specified, instead it will try to access the perf.data. This simple patch use the file name from -i option for command perf report --header-only to read the header. Signed-off-by: Tianyou Li <[email protected]> Reviewed-by: Pan Deng <[email protected]> Reviewed-by: Zhiguo Zhou <[email protected]> Reviewed-by: Wangyang Guo <[email protected]> Reviewed-by: Tim Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripts python: exported-sql-viewer.py: Fix pattern matching with Python 3 | Adrian Hunter | 2025-05-12 | 1 | -1/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | The script allows the user to enter patterns to find symbols. The pattern matching characters are converted for use in SQL. For PostgreSQL the conversion involves using the Python maketrans() method which is slightly different in Python 3 compared with Python 2. Fix to work in Python 3. Fixes: beda0e725e5f06ac ("perf script python: Add Python3 support to exported-sql-viewer.py") Signed-off-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Tony Jones <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf build: Add pylint build tests | Ian Rogers | 2025-03-24 | 1 | -0/+13 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If PYLINT=1 is passed to the build then run pylint over python code in perf. Unlike shellcheck this isn't default on as there are currently too many errors. An example of an error: ``` ************* Module setup util/setup.py:19:0: C0301: Line too long (127/100) (line-too-long) util/setup.py:20:0: C0301: Line too long (138/100) (line-too-long) util/setup.py:63:0: C0301: Line too long (106/100) (line-too-long) util/setup.py:1:0: C0114: Missing module docstring (missing-module-docstring) util/setup.py:24:4: W0622: Redefining built-in 'vars' (redefined-builtin) util/setup.py:11:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name) util/setup.py:13:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name) util/setup.py:15:34: R1732: Consider using 'with' for resource-allocating operations (consider-using-with) util/setup.py:18:0: C0116: Missing function or method docstring (missing-function-docstring) util/setup.py:19:16: R1732: Consider using 'with' for resource-allocating operations (consider-using-with) util/setup.py:44:0: C0413: Import "from setuptools import setup, Extension" should be placed at the top of the module (wrong-import-position) util/setup.py:46:0: C0413: Import "from setuptools.command.build_ext import build_ext as _build_ext" should be placed at the top of the module (wrong-import-position) util/setup.py:47:0: C0413: Import "from setuptools.command.install_lib import install_lib as _install_lib" should be placed at the top of the module (wrong-import-position) util/setup.py:49:0: C0115: Missing class docstring (missing-class-docstring) util/setup.py:49:0: C0103: Class name "build_ext" doesn't conform to PascalCase naming style (invalid-name) util/setup.py:52:8: W0201: Attribute 'build_lib' defined outside __init__ (attribute-defined-outside-init) util/setup.py:53:8: W0201: Attribute 'build_temp' defined outside __init__ (attribute-defined-outside-init) util/setup.py:55:0: C0115: Missing class docstring (missing-class-docstring) util/setup.py:55:0: C0103: Class name "install_lib" doesn't conform to PascalCase naming style (invalid-name) util/setup.py:58:8: W0201: Attribute 'build_dir' defined outside __init__ (attribute-defined-outside-init) *----------------------------------------------------------------- Your code has been rated at 6.67/10 (previous run: 6.51/10, +0.16) make[4]: *** [util/Build:442: util/setup.py.pylint_log] Error 1 ``` Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf build: Add mypy build tests | Ian Rogers | 2025-03-24 | 1 | -0/+13 |
| | | | | | | | | | | | | | | | | | | | | | | | | | If MYPY=1 is passed to the build then run mypy over python code in perf. Unlike shellcheck this isn't default on as there are currently too many errors. An example of an error: ``` util/setup.py:8: error: Item "None" of "str | None" has no attribute "split" [union-attr] util/setup.py:15: error: Item "None" of "IO[bytes] | None" has no attribute "readline" [union-attr] util/setup.py:15: error: List item 0 has incompatible type "str | None"; expected "str | bytes | PathLike[str] | PathLike[bytes]" [list-item] util/setup.py:16: error: Unsupported left operand type for + ("None") [operator] util/setup.py:16: note: Left operand is of type "str | None" util/setup.py:74: error: Unsupported left operand type for + ("None") [operator] util/setup.py:74: note: Left operand is of type "str | None" Found 5 errors in 1 file (checked 1 source file) make[4]: *** [util/Build:430: util/setup.py.mypy_log] Error 1 ``` Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf build: Remove Makefile.syscalls | Ian Rogers | 2025-03-21 | 2 | -137/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | Now a single beauty file is generated and used by all architectures, remove the per-architecture Makefiles, Kbuild files and previous generator script. Note: there was conversation with Charlie Jenkins <[email protected]> and they'd written an alternate approach to support multiple architectures: https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/ It would have been better to have helped Charlie fix their series (my apologies) but they agreed that the approach taken here was likely best for longer term maintainability: https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/ Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Howard Chu <[email protected]> Reviewed-by: Charlie Jenkins <[email protected]> Reviewed-by: Namhyung Kim <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf syscalltbl: Remove struct syscalltbl | Ian Rogers | 2025-03-21 | 1 | -23/+13 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The syscalltbl held entries of system call name and number pairs, generated from a native syscalltbl at start up. As there are gaps in the system call number there is a notion of index into the table. Going forward we want the system call table to be identifiable by a machine type, for example, i386 vs x86-64. Change the interface to the syscalltbl so (1) a (currently unused machine type of EM_HOST) is passed (2) the index to syscall number and system call name mapping is computed at build time. Two tables are used for this, an array of system call number to name, an array of system call numbers sorted by the system call name. The sorted array doesn't store strings in part to save memory and relocations. The index notion is carried forward and is an index into the sorted array of system call numbers, the data structures are opaque (held only in syscalltbl.c), and so the number of indices for a machine type is exposed as a new API. The arrays are computed in the syscalltbl.sh script and so no start-up time computation and storage is necessary. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Howard Chu <[email protected]> Reviewed-by: Charlie Jenkins <[email protected]> Reviewed-by: Namhyung Kim <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf script: Fix hangup in offline flamegraph report | Namhyung Kim | 2025-02-19 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | A recent change in the flamegraph script fixed an issue with live mode but it created another for offline mode. It needs to pass "-" to -i option to read from stdin in the live mode. Actually there's a logic to pass the option in the perf script code, but the script was written with "-- $@" which prevented the option to go to the perf script. So the previous commit added the hard-coded "-i -" to the report command. But it's a problem for the offline mode which expects input from a file and now it's stuck on reading from stdin. Let's remove the "-i - --" part and let it pass the options properly to perf script. Closes: https://lore.kernel.org/linux-perf-users/[email protected] Fixes: 23e0a63c6dd3f69c ("perf script: force stdin for flamegraph in live mode") Reported-by: Thomas Richter <[email protected]> Tested-by: Thomas Richter <[email protected]> Cc: Anubhav Shelat <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf script: force stdin for flamegraph in live mode | Anubhav Shelat | 2025-02-04 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | Currently, running "perf script flamegraph -a -F 99 sleep 1" should produce flamegraph.html containing the flamegraph. Howevever, it gives a segmentation fault. This is caused because the flamegraph.py script is supposed to take as input the output of "perf record", which should be in stdin. This would require passing "-i -" to flamegraph.py. However, the "flamegraph-report" script causes "perf script" command to take the "-i -" option instead of flamegraph.py, which causes no problem for "perf script", but causes a seg fault since flamegraph.py has no input file. To fix this I added the "-i -" option directly to the flamegraph-report script to ensure flamegraph.py gets input from stdin. Signed-off-by: Anubhav Shelat <[email protected]> Tested-by: Michael Petlan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf tools: Create generic syscall table support | Charlie Jenkins | 2025-01-09 | 2 | -0/+147 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each architecture in perf independently generates syscall headers. Adapt the work that has gone into unifying syscall header implementations in the kernel to work with perf tools. Introduce this framework with riscv at first. riscv previously relied on libaudit, but with this change, perf tools for riscv no longer needs this external dependency. Signed-off-by: Charlie Jenkins <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Guo Ren <[email protected]> Cc: Günther Noack <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mickaël Salaün <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script: Move script_fetch_insn to trace-event-scripting.c | Ian Rogers | 2024-12-18 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add native_arch as a parameter to script_fetch_insn rather than relying on the builtin-script value that won't be initialized for the dlfilter and python Context use cases. Assume both of those cases are running natively. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf python: Remove python 2 scripting support | Ian Rogers | 2024-12-18 | 1 | -18/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Python2 was deprecated 4 years ago, remove support and workarounds. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script python: Improve physical mem type resolution | Ian Rogers | 2024-12-09 | 1 | -75/+102 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously system RAM and persistent memory were hard code matched, change so that the label of the memory region is just read from /proc/iomem. This avoids frequent N/A samples. Change the /proc/iomem reading, event processing and output so that nested entries appear and their counts count toward their parent. As labels may be repeated, include the memory ranges in the output to make it clear why, for example, "System RAM" appears twice. Before: Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- System RAM 9460 96.5% N/A 998 3.5% After: Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-105f7fffff : System RAM 36741 96.5 841400000-8416599ff : Kernel data 89 0.2 840800000-8412a6fff : Kernel rodata 60 0.2 841ebe000-8423fffff : Kernel bss 34 0.1 0-fff : Reserved 1345 3.5 100000-89dd9fff : System RAM 2 0.0 Before: Event: mem_inst_retired.any:P Memory type count percentage ---------------------------------------- ----------- ----------- System RAM 9460 90.5% N/A 998 9.5% After: Event: mem_inst_retired.any:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-105f7fffff : System RAM 9460 90.5 841400000-8416599ff : Kernel data 45 0.4 840800000-8412a6fff : Kernel rodata 19 0.2 841ebe000-8423fffff : Kernel bss 12 0.1 0-fff : Reserved 998 9.5 The code has been updated to python 3 with type hints and resolving issues reported by mypy and pylint. Tabs are swapped to spaces as preferred in PEP8, because most lines of code were modified (of this small file) and this makes pylint significantly less noisy. Committer testing: root@number:/tmp# grep -m1 "model name" /proc/cpuinfo model name : Intel(R) Core(TM) i7-14700K root@number:/tmp# root@number:/tmp# perf script mem-phys-addr -a find / /bin /lib /lib64 /sbin Warning: 744 out of order events recorded. Event: cpu_core/mem_inst_retired.all_loads/P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-8bfbfffff : System RAM 364561 76.5 621400000-6223a6fff : Kernel rodata 10474 2.2 622400000-62283d4bf : Kernel data 4828 1.0 623304000-6237fffff : Kernel bss 1063 0.2 620000000-6213fffff : Kernel code 98 0.0 0-fff : Reserved 111480 23.4 100000-2b0ca017 : System RAM 337 0.1 2fbad000-30d92fff : System RAM 44 0.0 2c79d000-2fbabfff : System RAM 30 0.0 30d94000-316d5fff : System RAM 16 0.0 2b131a58-2c71dfff : System RAM 7 0.0 root@number:/tmp# Signed-off-by: Ian Rogers <[email protected]> Acked-by: Kan Liang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script python: Adjust objdump start/end per map pgoff parameter | Steve Clevenger | 2024-11-09 | 1 | -5/+11 |
| | | | | | | | | | | | | | | | | | | | | | | | | | Extract map_pgoff parameter from the dictionary, and adjust start/end range passed to objdump based on the value. A zero start_addr is filtered to prevent output of dso address range check failures. This script repeatedly sees a zero value passed in for start_addr = cpu_data[str(cpu) + 'addr'] These zero values are not a new problem. The start_addr/stop_addr warning clutters the instruction trace output, hence this change. Signed-off-by: Steve Clevenger <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/21ccdd22e664bdeccb878672d4b2c0518873c1e5.1731027120.git.scclevenger@os.amperecomputing.com Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripts python cs-etm: Add start and stop arguments | James Clark | 2024-09-24 | 1 | -2/+38 |
| | | | | | | | | | | | | | | | | | | | | | | | | | Make it possible to only disassemble a range of timestamps or sample indexes. This will be used by the test to limit the runtime, but it's also useful for users. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Tested-by: Ganapatrao Kulkarni <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ruidong Tian <[email protected]> Cc: Leo Yan <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Garry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripts python cs-etm: Improve arguments | James Clark | 2024-09-24 | 1 | -8/+55 |
| | | | | | | | | | | | | | | | | | | | | | | | | | Make vmlinux detection automatic and use Perf's default objdump when -d is specified. This will make it easier for a test to use the script without having to provide arguments. And similarly for users. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Tested-by: Ganapatrao Kulkarni <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ruidong Tian <[email protected]> Cc: Leo Yan <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Garry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripts python cs-etm: Update to use argparse | James Clark | 2024-09-24 | 1 | -18/+10 |
| | | | | | | | | | | | | | | | | | | | | | | | optparse is deprecated and less flexible than argparse so update it. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Tested-by: Ganapatrao Kulkarni <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ruidong Tian <[email protected]> Cc: Leo Yan <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Garry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripting python: Add function to get a config value | James Clark | 2024-09-24 | 1 | -0/+11 |
| | | | | | | | | | | | | | | | | | | | | | | | | This can be used to get config values like which objdump Perf uses for disassembly. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Tested-by: Ganapatrao Kulkarni <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ruidong Tian <[email protected]> Cc: Leo Yan <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Garry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> | ||||
| * | perf scripts python cs-etm: Restore first sample log in verbose mode | James Clark | 2024-07-31 | 1 | -5/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The linked commit moved the early return on the first sample to before the verbose log, so move the log earlier too. Now the first sample is also logged and not skipped. Fixes: 2d98dbb4c9c5b09c ("perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample") Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ruidong Tian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf util: Make util its own library | Ian Rogers | 2024-06-26 | 3 | -4/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Kees Cook <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Gary Guo <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Cc: Ze Gao <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Yicong Yang <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Guo Ren <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: Oliver Upton <[email protected]> Cc: John Garry <[email protected]> Cc: Benno Lossin <[email protected]> Cc: Björn Roy Baron <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Paul Walmsley <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] | ||||
| * | tools/perf: Fix parallel-perf python script to replace new python syntax ↵ | Athira Rajeev | 2024-06-25 | 1 | -1/+2 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ":=" usage perf test "perf script tests" fails as below in systems with python 3.6 File "/home/athira/linux/tools/perf/tests/shell/../../scripts/python/parallel-perf.py", line 442 if line := p.stdout.readline(): ^ SyntaxError: invalid syntax --- Cleaning up --- ---- end(-1) ---- 92: perf script tests: FAILED! This happens because ":=" is a new syntax that assigns values to variables as part of a larger expression. This is introduced from python 3.8 and hence fails in setup with python 3.6 Address this by splitting the large expression and check the value in two steps: Previous line: if line := p.stdout.readline(): Current change: line = p.stdout.readline() if line: With patch ./perf test "perf script tests" 93: perf script tests: Ok Signed-off-by: Athira Rajeev <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] | ||||
| * | perf script: netdev-times: add location parameter to consume_skb | Lucas Stach | 2024-06-07 | 1 | -1/+2 |
| | | | | | | | | | | | | | dd1b527831a3 ("net: add location to trace_consume_skb()") added a new parameter to the consume_skb tracepoint. Adapt the script to match. Signed-off-by: Lucas Stach <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] | ||||
| * | perf scripts python: Add a script to run instances of 'perf script' in parallel | Adrian Hunter | 2024-04-27 | 1 | -0/+988 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a Python script to run a perf script command multiple times in parallel, using perf script options --cpu and --time so that each job processes a different chunk of the data. Extend perf script tests to test also the new script. The script supports the use of normal 'perf script' options like --dlfilter and --script, so that the benefit of running parallel jobs naturally extends to them also. In addition, a command can be provided (refer --pipe-to option) to pipe standard output to a custom command. Refer to the script's own help text at the end of the patch for more details. The script is useful for Intel PT traces, that can be efficiently decoded by 'perf script' when split by CPU and/or time ranges. Running jobs in parallel can decrease the overall decoding time. Committer testing: Ian reported that shellcheck found some issues, I installed it as there are no warnings about it not being available, but when available it fails the build with: TEST /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log CC /tmp/build/perf-tools-next/util/header.o In tests/shell/script.sh line 20: rm -rf "${temp_dir}/"* ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* . In tests/shell/script.sh line 83: output1_dir="${temp_dir}/output1" ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 84: output2_dir="${temp_dir}/output2" ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 86: python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?). For more information: https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif... https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev... https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ... Did these fixes: - rm -rf "${temp_dir}/"* + rm -rf "${temp_dir:?}/"* And: @@ -83,8 +83,8 @@ test_parallel_perf() output1_dir="${temp_dir}/output1" output2_dir="${temp_dir}/output2" perf record -o "${perf_data}" --sample-cpu uname - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" After that: root@number:~# perf test -vv "perf script tests" 97: perf script tests: --- start --- test child forked, pid 4084139 DB test [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ] <SNIP> DB test [Success] parallel-perf test Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ] Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 2 completed, 2 running Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 4 completed, 0 running All jobs finished successfully parallel-perf.py done Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 4 completed, 0 running Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 8 completed, 0 running Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 12 completed, 0 running Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 16 completed, 0 running Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 20 completed, 0 running Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 24 completed, 0 running Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 27 completed, 1 running Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 28 completed, 0 running All jobs finished successfully parallel-perf.py done parallel-perf test [Success] --- Cleaning up --- ---- end(0) ---- 97: perf script tests : Ok root@number:~# Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample | Ruidong Tian | 2023-12-20 | 1 | -10/+11 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arm-cs-trace-disasm ignore disam the first branch sample, For example as follow, the instructions beteween 0x0000ffffae878750 and 0x0000ffffae878754 is lose: ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffae878750 phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Initialize cpu_data earlier to fix it: ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } 0000000000028740 <ioctl>: (base address is 0x0000ffffae850000) 28750: b13ffc1f cmn x0, #4095 28754: 54000042 b.hs 0x2875c <ioctl+0x1c> test 4003489/4003489 [0000] 26765.151766034 __GI___ioctl+0x14 /usr/lib64/libc-2.32.so Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffa67535ac phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Reviewed-by: James Clark <[email protected]> Signed-off-by: Ruidong Tian <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Tor Jeremiassen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python arm-cs-trace-disasm.py: Set start vm addr of exectable ↵ | Ruidong Tian | 2023-12-20 | 1 | -2/+3 |
| | | | | | | | | | | | | | | | | | | | | | | file to 0 For exectable ELF file, which e_type is ET_EXEC, dso start address is a absolute address other than offset. Just set vm_start to zero when dso start is 0x400000, which means it is a exectable file. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ruidong Tian <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Tor Jeremiassen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf tools: Address python 3.6 DeprecationWarning for string scapes | Benjamin Gray | 2023-11-23 | 3 | -5/+5 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and will eventually be a syntax error. Fix these now to get ahead of it before it's an error. Signed-off-by: Benjamin Gray <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Hartley Sweeten <[email protected]> Cc: Ian Abbott <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kiszka <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kieran Bingham <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mykola Lysenko <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Todd E Brandt <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python gecko: Launch the profiler UI on the default browser ↵ | Anup Sharma | 2023-08-24 | 1 | -7/+63 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the appropriate URL All required libraries have been imported and make sure that none of them are external dependencies. To achieve this, created a virt env and verified. Modified usage information and added combined command. Modified the main() function to read the --save-only command-line option and set the output_file variable accordingly. Modified the trace_end() function to check for the output_file variable. If it is set, the profiler data is saved to a local file in Gecko Profile format, or the profiler.firefox.com is opened on the default browser. Included trace_begin() to initialize the Firefox Profiler and launch the default browser to display the profiler.firefox.com. Added a new function launchFirefox() to start a local server and launch the profiler UI on the default browser with the appropriate URL. Created the "CORSRequestHandler" class to enable Cross-Origin Resource Sharing. Summary: This integration now includes a exiting feature to conveniently host the Gecko Profile data on a local server and open it directly in the default web browser. This means that users can now effortlessly visualize and analyze the profiler results with just a single click. The addition of the --save-only command-line option allows users to save the profiler output to a local file in Gecko Profile format, but the real highlight lies in the capability to seamlessly launch a local server, making the data accessible to Firefox Profiler via a web browser. In addition, it's important to highlight that all data are hosted locally, eliminating any concerns about data privacy rules and regulations. Signed-off-by: Anup Sharma <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Add support for input args in gecko script | Anup Sharma | 2023-08-24 | 1 | -1/+5 |
| | | | | | | | | | | | | | | | | | | | | | | Refines the argument handling mechanism in the "gecko-report" script to enable better compatibility and improved user experience. The script now differentiates between scenarios where arguments are provided for record and report cases where gecko.py arguments are passed. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Update audit-libs package name for python3 | Wei Li | 2023-08-15 | 1 | -3/+3 |
| | | | | | | | | | | | | | | | | | | | 'audit-libs-python' is the package for python2, update it for python3. On Ubuntu and Fedora, the new package is 'python3-audit'. Signed-off-by: Wei Li <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Bin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Support syscall name parsing on arm64 | Wei Li | 2023-08-15 | 1 | -0/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the result of "perf script syscall-counts" on arm64, the syscall events are not resolved currently. Add "aarch64" to audit uname list to support name parsing. * After the patch: [root@localhost ~]# perf script syscall-counts sleep 1 Press control+C to stop and show the summary syscall events: event count ---------------------------------------- ----------- mmap 6 close 5 mprotect 4 brk 3 newfstatat 3 openat 3 getrandom 1 prlimit64 1 munmap 1 clock_nanosleep 1 set_robust_list 1 set_tid_address 1 exit_group 1 read 1 faccessat 1 Signed-off-by: Wei Li <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Bin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script python: Cope with declarations after statements found in Python.h | Arnaldo Carvalho de Melo | 2023-08-03 | 1 | -1/+2 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With -Werror the build was failing on fedora rawhide: [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-13.2.1-20230728/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-offload-defaulted --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20230728 (Red Hat 13.2.1-1) (GCC) [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ In file included from /usr/include/python3.12/Python.h:44, from scripts/python/Perf-Trace-Util/Context.c:14: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ <SNIP> In file included from /usr/include/python3.12/Python.h:44, from util/scripting-engines/trace-event-python.c:22: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ CC /tmp/build/perf/util/units.o CC /tmp/build/perf/util/time-utils.o In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ So add -Wno-declaration-after-statement to the python scripting CFLAGS. Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/ZMpdKeO8gU%[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Add command execution for gecko script | Anup Sharma | 2023-07-28 | 2 | -0/+5 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will enable the execution of gecko.py script using record and report commands in 'perf script'. And this will be also reflected at "perf script -l" command. For Example: perf script record gecko perf script report gecko Committer notes: As discussed on the perf tools office hours, I made -F 99 the default for the record script and removed the double -- on the report script so that the existing 'perf script' protocol for the combined operation: # perf script gecko Works, i.e. the record script pipes its stdout into the stdin of the report script, basically: /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-record -F 99 -g -a -q -o - | \ /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-report -i - Testing it: The resulting JSON file needs to be uploaded to https://profiler.firefox.com, Anup already has code to start a local http server on the trace_begin handler of the gecko python script, start firefox and feed it the JSON. The example below only collects sample for the specified workload, so that we don't produce thousands of lines, to collect system wide samples, use instead: # perf script gecko -a sleep 0.5 # nohup perf script gecko sleep 0.5 { "meta": { "interval": 1, "processType": 0, "product": "x86_64 GNU/Linux", "stackwalk": 1, "debug": 0, "gcpoison": 0, "asyncstack": 1, "startTime": 274601692.636, "shutdownTime": null, "version": 24, "presymbolicated": true, "categories": [ { "name": "User", "color": "yellow", "subcategories": [ "Other" ] }, { "name": "Kernel", "color": "orange", "subcategories": [ "Other" ] } ], "markerSchema": [] }, "libs": [], "threads": [ { "tid": 3344498, "pid": 3344498, "name": "sleep", "markers": { "schema": { "name": 0, "startTime": 1, "endTime": 2, "phase": 3, "category": 4, "data": 5 }, "data": [] }, "samples": { "schema": { "stack": 0, "time": 1, "responsiveness": 2 }, "data": [ [ 21, 274601692.636, 0 ], [ 23, 274601692.641, 0 ], [ 29, 274601692.643, 0 ], [ 42, 274601692.648, 0 ] ] }, "frameTable": { "schema": { "location": 0, "relevantForJS": 1, "innerWindowID": 2, "implementation": 3, "optimizations": 4, "line": 5, "column": 6, "category": 7, "subcategory": 8 }, "data": [ [ 0, false, 0, null, null, null, null, 1, null ], [ 1, false, 0, null, null, null, null, 1, null ], [ 2, false, 0, null, null, null, null, 1, null ], [ 3, false, 0, null, null, null, null, 1, null ], [ 4, false, 0, null, null, null, null, 1, null ], [ 5, false, 0, null, null, null, null, 1, null ], [ 6, false, 0, null, null, null, null, 1, null ], [ 7, false, 0, null, null, null, null, 1, null ], [ 8, false, 0, null, null, null, null, 1, null ], [ 9, false, 0, null, null, null, null, 1, null ], [ 10, false, 0, null, null, null, null, 1, null ], [ 11, false, 0, null, null, null, null, 1, null ], [ 12, false, 0, null, null, null, null, 1, null ], [ 13, false, 0, null, null, null, null, 1, null ], [ 14, false, 0, null, null, null, null, 1, null ], [ 15, false, 0, null, null, null, null, 1, null ], [ 16, false, 0, null, null, null, null, 1, null ], [ 17, false, 0, null, null, null, null, 1, null ], [ 18, false, 0, null, null, null, null, 1, null ], [ 19, false, 0, null, null, null, null, 1, null ], [ 20, false, 0, null, null, null, null, 1, null ], [ 21, false, 0, null, null, null, null, 1, null ], [ 22, false, 0, null, null, null, null, 1, null ], [ 23, false, 0, null, null, null, null, 1, null ], [ 24, false, 0, null, null, null, null, 1, null ], [ 25, false, 0, null, null, null, null, 1, null ], [ 26, false, 0, null, null, null, null, 1, null ], [ 27, false, 0, null, null, null, null, 1, null ], [ 28, false, 0, null, null, null, null, 1, null ], [ 29, false, 0, null, null, null, null, 1, null ], [ 30, false, 0, null, null, null, null, 1, null ], [ 31, false, 0, null, null, null, null, 1, null ], [ 32, false, 0, null, null, null, null, 1, null ], [ 33, false, 0, null, null, null, null, 1, null ], [ 34, false, 0, null, null, null, null, 1, null ], [ 35, false, 0, null, null, null, null, 1, null ], [ 36, false, 0, null, null, null, null, 1, null ], [ 37, false, 0, null, null, null, null, 1, null ], [ 38, false, 0, null, null, null, null, 1, null ] ] }, "stackTable": { "schema": { "prefix": 0, "frame": 1 }, "data": [ [ null, 0 ], [ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 3, 4 ], [ 4, 5 ], [ 5, 6 ], [ 6, 7 ], [ 7, 8 ], [ 8, 9 ], [ 9, 10 ], [ 10, 11 ], [ 11, 12 ], [ 12, 13 ], [ 13, 14 ], [ 14, 15 ], [ 15, 16 ], [ 16, 17 ], [ 17, 18 ], [ 18, 19 ], [ 19, 20 ], [ 20, 21 ], [ 20, 22 ], [ 22, 23 ], [ 11, 24 ], [ 24, 25 ], [ 25, 26 ], [ 26, 27 ], [ 27, 28 ], [ 28, 29 ], [ 9, 11 ], [ 30, 24 ], [ 31, 25 ], [ 32, 30 ], [ 33, 31 ], [ 34, 32 ], [ 35, 29 ], [ 36, 33 ], [ 37, 34 ], [ 38, 35 ], [ 39, 36 ], [ 40, 37 ], [ 41, 38 ] ] }, "stringTable": [ "__func__.0 (in [kernel.kallsyms].rodata)", "perf_trace_ext4_fc_track_inode (in [kernel.kallsyms])", "perf_trace_ext4_es_insert_delayed_block (in [kernel.kallsyms])", "ext4_es_show_pblock (in [kernel.kallsyms])", "perf_trace_ext4_ext_rm_leaf (in [kernel.kallsyms])", "devcgroup_access_write (in [kernel.kallsyms])", "devcgroup_update_access (in [kernel.kallsyms])", "propagate_exception (in [kernel.kallsyms])", "revalidate_active_exceptions (in [kernel.kallsyms])", "perf_trace_ext4_fc_commit_stop (in [kernel.kallsyms])", "perf_fetch_caller_regs (in [kernel.kallsyms])", "khugepaged (in [kernel.kallsyms])", "khugepaged_wait_work (in [kernel.kallsyms])", "freezable_schedule_timeout (in [kernel.kallsyms])", "freezer_count (in [kernel.kallsyms])", "try_to_freeze (in [kernel.kallsyms])", "try_to_freeze_unsafe (in [kernel.kallsyms])", "split_huge_pages_write (in [kernel.kallsyms])", "migrate_pages (in [kernel.kallsyms])", "unmap_and_move (in [kernel.kallsyms])", "__unmap_and_move (in [kernel.kallsyms])", "collect_events (in [kernel.kallsyms])", "uncore_down_prepare (in [kernel.kallsyms])", "perf_iommu_read (in [kernel.kallsyms])", "khugepaged_do_scan (in [kernel.kallsyms])", "khugepaged_scan_mm_slot (in [kernel.kallsyms])", "khugepaged_scan_file (in [kernel.kallsyms])", "need_resched (in [kernel.kallsyms])", "get_current (in [kernel.kallsyms])", "move_to_new_page (in [kernel.kallsyms])", "khugepaged_scan_pmd (in [kernel.kallsyms])", "trace_mm_khugepaged_scan_pmd (in [kernel.kallsyms])", "migrate_huge_page_move_mapping (in [kernel.kallsyms])", "do_huge_pmd_numa_page (in [kernel.kallsyms])", "pmd_pfn (in [kernel.kallsyms])", "protnone_mask (in [kernel.kallsyms])", "__pte_needs_invert (in [kernel.kallsyms])", "reclaim_high (in [kernel.kallsyms])", "memcg_memory_event (in [kernel.kallsyms])" ], "registerTime": 0, "unregisterTime": null, "processType": "default" } ], "processes": [], "pausedRanges": [] } # Signed-off-by: Anup Sharma <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/cbf03cda175ea3dd2c6cd87bd3f12d803446cb95.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Implement add sample function and thread processing | Anup Sharma | 2023-07-28 | 1 | -2/+50 |
| | | | | | | | | | | | | | | | | | | | | | | | The stack has been created for storing func and dso from the callchain. The sample has been added to a specific thread. It first checks if the thread exists in the Thread class. Then it call _add_sample function which is responsible for appending a new entry to the samples list. Also callchain parsing and storing part is implemented. Moreover removed the comment from thread. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/5a112be85ccdcdcd611e343f6a7a7482d01f6299.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Implement add sample function and thread processing | Anup Sharma | 2023-07-28 | 1 | -0/+54 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intern_stack function is responsible for retrieving or creating a stack_id based on the provided frame_id and prefix_id. It first generates a key using the frame_id and prefix_id values. If the stack corresponding to the key is found in the stackMap, it is returned. Otherwise, a new stack is created by appending the prefix_id and frame_id to the stackTable. The key and the index of the newly created stack are added to the stackMap for future reference. The _intern_frame function is responsible for retrieving or creating a frame_id based on the provided frame string. If the frame_id corresponding to the frameString is found in the frameMap, it is returned. Otherwise, a new frame is created by appending relevant information to the frameTable and adding the frameString to the string_id through _intern_string. The _intern_string function will gets a matching string, or saves the new string and returns a String ID. Signed-off-by: Anup Sharma <[email protected]> Link: https://lore.kernel.org/r/4442f4b1ab4c7317cf940560a3a285fcdfbeeb08.1689961706.git.anupnewsmail@gmail.com Cc: Mark Rutland <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Add trace end processing and PRODUCT and CATEGORIES ↵ | Anup Sharma | 2023-07-28 | 1 | -1/+64 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | information The final output will now be presented in JSON format following the Gecko profile structure. Additionally, the inclusion of PRODUCT allows easy retrieval of header information for UI. Furthermore, CATEGORIES have been introduced to enable customization of kernel and user colors using input arguments. To facilitate this functionality, an argparse-based parser has been implemented. Note: The implementation of threads will be addressed in subsequent commits for now I have commented it out. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/fa6d027e4134c48e8a2ea45dd8f6b21e6a3418e4.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Add classes and conversion functions | Anup Sharma | 2023-07-28 | 1 | -1/+132 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces new classes and conversion functions to facilitate the representation of Gecko profile information. The new classes Frame, Stack, Sample, and Thread are added to handle specific components of the profile data, also link to the origin docs has been commented out. Additionally, Inside the Thread class _to_json_dict() method has been created that converts the current thread data into the corresponding format expected by the GeckoThread JSON schema, as per the Gecko profile format specification. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/ab7b40bd32df7101a6f8b4a3aa41570b63b831ac.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Extact necessary information from process event | Anup Sharma | 2023-07-28 | 1 | -1/+13 |
| | | | | | | | | | | | | | | | | | | | | The script takes in a sample event dictionary(param_dict) and retrieves relevant data such as time stamp, PID, TID, and comm for each event. Also start time is defined as a global variable as it need to be passed to trace_end for gecko meta information field creation. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/19910fefcfe4be03cd5c2aa3fec11d3f86c0381b.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python: Add initial script file with usage information | Anup Sharma | 2023-07-28 | 1 | -0/+31 |
| | | | | | | | | | | | | | | | | | | | | | Added necessary modules, including the Perf-Trace-Util library, and defines the required functions and variables for using perf script python. The perf_trace_context and Core modules for tracing and processing events has been also imported. Added usage information. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/f2f1a62f1cc69f44a5414da46a26a4cf124d2744.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf python scripting: Get rid of unused import in arm-cs-trace-disasm | Sourabh Jain | 2023-06-14 | 1 | -1/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | The arm-cs-trace-disasm.py script doesn't use the sys library, so remove the import. Report by pylint: W0611: Unused import sys (unused-import) Signed-off-by: Sourabh Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/linux-perf-users/[email protected] Signed-off-by: Athira Rajeev <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf thread: Add accessor functions for thread | Ian Rogers | 2023-06-12 | 1 | -2/+2 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using accessors will make it easier to add reference count checking in later patches. Committer notes: thread->nsinfo wasn't wrapped as it is used together with nsinfo__zput(), where does a trick to set the field with a refcount being dropped to NULL, and that doesn't work well with using thread__nsinfo(thread), that loses the &thread->nsinfo pointer. When refcount checking is added to 'struct thread', later in this series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to check the thread pointer. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brian Robbins <[email protected]> Cc: Changbin Du <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Fangrui Song <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Ye Xingchen <[email protected]> Cc: Yuan Can <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script: Add new parameter in kfree_skb tracepoint to the python scripts ↵ | Sriram Yagnaraman | 2023-05-02 | 2 | -5/+5 |
| | | | | | | | | | | | | | | | | | | | | using it Include reason parameter that was added in commit c504e5c2f9648a1e ("net: skb: introduce kfree_skb_reason()") Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sriram Yagnaraman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script task-analyzer: Fix spelling mistake "miliseconds" -> "milliseconds" | Colin Ian King | 2023-04-18 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | There is a spelling mistake in the help for the --ms option. Fix it. Signed-off-by: Colin Ian King <[email protected]> Acked-by: Hagen Paul Pfeifer <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Petar Gligoric <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts python intel-pt-events: Delete unused 'event_attr variable | Alexander Pantyukhin | 2023-04-12 | 1 | -3/+3 |
| | | | | | | | | | | | | | | | | | | The 'event_attr' is never used later, the var is ok be deleted. Additional code simplification is to substitute string slice comparison with "substring" function. This case no need to know the length specific words. Signed-off-by: Alexander Pantyukhin <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf map: Add accessor for dso | Ian Rogers | 2023-04-04 | 1 | -2/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Later changes will add reference count checking for struct map, with dso being the most frequently accessed variable. Add an accessor so that the reference count check is only necessary in one place. Additional changes: - add a dso variable to avoid repeated map__dso calls. - in builtin-mem.c dump_raw_samples, code only partially tested for dso == NULL. Make the possibility of NULL consistent. - in thread.c thread__memcpy fix use of spaces and use tabs. Committer notes: Did missing conversions on these files: tools/perf/arch/powerpc/util/skip-callchain-idx.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/ui/browsers/hists.c tools/perf/ui/gtk/annotate.c tools/perf/util/cs-etm.c tools/perf/util/thread.c tools/perf/util/unwind-libunwind-local.c tools/perf/util/unwind-libunwind.c Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf maps: Add functions to access maps | Ian Rogers | 2023-04-04 | 1 | -3/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce functions to access struct maps. These functions reduce the number of places reference counting is necessary. While tidying APIs do some small const-ification, in particlar to unwind_libunwind_ops. Committer notes: Fixed up tools/perf/util/unwind-libunwind.c: - return ops->get_entries(cb, arg, thread, data, max_stack); + return ops->get_entries(cb, arg, thread, data, max_stack, best_effort); Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script: Fix Python support when no libtraceevent | Adrian Hunter | 2023-03-15 | 3 | -2/+8 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Python scripting can be used without libtraceevent. In particular, scripting for Intel PT does not use tracepoints, and so does not need libtraceevent support. Alter the build and employ conditional compilation to allow Python scripting without libtraceevent. Example: Before: $ ldd `which perf` | grep -i python $ ldd `which perf` | grep -i libtraceevent $ perf record -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.031 MB perf.data ] $ perf script intel-pt-events.py |& head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. After: $ ldd `which perf` | grep -i python libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000) $ ldd `which perf` | grep -i libtraceevent $ perf script intel-pt-events.py | head Intel PT Branch Trace, Power Events, Event Trace and PTWRITE Switch In 8021/8021 [000] 11234.097713404 0/0 perf-exec 8021/8021 [000] 11234.098041726 psb offset: 0x0 0 [unknown] ([unknown]) perf-exec 8021/8021 [000] 11234.098041726 cbr 45 freq: 4505 MHz (161%) 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098082170 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098082379 branches:uH tr end 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098083629 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083629 branches:uH call 7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083837 branches:uH tr end 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) IPC: 0.01 (9/938) uname 8021/8021 [000] 11234.098084670 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf scripts intel-pt-events.py: Fix IPC output for Python 2 | Roman Lozko | 2023-03-14 | 1 | -1/+1 |
| | | | | | | | | | | | | | | Integers are not converted to floats during division in Python 2 which results in incorrect IPC values. Fix by switching to new division behavior. Fixes: a483e64c0b62e93a ("perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace") Signed-off-by: Roman Lozko <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Adrian Hunter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script flamegraph: Avoid d3-flame-graph package dependency | Ian Rogers | 2023-01-19 | 1 | -22/+85 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently flame graph generation requires a d3-flame-graph template to be installed. Unfortunately this is hard to come by for things like Debian [1]. If the template isn't installed then ask if it should be downloaded from jsdelivr CDN. The downloaded HTML file is validated against an md5sum. If the download fails, generate a minimal flame graph with the javascript coming from links to jsdelivr CDN. v3. Adds a warning message and quits before download in live mode. v2. Change the warning to a prompt about downloading and add the --allow-download command line flag. Add an md5sum check for the downloaded HTML. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996839 Reviewed-by: Andreas Gerstmayr <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: Alexander Shishkin <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin Spier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] # v3 discussion Link: https://lore.kernel.org/r/[email protected] # v2 discussion Link: https://lore.kernel.org/r/CAP-5=fXi_9zdhTAoYApiFQoLURAvpEatFzU3uL23o3zs=z25ZQ@mail.gmail.com # v1 discussion Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
| * | perf script: task-analyzer add csv support | Petar Gligoric | 2022-12-14 | 1 | -89/+185 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the possibility to write the trace and the summary as csv files to a user specified file. A format as such simplifies further data processing. This is achieved by having ";" as separators instead of spaces and solely one header per file. Additional parameters are being considered, like in the normal usage of the script. Colors are turned off in the case of a csv output, thus the highlight option is also being ignored. Usage: Write standard task to csv file: $ perf script report tasks-analyzer --csv <file> write limited output to csv file in nanoseconds: $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337 Write summary to a csv file: $ perf script report tasks-analyzer --csv-summary <file> Write summary to csv file with additional schedule information: $ perf script report tasks-analyzer --csv-summary <file> --summary-extended Write both summary and standard task to a csv file: $ perf script report tasks-analyzer --csv --csv-summary The following examples illustrate what is possible with the CSV output. The first command sequence will record all scheduler switch events for 10 seconds, the task-analyzer calculates task information like runtimes as CSV. A small python snippet using pandas and matplotlib will visualize the most frequent task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart: $ perf record -e sched:sched_switch -a -- sleep 10 $ perf script report tasks-analyzer --ns --csv tasks.csv $ cat << EOF > /tmp/freq-comm-runtimes-bar.py import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv("tasks.csv", sep=';') most_freq_comm = df["COMM"].value_counts().idxmax() most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"] plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds") plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes) plt.show() $ python3 /tmp/freq-comm-runtimes-bar.py As a seconds example, the subsequent script generates a pie chart of all accumulated tasks runtimes for 10 seconds of system recordings: $ perf record -e sched:sched_switch -a -- sleep 10 $ perf script report tasks-analyzer --csv-summary task-summary.csv $ cat << EOF > /tmp/accumulated-task-pie.py import pandas as pd from matplotlib.pyplot import pie, axis, show df = pd.read_csv("task-summary.csv", sep=';') sums = df.groupby(df["Comm"])["Accumulated"].sum() axis("equal") pie(sums, labels=sums.index); show() EOF $ python3 /tmp/accumulated-task-pie.py A variety of other visualizations are possible in matplotlib and other environments. Of course, pandas, numpy and co. also allow easy statistical analysis of the data! Signed-off-by: Petar Gligoric <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hagen Paul Pfeifer <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> | ||||
