linux/tools/include/uapi
Linus Torvalds df8f6181ab perf tools updates for 7.1
perf report:
 
  - Add 'comm_nodigit' sort key to combine similar threads that only have
    different numbers in the comm.  In the following example, the
    'comm_nodigit' will have samples from all threads starting with
    "bpfrb/" into an entry "bpfrb/<N>".
 
     $ perf report -s comm_nodigit,comm -H
     ...
     #
     #    Overhead  CommandNoDigit / Command
     # ...........  ........................
     #
         20.30%     swapper
            20.30%     swapper
         13.37%     chrome
            13.37%     chrome
         10.07%     bpfrb/<N>
             7.47%     bpfrb/0
             0.70%     bpfrb/1
             0.47%     bpfrb/3
             0.46%     bpfrb/2
             0.25%     bpfrb/4
             0.23%     bpfrb/5
             0.20%     bpfrb/6
             0.14%     bpfrb/10
             0.07%     bpfrb/7
 
  - Support flat layout for symfs.  The --symfs option is to specify the
    location of debugging symbol files.  The default 'hierarchy' layout
    would search the symbol file using the same path of the original file
    under the symfs root.  The new 'flat' layout would search only in the
    root directory.
 
  - Update 'simd' sort key for ARM SIMD flags to cover ASE/SME and more
    predicate flags.
 
 perf stat:
 
  - Add --pmu-filter option to select specific PMUs.  This would be
    useful when you measure metrics from multiple instance of uncore PMUs
    with similar names.
 
     # perf stat -M cpa_p0_avg_bw
      Performance counter stats for 'system wide':
 
         19,417,779,115      hisi_sicl0_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                      0      hisi_sicl0_cpa0/cpa_p0_wr_dat/
                      0      hisi_sicl0_cpa0/cpa_p0_rd_dat_64b/
                      0      hisi_sicl0_cpa0/cpa_p0_rd_dat_32b/
         19,417,751,103      hisi_sicl10_cpa0/cpa_cycles/     #     0.00 cpa_p0_avg_bw
                      0      hisi_sicl10_cpa0/cpa_p0_wr_dat/
                      0      hisi_sicl10_cpa0/cpa_p0_rd_dat_64b/
                      0      hisi_sicl10_cpa0/cpa_p0_rd_dat_32b/
         19,417,730,679      hisi_sicl2_cpa0/cpa_cycles/      #     0.31 cpa_p0_avg_bw
             75,635,749      hisi_sicl2_cpa0/cpa_p0_wr_dat/
             18,520,640      hisi_sicl2_cpa0/cpa_p0_rd_dat_64b/
                      0      hisi_sicl2_cpa0/cpa_p0_rd_dat_32b/
         19,417,674,227      hisi_sicl8_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                      0      hisi_sicl8_cpa0/cpa_p0_wr_dat/
                      0      hisi_sicl8_cpa0/cpa_p0_rd_dat_64b/
                      0      hisi_sicl8_cpa0/cpa_p0_rd_dat_32b/
 
           19.417734480 seconds time elapsed
 
    With --pmu-filter, users can select only hisi_sicl2_cpa0 PMU.
 
     # perf stat --pmu-filter hisi_sicl2_cpa0 -M cpa_p0_avg_bw
      Performance counter stats for 'system wide':
 
          6,234,093,559      cpa_cycles                       #     0.60 cpa_p0_avg_bw
             50,548,465      cpa_p0_wr_dat
              7,552,182      cpa_p0_rd_dat_64b
                      0      cpa_p0_rd_dat_32b
 
            6.234139320 seconds time elapsed
 
 Data type profiling:
 
  - Quality improvements by tracking register state more precisely.
  - Ensure array members to get the type.
  - Handle more cases for global variables.
 
 Vendor event/metric updates:
 
  - Update various Intel events and metrics
  - Add NVIDIA Tegra 410 Olympus events
 
 Internal changes:
 
  - Verify perf.data header for maliciously crafted files.
  - Update perf test to cover more usages and make them robust.
  - Move a couple of copied kernel headers not to annoy objtool build.
  - Fix a bug in map sorting in name order.
  - Remove some unused codes.
 
 Misc:
 
  - Fix module symbol resolution with non-zero text address.
  - Add -t/--threads option to `perf bench mem mmap`.
  - Track duration of exit*() syscall by `perf trace -s`.
  - Add core.addr2line-timeout and core.addr2line-disable-warn config
    items.
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCaeKePAAKCRCMstVUGiXM
 g5HiAQD7V4hiNd1atnY2slRfvkqSV7wlrXjYEQj01Ht0eJxJwAEA+3991R+6+RTZ
 9AbC0LvjBgKhnRDR1/DE+GkXUmQZnwA=
 =rlNN
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v7.1-2026-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
 "perf report:

   - Add 'comm_nodigit' sort key to combine similar threads that only
     have different numbers in the comm. In the following example, the
     'comm_nodigit' will have samples from all threads starting with
     "bpfrb/" into an entry "bpfrb/<N>".

        $ perf report -s comm_nodigit,comm -H
        ...
        #
        #    Overhead  CommandNoDigit / Command
        # ...........  ........................
        #
            20.30%     swapper
               20.30%     swapper
            13.37%     chrome
               13.37%     chrome
            10.07%     bpfrb/<N>
                7.47%     bpfrb/0
                0.70%     bpfrb/1
                0.47%     bpfrb/3
                0.46%     bpfrb/2
                0.25%     bpfrb/4
                0.23%     bpfrb/5
                0.20%     bpfrb/6
                0.14%     bpfrb/10
                0.07%     bpfrb/7

   - Support flat layout for symfs. The --symfs option is to specify the
     location of debugging symbol files. The default 'hierarchy' layout
     would search the symbol file using the same path of the original
     file under the symfs root. The new 'flat' layout would search only
     in the root directory.

   - Update 'simd' sort key for ARM SIMD flags to cover ASE/SME and more
     predicate flags.

  perf stat:

   - Add --pmu-filter option to select specific PMUs. This would be
     useful when you measure metrics from multiple instance of uncore
     PMUs with similar names.

        # perf stat -M cpa_p0_avg_bw
         Performance counter stats for 'system wide':

            19,417,779,115      hisi_sicl0_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                         0      hisi_sicl0_cpa0/cpa_p0_wr_dat/
                         0      hisi_sicl0_cpa0/cpa_p0_rd_dat_64b/
                         0      hisi_sicl0_cpa0/cpa_p0_rd_dat_32b/
            19,417,751,103      hisi_sicl10_cpa0/cpa_cycles/     #     0.00 cpa_p0_avg_bw
                         0      hisi_sicl10_cpa0/cpa_p0_wr_dat/
                         0      hisi_sicl10_cpa0/cpa_p0_rd_dat_64b/
                         0      hisi_sicl10_cpa0/cpa_p0_rd_dat_32b/
            19,417,730,679      hisi_sicl2_cpa0/cpa_cycles/      #     0.31 cpa_p0_avg_bw
                75,635,749      hisi_sicl2_cpa0/cpa_p0_wr_dat/
                18,520,640      hisi_sicl2_cpa0/cpa_p0_rd_dat_64b/
                         0      hisi_sicl2_cpa0/cpa_p0_rd_dat_32b/
            19,417,674,227      hisi_sicl8_cpa0/cpa_cycles/      #     0.00 cpa_p0_avg_bw
                         0      hisi_sicl8_cpa0/cpa_p0_wr_dat/
                         0      hisi_sicl8_cpa0/cpa_p0_rd_dat_64b/
                         0      hisi_sicl8_cpa0/cpa_p0_rd_dat_32b/

              19.417734480 seconds time elapsed

     With --pmu-filter, users can select only hisi_sicl2_cpa0 PMU.

        # perf stat --pmu-filter hisi_sicl2_cpa0 -M cpa_p0_avg_bw
         Performance counter stats for 'system wide':

             6,234,093,559      cpa_cycles                       #     0.60 cpa_p0_avg_bw
                50,548,465      cpa_p0_wr_dat
                 7,552,182      cpa_p0_rd_dat_64b
                         0      cpa_p0_rd_dat_32b

               6.234139320 seconds time elapsed

  Data type profiling:

   - Quality improvements by tracking register state more precisely

   - Ensure array members to get the type

   - Handle more cases for global variables

  Vendor event/metric updates:

   - Update various Intel events and metrics

   - Add NVIDIA Tegra 410 Olympus events

  Internal changes:

   - Verify perf.data header for maliciously crafted files

   - Update perf test to cover more usages and make them robust

   - Move a couple of copied kernel headers not to annoy objtool build

   - Fix a bug in map sorting in name order

   - Remove some unused codes

  Misc:

   - Fix module symbol resolution with non-zero text address

   - Add -t/--threads option to `perf bench mem mmap`

   - Track duration of exit*() syscall by `perf trace -s`

   - Add core.addr2line-timeout and core.addr2line-disable-warn config
     items"

* tag 'perf-tools-for-v7.1-2026-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (131 commits)
  perf loongarch: Fix build failure with CONFIG_LIBDW_DWARF_UNWIND
  perf annotate: Use jump__delete when freeing LoongArch jumps
  perf test: Fixes for check branch stack sampling
  perf test: Fix inet_pton probe failure and unroll call graph
  perf build: fix "argument list too long" in second location
  perf header: Add sanity checks to HEADER_BPF_BTF processing
  perf header: Sanity check HEADER_BPF_PROG_INFO
  perf header: Sanity check HEADER_PMU_CAPS
  perf header: Sanity check HEADER_HYBRID_TOPOLOGY
  perf header: Sanity check HEADER_CACHE
  perf header: Sanity check HEADER_GROUP_DESC
  perf header: Sanity check HEADER_PMU_MAPPINGS
  perf header: Sanity check HEADER_MEM_TOPOLOGY
  perf header: Sanity check HEADER_NUMA_TOPOLOGY
  perf header: Sanity check HEADER_CPU_TOPOLOGY
  perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO
  perf header: Bump up the max number of command line args allowed
  perf header: Validate nr_domains when reading HEADER_CPU_DOMAIN_INFO
  perf sample: Fix documentation typo
  perf arm_spe: Improve SIMD flags setting
  ...
2026-04-18 09:24:56 -07:00
..
asm
asm-generic
linux
README

README

Why we want a copy of kernel headers in tools?
==============================================

There used to be no copies, with tools/ code using kernel headers
directly. From time to time tools/perf/ broke due to legitimate kernel
hacking. At some point Linus complained about such direct usage. Then we
adopted the current model.

The way these headers are used in perf are not restricted to just
including them to compile something.

There are sometimes used in scripts that convert defines into string
tables, etc, so some change may break one of these scripts, or new MSRs
may use some different #define pattern, etc.

E.g.:

  $ ls -1 tools/perf/trace/beauty/*.sh | head -5
  tools/perf/trace/beauty/arch_errno_names.sh
  tools/perf/trace/beauty/drm_ioctl.sh
  tools/perf/trace/beauty/fadvise.sh
  tools/perf/trace/beauty/fsconfig.sh
  tools/perf/trace/beauty/fsmount.sh
  $
  $ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
        [0] = "NORMAL",
        [1] = "RANDOM",
        [2] = "SEQUENTIAL",
        [3] = "WILLNEED",
        [4] = "DONTNEED",
        [5] = "NOREUSE",
  };
  $

The tools/perf/check-headers.sh script, part of the tools/ build
process, points out changes in the original files.

So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers.

Another explanation from Ingo Molnar:
It's better than all the alternatives we tried so far:

 - Symbolic links and direct #includes: this was the original approach but
   was pushed back on from the kernel side, when tooling modified the
   headers and broke them accidentally for kernel builds.

 - Duplicate self-defined ABI headers like glibc: double the maintenance
   burden, double the chance for mistakes, plus there's no tech-driven
   notification mechanism to look at new kernel side changes.

What we are doing now is a third option:

 - A software-enforced copy-on-write mechanism of kernel headers to
   tooling, driven by non-fatal warnings on the tooling side build when
   kernel headers get modified:

    Warning: Kernel ABI header differences:
      diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
      diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
      diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
      ...

   The tooling policy is to always pick up the kernel side headers as-is,
   and integate them into the tooling build. The warnings above serve as a
   notification to tooling maintainers that there's changes on the kernel
   side.

We've been using this for many years now, and it might seem hacky, but
works surprisingly well.