mirror of https://github.com/torvalds/linux.git
12 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
aa497357c1 |
perf stat: Fix uncore aggregation number
Follow up:
lore.kernel.org/CAP-5=fVDF4-qYL1Lm7efgiHk7X=_nw_nEFMBZFMcsnOOJgX4Kg@mail.gmail.com/
The patch adds unit aggregation during evsel merge the aggregated uncore
counters. Change the name of the column to `ctrs` and `counters` for
json mode.
Tested on a 2-socket machine with SNC3, uncore_imc_[0-11] and
cpumask="0,120"
Before:
perf stat -e clockticks -I 1000 --per-socket
# time socket cpus counts unit events
1.001085024 S0 1 9615386315 clockticks
1.001085024 S1 1 9614287448 clockticks
perf stat -e clockticks -I 1000 --per-node
# time node cpus counts unit events
1.001029867 N0 1 3205726984 clockticks
1.001029867 N1 1 3205444421 clockticks
1.001029867 N2 1 3205234018 clockticks
1.001029867 N3 1 3205224660 clockticks
1.001029867 N4 1 3205207213 clockticks
1.001029867 N5 1 3205528246 clockticks
After:
perf stat -e clockticks -I 1000 --per-socket
# time socket ctrs counts unit events
1.001026071 S0 12 9619677996 clockticks
1.001026071 S1 12 9618612614 clockticks
perf stat -e clockticks -I 1000 --per-node
# time node ctrs counts unit events
1.001027449 N0 4 3207251859 clockticks
1.001027449 N1 4 3207315930 clockticks
1.001027449 N2 4 3206981828 clockticks
1.001027449 N3 4 3206566126 clockticks
1.001027449 N4 4 3206032609 clockticks
1.001027449 N5 4 3205651355 clockticks
Tested with JSON output linter:
perf test "perf stat JSON output linter"
94: perf stat JSON output linter : Ok
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Link: https://lore.kernel.org/r/20250627201818.479421-1-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
|
|
45a86d017a |
perf test: Add --metric-only to perf stat output tests
Add a test case for --metric-only for std, csv, json output mode using
shadow IPC metric from instructions and cycles events. It should
produce 'insn per cycle' metric.
But currently JSON output has (none) 'GHz' as well. It looks like a bug
but I don't have enough time to debug it for now so I made it pass. :(
$ perf stat --metric-only -e instructions,cycles true
Performance counter stats for 'true':
0.56
0.002127319 seconds time elapsed
0.002077000 seconds user
0.000000000 seconds sys
$ perf stat -x, --metric-only -e instructions,cycles true
0.55,,
$ perf stat -j --metric-only -e instructions,cycles true
{"insn per cycle" : "0.53", "GHz" : "none"}
$ perf test output -v
5: Test data source output : Ok
31: Sort output of hist entries : Ok
88: perf stat CSV output linter : Ok
90: perf stat JSON output linter : Ok
92: perf stat STD output linter : Ok
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-2-namhyung@kernel.org
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
|
|
967364894e |
perf stat: Fix trailing comma when there is no metric unit
Now that printing metric-value and metric-unit is optional,
print_running_json() shouldn't add the comma in case it becomes
trailing.
Replace all manual JSON comma stuff with a json_out() function that uses
the existing os->first tracking and auto inserts a comma if it's needed.
Update the test to handle that two of the fields can be missing.
This fixes the following test failure on Cortex A57 where the branch
misses metric is missing a required event:
$ perf test -vvv "json output"
106: perf stat JSON output linter:
--- start ---
test child forked, pid 665682
Checking json output: no args Test failed for input:
{"counter-value" : "3112.000000", "unit" : "",
"event" : "armv8_pmuv3_1/branch-misses/",
"event-runtime" : 20699340, "pcnt-running" : 100.00, }
...
json.decoder.JSONDecodeError: Expecting property name enclosed in
double quotes: line 12 column 144 (char 2109)
---- end(-1) ----
106: perf stat JSON output linter : FAILED!
Fixes:
|
|
|
|
f9825601aa |
perf stat: Add metric-threshold to json output
When the threshold isn't unknown add a value to the json like:
"metric-threshold" : "good"
A more complete example:
```
$ perf stat -a -j -I 1000
{"interval" : 1.001089747, "counter-value" : "16045.281449", "unit" : "msec", "event" : "cpu-clock", "event-runtime" : 16045355135, "pcnt-running" : 100.00, "metric-value" : "16.045281", "metric-unit" : "CPUs utilized"}
{"interval" : 1.001089747, "counter-value" : "10003.000000", "unit" : "", "event" : "context-switches", "event-runtime" : 16045314844, "pcnt-running" : 100.00, "metric-value" : "623.423156", "metric-unit" : "/sec"}
{"interval" : 1.001089747, "counter-value" : "328.000000", "unit" : "", "event" : "cpu-migrations", "event-runtime" : 16045321403, "pcnt-running" : 100.00, "metric-value" : "20.442147", "metric-unit" : "/sec"}
{"interval" : 1.001089747, "counter-value" : "20114.000000", "unit" : "", "event" : "page-faults", "event-runtime" : 16045355927, "pcnt-running" : 100.00, "metric-value" : "1.253577", "metric-unit" : "K/sec"}
{"interval" : 1.001089747, "counter-value" : "4066679471.000000", "unit" : "", "event" : "instructions", "event-runtime" : 16045369123, "pcnt-running" : 100.00, "metric-value" : "1.628330", "metric-unit" : "insn per cycle"}
{"interval" : 1.001089747, "counter-value" : "2497454658.000000", "unit" : "", "event" : "cycles", "event-runtime" : 16045374810, "pcnt-running" : 100.00, "metric-value" : "0.155650", "metric-unit" : "GHz"}
{"interval" : 1.001089747, "counter-value" : "914974294.000000", "unit" : "", "event" : "branches", "event-runtime" : 16045379877, "pcnt-running" : 100.00, "metric-value" : "57.024509", "metric-unit" : "M/sec"}
{"interval" : 1.001089747, "counter-value" : "9237201.000000", "unit" : "", "event" : "branch-misses", "event-runtime" : 16045375017, "pcnt-running" : 100.00, "metric-value" : "1.009559", "metric-unit" : "of all branches", "metric-threshold" : "good"}
{"interval" : 1.001089747, "event-runtime" : 16045397172, "pcnt-running" : 100.00, "metricgroup" : "TopdownL1"}
{"interval" : 1.001089747, "metric-value" : "22.036686", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"}
{"interval" : 1.001089747, "metric-value" : "7.610161", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"}
{"interval" : 1.001089747, "metric-value" : "36.729687", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"}
{"interval" : 1.001089747, "metric-value" : "33.623465", "metric-unit" : "% tma_retiring"}
...
```
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241017175356.783793-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
|
|
cbc917a1b0 |
perf stat: Support per-cluster aggregation
Some platforms have 'cluster' topology and CPUs in the cluster will
share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
cache (for Intel Jacobsville). Currently parsing and building cluster
topology have been supported since [1].
perf stat has already supported aggregation for other topologies like
die or socket, etc. It'll be useful to aggregate per-cluster to find
problems like L3T bandwidth contention.
This patch add support for "--per-cluster" option for per-cluster
aggregation. Also update the docs and related test. The output will
be like:
[root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
Performance counter stats for 'system wide':
S56-D0-CLS158 4 1,321,521,570 LLC-load
S56-D0-CLS594 4 794,211,453 LLC-load
S56-D0-CLS1030 4 41,623 LLC-load
S56-D0-CLS1466 4 41,646 LLC-load
S56-D0-CLS1902 4 16,863 LLC-load
S56-D0-CLS2338 4 15,721 LLC-load
S56-D0-CLS2774 4 22,671 LLC-load
[...]
On a legacy system without cluster or cluster support, the output will
be look like:
[root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1
Performance counter stats for 'system wide':
S56-D0-CLS0 64 18,011,485 cycles
S7182-D0-CLS0 64 16,548,835 cycles
Note that this patch doesn't mix the cluster information in the outputs
of --per-core to avoid breaking any tools/scripts using it.
Note that perf recently supports "--per-cache" aggregation, but it's not
the same with the cluster although cluster CPUs may share some cache
resources. For example on my machine all clusters within a die share the
same L3 cache:
$ cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-31
$ cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list
0-3
[1] commit
|
|
|
|
18b687d7ef |
pert tests: Update metric-value for perf stat JSON output
There may be multiplexing triggered, e.g., e-core of ADL. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20230615135315.3662428-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
e259555017 |
pert tests: Support metricgroup perf stat JSON output
A new field metricgroup has been added in the perf stat JSON output. Support it in the test case. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20230607162700.3234712-8-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
540c910c65 |
perf test: Fix perf stat JSON output test
The recent --per-cache option test caused a problem. According to the
option name, I think it should check args.per_cache instead of
args.per_cache_instance.
$ sudo ./perf test -v 99
99: perf stat JSON output linter :
--- start ---
test child forked, pid 3086101
Checking json output: no args [Success]
Checking json output: system wide [Success]
Checking json output: interval [Success]
Checking json output: event [Success]
Checking json output: per thread [Success]
Checking json output: per node [Success]
Checking json output: system wide no aggregation [Success]
Checking json output: per core [Success]
Checking json output: per cache_instance Test failed for input:
...
Traceback (most recent call last):
File "linux/tools/perf/tests/shell/lib/perf_json_output_lint.py", line 88, in <module>
elif args.per_core or args.per_socket or args.per_node or args.per_die or args.per_cache_instance:
AttributeError: 'Namespace' object has no attribute 'per_cache_instance'
test child finished with -1
---- end ----
perf stat JSON output linter: FAILED!
Fixes:
|
|
|
|
bfce728db3 |
pert tests: Add tests for new "perf stat --per-cache" aggregation option
Add tests for the new "--per-cache" option in 'perf stat' for CSV and JSON generation as well as for the JSON linting. Suggested-by: Gautham Shenoy <gautham.shenoy@amd.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Wen Pu <puwen@hygon.cn> Link: https://lore.kernel.org/r/20230517172745.5833-6-kprateek.nayak@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
760eafb2a3 |
perf test stat+json_output: Write JSON output to a file
Write the JSON output to a file, then sanity check this output. This avoids problems with debug/warning/error output corrupting the file format. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230408054456.3001367-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
3de34f85bf |
perf test: Avoid counting commas in json linter
Commas may appear in events like: cpu/INT_MISC.RECOVERY_CYCLES,cmask=1,edge/ which causes the count of commas to see more items than expected. Switch to counting the entries in the dictionary, which is 1 more than the number of commas. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Claire Jensen <cjense@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230223071818.329671-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
0c343af2a2 |
perf test: JSON format checking
Add field checking tests for perf stat JSON output. Sanity checks the expected number of fields are present, that the expected keys are present and they have the correct values. Committer notes: Had to fix this: - $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \ + $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \ Committer testing: [root@quaco ~]# perf test json 90: perf stat JSON output linter : Ok [root@quaco ~]# set -o vi [root@quaco ~]# perf test -v json 90: perf stat JSON output linter : --- start --- test child forked, pid 560794 Checking json output: no args [Success] Checking json output: system wide [Success] Checking json output: system wide Checking json output: system wide no aggregation [Success] Checking json output: interval [Success] Checking json output: event [Success] Checking json output: per core [Success] Checking json output: per thread [Success] Checking json output: per die [Success] Checking json output: per node [Success] Checking json output: per socket [Success] test child finished with 0 ---- end ---- perf stat JSON output linter: Ok [root@quaco ~]# Signed-off-by: Claire Jensen <cjense@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: Claire Jensen <clairej735@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |