mirror of https://github.com/torvalds/linux.git
142 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
0197da7aff |
perf pmu: Lazily compute default config
The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change default_config to perf_event_attr_init_default, a callback that is used when a default config needs initializing. This avoids holding onto the memory for a perf_event_attr and copying. On a tigerlake laptop running the pmu-scan benchmark: Before: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.780 usec (+- 0.503 usec) Average PMU scanning took: 283.480 usec (+- 18.471 usec) Number of openat syscalls: 30,227 After: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 27.880 usec (+- 0.169 usec) Average PMU scanning took: 245.260 usec (+- 15.758 usec) Number of openat syscalls: 28,914 Over 3 runs it is a nearly 12% reduction in execution time and a 4.3% of openat calls. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
63883cb063 |
perf pmu: Const-ify perf_pmu__config_terms
Add const to related APIs, this is so they can be used to default initialize a perf_event_attr from a const pmu. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
3a42f4c796 |
perf pmu: Const-ify file APIs
File APIs don't alter the struct pmu so allow const ones to be passed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
aa61360155 |
perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init
Assign default_config as part of the init. perf_pmu__get_default_config was doing more than just getting the default config and so this is intended to better align with the code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
b1f05622fe |
perf pmus: Make PMU alias name loading lazy
PMU alias names were computed when the first perf_pmu is created, scanning all PMUs in event sources for a file called alias that generally doesn't exist. Switch to trying to load the file when all PMU related files are loaded in lookup. This would cause a PMU name lookup of an alias name to fail if no PMUs were loaded, so in that case all PMUs are loaded and the find repeated. The overhead is similar but in the (very) general case not all PMUs are scanned for the alias file. As the overhead occurs once per invocation it doesn't show in perf bench internals pmu-scan. On a tigerlake machine, the number of openat system calls for an event of cpu/cycles/ with perf stat reduces from 94 to 69 (ie 25 fewer openat calls). Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230925062323.840799-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
2879ff36f5 |
perf pmu: "Compat" supports regular expression matching identifiers
The jevent "Compat" is used for uncore PMU alias or metric definitions. The same PMU driver has different PMU identifiers due to different hardware versions and types, but they may have some common PMU event. Since a Compat value can only match one identifier, when adding the same event alias to PMUs with different identifiers, each identifier needs to be defined once, which is not streamlined enough. So let "Compat" support using regular expression to match identifiers for uncore PMU alias. For example, if the "Compat" value is set to "43401|43c01", it would be able to match PMU identifiers such as "43401" or "43c01", which correspond to CMN600_r0p0 or CMN700_r0p0. Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Zhuo Song <zhuo.song@linux.alibaba.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/1695794391-34817-2-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
70360fad91 |
perf pmu: Remove unused function
pmu_events_table__find() is no longer used so remove it and its Arm specific version. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230913153355.138331-4-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
3d0f5f456a |
perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file. list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated. Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230913153355.138331-2-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
0d3f0e6f94 |
perf parse-events: Introduce 'struct parse_events_terms'
parse_events_terms() existed in function names but was passed a 'struct list_head'. As many parse_events functions take an evsel_config list as well as a parse_event_term list, and the naming head_terms and head_config is inconsistent, there's a potential to switch the lists and get errors. Introduce a 'struct parse_events_terms', that just wraps a list_head, to avoid this. Add the regular init/exit functions and transition the code to use them. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230901233949.2930562-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
cd4e1efbbc |
perf pmus: Skip duplicate PMUs and don't print list suffix by default
Add a PMUs scan that ignores duplicates. When there are multiple PMUs
that differ only by suffix, by default just list the first one and
skip all others. The scan routine checks that the PMU names match but
doesn't enforce that the numbers are consecutive as for some PMUs
there are gaps. If "-v" is passed to "perf list" then list all PMUs.
With the previous change duplicate PMUs are no longer printed but the
suffix of the first is printed. When duplicate PMUs are being skipped
avoid printing the suffix.
Before:
$ perf list
...
uncore_imc_free_running_0/data_read/ [Kernel PMU event]
uncore_imc_free_running_0/data_total/ [Kernel PMU event]
uncore_imc_free_running_0/data_write/ [Kernel PMU event]
uncore_imc_free_running_1/data_read/ [Kernel PMU event]
uncore_imc_free_running_1/data_total/ [Kernel PMU event]
uncore_imc_free_running_1/data_write/ [Kernel PMU event]
After:
$ perf list
...
uncore_imc_free_running/data_read/ [Kernel PMU event]
uncore_imc_free_running/data_total/ [Kernel PMU event]
uncore_imc_free_running/data_write/ [Kernel PMU event]
...
$ perf list -v
uncore_imc_free_running_0/data_read/ [Kernel PMU event]
uncore_imc_free_running_0/data_total/ [Kernel PMU event]
uncore_imc_free_running_0/data_write/ [Kernel PMU event]
uncore_imc_free_running_1/data_read/ [Kernel PMU event]
uncore_imc_free_running_1/data_total/ [Kernel PMU event]
uncore_imc_free_running_1/data_write/ [Kernel PMU event]
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230825135237.921058-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
b7823045ec |
perf pmu: Make id const and add missing free
The struct pmu id is initialized from pmu_id that is read into allocated memory from a file, as such it needs free-ing in pmu__delete(). Make the id value const so that we can remove casts in tests. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
c091ee9089 |
perf pmu: Remove logic for PMU name being NULL
The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Li <liwei391@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
8d4b6d37ea |
perf pmu: Lazily load sysfs aliases
Don't load sysfs aliases for a PMU when the PMU is first created, defer until an alias needs to be found. For the pmu-scan benchmark, average core PMU scanning is reduced by 30.8%, and average PMU scanning by 12.6%. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
e6ff1eed35 |
perf pmu: Lazily add JSON events
Rather than scanning all JSON events and adding them when a PMU is created, add the alias when the JSON event is needed. Average core PMU scanning run time reduced by 60.2%. Average PMU scanning run time reduced by 15%. Page faults with no events reduced by 74 page faults, 4% of total. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
7c52f10c0d |
perf pmu: Cache JSON events table
Cache the JSON events table so that finding it isn't done per event/alias. Change the events table find so that when the PMU is given, if the PMU has no JSON events return null. Update usage to always use the PMU variable. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
e3edd6cf63 |
perf pmu-events: Reduce processed events by passing PMU
Pass the PMU to pmu_events_table__for_each_event so that entries that don't match don't need to be processed by callback. If a NULL PMU is passed then all PMUs are processed. 'perf bench internals pmu-scan's "Average PMU scanning" performance is reduced by about 5% on an Intel tigerlake. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
9d31cb9395 |
perf parse-events: Improve error message for double setting
Double setting information for an event would produce an error message
associated with the PMU rather than the term that was double setting.
Improve the error message to be on the term.
Before:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: 'cpu/inst_retired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Run 'perf list' for a list of valid events
$
After:
$ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
event syntax error: '..etired.any,inst_retired.any/'
\___ Bad event or PMU
Unabled to find PMU or event on a PMU of 'cpu'
Initial error:
event syntax error: '..etired.any,inst_retired.any/'
\___ Attempt to set event's scale twice
Run 'perf list' for a list of valid events
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
c3245d2093 |
perf pmu: Abstract alias/event struct
In order to be able to lazily compute aliases/events for a PMU, move the struct perf_pmu_alias into pmu.c. Add perf_pmu__find_event and perf_pmu__for_each_event that take a callback that is called for the found event or for each event. The layout of struct pmu and the event/alias list is unchanged but the API is altered so that aliases are no longer directly accessed, allowing for later changes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
5040264121 |
perf pmu: Make the loading of formats lazy
The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.
Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.
For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.
Before:
$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
Average PMU scanning took: 4725.300 usec (+- 260.599 usec)
After:
$ perf bench internals pmu-scan
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
Average PMU scanning took: 4520.960 usec (+- 251.272 usec)
Committer testing:
On a AMD Ryzen 5950x:
Before:
$ perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
Average PMU scanning took: 1623.213 usec (+- 23.998 usec)
Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):
3276.82 msec task-clock:u # 0.990 CPUs utilized ( +- 0.82% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1008 page-faults:u # 307.615 /sec ( +- 0.04% )
12049614778 cycles:u # 3.677 GHz ( +- 0.07% ) (83.34%)
117507478 stalled-cycles-frontend:u # 0.98% frontend cycles idle ( +- 0.33% ) (83.32%)
27106761 stalled-cycles-backend:u # 0.22% backend cycles idle ( +- 9.55% ) (83.36%)
33294953848 instructions:u # 2.76 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.03% ) (83.31%)
6849825049 branches:u # 2.090 G/sec ( +- 0.03% ) (83.37%)
71533903 branch-misses:u # 1.04% of all branches ( +- 0.20% ) (83.30%)
3.3088 +- 0.0302 seconds time elapsed ( +- 0.91% )
$
After:
$ perf stat -r5 perf bench internals pmu-scan -i1000
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 1000 times
Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
Average PMU scanning took: 1562.814 usec (+- 22.729 usec)
Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):
3170.86 msec task-clock:u # 0.992 CPUs utilized ( +- 0.22% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1010 page-faults:u # 318.526 /sec ( +- 0.04% )
11890047674 cycles:u # 3.750 GHz ( +- 0.14% ) (83.27%)
119090499 stalled-cycles-frontend:u # 1.00% frontend cycles idle ( +- 0.46% ) (83.40%)
32502449 stalled-cycles-backend:u # 0.27% backend cycles idle ( +- 8.32% ) (83.30%)
33119141261 instructions:u # 2.79 insn per cycle
# 0.00 stalled cycles per insn ( +- 0.01% ) (83.37%)
6812816561 branches:u # 2.149 G/sec ( +- 0.01% ) (83.29%)
70157855 branch-misses:u # 1.03% of all branches ( +- 0.28% ) (83.38%)
3.19710 +- 0.00826 seconds time elapsed ( +- 0.26% )
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
838a8c5f40 |
perf pmu: Pass PMU rather than aliases and format
Pass the pmu so the aliases and format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
da6a5afda5 |
perf pmu: Avoid passing format list to perf_pmu__format_bits()
Pass the PMU so the format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-8-irogers@google.com [ Did missing conversions in tools/perf/arch/arm*/util/cs-etm.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
7eb5473314 |
perf pmu: Avoid passing format list to perf_pmu__format_type
Pass the pmu so the format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
804fee5d0f |
perf pmu: Avoid passing format list to perf_pmu__config_terms()
Abstract the format list better, hiding it in the PMU, by changing perf_pmu__config_terms() the PMU rather than the format list in the PMU. Change the PMU test to pass a dummy PMU for this purpose. Changing the test allows perf_pmu__del_formats() to become static. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
6f2f6eafcd |
perf pmu: Reduce scope of perf_pmu_error()
Move declaration from header file to pmu.y and make static. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
cc5adb7347 |
perf pmu: Move perf_pmu__set_format to pmu.y
Avoid having the function in the C and header file, as it is only used locally by pmu.y. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230823080828.1460376-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
9d5da30e4a |
perf jevents: Add a new expression builtin strcmp_cpuid_str()
This will allow writing formulas that are conditional on a specific
CPU type or CPU version. It calls through to the existing
strcmp_cpuid_str() function in Perf which has a default weak version,
and an arch specific version for x86 and arm64.
The function takes an 'ID' type value, which is a string. But in this
case Arm CPU IDs are hex numbers prefixed with '0x'. metric.py
assumes strings are only used by event names, and that they can't start
with a number ('0'), so an additional change has to be made to the
regex to convert hex numbers back to 'ID' types.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Sohom Datta <sohomdatta1@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230816114841.1679234-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
07d2b820fd |
perf test parse-events: Test complex name has required event format
test__checkevent_complex_name will use an "event" format which if not
present, such as with a placeholder PMU, will cause test failures. Skip
the test in this case to avoid failures in restricted environments.
Add perf_pmu__has_format utility as a general PMU utility.
Fixes:
|
|
|
|
628eaa4e87 |
perf pmus: Add placeholder core PMU
If loading a core PMU fails, legacy hardware/cache events may segv due to there being no PMU. Create a placeholder empty PMU for this case. This was discussed in: https://lore.kernel.org/lkml/20230614151625.2077-1-yangjihong1@huawei.com/ Reported-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230627182834.117565-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
|
|
|
b9f010328c |
perf pmu: Warn about invalid config for all PMUs and configs
Don't just check the raw PMU type, the only core PMU on homogeneous x86, check raw and all dynamically added PMUs. Extend the perf_pmu__warn_invalid_config to check all 4 config values. Rather than process the format list once per event, store the computed masks for each config value. Don't ignore the mask being zero, which is likely for config2 and config3, add config_masks_present so config values can be ignored only when no format information is present. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230601023644.587584-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
68c2504341 |
perf pmu: Only warn about unsupported formats once
Avoid scanning format list for each event parsed. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230601023644.587584-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
251aa04024 |
perf parse-events: Wildcard most "numeric" events
Numeric events are either raw events or those with ABI defined numbers
matched by the lexer. PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events
should wildcard match on hybrid systems. So "cycles" should match each
PMU type with an extended type, not just PERF_TYPE_HARDWARE.
Change wildcard matching to add the event even if wildcard PMU
scanning fails, there will be no extended type but this best matches
previous behavior.
Only set the extended type when the event type supports it and when
perf_pmus__supports_extended_type is true. This new function returns
true if >1 core PMU and avoids potential errors on older kernels.
Modify evsel__compute_group_pmu_name using a helper
perf_pmu__is_software to determine when grouping should occur. Try to
use PMUs, and evsel__find_pmu, as being more dependable than
evsel->pmu_name.
Set a parse events error if a hardware term's PMU lookup fails, to
provide extra diagnostics.
Fixes:
|
|
|
|
6b9da26070 |
perf pmu: Remove is_pmu_hybrid
Users have been removed or switched to using pmu->is_core with perf_pmus__num_core_pmus() > 1. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-35-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
9d6a1df9b2 |
perf pmus: Allow just core PMU scanning
Scanning all PMUs is expensive as all PMUs sysfs entries are loaded, benchmarking shows more than 4x the cost: ``` $ perf bench internals pmu-scan -i 1000 Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 989.231 usec (+- 1.535 usec) Average PMU scanning took: 4309.425 usec (+- 74.322 usec) ``` Add new perf_pmus__scan_core routine that scans just core PMUs. Replace perf_pmus__scan calls with perf_pmus__scan_core when non-core PMUs are being ignored. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-30-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
1eaf496ed3 |
perf pmu: Separate pmu and pmus
Separate and hide the pmus list in pmus.[ch]. Move pmus functionality out of pmu.[ch] into pmus.[ch] renaming pmus functions which were prefixed perf_pmu__ to perf_pmus__. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-28-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
597a4276fb |
perf pmu: Remove perf_pmu__hybrid_pmus list
Rather than iterate hybrid PMUs, inhererently Intel specific, iterate all PMUs checking whether they are core. To only get hybrid cores, first call perf_pmu__has_hybrid. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
abe9544ea7 |
perf mem: Avoid hybrid PMU list
Add perf_pmu__num_mem_pmus that scans/counts the number of PMUs for mem events. Switch perf_pmu__for_each_hybrid_pmu to iterating all PMUs and only handling is_core ones. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-24-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
5ac7263448 |
perf tools: Warn if no user requested CPUs match PMU's CPUs
In commit
|
|
|
|
e20d1f2fa2 |
perf pmu: Add is_core to pmu
Cache is_pmu_core in the pmu to avoid recomputation. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
4bf7e81aad |
perf pmu: Detect ARM and hybrid PMUs with sysfs
is_arm_pmu_core detects a core PMU via the presence of a "cpus" file rather than a "cpumask" file. This pattern holds for hybrid PMUs so rename the function and remove redundant perf_pmu__is_hybrid tests. Add a new helper is_pmu_hybrid similar to is_pmu_core. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
52c7b4d3f9 |
perf parse-events: Don't auto merge hybrid wildcard events
Bring back the behavior of not auto-merging hybrid events by delegating to a test in pmu. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230502223851.2234828-37-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
8bc75f699c |
perf parse-events: Support wildcards on raw events
Legacy raw events like r1a open as PERF_TYPE_RAW on non-hybrid systems and on each hybrid PMU on hybrid systems. Rather than iterate hybrid PMUs add a perf_pmu__supports_wildcard_numeric function that says when a numeric event should be opened upon it. If the parsed event specifies the type of the PMU then don't wildcard match PMUs, use the specified PMU type. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230502223851.2234828-27-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
6fd1e51915 |
perf parse-events: Support PMUs for legacy cache events
Allow a legacy cache event to be both, for example, "L1-dcache-load-miss" and "cpu/L1-dcache-load-miss/" by introducing a new legacy cache term type. The term type is processed in config_term_pmu, setting both the type in perf_event_attr and the config. The code to determine the config is factored out of parse_events_add_cache and shared. If the PMU doesn't support legacy events, currently just core/hybrid PMUs do, then the term is treated like a PE_NAME term - as before. If only terms are being parsed, such as for perf_pmu__new_alias, then the PE_LEGACY_CACHE token is always parsed as PE_NAME. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20230502223851.2234828-24-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
240e6fd0a9 |
perf pmu: Improve name/comments, avoid a memory allocation
Improve documentation around perf_pmu_alias pmu_name and on functions. Reduce the scope of pmu_uncore_alias_match to just file. Rename perf_pmu__valid_suffix to the more revealing perf_pmu__match_ignoring_suffix. Add a short-cut to perf_pmu__match_ignoring_suffix for PMU names that don't also have a socket value, and can therefore avoid a memory allocation. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230406235256.2768773-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
3d88aec0d4 |
perf pmu: Make parser reentrant
By default bison uses global state for compatibility with yacc. Make the parser reentrant so that it may be used in asynchronous and multithreaded situations. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230406065224.2553640-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
3a69672e88 |
perf pmu: Add perf_pmu__{open,scan}_file_at()
These two helpers will also use openat() to reduce the overhead with
relative pathnames. Convert other functions in pmu_lookup() to use
the new helpers.
Committer testing:
Before:
⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average PMU scanning took: 2729.040 usec (+- 7.117 usec)
⬢[acme@toolbox perf-tools-next]$
After:
⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average PMU scanning took: 2419.870 usec (+- 9.057 usec)
⬢[acme@toolbox perf-tools-next]$
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
e293a5e816 |
perf pmu: Use relative path for sysfs scan
The PMU information is in the kernel sysfs so it needs to scan the
directory to get the whole information like event aliases, formats and
so on. During the traversal, it opens a lot of files and directories
like below:
dir = opendir("/sys/bus/event_source/devices");
while (dentry = readdir(dir)) {
char buf[PATH_MAX];
snprintf(buf, sizeof(buf), "%s/%s",
"/sys/bus/event_source/devices", dentry->d_name);
fd = open(buf, O_RDONLY);
...
}
But this is not good since it needs to copy the string to build the
absolute pathname, and it makes redundant pathname walk (from the /sys)
unnecessarily. We can use openat(2) to open the file in the given
directory. While it's not a problem ususally, it can be a problem when
the kernel has contentions on the sysfs.
Add a couple of new helper to return the file descriptor of PMU
directory so that it can use it with relative paths.
* perf_pmu__event_source_devices_fd()
- returns a fd for the PMU root ("/sys/bus/event_source/devices")
* perf_pmu__pathname_fd()
- returns a fd for "<pmu>/<file>" under the PMU root
Now the above code can be converted something like below:
dirfd = perf_pmu__event_source_devices_fd();
dir = fdopendir(dirfd);
while (dentry = readdir(dir)) {
fd = openat(dirfd, dentry->d_name, O_RDONLY);
...
}
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
eec1131091 |
perf pmu: Add perf_pmu__destroy() function
It seems there's no function to delete the perf pmu struct. Add the perf_pmu__destroy() to do the job. While at it, add some more helper functions to delete pmu aliases and caps. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
204e7c499f |
perf tools: Add support for perf_event_attr::config3
perf_event_attr has gained a new field, config3, so add support for it extending the existing configN support. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220914-arm-perf-tool-spe1-2-v2-v5-2-2cf5210b2f77@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
f8ea2c1524 |
perf pmu-events: Introduce pmu_metrics_table
Add a metrics table that is just a cast from pmu_events_table. This changes the APIs so that event and metric usage of the underlying table is different. For the no jevents case the tables are already separate, later changes will separate the tables for the jevents case. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
|
|
d9dc8874d6 |
perf pmu-events: Remove now unused event and metric variables
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.
Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:
```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1
Performance counter stats for 'system wide':
0 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 # 57896.0 Bytes LLC_MISSES.PCIE_WRITE (49.84%)
7,174 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 (49.85%)
0 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 (50.16%)
63 UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 (50.15%)
1.004576381 seconds time elapsed
```
whilst the event '-e' version is broken even with --group/-g (fwiw, we should also remove -g [1]):
```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Performance counter stats for 'system wide':
27,316 Bytes LLC_MISSES.PCIE_WRITE
1.004505469 seconds time elapsed
```
The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.
[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irogers@google.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|