- Officially add Tomas Glozar as a maintainer to RTLA tool
- Add for_each_monitored_cpu() helper
In multiple places, RTLA tools iterate over the list of CPUs running
tracer threads.
Use single helper instead of repeating the for/if combination.
- Remove unused variable option_index in argument parsing
RTLA tools use getopt_long() for argument parsing. For its last
argument, an unused variable "option_index" is passed.
Remove the variable and pass NULL to getopt_long() to shorten
the naturally long parsing functions, and make them more readable.
- Fix unassigned nr_cpus after code consolidation
In recent code consolidation, timerlat tool cleanup, previously
implemented separately for each tool, was moved to a common function
timerlat_free().
The cleanup relies on nr_cpus being set. This was not done in the new
function, leaving the variable uninitialized.
Initialize the variable properly, and remove silencing of compiler
warning for uninitialized variables.
- Stop tracing on user latency in BPF mode
Despite the name, rtla-timerlat's -T/--thread option sets timerlat's
stop_tracing_total_us option, which also stops tracing on
return-from-user latency, not only on thread latency.
Implement the same behavior also in BPF sample collection stop tracing
handler to avoid a discrepancy and restore correspondence of behavior
with the equivalent option of cyclictest.
- Fix threshold actions always triggering
A bug in threshold action logic caused the action to execute even
if tracing did not stop because of threshold.
Fix the logic to stop correctly.
- Fix few minor issues in tests
Extend tests that were shown to need it to 5s, fix osnoise test
calling timerlat by mistake, and use new, more reliable output
checking in timerlat's "top stop at failed action" test.
- Do not print usage on argument parsing error
RTLA prints the entire usage message on encountering errors in
argument parsing, like a malformed CPU list.
The usage message has gotten too long. Instead of printing it,
use newly added fatal() helper function to simply exit with
the error message, excluding the usage.
- Fix unintuitive -C/--cgroup interface
"-C cgroup" and "--cgroup cgroup" are invalid syntax, despite that
being a common way to specify an option with argument. Moreover,
using them fails silently and no cgroup is set.
Create new helper function to unify the handling of all such options
and allow all of:
-Xsomething
-X=something
-X something
as well as the equivalent for the long option.
- Fix -a overriding -t argument filename
Fix a bug where -a following -t custom_file.txt overrides the custom
filename with the default timerlat_trace.txt.
- Stop tracing correctly on multiple events at once
In some race scenarios, RTLA BPF sample collection might send multiple
stop tracing events via the BPF ringbuffer at once.
Compare the number of events for != 0 instead of == 1 to cover for
this scenario and stop tracing properly.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaS9bxBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qhrgAP0a/AtsL9+IFXAK5JK8aO1XWApVyK9n
48FRZWu/jrupuAD7BO+EHazmPEourNaUqYPeuymwxT+4O47RH1Q/aasLQwo=
=RvNH
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull rtla trace tooling updates from Steven Rostedt:
- Officially add Tomas Glozar as a maintainer to RTLA tool
- Add for_each_monitored_cpu() helper
In multiple places, RTLA tools iterate over the list of CPUs running
tracer threads.
Use single helper instead of repeating the for/if combination.
- Remove unused variable option_index in argument parsing
RTLA tools use getopt_long() for argument parsing. For its last
argument, an unused variable "option_index" is passed.
Remove the variable and pass NULL to getopt_long() to shorten the
naturally long parsing functions, and make them more readable.
- Fix unassigned nr_cpus after code consolidation
In recent code consolidation, timerlat tool cleanup, previously
implemented separately for each tool, was moved to a common function
timerlat_free().
The cleanup relies on nr_cpus being set. This was not done in the new
function, leaving the variable uninitialized.
Initialize the variable properly, and remove silencing of compiler
warning for uninitialized variables.
- Stop tracing on user latency in BPF mode
Despite the name, rtla-timerlat's -T/--thread option sets timerlat's
stop_tracing_total_us option, which also stops tracing on
return-from-user latency, not only on thread latency.
Implement the same behavior also in BPF sample collection stop
tracing handler to avoid a discrepancy and restore correspondence of
behavior with the equivalent option of cyclictest.
- Fix threshold actions always triggering
A bug in threshold action logic caused the action to execute even if
tracing did not stop because of threshold.
Fix the logic to stop correctly.
- Fix few minor issues in tests
Extend tests that were shown to need it to 5s, fix osnoise test
calling timerlat by mistake, and use new, more reliable output
checking in timerlat's "top stop at failed action" test.
- Do not print usage on argument parsing error
RTLA prints the entire usage message on encountering errors in
argument parsing, like a malformed CPU list.
The usage message has gotten too long. Instead of printing it, use
newly added fatal() helper function to simply exit with the error
message, excluding the usage.
- Fix unintuitive -C/--cgroup interface
"-C cgroup" and "--cgroup cgroup" are invalid syntax, despite that
being a common way to specify an option with argument. Moreover,
using them fails silently and no cgroup is set.
Create new helper function to unify the handling of all such options
and allow all of:
-Xsomething
-X=something
-X something
as well as the equivalent for the long option.
- Fix -a overriding -t argument filename
Fix a bug where -a following -t custom_file.txt overrides the custom
filename with the default timerlat_trace.txt.
- Stop tracing correctly on multiple events at once
In some race scenarios, RTLA BPF sample collection might send
multiple stop tracing events via the BPF ringbuffer at once.
Compare the number of events for != 0 instead of == 1 to cover for
this scenario and stop tracing properly.
* tag 'trace-tools-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla/timerlat: Exit top main loop on any non-zero wait_retval
rtla/tests: Don't rely on matching ^1ALL
rtla: Fix -a overriding -t argument
rtla: Fix -C/--cgroup interface
tools/rtla: Replace osnoise_hist_usage("...") with fatal("...")
tools/rtla: Replace osnoise_top_usage("...") with fatal("...")
tools/rtla: Replace timerlat_hist_usage("...") with fatal("...")
tools/rtla: Replace timerlat_top_usage("...") with fatal("...")
tools/rtla: Add fatal() and replace error handling pattern
rtla/tests: Fix osnoise test calling timerlat
rtla/tests: Extend action tests to 5s
tools/rtla: Fix --on-threshold always triggering
rtla/timerlat_bpf: Stop tracing on user latency
tools/rtla: Fix unassigned nr_cpus
tools/rtla: Remove unused optional option_index
tools/rtla: Add for_each_monitored_cpu() helper
MAINTAINERS: Add Tomas Glozar as a maintainer to RTLA tool
Comparing to exactly 1 will fail if more than one ring buffer
event was seen since the last call to timerlat_bpf_wait(), which
can happen in some race scenarios.
Signed-off-by: Crystal Wood <crwood@redhat.com>
Link: https://lore.kernel.org/r/20251112152529.956778-5-crwood@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
The timerlat "top stop at failed action" test was relying on "ALL" being
printed immediately after the "1" from the threshold action. Besides being
fragile, this depends on stdbuf behavior, which is easy to miss when
recreating the test outside of the framework for debugging purposes.
Instead, use the expected/unexpected text mechanism from the
corresponding osnoise test.
Signed-off-by: Crystal Wood <crwood@redhat.com>
Link: https://lore.kernel.org/r/20251112152529.956778-2-crwood@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
When running rtla as
`rtla <timerlat|osnoise> <top|hist> -t custom_file.txt -a 100`
-a options override trace output filename specified by -t option.
Running the command above will create <timerlat|osnoise>_trace.txt file
instead of custom_file.txt. Fix this by making sure that -a option does
not override trace output filename even if it's passed after trace
output filename is specified.
Fixes: 173a3b0148 ("rtla/timerlat: Add the automatic trace option")
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/b6ae60424050b2c1c8709e18759adead6012b971.1762186418.git.ipravdin.official@gmail.com
[ use capital letter in subject, as required by tracing subsystem ]
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Currently, user can only specify cgroup to the tracer's thread the
following ways:
`-C[cgroup]`
`-C[=cgroup]`
`--cgroup[=cgroup]`
If user tries to specify cgroup as `-C [cgroup]` or `--cgroup [cgroup]`,
the parser silently fails and rtla's cgroup is used for the tracer
threads.
To make interface more user-friendly, allow user to specify cgroup in
the aforementioned way, i.e. `-C [cgroup]` and `--cgroup [cgroup]`.
Refactor identical logic between -t/--trace and -C/--cgroup into a
common function.
Change documentation to reflect this user interface change.
Fixes: a957cbc025 ("rtla: Add -C cgroup support")
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/16132f1565cf5142b5fbd179975be370b529ced7.1762186418.git.ipravdin.official@gmail.com
[ use capital letter in subject, as required by tracing subsystem ]
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
A long time ago, when the usage help was short, it was a favor
to the user to show it on error. Now that the usage help has
become very long, it is too noisy to dump the complete help text
for each typo after the error message itself.
Replace osnoise_hist_usage("...") with fatal("...") on errors.
Remove the already unused 'usage' argument from osnoise_hist_usage().
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251011082738.173670-6-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
A long time ago, when the usage help was short, it was a favor
to the user to show it on error. Now that the usage help has
become very long, it is too noisy to dump the complete help text
for each typo after the error message itself.
Replace osnoise_top_usage("...") with fatal("...") on errors.
Remove the already unused 'usage' argument from osnoise_top_usage().
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251011082738.173670-5-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
A long time ago, when the usage help was short, it was a favor
to the user to show it on error. Now that the usage help has
become very long, it is too noisy to dump the complete help text
for each typo after the error message itself.
Replace timerlat_hist_usage("...\n") with fatal("...") on errors.
Remove the already unused 'usage' argument from timerlat_hist_usage().
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251011082738.173670-4-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
A long time ago, when the usage help was short, it was a favor
to the user to show it on error. Now that the usage help has
become very long, it is too noisy to dump the complete help text
for each typo after the error message itself.
Replace timerlat_top_usage("...\n") with fatal("...") on errors.
Remove the already unused 'usage' argument from timerlat_top_usage().
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251011082738.173670-3-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
The code contains some technical debt in error handling,
which complicates the consolidation of duplicated code.
Introduce an fatal() function to replace the common pattern of
err_msg() followed by exit(EXIT_FAILURE), reducing the length of an
already long function.
Further patches using fatal() follow.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251011082738.173670-2-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
osnoise test "top stop at failed action" is calling timerlat instead of
osnoise by mistake.
Fix it so that it calls the correct RTLA subcommand.
Fixes: 05b7e10687 ("tools/rtla: Add remaining support for osnoise actions")
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20251007095341.186923-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
In non-BPF mode, it takes up to 1 second for RTLA to notice that tracing
has been stopped. That means that action tests cannot have a 1 second
duration, as the SIGALRM will be racing with the threshold overflow.
Previously, non-BPF mode actions were buggy and always executed
the action, even when stopping on duration or SIGINT, preventing
this issue from manifesting. Now that this has been fixed, the tests
have become flaky, and this has to be adjusted.
Fixes: 4e26f84abf ("rtla/tests: Add tests for actions")
Fixes: 05b7e10687 ("tools/rtla: Add remaining support for osnoise actions")
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20251007095341.186923-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Commit 8d933d5c89 ("rtla/timerlat: Add continue action") moved the
code performing on-threshold actions (enabled through --on-threshold
option) to inside the RTLA main loop.
The condition in the loop does not check whether the threshold was
actually exceeded or if stop tracing was requested by the user through
SIGINT or duration. This leads to a bug where on-threshold actions are
always performed, even when the threshold was not hit.
(BPF mode is not affected, since it uses a different condition in the
while loop.)
Add a condition that checks for !stop_tracing before executing the
actions. Also, fix incorrect brackets in hist_main_loop to match the
semantics of top_main_loop.
Fixes: 8d933d5c89 ("rtla/timerlat: Add continue action")
Fixes: 2f3172f9dd ("tools/rtla: Consolidate code between osnoise/timerlat and hist/top")
Reviewed-by: Crystal Wood <crwood@redhat.com>
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20251007095341.186923-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
rtla-timerlat allows a *thread* latency threshold to be set via the
-T/--thread option. However, the timerlat tracer calls this *total*
latency (stop_tracing_total_us), and stops tracing also when the
return-to-user latency is over the threshold.
Change the behavior of the timerlat BPF program to reflect what the
timerlat tracer is doing, to avoid discrepancy between stopping
collecting data in the BPF program and stopping tracing in the timerlat
tracer.
Cc: stable@vger.kernel.org
Fixes: e34293ddce ("rtla/timerlat: Add BPF skeleton to collect samples")
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20251006143100.137255-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
In recently introduced timerlat_free(),
the variable 'nr_cpus' is not assigned.
Assign it with sysconf(_SC_NPROCESSORS_CONF) as done elsewhere.
Remove the culprit: -Wno-maybe-uninitialized. The rest of the
code is clean.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Fixes: 2f3172f9dd ("tools/rtla: Consolidate code between osnoise/timerlat and hist/top")
Link: https://lore.kernel.org/r/20251002170846.437888-1-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
The longindex argument of getopt_long() is optional
and tied to the unused local variable option_index.
Remove it to shorten the four longest functions
and make the code neater.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251002123553.389467-2-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
The rtla tools have many instances of iterating over CPUs while
checking if they are monitored.
Add a for_each_monitored_cpu() helper macro to make the code
more readable and reduce code duplication.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20251002123553.389467-1-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
The help message incorrectly listed '-t' as the short option for
--threads, but the actual getopt_long configuration uses '-e'.
This mismatch can confuse users and lead to incorrect command-line
usage. This patch updates the usage string to correctly show:
"-e, --threads NRTHR"
to match the implementation.
Note: checkpatch.pl reports a false-positive spelling warning on
'Run', which is intentional.
Link: https://patch.msgid.link/20251106031040.1869-1-zhangchujun@cmss.chinamobile.com
Signed-off-by: Zhang Chujun <zhangchujun@cmss.chinamobile.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
- This update is mostly just consolidating code between osnoise/timerlat
and top/hist for easier maintenance and less future divergence.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaN/guhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qqU9AQCO+u+Qmx678DCfDJo9X1UPDtS/bM5f
r30X1pwYfZ3nNAEA47hbkVFcryFJZbrIPxuTGb0GSM36PHAxmch4QAwBqgs=
=qzZh
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt
- This is mostly just consolidating code between osnoise/timerlat and
top/hist for easier maintenance and less future divergence
* tag 'trace-tools-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/rtla: Add remaining support for osnoise actions
tools/rtla: Add test engine support for unexpected output
tools/rtla: Fix -A option name in test comment
tools/rtla: Consolidate code between osnoise/timerlat and hist/top
tools/rtla: Create common_apply_config()
tools/rtla: Move top/hist params into common struct
tools/rtla: Consolidate common parameters into shared structure
The condition to check if the actions buffer needs to be resized was
incorrect. The check `self->size >= self->len` would evaluate to
true on almost every call to `actions_new()`, causing the buffer to
be reallocated unnecessarily each time an action was added.
Fix the condition to `self->len >= self.size`, ensuring
that the buffer is only resized when it is actually full.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250915181101.52513-1-wander@redhat.com
Fixes: 6ea082b171 ("rtla/timerlat: Add action on threshold feature")
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The basic functionality came with the consolidation; now hook up the
command line options, and add documentation and tests.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-8-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add a check() parameter to indicate which text must not appear in the
output.
Simplify the code so that we can print failures as they happen rather
than trying to figure out what went wrong after printing "not ok". This
also means that "not ok" gets printed after the info rather than before,
which seems more intuitive anyway.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-7-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This was changed to --on-threshold when the patches were applied.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-6-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently a lot of code is duplicated between the different rtla tools,
making maintenance more difficult, and encouraging divergence such as
features that are only implemented for certain tools even though they
could be more broadly applicable.
Merge the various main() functions into a common run_tool() with an ops
struct for tool-specific details.
Implement enough support for actions on osnoise to not need to keep the
old params->trace_output path.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-5-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Merge the common bits of osnoise_apply_config() and
timerlat_apply_config(). Put the result in a new common.c, and move
enough things to common.h so that common.c does not need to include
osnoise.h.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-4-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The hist members were very similar between timerlat and top, so
just use one common hist struct.
output_divisor, quiet, and pretty printing are pretty generic
concepts that can go in the main struct even if not every
specific tool (currently) uses them.
Cc: John Kacur <jkacur@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-3-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
timerlat_params and osnoise_params structures contain 15 identical
fields.
Introduce a new header common.h and define a common_params structure to
consolidate shared fields, reduce code duplication, and enhance
maintainability.
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20250907022325.243930-2-crwood@redhat.com
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The tool pkg-config used to check libtraceevent and libtracefs, if not
installed, it will report the libs not found, even though they have
already been installed.
Before:
libtraceevent is missing. Please install libtraceevent-dev/libtraceevent-devel
libtracefs is missing. Please install libtracefs-dev/libtracefs-devel
After:
Makefile.config:10: *** Error: pkg-config needed by libtraceevent/libtracefs is missing
on this system, please install it.
Link: https://lore.kernel.org/20250808040527.2036023-2-chen.dylane@linux.dev
Fixes: 01474dc706 ("tools/rtla: Use tools/build makefiles to build rtla")
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The tool pkg-config used to check libtraceevent and libtracefs, if not
installed, it will report the libs not found, even though they have
already been installed.
Before:
libtraceevent is missing. Please install libtraceevent-dev/libtraceevent-devel
libtracefs is missing. Please install libtracefs-dev/libtracefs-devel
After:
Makefile.config:10: *** Error: pkg-config needed by libtraceevent/libtracefs is missing
on this system, please install it.
Link: https://lore.kernel.org/20250808040527.2036023-1-chen.dylane@linux.dev
Fixes: 9d56c88e52 ("tools/tracing: Use tools/build makefiles on latency-collector")
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The -P option is used to set priority of osnoise and timerlat threads.
Extend the test for -P with --on-threshold calling a script that looks
for running timerlat threads and checks if their priority is set
correctly.
As --on-threshold is only supported by timerlat at the moment, this is
only implemented there so far.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250725133817.59237-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Checking for patterns in rtla output with grep was added to test rtla
actions. Add grep checks also for base tests where applicable.
Also fix trace event histogram trigger check to use the correct syntax
for the command-line option so that the test passes with the grep check.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250725133817.59237-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Many of the original rtla tests included durations of 1 minute and 30
seconds. Experience has shown this is unnecessary, since 10 seconds as
waiting time for samples to appear.
Change duration of all rtla tests to at most 10 seconds. This speeds up
testing significantly.
Before:
$ make check
All tests successful.
Files=3, Tests=54, 536 wallclock secs
( 0.03 usr 0.00 sys + 20.31 cusr 22.02 csys = 42.36 CPU)
Result: PASS
After:
$ make check
...
All tests successful.
Files=3, Tests=54, 196 wallclock secs
( 0.03 usr 0.01 sys + 20.28 cusr 20.68 csys = 41.00 CPU)
Result: PASS
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-9-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add a bunch of tests covering most of both --on-threshold and --on-end.
Parts sensitive to implementation of hist/top are tested for both.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-8-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add argument to the check command in the test suite that takes a regular
expression that the output of rtla command is checked against. This
allows testing for specific information in rtla output in addition
to checking the return value.
Two minor improvements are included: running rtla with "eval" so that
arguments with spaces can be passed to it via shell quotations, and
the stdout of pushd and popd is suppressed to clean up the test output.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-7-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Implement actions on end next to actions on threshold. A new option,
--on-end is added, parallel to --on-threshold. Instead of being
executed whenever a latency threshold is reached, it is executed at the
end of the measurement.
For example:
$ rtla timerlat hist -d 5s --on-end trace
will save the trace output at the end.
All actions supported by --on-threshold are also supported by --on-end,
except for continue, which does nothing with --on-end.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-6-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Introduce option to resume tracing after a latency threshold overflow.
The option is implemented as an action named "continue".
Example:
$ rtla timerlat top -q -T 200 -d 1s --on-threshold \
exec,command="echo Threshold" --on-threshold continue
Threshold
Threshold
Threshold
Timer Latency
...
The feature is supported for both hist and top. After the continue
action is executed, processing of the list of actions is stopped and
tracing is resumed.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-5-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, rtla-timerlat BPF program uses a global variable stored in a
.bss section to store whether tracing has been stopped.
Move the information to a separate map, so that it is easily writable
from userspace, and add a function that clears the value, resuming
tracing after it has been stopped.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-4-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Extend the functionality provided by the -t/--trace option, which
triggers saving the contents of a tracefs buffer after tracing is
stopped, to support implementing arbitrary actions.
A new option, --on-threshold, is added, taking an argument
that further specifies the action. Actions added in this patch are:
- trace[,file=<filename>]: Saves tracefs buffer, optionally taking a
filename.
- signal,num=<sig>,pid=<pid>: Sends signal to process. "parent" might
be specified instead of number to send signal to parent process.
- shell,command=<command>: Execute shell command.
Multiple actions may be specified and will be executed in order,
including multiple actions of the same type. Trace output requested via
-t and -a now adds a trace action to the end of the list.
If an action fails, the following actions are not executed. For
example, this command:
$ rtla timerlat -T 20 --on-threshold trace \
--on-threshold shell,command="grep ipi_send timerlat_trace.txt" \
--on-threshold signal,num=2,pid=parent
will send signal 2 (SIGINT) to parent process, but only if saved trace
contains the text "ipi_send".
This way, the feature can be used for flexible reactions on latency
spikes, and allows combining rtla with other tooling like perf.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
After the introduction of BPF-based sample collection, rtla-timerlat
effectively runs in one of three modes:
- Pure BPF mode, with tracefs only being used to set up the timerlat
tracer. Sample processing and stop on threshold are handled by BPF.
- tracefs mode. BPF is unsupported or kernel is lacking the necessary
trace event (osnoise:timerlat_sample). Stop on theshold is handled by
timerlat tracer stopping tracing in all instances.
- BPF/tracefs mixed mode - BPF is used for sample collection for top or
histogram, tracefs is used for trace output and/or auto-analysis. Stop
on threshold is handled both through BPF program, which stops sample
collection for top/histogram and wakes up rtla, and by timerlat
tracer, which stops tracing for trace output/auto-analysis instances.
Add enum timerlat_tracing_mode, with three values:
- TRACING_MODE_BPF
- TRACING_MODE_TRACEFS
- TRACING_MODE_MIXED
Those represent the modes described above. A field of this type is added
to struct timerlat_params, named "mode", replacing the no_bpf variable.
params->mode is set in timerlat_{top,hist}_parse_args to
TRACING_MODE_BPF or TRACING_MODE_MIXED based on whether trace output
and/or auto-analysis is requested. timerlat_{top,hist}_main then checks
if BPF is not unavailable or disabled, in that case, it sets
params->mode to TRACING_MODE_TRACEFS.
A condition is added to timerlat_apply_config that skips setting
timerlat tracer thresholds if params->mode is TRACING_MODE_BPF (those
are unnecessary, since they only turn off tracing, which is already
turned off in that case, since BPF is used to collect samples).
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Chang Yin <cyin@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250626123405.1496931-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
- Set distinctive value for failed tests
When running "make check" that performs tests on rtla the failure is
checked by examining the output. Instead have the tool return an error
status if it exceeds the threadhold.
- Define __NR_sched_setattr for LoongArch
Define __NR_sched_setattr to allow this to build for LoongArch.
- Define _GNU_SOURCE for timerlat_bpf.c
Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE is
not defined, this can cause errors for timerlat_bpf_init() and breakage in
BPF sample collection mode.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDeTzBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qokRAP0XJzos+uvQtkGrqiX5SB/rn1s3/tiD
nZagARyiV06BAwEA+NNzqFyx/BLUwMnpx/HFTnIMGXbRVWCVAEeL3t77zgk=
=dx7t
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:
- Set distinctive value for failed tests
When running "make check" that performs tests on rtla the failure is
checked by examining the output. Instead have the tool return an
error status if it exceeds the threadhold.
- Define __NR_sched_setattr for LoongArch
Define __NR_sched_setattr to allow this to build for LoongArch.
- Define _GNU_SOURCE for timerlat_bpf.c
Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE
is not defined, this can cause errors for timerlat_bpf_init() and
breakage in BPF sample collection mode.
* tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Define _GNU_SOURCE in timerlat_bpf.c
rtla: Define __NR_sched_setattr for LoongArch
rtla: Set distinctive exit value for failed tests
Newer versions of glibc include a definition of struct sched_attr in
bits/sched.h (included through sched.h which is included by rtla).
Commit 0eecee3406 ("tools/rtla: fix collision with glibc
sched_attr/sched_set_attr") has modified the definition of struct
sched_attr in utils.h, so that it is only applied with older versions of
glibc that do not define it, in order to prevent build failure.
The definition in bits/sched.h depends on _GNU_SOURCE.
timerlat_bpf.c does not define _GNU_SOURCE, making it fall back to the
definition in utils.h. The latter has two fields less, leading to
shifted offsets of struct timerlat_params in timerlat_bpf_init.
Because of the shift, timerlat_bpf_init incorrectly reads
params->entries as 0 for timerlat-hist and disables the creation of
histogram maps, causing breakage in BPF sample collection mode:
$ rtla timerlat hist -d 1s
Error pulling BPF data
Fix the issue by also defining _GNU_SOURCE in timerlat_bpf.c.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250430144651.621766-1-tglozar@redhat.com
Fixes: e34293ddce ("rtla/timerlat: Add BPF skeleton to collect samples")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When executing "make -C tools/tracing/rtla" on LoongArch, there exists
the following error:
src/utils.c:237:24: error: '__NR_sched_setattr' undeclared
Just define __NR_sched_setattr for LoongArch if not exist.
Link: https://lore.kernel.org/20250422074917.25771-1-yangtiezhu@loongson.cn
Reported-by: Haiyong Sun <sunhaiyong@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
A test is considered failed when a sample trace exceeds the threshold.
Failed tests return the same exit code as passed tests, requiring test
frameworks to determine the result by searching for "hit stop tracing"
in the output.
Assign a distinct exit code for failed tests to enable the use of shell
expressions and seamless integration with testing frameworks without the
need to parse output.
Add enum type for return value.
Update `make check`.
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Cc: Eder Zulian <ezulian@redhat.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: https://lore.kernel.org/20250417185757.2194541-1-costa.shul@redhat.com
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add dependencies needed to build rtla with BPF sample collection support
to README, and document both ways of sample collection in the manpages.
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250311114936.148012-5-tglozar@redhat.com
- Allow RTLA to collect data via BPF
The current implementation of rtla uses libtracefs and libtraceevent to
pull sample events generated by the timerlat tracer from the trace
buffer. rtla then processes the sample by updating the histogram and
summary (current, maximum, minimum, and sum values) as well as checks
if tracing has been stopped due to threshold overflow.
In use cases where a large number of samples is being generated, that
is, with measurements running on many CPUs and with a low interval,
this sample processing design causes a significant CPU load on the rtla
side. Furthermore, with >100 CPUs and 100us interval, rtla was reported
as not being able to keep up with the samples and dropping most of them,
leading to it being unusable.
Change the way the timerlat trace processes samples by attaching
a BPF program to the trace event using the BPF skeleton feature of bpftool.
Unlike the current implementation, the BPF implementation does not check
whether tracing is stopped (in BPF mode, tracing is always off to improve
performance), but waits for a write to a BPF ringbuffer instead. This allows
rtla to exit immediately when a threshold is violated, without waiting
for the next iteration of the while loop.
If the requirements for the BPF implementation are not met, either at
build time or at run time, the current implementation is used as
fallback. Which implementation is being used can be seen when running
rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
- Fix LD_FLAGS from being dropped in build
- Refactor code to remove duplication of save_trace_to_file
- Always set options and do not rely on default settings
Do not rely on the default kernel settings of the tracers when
starting. They could have been changed by the user which gives
inconsistent results. Always set the options that rtla expects.
- Add creation of ctags and TAGS for traversing code
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+WBgRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qg54AQDCOChaSSBiUkD0VoPKIeDMlPfvO5Qz
Xvrst5gtopfKFgEA12/9Lll/sh1eoc4saeGBooNY48HBUMjmX3KNFB14PQg=
=aHFu
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tooling updates from Steven Rostedt:
- Allow RTLA to collect data via BPF
The current implementation of rtla uses libtracefs and libtraceevent
to pull sample events generated by the timerlat tracer from the trace
buffer. rtla then processes the sample by updating the histogram and
summary (current, maximum, minimum, and sum values) as well as checks
if tracing has been stopped due to threshold overflow.
In use cases where a large number of samples is being generated, that
is, with measurements running on many CPUs and with a low interval,
this sample processing design causes a significant CPU load on the
rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was
reported as not being able to keep up with the samples and dropping
most of them, leading to it being unusable.
Change the way the timerlat trace processes samples by attaching a
BPF program to the trace event using the BPF skeleton feature of
bpftool. Unlike the current implementation, the BPF implementation
does not check whether tracing is stopped (in BPF mode, tracing is
always off to improve performance), but waits for a write to a BPF
ringbuffer instead. This allows rtla to exit immediately when a
threshold is violated, without waiting for the next iteration of the
while loop.
If the requirements for the BPF implementation are not met, either at
build time or at run time, the current implementation is used as
fallback. Which implementation is being used can be seen when running
rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
- Fix LD_FLAGS from being dropped in build
- Refactor code to remove duplication of save_trace_to_file
- Always set options and do not rely on default settings
Do not rely on the default kernel settings of the tracers when
starting. They could have been changed by the user which gives
inconsistent results. Always set the options that rtla expects.
- Add creation of ctags and TAGS for traversing code
* tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Add the ability to create ctags and etags
rtla/tests: Test setting default options
rtla/tests: Reset osnoise options before check
rtla: Always set all tracer options
rtla/osnoise: Set OSNOISE_WORKLOAD to true
rtla: Unify apply_config between top and hist
rtla/osnoise: Unify params struct
rtla: Fix segfault in save_trace_to_file call
tools/build: Use SYSTEM_BPFTOOL for system bpftool
rtla: Refactor save_trace_to_file
tools/rv: Keep user LDFLAGS in build
rtla/timerlat: Test BPF mode
rtla/timerlat_top: Use BPF to collect samples
rtla/timerlat_top: Move divisor to update
rtla/timerlat_hist: Use BPF to collect samples
rtla/timerlat: Add BPF skeleton to collect samples
rtla: Add optional dependency on BPF tooling
tools/build: Add bpftool-skeletons feature test
rtla/timerlat: Unify params struct
- Add the ability to create and remove ctags and etags, using the following
make tags
make TAGS
make tags_clean
- fix a comment in Makefile.rtla with the correct spelling and don't
imply that the ability to create an rtla tarball will be removed
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: "Luis Claudio R . Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250321175053.29048-1-jkacur@redhat.com
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add function to test engine to test with pre-set osnoise options, and
use it to test whether osnoise period (as an example) is set correctly.
The test works by pre-setting a high period of 10 minutes and stop on
threshold. Thus, it is easy to check whether rtla is properly resetting
the period to default: if it is, the test will complete on time, since
the first sample will overflow the threshold. If not, it will time out.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-7-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Remove any dangling tracing instances from previous improperly exited
runs of rtla, and reset osnoise options to default before running a test
case.
This ensures that the test results are deterministic. Specific test
cases checked that rtla behaves correctly even when the tracer state is
not clean will be added later.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-6-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
rtla currently only sets tracer options that are explicitly set by the
user, with the exception of OSNOISE_WORKLOAD.
This leads to improper behavior in case rtla is run with those options
not set to the default value. rtla does reset them to the original
value upon exiting, but that does not protect it from starting with
non-default values set either by an improperly exited rtla or by another
user of the tracers.
For example, after running this command:
$ echo 1 > /sys/kernel/tracing/osnoise/stop_tracing_us
all runs of rtla will stop at the 1us threshold, even if not requested
by the user:
$ rtla osnoise hist
Index CPU-000 CPU-001
1 8 5
2 5 9
3 1 2
4 6 1
5 2 1
6 0 1
8 1 1
12 0 1
14 1 0
15 1 0
over: 0 0
count: 25 21
min: 1 1
avg: 3.68 3.05
max: 15 12
rtla osnoise hit stop tracing
Fix the problem by setting the default value for all tracer options if
the user has not provided their own value.
For most of the options, it's enough to just drop the if clause checking
for the value being set. For cpus, "all" is used as the default value,
and for osnoise default period and runtime, default values of
the osnoise_data variable in trace_osnoise.c are used.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-5-tglozar@redhat.com
Fixes: 1eceb2fc2c ("rtla/osnoise: Add osnoise top mode")
Fixes: 829a6c0b56 ("rtla/osnoise: Add the hist mode")
Fixes: a828cd18bc ("rtla: Add timerlat tool and timelart top mode")
Fixes: 1eeb6328e8 ("rtla/timerlat: Add timerlat hist mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>