Commit Graph

617 Commits

Author SHA1 Message Date
Linus Torvalds 69c5079b49 tracing updates for v6.19:
- Merge branch shared with kprobes on extending trace options
 
   The trace options were defined by a 32 bit variable. This limits the
   tracing instances to have a total of 32 different options. As that limit
   has been hit, and more options are being added, increase the option mask
   to a 64 bit number, doubling the number of options available.
 
   As this is required for the kprobe topic branches as well as the tracing
   topic branch, a separate branch was created and merged into both.
 
 - Make trace_user_fault_read() available for the rest of tracing
 
   The function trace_user_fault_read() is used by trace_marker file read to
   allow reading user space to be done fast and without locking or
   allocations. Make this available so that the system call trace events can
   use it too.
 
 - Have system call trace events read user space values
 
   Now that the system call trace events callbacks are called in a faultable
   context, take advantage of this and read the user space buffers for
   various system calls. For example, show the path name of the openat system
   call instead of just showing the pointer to that path name in user space.
   Also show the contents of the buffer of the write system call. Several
   system call trace events are updated to make tracing into a light weight
   strace tool for all applications in the system.
 
 - Update perf system call tracing to do the same
 
 - And a config and syscall_user_buf_size file to control the size of the buffer
 
   Limit the amount of data that can be read from user space. The default
   size is 63 bytes but that can be expanded to 165 bytes.
 
 - Allow the persistent ring buffer to print system calls normally
 
   The persistent ring buffer prints trace events by their type and ignores
   the print_fmt. This is because the print_fmt may change from kernel to
   kernel. As the system call output is fixed by the system call ABI itself,
   there's no reason to limit that. This makes reading the system call events
   in the persistent ring buffer much nicer and easier to understand.
 
 - Add options to show text offset to function profiler
 
   The function profiler that counts the number of times a function is hit
   currently lists all functions by its name and offset. But this becomes
   ambiguous when there are several functions with the same name. Add a
   tracing option that changes the output to be that of _text+offset
   instead. Now a user space tool can use this information to map the
   _text+offset to the unique function it is counting.
 
 - Report bad dynamic event command
 
   If a bad command is passed to the dynamic_events file, report it properly
   in the error log.
 
 - Clean up tracer options
 
   Clean up the tracer option code a bit, by removing some useless code and
   also using switch statements instead of a series of if statements.
 
 - Have tracing options be instance specific
 
   Tracers can have their own options (function tracer, irqsoff tracer,
   function graph tracer, etc). But now that the same tracer can be enabled
   in multiple trace instances, their options are still global. The API is
   per instance, thus changing one affects other instances. This isn't even
   consistent, as the option take affect differently depending on when an
   tracer started in an instance.  Make the options for instances only affect
   the instance it is changed under.
 
 - Optimize pid_list lock contention
 
   Whenever the pid_list is read, it uses a spin lock. This happens at every
   sched switch. Taking the lock at sched switch can be removed by instead
   using a seqlock counter.
 
 - Clean up the trace trigger structures
 
   The trigger code uses two different structures to implement a single
   tigger. This was due to trying to reuse code for the two different types
   of triggers (always on trigger, and count limited trigger). But by adding
   a single field to one structure, the other structure could be absorbed
   into the first structure making he code easier to understand.
 
 - Create a bulk garbage collector for trace triggers
 
   If user space has triggers for several hundreds of events and then removes
   them, it can take several seconds to complete. This is because each
   removal calls the slow tracepoint_synchronize_unregister() that can take
   hundreds of milliseconds to complete. Instead, create a helper thread that
   will do the clean up. When a trigger is removed, it will create the
   kthread if it isn't already created, and then add the trigger to a llist.
   The kthread will take the items off the llist, call
   tracepoint_synchronize_unregister(), and then remove the items it took
   off. It will then check if there's more items to free before sleeping.
 
   This makes user space removing all these triggers to finish in less than a
   second.
 
 - Allow function tracing of some of the tracing infrastructure code
 
   Because the tracing code can cause recursion issues if it is traced by the
   function tracer the entire tracing directory disables function tracing.
   But not all of tracing causes issues if it is traced. Namely, the event
   tracing code. Add a config that enables some of the tracing code to be
   traced to help in debugging it. Note, when this is enabled, it does add
   noise to general function tracing, especially if events are enabled as
   well (which is a common case).
 
 - Add boot-time backup instance for persistent buffer
 
   The persistent ring buffer is used mostly for kernel crash analysis in the
   field. One issue is that if there's a crash, the data in the persistent
   ring buffer must be read before tracing can begin using it. This slows
   down the boot process. Once tracing starts in the persistent ring buffer,
   the old data must be freed and the addresses no longer match and old
   events can't be in the buffer with new events.
 
   Create a way to create a backup buffer that copies the persistent ring
   buffer at boot up. Then after a crash, the always on tracer can begin
   immediately as well as the normal boot process while the crash analysis
   tooling uses the backup buffer. After the backup buffer is finished being
   read, it can be removed.
 
 - Enable function graph args and return address options at the same time
 
   Currently the when reading of arguments in the function graph tracer is
   enabled, the option to record the parent function in the entry event can
   not be enabled. Update the code so that it can.
 
 - Add new struct_offset() helper macro
 
   Add a new macro that takes a pointer to a structure and a name of one of
   its members and it will return the offset of that member. This allows the
   ring buffer code to simplify the following:
 
   From:  size = struct_size(entry, buf, cnt - sizeof(entry->id));
     To:  size = struct_offset(entry, id) + cnt;
 
   There should be other simplifications that this macro can help out with as
   well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaS9xqxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qj6tAQD4MR1lsE3XpH09asO4CDDfhbtRSQVD
 o8bVKVihWx/j5gD/XezjqE2Q2+DO6dhnsQY6pbtNdXoKgaMuQJGA+dvPsQc=
 =HilC
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Extend tracing option mask to 64 bits

   The trace options were defined by a 32 bit variable. This limits the
   tracing instances to have a total of 32 different options. As that
   limit has been hit, and more options are being added, increase the
   option mask to a 64 bit number, doubling the number of options
   available.

   As this is required for the kprobe topic branches as well as the
   tracing topic branch, a separate branch was created and merged into
   both.

 - Make trace_user_fault_read() available for the rest of tracing

   The function trace_user_fault_read() is used by trace_marker file
   read to allow reading user space to be done fast and without locking
   or allocations. Make this available so that the system call trace
   events can use it too.

 - Have system call trace events read user space values

   Now that the system call trace events callbacks are called in a
   faultable context, take advantage of this and read the user space
   buffers for various system calls. For example, show the path name of
   the openat system call instead of just showing the pointer to that
   path name in user space. Also show the contents of the buffer of the
   write system call. Several system call trace events are updated to
   make tracing into a light weight strace tool for all applications in
   the system.

 - Update perf system call tracing to do the same

 - And a config and syscall_user_buf_size file to control the size of
   the buffer

   Limit the amount of data that can be read from user space. The
   default size is 63 bytes but that can be expanded to 165 bytes.

 - Allow the persistent ring buffer to print system calls normally

   The persistent ring buffer prints trace events by their type and
   ignores the print_fmt. This is because the print_fmt may change from
   kernel to kernel. As the system call output is fixed by the system
   call ABI itself, there's no reason to limit that. This makes reading
   the system call events in the persistent ring buffer much nicer and
   easier to understand.

 - Add options to show text offset to function profiler

   The function profiler that counts the number of times a function is
   hit currently lists all functions by its name and offset. But this
   becomes ambiguous when there are several functions with the same
   name.

   Add a tracing option that changes the output to be that of
   '_text+offset' instead. Now a user space tool can use this
   information to map the '_text+offset' to the unique function it is
   counting.

 - Report bad dynamic event command

   If a bad command is passed to the dynamic_events file, report it
   properly in the error log.

 - Clean up tracer options

   Clean up the tracer option code a bit, by removing some useless code
   and also using switch statements instead of a series of if
   statements.

 - Have tracing options be instance specific

   Tracers can have their own options (function tracer, irqsoff tracer,
   function graph tracer, etc). But now that the same tracer can be
   enabled in multiple trace instances, their options are still global.
   The API is per instance, thus changing one affects other instances.
   This isn't even consistent, as the option take affect differently
   depending on when an tracer started in an instance. Make the options
   for instances only affect the instance it is changed under.

 - Optimize pid_list lock contention

   Whenever the pid_list is read, it uses a spin lock. This happens at
   every sched switch. Taking the lock at sched switch can be removed by
   instead using a seqlock counter.

 - Clean up the trace trigger structures

   The trigger code uses two different structures to implement a single
   tigger. This was due to trying to reuse code for the two different
   types of triggers (always on trigger, and count limited trigger). But
   by adding a single field to one structure, the other structure could
   be absorbed into the first structure making he code easier to
   understand.

 - Create a bulk garbage collector for trace triggers

   If user space has triggers for several hundreds of events and then
   removes them, it can take several seconds to complete. This is
   because each removal calls tracepoint_synchronize_unregister() that
   can take hundreds of milliseconds to complete.

   Instead, create a helper thread that will do the clean up. When a
   trigger is removed, it will create the kthread if it isn't already
   created, and then add the trigger to a llist. The kthread will take
   the items off the llist, call tracepoint_synchronize_unregister(),
   and then remove the items it took off. It will then check if there's
   more items to free before sleeping.

   This makes user space removing all these triggers to finish in less
   than a second.

 - Allow function tracing of some of the tracing infrastructure code

   Because the tracing code can cause recursion issues if it is traced
   by the function tracer the entire tracing directory disables function
   tracing. But not all of tracing causes issues if it is traced.
   Namely, the event tracing code. Add a config that enables some of the
   tracing code to be traced to help in debugging it. Note, when this is
   enabled, it does add noise to general function tracing, especially if
   events are enabled as well (which is a common case).

 - Add boot-time backup instance for persistent buffer

   The persistent ring buffer is used mostly for kernel crash analysis
   in the field. One issue is that if there's a crash, the data in the
   persistent ring buffer must be read before tracing can begin using
   it. This slows down the boot process. Once tracing starts in the
   persistent ring buffer, the old data must be freed and the addresses
   no longer match and old events can't be in the buffer with new
   events.

   Create a way to create a backup buffer that copies the persistent
   ring buffer at boot up. Then after a crash, the always on tracer can
   begin immediately as well as the normal boot process while the crash
   analysis tooling uses the backup buffer. After the backup buffer is
   finished being read, it can be removed.

 - Enable function graph args and return address options at the same
   time

   Currently the when reading of arguments in the function graph tracer
   is enabled, the option to record the parent function in the entry
   event can not be enabled. Update the code so that it can.

 - Add new struct_offset() helper macro

   Add a new macro that takes a pointer to a structure and a name of one
   of its members and it will return the offset of that member. This
   allows the ring buffer code to simplify the following:

   From:  size = struct_size(entry, buf, cnt - sizeof(entry->id));
     To:  size = struct_offset(entry, id) + cnt;

   There should be other simplifications that this macro can help out
   with as well

* tag 'trace-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (42 commits)
  overflow: Introduce struct_offset() to get offset of member
  function_graph: Enable funcgraph-args and funcgraph-retaddr to work simultaneously
  tracing: Add boot-time backup of persistent ring buffer
  ftrace: Allow tracing of some of the tracing code
  tracing: Use strim() in trigger_process_regex() instead of skip_spaces()
  tracing: Add bulk garbage collection of freeing event_trigger_data
  tracing: Remove unneeded event_mutex lock in event_trigger_regex_release()
  tracing: Merge struct event_trigger_ops into struct event_command
  tracing: Remove get_trigger_ops() and add count_func() from trigger ops
  tracing: Show the tracer options in boot-time created instance
  ftrace: Avoid redundant initialization in register_ftrace_direct
  tracing: Remove unused variable in tracing_trace_options_show()
  fgraph: Make fgraph_no_sleep_time signed
  tracing: Convert function graph set_flags() to use a switch() statement
  tracing: Have function graph tracer option sleep-time be per instance
  tracing: Move graph-time out of function graph options
  tracing: Have function graph tracer option funcgraph-irqs be per instance
  trace/pid_list: optimize pid_list->lock contention
  tracing: Have function graph tracer define options per instance
  tracing: Have function tracer define options per instance
  ...
2025-12-05 09:51:37 -08:00
Tomas Glozar b9f6a40dc3 Documentation/trace: Specify exact priority for timerlat
The timerlat tracer documentation mentions that threads are created with
real-time priority, but does not mention which priority and scheduling
class is used.

Add the information so that users do not have to look it up in
trace_osnoise.c.

Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20251010083338.478961-9-tglozar@redhat.com>
2025-11-05 11:19:20 -07:00
Steven Rostedt 299ea67e6a tracing: Add a config and syscall_user_buf_size file to limit amount written
When a system call that can copy user space addresses into the ring
buffer, it can copy up to 511 bytes of data. This can waste precious ring
buffer space if the user isn't interested in the output. Add a new file
"syscall_user_buf_size" that gets initialized to a new config
CONFIG_SYSCALL_BUF_SIZE_DEFAULT that defaults to 63.

The config also is used to limit how much perf can read from user space.

Also lower the max down to 165, as this isn't to record everything that a
system call may be passing through to the kernel. 165 is more than enough.

The reason for 165 is because adding one for the nul terminating byte, as
well as possibly needing to append the "..." string turns it into 170
bytes. As this needs to save up to 3 arguments and 3 * 170 is 510 which
fits nicely in 512 bytes (a power of 2).

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Takaya Saeki <takayas@google.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: https://lore.kernel.org/20251028231148.260068913@kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-10-28 20:10:59 -04:00
Bagas Sanjaya b8874ea3d0 Documentation: trace: histogram: Convert ftrace docs cross-reference
In brief "Extended error information" section, details on error
condition is referred to ftrace docs, which is written in :file:
directive instead of a proper cross-reference. Convert it.

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250916054202.582074-7-bagasdotme@gmail.com>
2025-09-18 11:49:26 -06:00
Bagas Sanjaya fa06220f34 Documentation: trace: histogram-design: Wrap introductory note in note:: directive
Use Sphinx note:: directive for the introductory note at the beginning
of docs, instead of aligned-text paragraph that renders as definition
list.

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250916054202.582074-6-bagasdotme@gmail.com>
2025-09-18 11:49:26 -06:00
Bagas Sanjaya 8c716e87ea Documentation: trace: historgram-design: Separate sched_waking histogram section heading and the following diagram
Section heading for sched_waking histogram is shown as normal paragraph
instead due to codeblock marker for the following diagram being in the
same line as the section underline. Separate them.

Fixes: daceabf1b4 ("tracing/doc: Fix ascii-art in histogram-design.rst")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250916054202.582074-5-bagasdotme@gmail.com>
2025-09-18 11:49:26 -06:00
Bagas Sanjaya f867a298ac Documentation: trace: histogram-design: Trim trailing vertices in diagram explanation text
Diagram explanation text is supposed to be interleaved commentary
between diagram parts that are spread out, but it outputs ugly in
htmldocs due to trailing vertices as if both the explanation and the
diagram are in the same literal code block.

Trim trailing vertices.

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250916054202.582074-4-bagasdotme@gmail.com>
2025-09-18 11:49:26 -06:00
Bagas Sanjaya 348b7ca8d3 Documentation: trace: histogram: Fix histogram trigger subsection number order
Section numbering in subsections of "Histogram Trigger Command" sections
is inconsistent in order. In particular, "'hist' trigger examples" is
erroneously numbered as 6.2, which is a leftover from  b8df4a3634
("tracing: Move hist trigger Documentation to histogram.txt").

Fix the order.

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250916054202.582074-3-bagasdotme@gmail.com>
2025-09-18 11:49:26 -06:00
Ryan Chung b65988af71 tracing: fix grammar error in debugging.rst
Signed-off-by: Ryan Chung <seokwoo.chung130@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250831101736.11519-3-seokwoo.chung130@gmail.com
2025-09-03 15:31:48 -06:00
Ryan Chung d5958c8a09 tracing: rephrase for clearer documentation
Signed-off-by: Ryan Chung <seokwoo.chung130@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250831101736.11519-2-seokwoo.chung130@gmail.com
2025-09-03 15:31:30 -06:00
Mehdi Ben Hadj Khelifa 915fb5caad docs: Corrected typo in trace/events
-Changed 'Dyamically' to 'Dynamically' in trace/events.rst

under sections 7.1 and 7.3

Signed-off-by: Mehdi Ben Hadj Khelifa <mehdi.benhadjkhelifa@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250819085040.974388-1-mehdi.benhadjkhelifa@gmail.com
2025-08-21 11:59:11 -06:00
Jonathan Corbet 4e18a0b090 Merge branch 'bjorn' into docs-mw
A big set of typo fixes from Bjorn Helgaas
2025-08-18 10:40:16 -06:00
Bjorn Helgaas 29fe206065 Documentation: Fix trace typos
Fix typos.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250813200526.290420-9-helgaas@kernel.org
2025-08-18 10:31:20 -06:00
Gopi Krishna Menon 70d476b63a Documentation/rv: Fix minor typo in monitor_synthesis page
Specifically, fix spelling of "practice"

Signed-off-by: Gopi Krishna Menon <krishnagopi487@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250810111249.93181-1-krishnagopi487@gmail.com
2025-08-12 12:41:20 -06:00
Linus Torvalds d6f38c1239 tracing changes for 6.17
- Deprecate auto-mounting tracefs to /sys/kernel/debug/tracing
 
   When tracefs was first introduced back in 2014, the directory
   /sys/kernel/tracing was added and is the designated location to mount
   tracefs. To keep backward compatibility, tracefs was auto-mounted in
   /sys/kernel/debug/tracing as well.
 
   All distros now mount tracefs on /sys/kernel/tracing. Having it seen in two
   different locations has lead to various issues and inconsistencies.
 
   The VFS folks have to also maintain debugfs_create_automount() for this
   single user.
 
   It's been over 10 years. Tooling and scripts should start replacing the
   debugfs location with the tracefs one. The reason tracefs was created in the
   first place was to allow access to the tracing facilities without the need
   to configure debugfs into the kernel. Using tracefs should now be more
   robust.
 
   A new config is created: CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED
   which is default y, so that the kernel is still built with the automount.
   This config allows those that want to remove the automount from debugfs to
   do so.
 
   When tracefs is accessed from /sys/kernel/debug/tracing, the following
   printk is triggerd:
 
    pr_warn("NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030\n");
 
   This gives users another 5 years to fix their scripts.
 
 - Use queue_rcu_work() instead of call_rcu() for freeing event filters
 
   The number of filters to be free can be many depending on the number of
   events within an event system. Freeing them from softirq context can
   potentially cause undesired latency. Use the RCU workqueue to free them
   instead.
 
 - Remove pointless memory barriers in latency code
 
   Memory barriers were added to some of the latency code a long time ago with
   the idea of "making them visible", but that's not what memory barriers are
   for. They are to synchronize access between different variables. There was
   no synchronization here making them pointless.
 
 - Remove "__attribute__()" from the type field of event format
 
   When LLVM is used to compile the kernel with CONFIG_DEBUG_INFO_BTF=y and
   PAHOLE_HAS_BTF_TAG=y, some of the format fields get expanded with the
   following:
 
     field:const char * filename;      offset:24;      size:8; signed:0;
 
   Turns into:
 
     field:const char __attribute__((btf_type_tag("user"))) * filename;      offset:24;      size:8; signed:0;
 
   This confuses parsers. Add code to strip these tags from the strings.
 
 - Add eprobe config option CONFIG_EPROBE_EVENTS
 
   Eprobes were added back in 5.15 but were only enabled when another probe was
   enabled (kprobe, fprobe, uprobe, etc). The eprobes had no config option
   of their own. Add one as they should be a separate entity.
 
   It's default y to keep with the old kernels but still has dependencies on
   TRACING and HAVE_REGS_AND_STACK_ACCESS_API.
 
 - Add eprobe documentation
 
   When eprobes were added back in 5.15 no documentation was added to describe
   them. This needs to be rectified.
 
 - Replace open coded cpumask_next_wrap() in move_to_next_cpu()
 
 - Have preemptirq_delay_run() use off-stack CPU mask
 
 - Remove obsolete comment about pelt_cfs event
 
   DECLARE_TRACE() appends "_tp" to trace events now, but the comment above
   pelt_cfs still mentioned appending it manually.
 
 - Remove EVENT_FILE_FL_SOFT_MODE flag
 
   The SOFT_MODE flag was required when the soft enabling and disabling of
   trace events was first introduced. But there was a bug with this approach
   as it only worked for a single instance. When multiple users required soft
   disabling and disabling the code was changed to have a ref count. The
   SOFT_MODE flag is now set iff the ref count is non zero. This is redundant
   and just reading the ref count is good enough.
 
 - Fix typo in comment
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIt5ZRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qvriAPsEbOEgMrPF1Tdj1mHLVajYTxI8ft5J
 aX5bfM2cDDRVcgEA57JHOXp4d05dj555/hgAUuCWuFp/E0Anp45EnFTedgQ=
 =wKZW
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Deprecate auto-mounting tracefs to /sys/kernel/debug/tracing

   When tracefs was first introduced back in 2014, the directory
   /sys/kernel/tracing was added and is the designated location to mount
   tracefs. To keep backward compatibility, tracefs was auto-mounted in
   /sys/kernel/debug/tracing as well.

   All distros now mount tracefs on /sys/kernel/tracing. Having it seen
   in two different locations has lead to various issues and
   inconsistencies.

   The VFS folks have to also maintain debugfs_create_automount() for
   this single user.

   It's been over 10 years. Tooling and scripts should start replacing
   the debugfs location with the tracefs one. The reason tracefs was
   created in the first place was to allow access to the tracing
   facilities without the need to configure debugfs into the kernel.
   Using tracefs should now be more robust.

   A new config is created: CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED which is
   default y, so that the kernel is still built with the automount. This
   config allows those that want to remove the automount from debugfs to
   do so.

   When tracefs is accessed from /sys/kernel/debug/tracing, the
   following printk is triggerd:

     pr_warn("NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030\n");

   This gives users another 5 years to fix their scripts.

 - Use queue_rcu_work() instead of call_rcu() for freeing event filters

   The number of filters to be free can be many depending on the number
   of events within an event system. Freeing them from softirq context
   can potentially cause undesired latency. Use the RCU workqueue to
   free them instead.

 - Remove pointless memory barriers in latency code

   Memory barriers were added to some of the latency code a long time
   ago with the idea of "making them visible", but that's not what
   memory barriers are for. They are to synchronize access between
   different variables. There was no synchronization here making them
   pointless.

 - Remove "__attribute__()" from the type field of event format

   When LLVM is used to compile the kernel with CONFIG_DEBUG_INFO_BTF=y
   and PAHOLE_HAS_BTF_TAG=y, some of the format fields get expanded with
   the following:

     field:const char * filename;      offset:24;      size:8; signed:0;

   Turns into:

     field:const char __attribute__((btf_type_tag("user"))) * filename;      offset:24;      size:8; signed:0;

   This confuses parsers. Add code to strip these tags from the strings.

 - Add eprobe config option CONFIG_EPROBE_EVENTS

   Eprobes were added back in 5.15 but were only enabled when another
   probe was enabled (kprobe, fprobe, uprobe, etc). The eprobes had no
   config option of their own. Add one as they should be a separate
   entity.

   It's default y to keep with the old kernels but still has
   dependencies on TRACING and HAVE_REGS_AND_STACK_ACCESS_API.

 - Add eprobe documentation

   When eprobes were added back in 5.15 no documentation was added to
   describe them. This needs to be rectified.

 - Replace open coded cpumask_next_wrap() in move_to_next_cpu()

 - Have preemptirq_delay_run() use off-stack CPU mask

 - Remove obsolete comment about pelt_cfs event

   DECLARE_TRACE() appends "_tp" to trace events now, but the comment
   above pelt_cfs still mentioned appending it manually.

 - Remove EVENT_FILE_FL_SOFT_MODE flag

   The SOFT_MODE flag was required when the soft enabling and disabling
   of trace events was first introduced. But there was a bug with this
   approach as it only worked for a single instance. When multiple users
   required soft disabling and disabling the code was changed to have a
   ref count. The SOFT_MODE flag is now set iff the ref count is non
   zero. This is redundant and just reading the ref count is good
   enough.

 - Fix typo in comment

* tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation: tracing: Add documentation about eprobes
  tracing: Have eprobes have their own config option
  tracing: Remove "__attribute__()" from the type field of event format
  tracing: Deprecate auto-mounting tracefs in debugfs
  tracing: Fix comment in trace_module_remove_events()
  tracing: Remove EVENT_FILE_FL_SOFT_MODE flag
  tracing: Remove pointless memory barriers
  tracing/sched: Remove obsolete comment on suffixes
  kernel: trace: preemptirq_delay_test: use offstack cpu mask
  tracing: Use queue_rcu_work() to free filters
  tracing: Replace opencoded cpumask_next_wrap() in move_to_next_cpu()
2025-08-01 10:29:36 -07:00
Linus Torvalds b1cce98493 It has been a relatively busy cycle for docs, especially the build system:
- The Perl kernel-doc script was added to 2.3.52pre1 just after the turn of
   the millennium.  Over the following 25 years, it accumulated a vast
   amount of cruft, all in a language few people want to deal with anymore.
   Mauro's Python replacement in 6.16 faithfully reproduced all of the cruft
   in the hope of avoiding regressions.  Now that we have a more reasonable
   code base, though, we can work on cleaning it up; many of the changes
   this time around are toward that end.
 
 - A reorganization of the ext4 docs into the usual TOC format.
 
 - Various Chinese translations and updates.
 
 - A new script from Mauro to help with docs-build testing.
 
 - A new document for linked lists
 
 - A sweep through MAINTAINERS fixing broken GitHub git:// repository links.
 
 ...and lots of fixes and updates.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCgAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmiHe3oPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y+EIH/0dMribAlWSrfS1sdisfkVp+nHh+DB6EA+uX
 XqbJvQrukze6GvvOI2L6+3fDp+5CBtBRRSkzsNIXfFQo6p/jEbTmD/JILO0LcyDT
 9iFX+W30nRetu1SqkiTGjLXgu+tF0gUE6zVnI7Lx7H10PUnUPkFbmMuwmOcOV/lC
 7Lml+G1FTByGE6gDjTTyTJOqBf37uLJq33N2YnPK0SHm4DiSsWGvINxGbXrrpR5Z
 7ORA6SnaIxFuy60SxL9pEH92OLS/kHRw74P/DT1dkg9BSdy4TRLM30QkZFGoiG2B
 OOnnT/JJz80BzI1ctzpcwGRWfD+i8DDvujp8+aLxXYbl5N7WYw0=
 =sji4
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.17' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It has been a relatively busy cycle for docs, especially the build
  system:

   - The Perl kernel-doc script was added to 2.3.52pre1 just after the
     turn of the millennium. Over the following 25 years, it accumulated
     a vast amount of cruft, all in a language few people want to deal
     with anymore. Mauro's Python replacement in 6.16 faithfully
     reproduced all of the cruft in the hope of avoiding regressions.

     Now that we have a more reasonable code base, though, we can work
     on cleaning it up; many of the changes this time around are toward
     that end.

   - A reorganization of the ext4 docs into the usual TOC format.

   - Various Chinese translations and updates.

   - A new script from Mauro to help with docs-build testing.

   - A new document for linked lists

   - A sweep through MAINTAINERS fixing broken GitHub git:// repository
     links.

  ...and lots of fixes and updates"

* tag 'docs-6.17' of git://git.lwn.net/linux: (147 commits)
  scripts: add origin commit identification based on specific patterns
  sphinx: kernel_abi: fix performance regression with O=<dir>
  Documentation: core-api: entry: Replace deprecated KVM entry/exit functions
  docs: fault-injection: drop reference to md-faulty
  docs: document linked lists
  scripts: kdoc: make it backward-compatible with Python 3.7
  docs: kernel-doc: emit warnings for ancient versions of Python
  Documentation/rtla: Describe exit status
  Documentation/rtla: Add include common_appendix.rst
  docs: kernel: Clarify printk_ratelimit_burst reset behavior
  Documentation: ioctl-number: Don't repeat macro names
  Documentation: ioctl-number: Shorten macros table
  Documentation: ioctl-number: Correct full path to papr-physical-attestation.h
  Documentation: ioctl-number: Extend "Include File" column width
  Documentation: ioctl-number: Fix linuxppc-dev mailto link
  overlayfs.rst: fix typos
  docs: kdoc: emit a warning for ancient versions of Python
  docs: kdoc: clean up check_sections()
  docs: kdoc: directly access the always-there KdocItem fields
  docs: kdoc: straighten up dump_declaration()
  ...
2025-07-31 08:36:51 -07:00
Linus Torvalds 4ff261e725 Runtime verification changes for 6.17
- Added Linear temporal logic monitors for RT application
 
   Real-time applications may have design flaws causing them to have
   unexpected latency. For example, the applications may raise page faults, or
   may be blocked trying to take a mutex without priority inheritance.
 
   However, while attempting to implement DA monitors for these real-time
   rules, deterministic automaton is found to be inappropriate as the
   specification language. The automaton is complicated, hard to understand,
   and error-prone.
 
   For these cases, linear temporal logic is found to be more suitable. The
   LTL is more concise and intuitive.
 
 - Make printk_deferred() public
 
   The new monitors needed access to printk_deferred(). Make them visible for
   the entire kernel.
 
 - Add a vpanic() to allow for va_list to be passed to panic.
 
 - Add rtapp container monitor.
 
   A collection of monitors that check for common problems with real-time
   applications that cause unexpected latency.
 
 - Add page fault tracepoints to risc-v
 
   These tracepoints are necessary to for the RV monitor to run on risc-v.
 
 - Fix the behaviour of the rv tool with -s and idle tasks.
 
 - Allow the rv tool to gracefully terminate with SIGTERM
 
 - Adjusts dot2c not to create lines over 100 columns
 
 - Properly order nested monitors in the RV Kconfig file
 
 - Return the registration error in all DA monitor instead of 0
 
 - Update and add new sched collection monitors
 
   Replace tss and sncid monitors with more complete sts:
   Not only prove that switches occur in scheduling context and scheduling
   needs interrupt disabled but also that each call to the scheduler
   disables interrupts to (optionally) switch.
 
   New monitor: nrp
    Preemption requires need resched which is cleared by any switch
    (includes a non optimal workaround for /nested/ preemptions)
 
   New monitor: sssw
    suspension requires setting the task to sleepable and, after the
    switch occurs, the task requires a wakeup to come back to runnable
 
   New monitor: opid
    waking and need-resched operations occur with interrupts and
    preemption disabled or in IRQ without explicitly disabling preemption
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIk8cBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qi3DAQCFu6DM7uPSh94oggWlH2LukOYVGk2b
 CvGrqMFuefae7QD/aK9nCMfzaBehixMOMQHLHELEh527Hd+RwQCrlnLALQU=
 =r5HZ
 -----END PGP SIGNATURE-----

Merge tag 'trace-rv-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull runtime verification updates from Steven Rostedt:

 - Added Linear temporal logic monitors for RT application

   Real-time applications may have design flaws causing them to have
   unexpected latency. For example, the applications may raise page
   faults, or may be blocked trying to take a mutex without priority
   inheritance.

   However, while attempting to implement DA monitors for these
   real-time rules, deterministic automaton is found to be inappropriate
   as the specification language. The automaton is complicated, hard to
   understand, and error-prone.

   For these cases, linear temporal logic is found to be more suitable.
   The LTL is more concise and intuitive.

 - Make printk_deferred() public

   The new monitors needed access to printk_deferred(). Make them
   visible for the entire kernel.

 - Add a vpanic() to allow for va_list to be passed to panic.

 - Add rtapp container monitor.

   A collection of monitors that check for common problems with
   real-time applications that cause unexpected latency.

 - Add page fault tracepoints to risc-v

   These tracepoints are necessary to for the RV monitor to run on
   risc-v.

 - Fix the behaviour of the rv tool with -s and idle tasks.

 - Allow the rv tool to gracefully terminate with SIGTERM

 - Adjusts dot2c not to create lines over 100 columns

 - Properly order nested monitors in the RV Kconfig file

 - Return the registration error in all DA monitor instead of 0

 - Update and add new sched collection monitors

   Replace tss and sncid monitors with more complete sts:

   Not only prove that switches occur in scheduling context and scheduling
   needs interrupt disabled but also that each call to the scheduler
   disables interrupts to (optionally) switch.

   New monitor: nrp
     Preemption requires need resched which is cleared by any switch
     (includes a non optimal workaround for /nested/ preemptions)

   New monitor: sssw
     suspension requires setting the task to sleepable and, after the
     switch occurs, the task requires a wakeup to come back to runnable

   New monitor: opid
      waking and need-resched operations occur with interrupts and
      preemption disabled or in IRQ without explicitly disabling
      preemption"

* tag 'trace-rv-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (48 commits)
  rv: Add opid per-cpu monitor
  rv: Add nrp and sssw per-task monitors
  rv: Replace tss and sncid monitors with more complete sts
  sched: Adapt sched tracepoints for RV task model
  rv: Retry when da monitor detects race conditions
  rv: Adjust monitor dependencies
  rv: Use strings in da monitors tracepoints
  rv: Remove trailing whitespace from tracepoint string
  rv: Add da_handle_start_run_event_ to per-task monitors
  rv: Fix wrong type cast in reactors_show() and monitor_reactor_show()
  rv: Fix wrong type cast in monitors_show()
  rv: Remove struct rv_monitor::reacting
  rv: Remove rv_reactor's reference counter
  rv: Merge struct rv_reactor_def into struct rv_reactor
  rv: Merge struct rv_monitor_def into struct rv_monitor
  rv: Remove unused field in struct rv_monitor_def
  rv: Return init error when registering monitors
  verification/rvgen: Organise Kconfig entries for nested monitors
  tools/dot2c: Fix generated files going over 100 column limit
  tools/rv: Stop gracefully also on SIGTERM
  ...
2025-07-30 16:23:12 -07:00
Steven Rostedt 623526ba89 Documentation: tracing: Add documentation about eprobes
Eprobes was added back in 5.15, but was never documented. It became a
"secret" interface even though it has been a topic of several
presentations. For some reason, when eprobes was added, documenting it
never became a priority, until now.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/20250730140945.528135548@kernel.org
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-30 10:38:43 -04:00
Gabriele Monaco 614384533d rv: Add opid per-cpu monitor
Add a per-cpu monitor as part of the sched model:
* opid: operations with preemption and irq disabled
    Monitor to ensure wakeup and need_resched occur with irq and
    preemption disabled or in irq handlers.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20250728135022.255578-10-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Acked-by: Nam Cao <namcao@linutronix.de>
Tested-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-28 16:47:35 -04:00
Gabriele Monaco e8440a88e5 rv: Add nrp and sssw per-task monitors
Add 2 per-task monitors as part of the sched model:

* nrp: need-resched preempts
    Monitor to ensure preemption requires need resched.
* sssw: set state sleep and wakeup
    Monitor to ensure sched_set_state to sleepable leads to sleeping and
    sleeping tasks require wakeup.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/20250728135022.255578-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Acked-by: Nam Cao <namcao@linutronix.de>
Tested-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-28 16:47:34 -04:00
Gabriele Monaco d0096c2f9c rv: Replace tss and sncid monitors with more complete sts
The tss monitor currently guarantees task switches can happen only while
scheduling, whereas the sncid monitor enforces scheduling occurs with
interrupt disabled.

Replace the monitors with a more comprehensive specification which
implies both but also ensures that:
* each scheduler call disable interrupts to switch
* each task switch happens with interrupts disabled

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nam Cao <namcao@linutronix.de>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Juri Lelli <jlelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/20250728135022.255578-8-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-28 16:47:34 -04:00
Nam Cao 8cfcf9b0e9 verification/rvgen: Support the 'next' operator
The 'next' operator is a unary operator. It is defined as: "next time, the
operand must be true".

Support this operator. For RV monitors, "next time" means the next
invocation of ltl_validate().

Cc: John Ogness <john.ogness@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/9c32cec04dd18d2e956fddd84b0e0a2503daa75a.1752239482.git.namcao@linutronix.de
Signed-off-by: Nam Cao <namcao@linutronix.de>
Tested-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24 10:43:23 -04:00
Nam Cao e93648e862 Documentation/rv: Add documentation for linear temporal logic monitors
Add documents describing linear temporal logic runtime verification
monitors and how to generate them using rvgen.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/be13719e66fd8da147d7c69d5365aa23c52b743f.1751634289.git.namcao@linutronix.de
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24 10:42:47 -04:00
Nam Cao f40a7c0602 Documentation/rv: Prepare monitor synthesis document for LTL inclusion
Monitor synthesis from deterministic automaton and linear temporal logic
have a lot in common. Therefore a single document should describe both.

Change da_monitor_synthesis.rst to monitor_synthesis.rst. LTL monitor
synthesis will be added to this file by a follow-up commit.

This makes the diff far easier to read. If renaming and adding LTL info is
done in a single commit, git wouldn't recognize it as a rename, but a file
removal and a file addition.

While at it, correct the old dot2k commands to the new rvgen commands.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/d91c6e4600287f4732d68a014219e576a75ce6dc.1751634289.git.namcao@linutronix.de
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-24 10:42:46 -04:00
Steven Rostedt 4d6d0a6263 tracing: Remove redundant config HAVE_FTRACE_MCOUNT_RECORD
Ftrace is tightly coupled with architecture specific code because it
requires the use of trampolines written in assembly. This means that when
a new feature or optimization is made, it must be done for all
architectures. To simplify the approach, CONFIG_HAVE_FTRACE_* configs are
added to denote which architecture has the new enhancement so that other
architectures can still function until they too have been updated.

The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the
DYNAMIC_FTRACE work, but now every architecture that implements
DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it redundant
with the HAVE_DYNAMIC_FTRACE.

Remove the HAVE_FTRACE_MCOUNT config and use DYNAMIC_FTRACE directly where
applicable.

Link: https://lore.kernel.org/all/20250703154916.48e3ada7@gandalf.local.home/

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/20250704104838.27a18690@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-22 20:15:56 -04:00
Nam Cao 670ff946b9 rv: Add documentation for rtapp monitor
Add documentation describing the rtapp monitor.

Cc: John Ogness <john.ogness@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/df0242d74c12511e82cc9d73c082def91a160c74.1752088709.git.namcao@linutronix.de
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-09 15:27:02 -04:00
Ahelenia Ziemiańska f55b3ca3cf tracing: doc: fix "for a while" typo
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/xygdnynf7m55p7d27ovzqtdjaa7pua3bxuk5c22cnmoovaji5e@tarta.nabijaczleweli.xyz
2025-07-08 08:14:28 -06:00
Runji Liu ea08e53d4d docs: trace: boottime-trace.rst: fix typo
Replace misspelled "eariler" with "earlier" and drop the stray period
after "example".

Signed-off-by: Runji Liu <runjiliu.tech@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250526134046.1042-1-runjiliu.tech@gmail.com
2025-06-09 14:35:07 -06:00
Linus Torvalds c26f4fbd58 Char/Misc/IIO pull request for 6.16-rc1
Here is the big char/misc/iio and other small driver subsystem pull
 request for 6.16-rc1.
 
 Overall, a lot of individual changes, but nothing major, just the normal
 constant forward progress of new device support and cleanups to existing
 subsystems.  Highlights in here are:
   - Large IIO driver updates and additions and device tree changes
   - Android binder bugfixes and logfile fixes
   - mhi driver updates
   - comedi driver updates
   - counter driver updates and additions
   - coresight driver updates and additions
   - echo driver removal as there are no in-kernel users of it
   - nvmem driver updates
   - spmi driver updates
   - new amd-sbi driver "subsystem" and drivers added
   - rust miscdriver binding documentation fix
   - other small driver fixes and updates (uio, w1, acrn, hpet, xillybus,
     cardreader drivers, fastrpc and others.)
 
 All of these have been in linux-next for quite a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaEKg5Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykUyACgmAzrzKMoQUwwhQ6ed2l7tHdrlOcAoIORI1/x
 pNqQdrE1EbmAAyl47IN4
 =ts6J
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc / iio driver updates from Greg KH:
 "Here is the big char/misc/iio and other small driver subsystem pull
  request for 6.16-rc1.

  Overall, a lot of individual changes, but nothing major, just the
  normal constant forward progress of new device support and cleanups to
  existing subsystems. Highlights in here are:

   - Large IIO driver updates and additions and device tree changes

   - Android binder bugfixes and logfile fixes

   - mhi driver updates

   - comedi driver updates

   - counter driver updates and additions

   - coresight driver updates and additions

   - echo driver removal as there are no in-kernel users of it

   - nvmem driver updates

   - spmi driver updates

   - new amd-sbi driver "subsystem" and drivers added

   - rust miscdriver binding documentation fix

   - other small driver fixes and updates (uio, w1, acrn, hpet,
     xillybus, cardreader drivers, fastrpc and others)

  All of these have been in linux-next for quite a while with no
  reported problems"

* tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits)
  binder: fix yet another UAF in binder_devices
  counter: microchip-tcb-capture: Add watch validation support
  dt-bindings: iio: adc: Add ROHM BD79100G
  iio: adc: add support for Nuvoton NCT7201
  dt-bindings: iio: adc: add NCT7201 ADCs
  iio: chemical: Add driver for SEN0322
  dt-bindings: trivial-devices: Document SEN0322
  iio: adc: ad7768-1: reorganize driver headers
  iio: bmp280: zero-init buffer
  iio: ssp_sensors: optimalize -> optimize
  HID: sensor-hub: Fix typo and improve documentation
  iio: admv1013: replace redundant ternary operator with just len
  iio: chemical: mhz19b: Fix error code in probe()
  iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
  iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
  iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
  iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
  iio: make IIO_DMA_MINALIGN minimum of 8 bytes
  ...
2025-06-06 11:50:47 -07:00
Linus Torvalds b78f1293f9 tracing updates for v6.16:
- Have module addresses get updated in the persistent ring buffer
 
   The addresses of the modules from the previous boot are saved in the
   persistent ring buffer. If the same modules are loaded and an address is
   in the old buffer points to an address that was both saved in the
   persistent ring buffer and is loaded in memory, shift the address to point
   to the address that is loaded in memory in the trace event.
 
 - Print function names for irqs off and preempt off callsites
 
   When ignoring the print fmt of a trace event and just printing the fields
   directly, have the fields for preempt off and irqs off events still show
   the function name (via kallsyms) instead of just showing the raw address.
 
 - Clean ups of the histogram code
 
   The histogram functions saved over 800 bytes on the stack to process
   events as they come in. Instead, create per-cpu buffers that can hold this
   information and have a separate location for each context level (thread,
   softirq, IRQ and NMI).
 
   Also add some more comments to the code.
 
 - Add "common_comm" field for histograms
 
   Add "common_comm" that uses the current->comm as a field in an event
   histogram and acts like any of the other fields of the event.
 
 - Show "subops" in the enabled_functions file
 
   When the function graph infrastructure is used, a subsystem has a "subops"
   that it attaches its callback function to. Instead of the
   enabled_functions just showing a function calling the function that calls
   the subops functions, also show the subops functions that will get called
   for that function too.
 
 - Add "copy_trace_marker" option to instances
 
   There are cases where an instance is created for tooling to write into,
   but the old tooling has the top level instance hardcoded into the
   application. New tools want to consume the data from an instance and not
   the top level buffer. By adding a copy_trace_marker option, whenever the
   top instance trace_marker is written into, a copy of it is also written
   into the instance with this option set. This allows new tools to read what
   old tools are writing into the top buffer.
 
   If this option is cleared by the top instance, then what is written into
   the trace_marker is not written into the top instance. This is a way to
   redirect the trace_marker writes into another instance.
 
 - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()
 
   If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(),
   then it will not be exposed via tracefs. Currently there's no way to
   differentiate in the kernel the tracepoint functions between those that
   are exposed via tracefs or not. A calling convention has been made
   manually to append a "_tp" prefix for events created by DECLARE_TRACE().
   Instead of doing this manually, force it so that all DECLARE_TRACE()
   events have this notation.
 
 - Use __string() for task->comm in some sched events
 
   Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
   scheduler events use __string() which makes it dynamic. Note, if these
   events are parsed by user space it they may break, and the event may have
   to be converted back to the hardcoded size.
 
 - Have function graph "depth" be unsigned to the user
 
   Internally to the kernel, the "depth" field of the function graph event is
   signed due to -1 being used for end of boundary. What actually gets
   recorded in the event itself is zero or positive. Reflect this to user
   space by showing "depth" as unsigned int and be consistent across all
   events.
 
 - Allow an arbitrary long CPU string to osnoise_cpus_write()
 
   The filtering of which CPUs to write to can exceed 256 bytes. If a machine
   has 256 CPUs, and the filter is to filter every other CPU, the write would
   take a string larger than 256 bytes. Instead of using a fixed size buffer
   on the stack that is 256 bytes, allocate it to handle what is passed in.
 
 - Stop having ftrace check the per-cpu data "disabled" flag
 
   The "disabled" flag in the data structure passed to most ftrace functions
   is checked to know if tracing has been disabled or not. This flag was
   added back in 2008 before the ring buffer had its own way to disable
   tracing. The "disable" flag is now not always set when needed, and the
   ring buffer flag should be used in all locations where the disabled is
   needed. Since the "disable" flag is redundant and incorrect, stop using it.
   Fix up some locations that use the "disable" flag to use the ring buffer
   info.
 
 - Use a new tracer_tracing_disable/enable() instead of data->disable flag
 
   There's a few cases that set the data->disable flag to stop tracing, but
   this flag is not consistently used. It is also an on/off switch where if a
   function set it and calls another function that sets it, the called
   function may incorrectly enable it.
 
   Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a
   counter and can be nested. These use the ring buffer flags which are
   always checked making the disabling more consistent.
 
 - Save the trace clock in the persistent ring buffer
 
   Save what clock was used for tracing in the persistent ring buffer and set
   it back to that clock after a reboot.
 
 - Remove unused reference to a per CPU data pointer in mmiotrace functions
 
 - Remove unused buffer_page field from trace_array_cpu structure
 
 - Remove more strncpy() instances
 
 - Other minor clean ups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDhiqRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkheAQDpyRHoXF1AIoEqyahDax8f3vpZQeCH
 B/mn+YJmU1wuVgEA7AFALov5SHKv4IzoARz68GXtR0jGhP5D8uebUhUqDAQ=
 =WmFG
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Have module addresses get updated in the persistent ring buffer

   The addresses of the modules from the previous boot are saved in the
   persistent ring buffer. If the same modules are loaded and an address
   is in the old buffer points to an address that was both saved in the
   persistent ring buffer and is loaded in memory, shift the address to
   point to the address that is loaded in memory in the trace event.

 - Print function names for irqs off and preempt off callsites

   When ignoring the print fmt of a trace event and just printing the
   fields directly, have the fields for preempt off and irqs off events
   still show the function name (via kallsyms) instead of just showing
   the raw address.

 - Clean ups of the histogram code

   The histogram functions saved over 800 bytes on the stack to process
   events as they come in. Instead, create per-cpu buffers that can hold
   this information and have a separate location for each context level
   (thread, softirq, IRQ and NMI).

   Also add some more comments to the code.

 - Add "common_comm" field for histograms

   Add "common_comm" that uses the current->comm as a field in an event
   histogram and acts like any of the other fields of the event.

 - Show "subops" in the enabled_functions file

   When the function graph infrastructure is used, a subsystem has a
   "subops" that it attaches its callback function to. Instead of the
   enabled_functions just showing a function calling the function that
   calls the subops functions, also show the subops functions that will
   get called for that function too.

 - Add "copy_trace_marker" option to instances

   There are cases where an instance is created for tooling to write
   into, but the old tooling has the top level instance hardcoded into
   the application. New tools want to consume the data from an instance
   and not the top level buffer. By adding a copy_trace_marker option,
   whenever the top instance trace_marker is written into, a copy of it
   is also written into the instance with this option set. This allows
   new tools to read what old tools are writing into the top buffer.

   If this option is cleared by the top instance, then what is written
   into the trace_marker is not written into the top instance. This is a
   way to redirect the trace_marker writes into another instance.

 - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()

   If a tracepoint is created by DECLARE_TRACE() instead of
   TRACE_EVENT(), then it will not be exposed via tracefs. Currently
   there's no way to differentiate in the kernel the tracepoint
   functions between those that are exposed via tracefs or not. A
   calling convention has been made manually to append a "_tp" prefix
   for events created by DECLARE_TRACE(). Instead of doing this
   manually, force it so that all DECLARE_TRACE() events have this
   notation.

 - Use __string() for task->comm in some sched events

   Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
   scheduler events use __string() which makes it dynamic. Note, if
   these events are parsed by user space it they may break, and the
   event may have to be converted back to the hardcoded size.

 - Have function graph "depth" be unsigned to the user

   Internally to the kernel, the "depth" field of the function graph
   event is signed due to -1 being used for end of boundary. What
   actually gets recorded in the event itself is zero or positive.
   Reflect this to user space by showing "depth" as unsigned int and be
   consistent across all events.

 - Allow an arbitrary long CPU string to osnoise_cpus_write()

   The filtering of which CPUs to write to can exceed 256 bytes. If a
   machine has 256 CPUs, and the filter is to filter every other CPU,
   the write would take a string larger than 256 bytes. Instead of using
   a fixed size buffer on the stack that is 256 bytes, allocate it to
   handle what is passed in.

 - Stop having ftrace check the per-cpu data "disabled" flag

   The "disabled" flag in the data structure passed to most ftrace
   functions is checked to know if tracing has been disabled or not.
   This flag was added back in 2008 before the ring buffer had its own
   way to disable tracing. The "disable" flag is now not always set when
   needed, and the ring buffer flag should be used in all locations
   where the disabled is needed. Since the "disable" flag is redundant
   and incorrect, stop using it. Fix up some locations that use the
   "disable" flag to use the ring buffer info.

 - Use a new tracer_tracing_disable/enable() instead of data->disable
   flag

   There's a few cases that set the data->disable flag to stop tracing,
   but this flag is not consistently used. It is also an on/off switch
   where if a function set it and calls another function that sets it,
   the called function may incorrectly enable it.

   Use a new trace_tracing_disable() and tracer_tracing_enable() that
   uses a counter and can be nested. These use the ring buffer flags
   which are always checked making the disabling more consistent.

 - Save the trace clock in the persistent ring buffer

   Save what clock was used for tracing in the persistent ring buffer
   and set it back to that clock after a reboot.

 - Remove unused reference to a per CPU data pointer in mmiotrace
   functions

 - Remove unused buffer_page field from trace_array_cpu structure

 - Remove more strncpy() instances

 - Other minor clean ups and fixes

* tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits)
  tracing: Fix compilation warning on arm32
  tracing: Record trace_clock and recover when reboot
  tracing/sched: Use __string() instead of fixed lengths for task->comm
  tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix
  tracing: Cleanup upper_empty() in pid_list
  tracing: Allow the top level trace_marker to write into another instances
  tracing: Add a helper function to handle the dereference arg in verifier
  tracing: Remove unnecessary "goto out" that simply returns ret is trigger code
  tracing: Fix error handling in event_trigger_parse()
  tracing: Rename event_trigger_alloc() to trigger_data_alloc()
  tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf
  tracing: Remove unused buffer_page field from trace_array_cpu structure
  tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer
  tracing: Convert the per CPU "disabled" counter to local from atomic
  tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field
  ring-buffer: Add ring_buffer_record_is_on_cpu()
  tracing: Do not use per CPU array_buffer.data->disabled for cpumask
  ftrace: Do not disabled function graph based on "disabled" field
  tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled
  tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one()
  ...
2025-05-29 21:04:36 -07:00
Hendrik Hamerlinck 52092c1d50 docs: fix "incase" typo in coresight/panic.rst
Corrects a spelling mistake in Documentation/trace/coresight/panic.rst
where "incase" was used instead of "in case".

Signed-off-by: Hendrik Hamerlinck <hendrik.hamerlinck@hammernet.be>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20250513110931.15072-1-hendrik.hamerlinck@hammernet.be>
2025-05-19 08:02:14 -06:00
Steven Rostedt ac01fa73f5 tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix
Most tracepoints in the kernel are created with TRACE_EVENT(). The
TRACE_EVENT() macro (and DECLARE_EVENT_CLASS() and DEFINE_EVENT() where in
reality, TRACE_EVENT() is just a helper macro that calls those other two
macros), will create not only a tracepoint (the function trace_<event>()
used in the kernel), it also exposes the tracepoint to user space along
with defining what fields will be saved by that tracepoint.

There are a few places that tracepoints are created in the kernel that are
not exposed to userspace via tracefs. They can only be accessed from code
within the kernel. These tracepoints are created with DEFINE_TRACE()

Most of these tracepoints end with "_tp". This is useful as when the
developer sees that, they know that the tracepoint is for in-kernel only
(meaning it can only be accessed inside the kernel, either directly by the
kernel or indirectly via modules and BPF programs) and is not exposed to
user space.

Instead of making this only a process to add "_tp", enforce it by making
the DECLARE_TRACE() append the "_tp" suffix to the tracepoint. This
requires adding DECLARE_TRACE_EVENT() macros for the TRACE_EVENT() macro
to use that keeps the original name.

Link: https://lore.kernel.org/all/20250418083351.20a60e64@gandalf.local.home/

Cc: netdev <netdev@vger.kernel.org>
Cc: Jiri Olsa <olsajiri@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Breno Leitao <leitao@debian.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/20250510163730.092fad5b@gandalf.local.home
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-14 11:19:32 -04:00
Leo Yan 5161890f13 Documentation: coresight: Document AUX pause and resume
This adds description for AUX pause and resume.  It gives introduction
for what's AUX pause and resume and records usage examples.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250401180708.385396-8-leo.yan@arm.com
2025-05-14 11:56:17 +01:00
Steven Rostedt 7b382efd5e tracing: Allow the top level trace_marker to write into another instances
There are applications that have it hard coded to write into the top level
trace_marker instance (/sys/kernel/tracing/trace_marker). This can be
annoying if a profiler is using that instance for other work, or if it
needs all writes to go into a new instance.

A new option is created called "copy_trace_marker". By default, the top
level has this set, as that is the default buffer that writing into the
top level trace_marker file will go to. But now if an instance is created
and sets this option, all writes into the top level trace_marker will also
be written into that instance buffer just as if an application were to
write into the instance's trace_marker file.

If the top level instance disables this option, then writes to its own
trace_marker and trace_marker_raw files will not go into its buffer.

If no instance has this option set, then the write will return an error
and errno will contain ENODEV.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250508095639.39f84eda@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-09 15:19:11 -04:00
Purva Yeshi f0ba72e655 Documentation: trace: Refactor toctree
Refactor table of contents of kernel tracing subsystem docs to improve
clarity, structure, and organization:

- Reformat sections and add appropriate headings
- Improve section grouping and refine descriptions for each group
- Add docs intro paragraph

Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com>
Link: https://lore.kernel.org/r/20250318113230.24950-2-purvayeshi550@gmail.com
[Bagas: massage commit message and address reviews]
Co-developed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2025-04-21 10:37:13 -06:00
Purva Yeshi 33583537dd Documentation: trace: Reduce toctree depth
Reduce toctree depth from 2 to 1 so that only docs titles are listed
in the toctree.

Signed-off-by: Purva Yeshi <purvayeshi550@gmail.com>
Link: https://lore.kernel.org/r/20250318113230.24950-1-purvayeshi550@gmail.com
[Bagas: massage commit message]
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2025-04-21 10:37:01 -06:00
Nam Cao 244132c4e5 tracing/timers: Rename the hrtimer_init event to hrtimer_setup
The function hrtimer_init() doesn't exist anymore. It was replaced by
hrtimer_setup().

Thus, rename the hrtimer_init trace event to hrtimer_setup to keep it
consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/cba84c3d853c5258aa3a262363a6eac08e2c7afc.1738746927.git.namcao@linutronix.de
2025-04-05 10:30:17 +02:00
Linus Torvalds 6cb0bd94c0 Persistent buffer cleanups and simplifications for v6.15:
It was mistaken that the physical memory returned from "reserve_mem" had to
 be vmap()'d to get to it from a virtual address. But reserve_mem already
 maps the memory to the virtual address of the kernel so a simple
 phys_to_virt() can be used to get to the virtual address from the physical
 memory returned by "reserve_mem". With this new found knowledge, the
 code can be cleaned up and simplified.
 
 - Enforce that the persistent memory is page aligned
 
   As the buffers using the persistent memory are all going to be
   mapped via pages, make sure that the memory given to the tracing
   infrastructure is page aligned. If it is not, it will print a warning
   and fail to map the buffer.
 
 - Use phys_to_virt() to get the virtual address from reserve_mem
 
   Instead of calling vmap() on the physical memory returned from
   "reserve_mem", use phys_to_virt() instead.
 
   As the memory returned by "memmap" or any other means where a physical
   address is given to the tracing infrastructure, it still needs to
   be vmap(). Since this memory can never be returned back to the buddy
   allocator nor should it ever be memmory mapped to user space, flag
   this buffer and up the ref count. The ref count will keep it from
   ever being freed, and the flag will prevent it from ever being memory
   mapped to user space.
 
 - Use vmap_page_range() for memmap virtual address mapping
 
   For the memmap buffer, instead of allocating an array of struct pages,
   assigning them to the contiguous phsycial memory and then passing that to
   vmap(), use vmap_page_range() instead
 
 - Replace flush_dcache_folio() with flush_kernel_vmap_range()
 
   Instead of calling virt_to_folio() and passing that to
   flush_dcache_folio(), just call flush_kernel_vmap_range() directly.
   This also fixes a bug where if a subbuffer was bigger than PAGE_SIZE
   only the PAGE_SIZE portion would be flushed.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+6oZRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhq6AP481KHAgaowQCg7zrKPkMlbYBIigYoU
 7aqoAg2rSLBRSQEAl8fViHZgZ9Q+O7xdozQWiIR7/KQW8VIaTcP/V7cHkAU=
 =+5JB
 -----END PGP SIGNATURE-----

Merge tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer updates from Steven Rostedt:
 "Persistent buffer cleanups and simplifications.

  It was mistaken that the physical memory returned from "reserve_mem"
  had to be vmap()'d to get to it from a virtual address. But
  reserve_mem already maps the memory to the virtual address of the
  kernel so a simple phys_to_virt() can be used to get to the virtual
  address from the physical memory returned by "reserve_mem". With this
  new found knowledge, the code can be cleaned up and simplified.

   - Enforce that the persistent memory is page aligned

     As the buffers using the persistent memory are all going to be
     mapped via pages, make sure that the memory given to the tracing
     infrastructure is page aligned. If it is not, it will print a
     warning and fail to map the buffer.

   - Use phys_to_virt() to get the virtual address from reserve_mem

     Instead of calling vmap() on the physical memory returned from
     "reserve_mem", use phys_to_virt() instead.

     As the memory returned by "memmap" or any other means where a
     physical address is given to the tracing infrastructure, it still
     needs to be vmap(). Since this memory can never be returned back to
     the buddy allocator nor should it ever be memmory mapped to user
     space, flag this buffer and up the ref count. The ref count will
     keep it from ever being freed, and the flag will prevent it from
     ever being memory mapped to user space.

   - Use vmap_page_range() for memmap virtual address mapping

     For the memmap buffer, instead of allocating an array of struct
     pages, assigning them to the contiguous phsycial memory and then
     passing that to vmap(), use vmap_page_range() instead

   - Replace flush_dcache_folio() with flush_kernel_vmap_range()

     Instead of calling virt_to_folio() and passing that to
     flush_dcache_folio(), just call flush_kernel_vmap_range() directly.
     This also fixes a bug where if a subbuffer was bigger than
     PAGE_SIZE only the PAGE_SIZE portion would be flushed"

* tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()
  tracing: Use vmap_page_range() to map memmap ring buffer
  tracing: Have reserve_mem use phys_to_virt() and separate from memmap buffer
  tracing: Enforce the persistent ring buffer to be page aligned
2025-04-03 16:09:29 -07:00
Steven Rostedt c44a14f216 tracing: Enforce the persistent ring buffer to be page aligned
Enforce that the address and the size of the memory used by the persistent
ring buffer is page aligned. Also update the documentation to reflect this
requirement.

Link: https://lore.kernel.org/all/CAHk-=whUOfVucfJRt7E0AH+GV41ELmS4wJqxHDnui6Giddfkzw@mail.gmail.com/

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/20250402144953.412882844@goodmis.org
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-02 11:02:26 -04:00
Linus Torvalds 25601e8544 Char/Misc/IIO driver updates for 6.15-rc1
Here is the big set of char, misc, iio, and other smaller driver
 subsystems for 6.15-rc1.  Lots of stuff in here, including:
   - loads of IIO changes and driver updates
   - counter driver updates
   - w1 driver updates
   - faux conversions for some drivers that were abusing the platform bus
     interface
   - coresight driver updates
   - rust miscdevice binding updates based on real-world-use
   - other minor driver updates
 
 All of these have been in linux-next with no reported issues for quite a
 while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ+mNdQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylktACfYJix41jCCDbiFjnu7Hz4OIdcrUsAnRyF164M
 1n5MhEhsEmvQj7WBwQLE
 =AmmW
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc / IIO driver updates from Greg KH:
 "Here is the big set of char, misc, iio, and other smaller driver
  subsystems for 6.15-rc1. Lots of stuff in here, including:

   - loads of IIO changes and driver updates

   - counter driver updates

   - w1 driver updates

   - faux conversions for some drivers that were abusing the platform
     bus interface

   - coresight driver updates

   - rust miscdevice binding updates based on real-world-use

   - other minor driver updates

  All of these have been in linux-next with no reported issues for quite
  a while"

* tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
  samples: rust_misc_device: fix markup in top-level docs
  Coresight: Fix a NULL vs IS_ERR() bug in probe
  misc: lis3lv02d: convert to use faux_device
  tlclk: convert to use faux_device
  regulator: dummy: convert to use the faux device interface
  bus: mhi: host: Fix race between unprepare and queue_buf
  coresight: configfs: Constify struct config_item_type
  doc: iio: ad7380: describe offload support
  iio: ad7380: add support for SPI offload
  iio: light: Add check for array bounds in veml6075_read_int_time_ms
  iio: adc: ti-ads7924 Drop unnecessary function parameters
  staging: iio: ad9834: Use devm_regulator_get_enable()
  staging: iio: ad9832: Use devm_regulator_get_enable()
  iio: gyro: bmg160_spi: add of_match_table
  dt-bindings: iio: adc: Add i.MX94 and i.MX95 support
  iio: adc: ad7768-1: remove unnecessary locking
  Documentation: ABI: add wideband filter type to sysfs-bus-iio
  iio: adc: ad7768-1: set MOSI idle state to prevent accidental reset
  iio: adc: ad7768-1: Fix conversion result sign
  iio: adc: ad7124: Benefit of dev = indio_dev->dev.parent in ad7124_parse_channel_config()
  ...
2025-04-01 11:26:08 -07:00
Linus Torvalds 609706855d Documentation fix for runtime verifier
The runtime verifier documents that were created were not referenced
 in the indices, which caused warning when building the documentation
 tree. Those documents are now added to the rv indices.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+cRdhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhc3AQCm6C+HJZl28AT7jlFLBwUzgzHJdkQp
 hAVp0o6heMjRCwD/QAKUDKaTOeqmmvmP/u2CF/e/IFypBDauwsA3bhiKKA8=
 =2HEf
 -----END PGP SIGNATURE-----

Merge tag 'trace-latency-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing documentation fix from Steven Rostedt:
 "Documentation fix for runtime verifier

  The runtime verifier documents that were created were not referenced
  in the indices, which caused warning when building the documentation
  tree. Those documents are now added to the rv indices"

* tag 'trace-latency-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation/rv: Add sched pages to the indices
2025-03-31 09:56:08 -07:00
Linus Torvalds 88221ac0d5 Latency tracing changes for v6.15:
- Add some trace events to osnoise and timerlat sample generation
 
   This adds more information to the osnoise and timerlat tracers as well as
   allows BPF programs to be attached to these locations to extract even more
   data.
 
 - Fix to DECLARE_TRACE_CONDITION() macro
 
   It wasn't used but now will be and it happened to be broken causing the
   build to fail.
 
 - Add scheduler specification monitors to runtime verifier (RV)
 
   This is a continuation of Daniel Bristot's work.
 
   RV allows monitors to run and react concurrently. Running the cumulative
   model is equivalent to running single components using the same
   reactors, with the advantage that it's easier to point out which
   specification failed in case of error.
 
   This update introduces nested monitors to RV, in short, the sysfs
   monitor folder will contain a monitor named sched, which is nothing but
   an empty container for other monitors. Controlling the sched monitor
   (enable, disable, set reactors) controls all nested monitors.
 
   The following scheduling monitors are added:
 
   * sco: scheduling context operations
       Monitor to ensure sched_set_state happens only in thread context
   * tss: task switch while scheduling
       Monitor to ensure sched_switch happens only in scheduling context
   * snroc: set non runnable on its own context
       Monitor to ensure set_state happens only in the respective task's context
   * scpd: schedule called with preemption disabled
       Monitor to ensure schedule is called with preemption disabled
   * snep: schedule does not enable preempt
       Monitor to ensure schedule does not enable preempt
   * sncid: schedule not called with interrupt disabled
       Monitor to ensure schedule is not called with interrupt disabled
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+QhuxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qg62AP9bkeNDbiCuAqjZGddV09Hw26wC3yum
 kQoNSebD8G52rQEA9GDjK37xGzYwW/fJokhJVTV39qfub6inAJE5dS6WeQY=
 =8Ikv
 -----END PGP SIGNATURE-----

Merge tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull latency tracing updates from Steven Rostedt:

 - Add some trace events to osnoise and timerlat sample generation

   This adds more information to the osnoise and timerlat tracers as
   well as allows BPF programs to be attached to these locations to
   extract even more data.

 - Fix to DECLARE_TRACE_CONDITION() macro

   It wasn't used but now will be and it happened to be broken causing
   the build to fail.

 - Add scheduler specification monitors to runtime verifier (RV)

   This is a continuation of Daniel Bristot's work.

   RV allows monitors to run and react concurrently. Running the
   cumulative model is equivalent to running single components using the
   same reactors, with the advantage that it's easier to point out which
   specification failed in case of error.

   This update introduces nested monitors to RV, in short, the sysfs
   monitor folder will contain a monitor named sched, which is nothing
   but an empty container for other monitors. Controlling the sched
   monitor (enable, disable, set reactors) controls all nested monitors.

   The following scheduling monitors are added:

     - sco: scheduling context operations
       Monitor to ensure sched_set_state happens only in thread context

     - tss: task switch while scheduling
       Monitor to ensure sched_switch happens only in scheduling context

     - snroc: set non runnable on its own context
       Monitor to ensure set_state happens only in the respective task's context

     - scpd: schedule called with preemption disabled
       Monitor to ensure schedule is called with preemption disabled

     - snep: schedule does not enable preempt
       Monitor to ensure schedule does not enable preempt

     - sncid: schedule not called with interrupt disabled
       Monitor to ensure schedule is not called with interrupt disabled

* tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tools/rv: Allow rv list to filter for container
  Documentation/rv: Add docs for the sched monitors
  verification/dot2k: Add support for nested monitors
  tools/rv: Add support for nested monitors
  rv: Add scpd, snep and sncid per-cpu monitors
  rv: Add snroc per-task monitor
  rv: Add sco and tss per-cpu monitors
  rv: Add option for nested monitors and include sched
  sched: Add sched tracepoints for RV task model
  rv: Add license identifiers to monitor files
  tracing: Fix DECLARE_TRACE_CONDITION
  trace/osnoise: Add trace events for samples
2025-03-27 16:03:52 -07:00
Gabriele Monaco 4bb5d82b66 Documentation/rv: Add sched pages to the indices
The pages Documentation/tools/rv/rv-mon-sched.rst and
Documentation/trace/rv/monitor_sched.rst were introduced but not
included in any index.

Add them to the respective indices.

Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/20250327081240.46422-1-gmonaco@redhat.com
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 03abeaa63c ("Documentation/rv: Add docs for the sched monitors")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-27 12:02:38 -04:00
Gabriele Monaco 03abeaa63c Documentation/rv: Add docs for the sched monitors
Add man page and kernel documentation for the sched monitors, as sched
is a container of other monitors, document all in the same page.
sched is the first nested monitor, also explain what is a nested monitor
and how enabling containers or children monitors work.

To: Ingo Molnar <mingo@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: John Kacur <jkacur@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-24 17:27:40 -04:00
James Clark 4a29fa2626 coresight: docs: Remove target sink from examples
Previously the sink had to be specified, but now it auto selects one by
default. Including a sink in the examples causes issues when copy
pasting the command because it might not work if that sink isn't
present. Remove the sink from all the basic examples and create a new
section specifically about overriding the default one.

Make the text a but more concise now that it's in the advanced section,
and similarly for removing the old kernel advice.

Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20241210144933.295798-1-james.clark@linaro.org
2025-03-11 14:55:57 +00:00
Linu Cherian b47d1fcd0d Documentation: coresight: Panic support
Add documentation on using coresight during panic
and watchdog.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250212114918.548431-9-lcherian@marvell.com
2025-02-21 16:17:40 +00:00
Mauro Carvalho Chehab 6a0c4b61e1 docs: trace: decode_msr.py: make it compatible with python 3
This script uses print <foo> instead of print(foo), which is
incompatible with Python 3.

Fix it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/88bb0d47100feaa3cda215e68bf6500dc67da7b3.1739257245.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2025-02-13 09:45:22 -07:00
Linus Torvalds e8744fbc83 tracing updates for v6.14:
- Cleanup with guard() and free() helpers
 
   There were several places in the code that had a lot of "goto out" in the
   error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.
 
 - Update the Rust tracepoint code to use the C code too
 
   There was some duplication of the tracepoint code for Rust that did the
   same logic as the C code. Add a helper that makes it possible for both
   algorithms to use the same logic in one place.
 
 - Add poll to trace event hist files
 
   It is useful to know when an event is triggered, or even with some
   filtering. Since hist files of events get updated when active and the
   event is triggered, allow applications to poll the hist file and wake up
   when an event is triggered. This will let the application know that the
   event it is waiting for happened.
 
 - Add :mod: command to enable events for current or future modules
 
   The function tracer already has a way to enable functions to be traced in
   modules by writing ":mod:<module>" into set_ftrace_filter. That will
   enable either all the functions for the module if it is loaded, or if it
   is not, it will cache that command, and when the module is loaded that
   matches <module>, its functions will be enabled. This also allows init
   functions to be traced. But currently events do not have that feature.
 
   Add the command where if ':mod:<module>' is written into set_event, then
   either all the modules events are enabled if it is loaded, or cache it so
   that the module's events are enabled when it is loaded. This also works
   from the kernel command line, where "trace_event=:mod:<module>", when the
   module is loaded at boot up, its events will be enabled then.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ5EbMxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkZsAP9Amgx9frSbR1pn1t0I3wVnQx7khgOu
 s/b8Ro+vjTx1/QD/RN2AA7f+HK4F27w3Aqfrs0nKXAPtXWsJ9Epp8raG5w8=
 =Pg+4
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Update the Rust tracepoint code to use the C code too

   There was some duplication of the tracepoint code for Rust that did
   the same logic as the C code. Add a helper that makes it possible for
   both algorithms to use the same logic in one place.

 - Add poll to trace event hist files

   It is useful to know when an event is triggered, or even with some
   filtering. Since hist files of events get updated when active and the
   event is triggered, allow applications to poll the hist file and wake
   up when an event is triggered. This will let the application know
   that the event it is waiting for happened.

 - Add :mod: command to enable events for current or future modules

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Add the command where if ':mod:<module>' is written into set_event,
   then either all the modules events are enabled if it is loaded, or
   cache it so that the module's events are enabled when it is loaded.
   This also works from the kernel command line, where
   "trace_event=:mod:<module>", when the module is loaded at boot up,
   its events will be enabled then.

* tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  tracing: Fix output of set_event for some cached module events
  tracing: Fix allocation of printing set_event file content
  tracing: Rename update_cache() to update_mod_cache()
  tracing: Fix #if CONFIG_MODULES to #ifdef CONFIG_MODULES
  selftests/ftrace: Add test that tests event :mod: commands
  tracing: Cache ":mod:" events for modules not loaded yet
  tracing: Add :mod: command to enabled module events
  selftests/tracing: Add hist poll() support test
  tracing/hist: Support POLLPRI event for poll on histogram
  tracing/hist: Add poll(POLLIN) support on hist file
  tracing: Fix using ret variable in tracing_set_tracer()
  tracepoint: Reduce duplication of __DO_TRACE_CALL
  tracing/string: Create and use __free(argv_free) in trace_dynevent.c
  tracing: Switch trace_stat.c code over to use guard()
  tracing: Switch trace_stack.c code over to use guard()
  tracing: Switch trace_osnoise.c code over to use guard() and __free()
  tracing: Switch trace_events_synth.c code over to use guard()
  tracing: Switch trace_events_filter.c code over to use guard()
  tracing: Switch trace_events_trigger.c code over to use guard()
  tracing: Switch trace_events_hist.c code over to use guard()
  ...
2025-01-23 17:51:16 -08:00
Linus Torvalds d0f93ac2c3 Documentation changes this time around include:
- Quite a bit of Chinese and Spanish translation work.
 
 - Clarifying that Git commit IDs >12chars are OK
 
 - A new nvme-multipath document
 
 - A reorganization of the admin-guide top-level page to make it readable
 
 - Clarification of the role of Acked-by and maintainer discretion on their
   acceptance.
 
 - Some reorganization of debugging-oriented docs.
 
 ...and typo fixes, documentation updates, etc. as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmeOp8EPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YipUH/iffvlVYuqoVdPUFWdmsiNjwOCRE2MIfp8qO
 tPTRRHJAny+NlFT0IWlGUbLNoNXtvpN47YlkaeAjdrsjASerfpwzje7t4Z1B+jWT
 0YwGBCvDIGasfRCx7D14+w5aqkEEynfsy+QurwcuDxcHMQGwt7ZCuTNOVO6BULEr
 L++BMwqapUr5IemP4ItQqDVVF9sp6bWEhaOnTTJCLU6oG23uUSSA/59sJmwDJUk7
 6J3VGO1An4Jte9WX7qkVrSBNO5cOOhaFiFXIeNxfOioOPctBwxKiHDJnzVud8ipz
 R+tnUI/8hEvyJ7GZFezyZxmMnFs0P2DEYAkaN+hBs/nUjx0dKUg=
 =YxaS
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.14' of git://git.lwn.net/linux

Pull Documentation updates from Jonathan Corbet:

 - Quite a bit of Chinese and Spanish translation work

 - Clarifying that Git commit IDs >12chars are OK

 - A new nvme-multipath document

 - A reorganization of the admin-guide top-level page to make it
   readable

 - Clarification of the role of Acked-by and maintainer discretion on
   their acceptance

 - Some reorganization of debugging-oriented docs

... and typo fixes, documentation updates, etc as usual

* tag 'docs-6.14' of git://git.lwn.net/linux: (50 commits)
  Documentation: Fix x86_64 UEFI outdated references to elilo
  Documentation/sysctl: Add timer_migration to kernel.rst
  docs/mm: Physical memory: Remove zone_t
  docs: submitting-patches: clarify that signers may use their discretion on tags
  docs: submitting-patches: clarify difference between Acked-by and Reviewed-by
  docs: submitting-patches: clarify Acked-by and introduce "# Suffix"
  Documentation: bug-hunting.rst: remove odd contact information
  docs/zh_CN: Add sak index Chinese translation
  doc: module: DEFAULT_SYMBOL_NAMESPACE must be defined before #includes
  doc: module: Fix documented type of namespace
  Documentation/kernel-parameters: Fix a reference to vga-softcursor.rst
  docs/zh_CN: Add landlock index Chinese translation
  Documentation: Fix typo localmodonfig -> localmodconfig
  overlayfs.rst: Fix and improve grammar
  docs/zh_CN: Add siphash index Chinese translation
  docs/zh_CN: Add security IMA-templates Chinese translation
  docs/zh_CN: Add security digsig Chinese translation
  Align git commit ID abbreviation guidelines and checks
  docs: process: submitting-patches: split canonical patch format section
  docs/zh_CN: Add security lsm Chinese translation
  ...
2025-01-21 18:00:00 -08:00
Linus Torvalds 2e04247f7c ftrace updates for v6.14:
- Have fprobes built on top of function graph infrastructure
 
   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function. The
   fprobe and kretprobe logic implements a similar method as the function
   graph tracer to trace the end of the function. That is to hijack the
   return address and jump to a trampoline to do the trace when the function
   exits. To do this, a shadow stack needs to be created to store the
   original return address.  Fprobes and function graph do this slightly
   differently. Fprobes (and kretprobes) has slots per callsite that are
   reserved to save the return address. This is fine when just a few points
   are traced. But users of fprobes, such as BPF programs, are starting to add
   many more locations, and this method does not scale.
 
   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started, every
   task gets its own shadow stack to hold the return address that is going to
   be traced. The function graph tracer has been updated to allow multiple
   users to use its infrastructure. Now have fprobes be one of those users.
   This will also allow for the fprobe and kretprobe methods to trace the
   return address to become obsolete. With new technologies like CFI that
   need to know about these methods of hijacking the return address, going
   toward a solution that has only one method of doing this will make the
   kernel less complex.
 
 - Cleanup with guard() and free() helpers
 
   There were several places in the code that had a lot of "goto out" in the
   error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.
 
 - Remove disabling of interrupts in the function graph tracer
 
   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable interrupts and
   not trace NMIs. But the code has changed to allow NMIs and also
   interrupts. This change was done a long time ago, but the disabling of
   interrupts was never removed. Remove the disabling of interrupts in the
   function graph tracer is it is not needed. This greatly improves its
   performance.
 
 - Allow the :mod: command to enable tracing module functions on the kernel
   command line.
 
   The function tracer already has a way to enable functions to be traced in
   modules by writing ":mod:<module>" into set_ftrace_filter. That will
   enable either all the functions for the module if it is loaded, or if it
   is not, it will cache that command, and when the module is loaded that
   matches <module>, its functions will be enabled. This also allows init
   functions to be traced. But currently events do not have that feature.
 
   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ42E2RQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqXSAPwOMxuhye8tb1GYG62QD9+w7e6nOmlC
 2GCPj4detnEM2QD/ciivkhespVKhHpZHRewAuSnJgHPSM45NQ3EVESzjWQ4=
 =snbx
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
2025-01-21 15:15:28 -08:00