linux/kernel/trace
Linus Torvalds 21fbefc588 tracing updates for v6.18:
- Use READ_ONCE() and WRITE_ONCE() instead of RCU for syscall tracepoints
 
   Individual system call trace events are pseudo events attached to the
   raw_syscall trace events that just trace the entry and exit of all system
   calls. When any of these individual system call trace events get enabled,
   an element in an array indexed by the system call number is assigned to
   the trace file that defines how to trace it. When the trace event
   triggers, it reads this array and if the array has an element, it uses that
   trace file to know what to write it (the trace file defines the output
   format of the corresponding system call).
 
   The issue is that it uses rcu_dereference_ptr() and marks the elements of
   the array as using RCU. This is incorrect. There is no RCU synchronization
   here. The event file that is pointed to has a completely different way to
   make sure its freed properly. The reading of the array during the system
   call trace event is only to know if there is a value or not. If not, it
   does nothing (it means this system call isn't being traced). If it does,
   it uses the information to store the system call data.
 
   The RCU usage here can simply be replaced by READ_ONCE() and WRITE_ONCE()
   macros.
 
 - Have the system call trace events use "0x" for hex values
 
   Some system call trace events display hex values but do not have "0x" in
   front of it. Seeing "count: 44" can be assumed that it is 44 decimal when
   in actuality it is 44 hex (68 decimal). Display "0x44" instead.
 
 - Use vmalloc_array() in tracing_map_sort_entries()
 
   The function tracing_map_sort_entries() used array_size() and vmalloc()
   when it could have simply used vmalloc_array().
 
 - Use for_each_online_cpu() in trace_osnoise.c()
 
   Instead of open coding for_each_cpu(cpu, cpu_online_mask), use
   for_each_online_cpu().
 
 - Move the buffer field in struct trace_seq to the end
 
   The buffer field in struct trace_seq is architecture dependent in size,
   and caused padding for the fields after it. By moving the buffer to the
   end of the structure, it compacts the trace_seq structure better.
 
 - Remove redundant zeroing of cmdline_idx field in saved_cmdlines_buffer()
 
   The structure that contains cmdline_idx is zeroed by memset(), no need to
   explicitly zero any of its fields after that.
 
 - Use  system_percpu_wq instead of system_wq in user_event_mm_remove()
 
   As system_wq is being deprecated, use the new wq.
 
 - Add cond_resched() is ftrace_module_enable()
 
   Some modules have a lot of functions (thousands of them), and the enabling
   of those functions can take some time. On non preemtable kernels, it was
   triggering a watchdog timeout. Add a cond_resched() to prevent that.
 
 - Add a BUILD_BUG_ON() to make sure PID_MAX_DEFAULT is always a power of 2
 
   There's code that depends on PID_MAX_DEFAULT being a power of 2 or it will
   break. If in the future that changes, make sure the build fails to ensure
   that the code is fixed that depends on this.
 
 - Grab mutex_lock() before ever exiting s_start()
 
   The s_start() function is a seq_file start routine. As s_stop() is always
   called even if s_start() fails, and s_stop() expects the event_mutex to be
   held as it will always release it. That mutex must always be taken in
   s_start() even if that function fails.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaN/4phQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qpToAP4sENpZkZHFOl2PuikmAgjhB6PqiUtL
 LNuMDx45WygcLwD6Awy8DUlEBN6RzTPA761MSjs0+NMg16QLrhPLxWqFEgw=
 =nxDn
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Use READ_ONCE() and WRITE_ONCE() instead of RCU for syscall
   tracepoints

   Individual system call trace events are pseudo events attached to the
   raw_syscall trace events that just trace the entry and exit of all
   system calls. When any of these individual system call trace events
   get enabled, an element in an array indexed by the system call number
   is assigned to the trace file that defines how to trace it. When the
   trace event triggers, it reads this array and if the array has an
   element, it uses that trace file to know what to write it (the trace
   file defines the output format of the corresponding system call).

   The issue is that it uses rcu_dereference_ptr() and marks the
   elements of the array as using RCU. This is incorrect. There is no
   RCU synchronization here. The event file that is pointed to has a
   completely different way to make sure its freed properly. The reading
   of the array during the system call trace event is only to know if
   there is a value or not. If not, it does nothing (it means this
   system call isn't being traced). If it does, it uses the information
   to store the system call data.

   The RCU usage here can simply be replaced by READ_ONCE() and
   WRITE_ONCE() macros.

 - Have the system call trace events use "0x" for hex values

   Some system call trace events display hex values but do not have "0x"
   in front of it. Seeing "count: 44" can be assumed that it is 44
   decimal when in actuality it is 44 hex (68 decimal). Display "0x44"
   instead.

 - Use vmalloc_array() in tracing_map_sort_entries()

   The function tracing_map_sort_entries() used array_size() and
   vmalloc() when it could have simply used vmalloc_array().

 - Use for_each_online_cpu() in trace_osnoise.c()

   Instead of open coding for_each_cpu(cpu, cpu_online_mask), use
   for_each_online_cpu().

 - Move the buffer field in struct trace_seq to the end

   The buffer field in struct trace_seq is architecture dependent in
   size, and caused padding for the fields after it. By moving the
   buffer to the end of the structure, it compacts the trace_seq
   structure better.

 - Remove redundant zeroing of cmdline_idx field in
   saved_cmdlines_buffer()

   The structure that contains cmdline_idx is zeroed by memset(), no
   need to explicitly zero any of its fields after that.

 - Use system_percpu_wq instead of system_wq in user_event_mm_remove()

   As system_wq is being deprecated, use the new wq.

 - Add cond_resched() is ftrace_module_enable()

   Some modules have a lot of functions (thousands of them), and the
   enabling of those functions can take some time. On non preemtable
   kernels, it was triggering a watchdog timeout. Add a cond_resched()
   to prevent that.

 - Add a BUILD_BUG_ON() to make sure PID_MAX_DEFAULT is always a power
   of 2

   There's code that depends on PID_MAX_DEFAULT being a power of 2 or it
   will break. If in the future that changes, make sure the build fails
   to ensure that the code is fixed that depends on this.

 - Grab mutex_lock() before ever exiting s_start()

   The s_start() function is a seq_file start routine. As s_stop() is
   always called even if s_start() fails, and s_stop() expects the
   event_mutex to be held as it will always release it. That mutex must
   always be taken in s_start() even if that function fails.

* tag 'trace-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix lock imbalance in s_start() memory allocation failure path
  tracing: Ensure optimized hashing works
  ftrace: Fix softlockup in ftrace_module_enable
  tracing: replace use of system_wq with system_percpu_wq
  tracing: Remove redundant 0 value initialization
  tracing: Move buffer in trace_seq to end of struct
  tracing/osnoise: Use for_each_online_cpu() instead of for_each_cpu()
  tracing: Use vmalloc_array() to improve code
  tracing: Have syscall trace events show "0x" for values greater than 10
  tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE()
2025-10-05 09:43:36 -07:00
..
rv rv: Fix missing mutex unlock in rv_register_monitor() 2025-09-15 08:36:35 +02:00
Kconfig tracing changes for 6.17 2025-08-01 10:29:36 -07:00
Makefile tracing: Have eprobes have their own config option 2025-07-30 10:38:43 -04:00
blktrace.c Significant patch series in this pull request: 2025-08-03 16:23:09 -07:00
bpf_trace.c file->f_path constification 2025-10-03 16:32:36 -07:00
bpf_trace.h
error_report-traces.c
fgraph.c tracing: fgraph: Protect return handler from recursion loop 2025-09-27 09:04:05 -04:00
fprobe.c tracing: fprobe: Fix to remove recorded module addresses from filter 2025-09-24 23:18:26 +09:00
ftrace.c ftrace: Fix softlockup in ftrace_module_enable 2025-09-30 17:27:58 -04:00
ftrace_internal.h
kprobe_event_gen_test.c
pid_list.c
pid_list.h
power-traces.c
preemptirq_delay_test.c
rethook.c
ring_buffer.c ring-buffer: Remove redundant semicolons 2025-08-20 09:20:30 -04:00
ring_buffer_benchmark.c
rpm-traces.c
synth_event_gen_test.c
trace.c vfs_parse_fs_string() stuff 2025-10-03 10:51:44 -07:00
trace.h tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE() 2025-09-23 09:29:29 -04:00
trace_benchmark.c
trace_benchmark.h
trace_boot.c
trace_branch.c
trace_btf.c
trace_btf.h
trace_clock.c
trace_dynevent.c tracing: dynevent: Add a missing lockdown check on dynevent 2025-09-25 00:22:46 +09:00
trace_dynevent.h
trace_entries.h
trace_eprobe.c tracing: Have eprobes handle arrays 2025-07-24 22:57:32 +09:00
trace_event_perf.c
trace_events.c tracing: Fix lock imbalance in s_start() memory allocation failure path 2025-10-03 12:13:12 -04:00
trace_events_filter.c tracing changes for 6.17 2025-08-01 10:29:36 -07:00
trace_events_filter_test.h
trace_events_hist.c
trace_events_inject.c
trace_events_synth.c tracing: Add guard(ring_buffer_nest) 2025-08-01 16:49:15 -04:00
trace_events_trigger.c
trace_events_user.c tracing updates for v6.18: 2025-10-05 09:43:36 -07:00
trace_export.c
trace_fprobe.c tracing: Fix race condition in kprobe initialization causing NULL pointer dereference 2025-10-02 08:05:01 +09:00
trace_functions.c
trace_functions_graph.c fgraph: Copy args in intermediate storage with entry 2025-08-22 17:32:35 -04:00
trace_hwlat.c
trace_irqsoff.c
trace_kdb.c
trace_kprobe.c tracing: Fix race condition in kprobe initialization causing NULL pointer dereference 2025-10-02 08:05:01 +09:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing updates for v6.18: 2025-10-05 09:43:36 -07:00
trace_output.c tracing: Have unsigned int function args displayed as hexadecimal 2025-08-01 19:14:51 -04:00
trace_output.h
trace_preemptirq.c
trace_printk.c
trace_probe.c Probes updates for v6.17: 2025-07-30 15:38:01 -07:00
trace_probe.h tracing: Fix race condition in kprobe initialization causing NULL pointer dereference 2025-10-02 08:05:01 +09:00
trace_probe_kernel.h
trace_probe_tmpl.h
trace_recursion_record.c
trace_sched_switch.c tracing: Ensure optimized hashing works 2025-09-30 17:27:58 -04:00
trace_sched_wakeup.c
trace_selftest.c
trace_selftest_dynamic.c
trace_seq.c
trace_stack.c
trace_stat.c
trace_stat.h
trace_synth.h
trace_syscalls.c tracing: Have syscall trace events show "0x" for values greater than 10 2025-09-23 09:29:29 -04:00
trace_uprobe.c tracing: Fix race condition in kprobe initialization causing NULL pointer dereference 2025-10-02 08:05:01 +09:00
tracing_map.c tracing: Use vmalloc_array() to improve code 2025-09-23 09:31:58 -04:00
tracing_map.h