mirror of https://github.com/torvalds/linux.git
4524 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
f91c031e65 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-07-04 The following pull-request contains BPF updates for your *net-next* tree. We've added 73 non-merge commits during the last 17 day(s) which contain a total of 106 files changed, 5233 insertions(+), 1283 deletions(-). The main changes are: 1) bpftool ability to show PIDs of processes having open file descriptors for BPF map/program/link/BTF objects, relying on BPF iterator progs to extract this info efficiently, from Andrii Nakryiko. 2) Addition of BPF iterator progs for dumping TCP and UDP sockets to seq_files, from Yonghong Song. 3) Support access to BPF map fields in struct bpf_map from programs through BTF struct access, from Andrey Ignatov. 4) Add a bpf_get_task_stack() helper to be able to dump /proc/*/stack via seq_file from BPF iterator progs, from Song Liu. 5) Make SO_KEEPALIVE and related options available to bpf_setsockopt() helper, from Dmitry Yakunin. 6) Optimize BPF sk_storage selection of its caching index, from Martin KaFai Lau. 7) Removal of redundant synchronize_rcu()s from BPF map destruction which has been a historic leftover, from Alexei Starovoitov. 8) Several improvements to test_progs to make it easier to create a shell loop that invokes each test individually which is useful for some CIs, from Jesper Dangaard Brouer. 9) Fix bpftool prog dump segfault when compiled without skeleton code on older clang versions, from John Fastabend. 10) Bunch of cleanups and minor improvements, from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
2df6bb5493 |
bpf: Allow %pB in bpf_seq_printf() and bpf_trace_printk()
This makes it easy to dump stack trace in text. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-4-songliubraving@fb.com |
|
|
|
fa28dcb82a |
bpf: Introduce helper bpf_get_task_stack()
Introduce helper bpf_get_task_stack(), which dumps stack trace of given task. This is different to bpf_get_stack(), which gets stack track of current task. One potential use case of bpf_get_task_stack() is to call it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file. bpf_get_task_stack() uses stack_trace_save_tsk() instead of get_perf_callchain() for kernel stack. The benefit of this choice is that stack_trace_save_tsk() doesn't require changes in arch/. The downside of using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the stack trace to unsigned long array. For 32-bit systems, we need to translate it to u64 array. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com |
|
|
|
4a21185cda |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.
2) The default crypto algorithm selection in Kconfig for IPSEC is out
of touch with modern reality, fix this up. From Eric Biggers.
3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
Nakryiko.
4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
Hangbin Liu.
5) Adjust packet alignment handling in ax88179_178a driver to match
what the hardware actually does. From Jeremy Kerr.
6) register_netdevice can leak in the case one of the notifiers fail,
from Yang Yingliang.
7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.
8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
Oltean.
9) tg3 driver can sleep forever when we get enough EEH errors, fix from
David Christensen.
10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
drivers, from Ciara Loftus.
11) Fix scanning loop break condition in of_mdiobus_register(), from
Florian Fainelli.
12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.
13) Endianness fix in mlxsw, from Ido Schimmel.
14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.
15) Missing bridge mrp configuration validation, from Horatiu Vultur.
16) Fix circular netns references in wireguard, from Jason A. Donenfeld.
17) PTP initialization on recovery is not done properly in qed driver,
from Alexander Lobakin.
18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
from Rahul Lakkireddy.
19) Don't clear bound device TX queue of socket prematurely otherwise we
get problems with ktls hw offloading, from Tariq Toukan.
20) ipset can do atomics on unaligned memory, fix from Russell King.
21) Align ethernet addresses properly in bridging code, from Thomas
Martitz.
22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
from Marcelo Ricardo Leitner.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
rds: transport module should be auto loaded when transport is set
sch_cake: fix a few style nits
sch_cake: don't call diffserv parsing code when it is not needed
sch_cake: don't try to reallocate or unshare skb unconditionally
ethtool: fix error handling in linkstate_prepare_data()
wil6210: account for napi_gro_receive never returning GRO_DROP
hns: do not cast return value of napi_gro_receive to null
socionext: account for napi_gro_receive never returning GRO_DROP
wireguard: receive: account for napi_gro_receive never returning GRO_DROP
vxlan: fix last fdb index during dump of fdb with nhid
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
tc-testing: avoid action cookies with odd length.
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
net: dsa: sja1105: fix tc-gate schedule with single element
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
net: dsa: sja1105: unconditionally free old gating config
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
net: macb: free resources on failure path of at91ether_open()
net: macb: call pm_runtime_put_sync on failure path
...
|
|
|
|
0d4fad3e57 |
bpf: Add bpf_skc_to_udp6_sock() helper
The helper is used in tracing programs to cast a socket pointer to a udp6_sock pointer. The return value could be NULL if the casting is illegal. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com |
|
|
|
478cfbdf5f |
bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers
Three more helpers are added to cast a sock_common pointer to an tcp_sock, tcp_timewait_sock or a tcp_request_sock for tracing programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com |
|
|
|
af7ec13833 |
bpf: Add bpf_skc_to_tcp6_sock() helper
The helper is used in tracing programs to cast a socket pointer to a tcp6_sock pointer. The return value could be NULL if the casting is illegal. A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added so the verifier is able to deduce proper return types for the helper. Different from the previous BTF_ID based helpers, the bpf_skc_to_tcp6_sock() argument can be several possible btf_ids. More specifically, all possible socket data structures with sock_common appearing in the first in the memory layout. This patch only added socket types related to tcp and udp. All possible argument btf_id and return value btf_id for helper bpf_skc_to_tcp6_sock() are pre-calculcated and cached. In the future, it is even possible to precompute these btf_id's at kernel build time. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com |
|
|
|
72e2b2b66f |
bpf: Allow tracing programs to use bpf_jiffies64() helper
/proc/net/tcp{4,6} uses jiffies for various computations.
Let us add bpf_jiffies64() helper to tracing program
so bpf_iter and other programs can use it.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230808.3988073-1-yhs@fb.com
|
|
|
|
c06b022957 |
bpf: Support 'X' in bpf_seq_printf() helper
'X' tells kernel to print hex with upper case letters.
/proc/net/tcp{4,6} seq_file show() used this, and
supports it in bpf_seq_printf() helper too.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230807.3988014-1-yhs@fb.com
|
|
|
|
20dc3847cc |
tracing/boottime: Fix kprobe multiple events
Fix boottime kprobe events to report and abort after each failure when
adding probes.
As an example, when we try to set multiprobe kprobe events in
bootconfig like this:
ftrace.event.kprobes.vfsevents {
probes = "vfs_read $arg1 $arg2,,
!error! not reported;?", // leads to error
"vfs_write $arg1 $arg2"
}
This will not work as expected. After
commit
|
|
|
|
6784beada6 |
tracing: Fix event trigger to accept redundant spaces
Fix the event trigger to accept redundant spaces in
the trigger input.
For example, these return -EINVAL
echo " traceon" > events/ftrace/print/trigger
echo "traceon if common_pid == 0" > events/ftrace/print/trigger
echo "disable_event:kmem:kmalloc " > events/ftrace/print/trigger
But these are hard to find what is wrong.
To fix this issue, use skip_spaces() to remove spaces
in front of actual tokens, and set NULL if there is no
token.
Link: http://lkml.kernel.org/r/159262476352.185015.5261566783045364186.stgit@devnote2
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes:
|
|
|
|
6c95503c29 |
tracing/boot: Fix config dependency for synthedic event
Since commit |
|
|
|
4e608675e7 | Merge up to bpf_probe_read_kernel_str() fix into bpf-next | |
|
|
097350d1c6 |
ring-buffer: Zero out time extend if it is nested and not absolute
Currently the ring buffer makes events that happen in interrupts that preempt
another event have a delta of zero. (Hopefully we can change this soon). But
this is to deal with the races of updating a global counter with lockless
and nesting functions updating deltas.
With the addition of absolute time stamps, the time extend didn't follow
this rule. A time extend can happen if two events happen longer than 2^27
nanoseconds appart, as the delta time field in each event is only 27 bits.
If that happens, then a time extend is injected with 2^59 bits of
nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
two events, and as it is writing the event, an interrupt triggers, it will
see the 2^27 difference as well and inject a time extend of its own. But a
recent change made the time extend logic not take into account the nesting,
and this can cause two time extend deltas to happen moving the time stamp
much further ahead than the current time. This gets all reset when the ring
buffer moves to the next page, but that can cause time to appear to go
backwards.
This was observed in a trace-cmd recording, and since the data is saved in a
file, with trace-cmd report --debug, it was possible to see that this indeed
did happen!
bash-52501 110d... 81778.908247: sched_switch: bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
<idle>-0 110d... 81778.908757: sched_switch: swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
TIME EXTEND: delta:306454770 length:0
bash-52501 110.... 81779.215212: sched_swap_numa: src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
TIME EXTEND: delta:306458165 length:0
bash-52501 110dNh. 81779.521670: sched_wakeup: migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
and at the next page, caused the time to go backwards:
bash-52504 110d... 81779.685411: sched_switch: bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
<idle>-0 110dN.. 81779.379166: sched_wakeup: bash:52504 [120] success=1 CPU:110 [0:0x10:40]
<idle>-0 110d... 81779.379167: sched_switch: swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes:
|
|
|
|
8b6ddd10d6 |
A few fixes and small cleanups for tracing:
- Have recordmcount work with > 64K sections (to support LTO) - kprobe RCU fixes - Correct a kprobe critical section with missing mutex - Remove redundant arch_disarm_kprobe() call - Fix lockup when kretprobe triggers within kprobe_flush_task() - Fix memory leak in fetch_op_data operations - Fix sleep in atomic in ftrace trace array sample code - Free up memory on failure in sample trace array code - Fix incorrect reporting of function_graph fields in format file - Fix quote within quote parsing in bootconfig - Fix return value of bootconfig tool - Add testcases for bootconfig tool - Fix maybe uninitialized warning in ftrace pid file code - Remove unused variable in tracing_iter_reset() - Fix some typos -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXu1jrRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qoCMAP91nOccE3X+Nvc3zET3isDWnl1tWJxk icsBgN/JwBRuTAD/dnWTHIWM2/5lTiagvyVsmINdJHP6JLr8T7dpN9tlxAQ= =Cuo7 -----END PGP SIGNATURE----- Merge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Have recordmcount work with > 64K sections (to support LTO) - kprobe RCU fixes - Correct a kprobe critical section with missing mutex - Remove redundant arch_disarm_kprobe() call - Fix lockup when kretprobe triggers within kprobe_flush_task() - Fix memory leak in fetch_op_data operations - Fix sleep in atomic in ftrace trace array sample code - Free up memory on failure in sample trace array code - Fix incorrect reporting of function_graph fields in format file - Fix quote within quote parsing in bootconfig - Fix return value of bootconfig tool - Add testcases for bootconfig tool - Fix maybe uninitialized warning in ftrace pid file code - Remove unused variable in tracing_iter_reset() - Fix some typos * tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix maybe-uninitialized compiler warning tools/bootconfig: Add testcase for show-command and quotes test tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig tools/bootconfig: Fix to use correct quotes for value proc/bootconfig: Fix to use correct quotes for value tracing: Remove unused event variable in tracing_iter_reset tracing/probe: Fix memleak in fetch_op_data operations trace: Fix typo in allocate_ftrace_ops()'s comment tracing: Make ftrace packed events have align of 1 sample-trace-array: Remove trace_array 'sample-instance' sample-trace-array: Fix sleeping function called from invalid context kretprobe: Prevent triggering kretprobe from within kprobe_flush_task kprobes: Remove redundant arch_disarm_kprobe() call kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex kprobes: Use non RCU traversal APIs on kprobe_tables if possible kprobes: Suppress the suspicious RCU warning on kprobes recordmcount: support >64k sections |
|
|
|
d2b1c81f5f |
block-5.8-2020-06-19
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl7s0SAQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpp+YEACVqFvsfzxKCqa61IzyuOaPfnj9awyP+MY2 7V6y9sDDHL8sp6aPDbHvqFnqz0O7E+7nHVZD2rf2qc6tKKMvJYNO/BFZSXPvWTZV KQ4cBChf/LDwqAKOnI4ZhmF5UcSyyob1yMy4uJ+U0gQiXXrRMbwJ3N1K24a9dr4c epkzGavR0Q+PJ9BbUgjACjbRdT+vrP4bOu0cuyCGkIpD9eCerKJ6mFaUAj0FDthD bg4BJj+c8Ij6LO0V++Wga6OxccmL43KeP0ky8B3x07PfAl+tDWqsbHSlU2YPtdcq 5nKgMMTW16mVnZeO2/W0JB7tn89VubsmyvIFcm2KNeeRqSnEZyW9HI8n4kq994Ju xMH24lgbsU4trNeYkgOmzPoJJZ+LShkn+rnldyI1U/fhpEYub7DqfVySuT7ti9in uFpQdeRUmPsdw92F3+o6h8OYAflpcQQ7CblkzxPEeV4OyzOZasb+S9tMNPe59KBh 0MtHv9IfzgtDihR6HuXifitXaP+GtH4x3D2z0dzEdooHKHC/+P3WycS5daG+3WKQ xV5lJruvpTuxhXKLFAH0wRrxnVlB0VUvhQ21T3WgHrwF0btbdmQMHFc83XOxBIB4 jHWJMHGc4xp1ZdpWFBC8Cj79OmJh1w/ao8+/cf8SUoTB0LzFce1B8LvwnxgpcpUk VjIOrl7zhQ== =LeLd -----END PGP SIGNATURE----- Merge tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - Use import_uuid() where appropriate (Andy) - bcache fixes (Coly, Mauricio, Zhiqiang) - blktrace sparse warnings fix (Jan) - blktrace concurrent setup fix (Luis) - blkdev_get use-after-free fix (Jason) - Ensure all blk-mq maps are updated (Weiping) - Loop invalidate bdev fix (Zheng) * tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block: block: make function 'kill_bdev' static loop: replace kill_bdev with invalidate_bdev partitions/ldm: Replace uuid_copy() with import_uuid() where it makes sense block: update hctx map when use multiple maps blktrace: Avoid sparse warnings when assigning q->blk_trace blktrace: break out of blktrace setup on concurrent calls block: Fix use-after-free in blkdev_get() trace/events/block.h: drop kernel-doc for dropped function parameter blk-mq: Remove redundant 'return' statement bcache: pr_info() format clean up in bcache_device_init() bcache: use delayed kworker fo asynchronous devices registration bcache: check and adjust logical block size for backing devices bcache: fix potential deadlock problem in btree_gc_coalesce |
|
|
|
5e857ce6ea |
Merge branch 'hch' (maccess patches from Christoph Hellwig)
Merge non-faulting memory access cleanups from Christoph Hellwig:
"Andrew and I decided to drop the patches implementing your suggested
rename of the probe_kernel_* and probe_user_* helpers from -mm as
there were way to many conflicts.
After -rc1 might be a good time for this as all the conflicts are
resolved now"
This also adds a type safety checking patch on top of the renaming
series to make the subtle behavioral difference between 'get_user()' and
'get_kernel_nofault()' less potentially dangerous and surprising.
* emailed patches from Christoph Hellwig <hch@lst.de>:
maccess: make get_kernel_nofault() check for minimal type compatibility
maccess: rename probe_kernel_address to get_kernel_nofault
maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
|
|
|
|
026bb845b0 |
ftrace: Fix maybe-uninitialized compiler warning
During build compiler reports some 'false positive' warnings about
variables {'seq_ops', 'filtered_pids', 'other_pids'} may be used
uninitialized. This patch silences these warnings.
Also delete some useless spaces
Link: https://lkml.kernel.org/r/20200529141214.37648-1-pilgrimtao@gmail.com
Signed-off-by: Kaitao Cheng <pilgrimtao@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
|
|
b9d37bbb55 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says: ==================== pull-request: bpf 2020-06-17 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 2 day(s) which contain a total of 14 files changed, 158 insertions(+), 59 deletions(-). The main changes are: 1) Important fix for bpf_probe_read_kernel_str() return value, from Andrii. 2) [gs]etsockopt fix for large optlen, from Stanislav. 3) devmap allocation fix, from Toke. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
c0ee37e85e |
maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
fe557319aa |
maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
02553b91da |
bpf: bpf_probe_read_kernel_str() has to return amount of data read on success
During recent refactorings, bpf_probe_read_kernel_str() started returning 0 on
success, instead of amount of data successfully read. This majorly breaks
applications relying on bpf_probe_read_kernel_str() and bpf_probe_read_str()
and their results. Fix this by returning actual number of bytes read.
Fixes:
|
|
|
|
c3dbe541ef |
blktrace: Avoid sparse warnings when assigning q->blk_trace
Mostly for historical reasons, q->blk_trace is assigned through xchg()
and cmpxchg() atomic operations. Although this is correct, sparse
complains about this because it violates rcu annotations since commit
|
|
|
|
1b0b283648 |
blktrace: break out of blktrace setup on concurrent calls
We use one blktrace per request_queue, that means one per the entire disk. So we cannot run one blktrace on say /dev/vda and then /dev/vda1, or just two calls on /dev/vda. We check for concurrent setup only at the very end of the blktrace setup though. If we try to run two concurrent blktraces on the same block device the second one will fail, and the first one seems to go on. However when one tries to kill the first one one will see things like this: The kernel will show these: ``` debugfs: File 'dropped' in directory 'nvme1n1' already present! debugfs: File 'msg' in directory 'nvme1n1' already present! debugfs: File 'trace0' in directory 'nvme1n1' already present! `` And userspace just sees this error message for the second call: ``` blktrace /dev/nvme1n1 BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error ``` The first userspace process #1 will also claim that the files were taken underneath their nose as well. The files are taken away form the first process given that when the second blktrace fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN. This means that even if go-happy process #1 is waiting for blktrace data, we *have* been asked to take teardown the blktrace. This can easily be reproduced with break-blktrace [0] run_0005.sh test. Just break out early if we know we're already going to fail, this will prevent trying to create the files all over again, which we know still exist. [0] https://github.com/mcgrof/break-blktrace Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
|
|
|
69243720c0 |
tracing: Remove unused event variable in tracing_iter_reset
We do not use the event variable, just remove it. Signed-off-by: YangHui <yanghui.def@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
|
|
|
3aa8fdc37d |
tracing/probe: Fix memleak in fetch_op_data operations
kmemleak report:
[<57dcc2ca>] __kmalloc_track_caller+0x139/0x2b0
[<f1c45d0f>] kstrndup+0x37/0x80
[<f9761eb0>] parse_probe_arg.isra.7+0x3cc/0x630
[<055bf2ba>] traceprobe_parse_probe_arg+0x2f5/0x810
[<655a7766>] trace_kprobe_create+0x2ca/0x950
[<4fc6a02a>] create_or_delete_trace_kprobe+0xf/0x30
[<6d1c8a52>] trace_run_command+0x67/0x80
[<be812cc0>] trace_parse_run_command+0xa7/0x140
[<aecfe401>] probes_write+0x10/0x20
[<2027641c>] __vfs_write+0x30/0x1e0
[<6a4aeee1>] vfs_write+0x96/0x1b0
[<3517fb7d>] ksys_write+0x53/0xc0
[<dad91db7>] __ia32_sys_write+0x15/0x20
[<da347f64>] do_syscall_32_irqs_on+0x3d/0x260
[<fd0b7e7d>] do_fast_syscall_32+0x39/0xb0
[<ea5ae810>] entry_SYSENTER_32+0xaf/0x102
Post parse_probe_arg(), the FETCH_OP_DATA operation type is overwritten
to FETCH_OP_ST_STRING, as a result memory is never freed since
traceprobe_free_probe_arg() iterates only over SYMBOL and DATA op types
Setup fetch string operation correctly after fetch_op_data operation.
Link: https://lkml.kernel.org/r/20200615143034.GA1734@cosmos
Cc: stable@vger.kernel.org
Fixes:
|
|
|
|
48a42f5d13 |
trace: Fix typo in allocate_ftrace_ops()'s comment
No functional change, just correct the word. Link: https://lkml.kernel.org/r/20200610033251.31713-1-richard.weiyang@linux.alibaba.com Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
|
|
|
4649079b9d |
tracing: Make ftrace packed events have align of 1
When using trace-cmd on 5.6-rt for the function graph tracer, the output was
corrupted. It gave output like this:
funcgraph_entry: func=0xffffffff depth=38982
funcgraph_entry: func=0x1ffffffff depth=16044
funcgraph_exit: func=0xffffffff overrun=0x92539aaf00000000 calltime=0x92539c9900000072 rettime=0x100000072 depth=11084
funcgraph_exit: func=0xffffffff overrun=0x9253946e00000000 calltime=0x92539e2100000072 rettime=0x72 depth=26033702
funcgraph_entry: func=0xffffffff depth=85798
funcgraph_entry: func=0x1ffffffff depth=12044
The reason was because the tracefs/events/ftrace/funcgraph_entry/exit format
file was incorrect. The -rt kernel adds more common fields to the trace
events. Namely, common_migrate_disable and common_preempt_lazy_count. Each
is one byte in size. This changes the alignment of the normal payload. Most
events are aligned normally, but the function and function graph events are
defined with a "PACKED" macro, that packs their payload. As the offsets
displayed in the format files are now calculated by an aligned field, the
aligned field for function and function graph events should be 1, not their
normal alignment.
With aligning of the funcgraph_entry event, the format file has:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned char common_migrate_disable; offset:8; size:1; signed:0;
field:unsigned char common_preempt_lazy_count; offset:9; size:1; signed:0;
field:unsigned long func; offset:16; size:8; signed:0;
field:int depth; offset:24; size:4; signed:1;
But the actual alignment is:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned char common_migrate_disable; offset:8; size:1; signed:0;
field:unsigned char common_preempt_lazy_count; offset:9; size:1; signed:0;
field:unsigned long func; offset:12; size:8; signed:0;
field:int depth; offset:20; size:4; signed:1;
Link: https://lkml.kernel.org/r/20200609220041.2a3b527f@oasis.local.home
Cc: stable@vger.kernel.org
Fixes:
|
|
|
|
7fac96f2be |
tracing/probe: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> |
|
|
|
96144c58ab |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Fix cfg80211 deadlock, from Johannes Berg.
2) RXRPC fails to send norigications, from David Howells.
3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
Geliang Tang.
4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.
5) The ucc_geth driver needs __netdev_watchdog_up exported, from
Valentin Longchamp.
6) Fix hashtable memory leak in dccp, from Wang Hai.
7) Fix how nexthops are marked as FDB nexthops, from David Ahern.
8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.
9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.
10) Fix link speed reporting in iavf driver, from Brett Creeley.
11) When a channel is used for XSK and then reused again later for XSK,
we forget to clear out the relevant data structures in mlx5 which
causes all kinds of problems. Fix from Maxim Mikityanskiy.
12) Fix memory leak in genetlink, from Cong Wang.
13) Disallow sockmap attachments to UDP sockets, it simply won't work.
From Lorenz Bauer.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: ethernet: ti: ale: fix allmulti for nu type ale
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
net: atm: Remove the error message according to the atomic context
bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
libbpf: Support pre-initializing .bss global variables
tools/bpftool: Fix skeleton codegen
bpf: Fix memlock accounting for sock_hash
bpf: sockmap: Don't attach programs to UDP sockets
bpf: tcp: Recv() should return 0 when the peer socket is closed
ibmvnic: Flush existing work items before device removal
genetlink: clean up family attributes allocations
net: ipa: header pad field only valid for AP->modem endpoint
net: ipa: program upper nibbles of sequencer type
net: ipa: fix modem LAN RX endpoint id
net: ipa: program metadata mask differently
ionic: add pcie_print_link_status
rxrpc: Fix race between incoming ACK parser and retransmitter
net/mlx5: E-Switch, Fix some error pointer dereferences
net/mlx5: Don't fail driver on failure to create debugfs
net/mlx5e: CT: Fix ipv6 nat header rewrite actions
...
|
|
|
|
076f14be7f |
The X86 entry, exception and interrupt code rework
This all started about 6 month ago with the attempt to move the Posix CPU timer heavy lifting out of the timer interrupt code and just have lockless quick checks in that code path. Trivial 5 patches. This unearthed an inconsistency in the KVM handling of task work and the review requested to move all of this into generic code so other architectures can share. Valid request and solved with another 25 patches but those unearthed inconsistencies vs. RCU and instrumentation. Digging into this made it obvious that there are quite some inconsistencies vs. instrumentation in general. The int3 text poke handling in particular was completely unprotected and with the batched update of trace events even more likely to expose to endless int3 recursion. In parallel the RCU implications of instrumenting fragile entry code came up in several discussions. The conclusion of the X86 maintainer team was to go all the way and make the protection against any form of instrumentation of fragile and dangerous code pathes enforcable and verifiable by tooling. A first batch of preparatory work hit mainline with commit |
|
|
|
b791d1bdf9 |
The Kernel Concurrency Sanitizer (KCSAN)
KCSAN is a dynamic race detector, which relies on compile-time instrumentation, and uses a watchpoint-based sampling approach to detect races. The feature was under development for quite some time and has already found legitimate bugs. Unfortunately it comes with a limitation, which was only understood late in the development cycle: It requires an up to date CLANG-11 compiler CLANG-11 is not yet released (scheduled for June), but it's the only compiler today which handles the kernel requirements and especially the annotations of functions to exclude them from KCSAN instrumentation correctly. These annotations really need to work so that low level entry code and especially int3 text poke handling can be completely isolated. A detailed discussion of the requirements and compiler issues can be found here: https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/ We came to the conclusion that trying to work around compiler limitations and bugs again would end up in a major trainwreck, so requiring a working compiler seemed to be the best choice. For Continous Integration purposes the compiler restriction is manageable and that's where most xxSAN reports come from. For a change this limitation might make GCC people actually look at their bugs. Some issues with CSAN in GCC are 7 years old and one has been 'fixed' 3 years ago with a half baken solution which 'solved' the reported issue but not the underlying problem. The KCSAN developers also ponder to use a GCC plugin to become independent, but that's not something which will show up in a few days. Blocking KCSAN until wide spread compiler support is available is not a really good alternative because the continuous growth of lockless optimizations in the kernel demands proper tooling support. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl7im98THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoQ3xD/9+q87OmwnyoRTs6O3GDDbWZYoJGolh rctDOAYW8RSS73Fiw23z8hKlLl9tJCya6/X8Q9qoonB1YeIEPPRVj5HJWAMUNEIs YgjlZJFmh+mnbP/KQFctm3AWpoX8kqt3ncqj6zG72oQ9qKui691BY/2NmGVSLxUV DqtUYSKmi51XEQtZuXRuHEf3zBxoyeD43DaSCdJAXd6f5O2X7tmrWDuazHVeKzHV lhijvkyBvGMWvPg0IBrXkkLmeOvS0++MTGm3o+L72XF6nWpzTkcV7N0E9GEDFg45 zwcidRVKD5d/1DoU5Tos96rCJpBEGh/wimlu0z14mcZpNiJgRQH5rzVEO9Y14UcP KL9FgRrb5dFw7yfX2zRQ070OFJ4AEDBMK0o5Lbu/QO5KLkvFkqnuWlQfmmtZJWCW DTRw/FgUgU7lvyPjRrao6HBvwy+yTb0u9K5seCOTRkuepR9nPJs0710pFiBsNCfV RY3cyggNBipAzgBOgLxixnq9+rHt70ton6S8Gijxpvt0dGGfO8k0wuEhFtA4zKrQ 6HGK+pidxnoVdEgyQZhS+qzMMkyiUL0FXdaGJ2IX+/DC+Ij1UrUPjZBn7v25M0hQ ESkvxWKCn7snH4/NJsNxqCV1zyEc3zAW/WvLJUc9I7H8zPwtVvKWPrKEMzrJJ5bA aneySilbRxBFUg== =iplm -----END PGP SIGNATURE----- Merge tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull the Kernel Concurrency Sanitizer from Thomas Gleixner: "The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector, which relies on compile-time instrumentation, and uses a watchpoint-based sampling approach to detect races. The feature was under development for quite some time and has already found legitimate bugs. Unfortunately it comes with a limitation, which was only understood late in the development cycle: It requires an up to date CLANG-11 compiler CLANG-11 is not yet released (scheduled for June), but it's the only compiler today which handles the kernel requirements and especially the annotations of functions to exclude them from KCSAN instrumentation correctly. These annotations really need to work so that low level entry code and especially int3 text poke handling can be completely isolated. A detailed discussion of the requirements and compiler issues can be found here: https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/ We came to the conclusion that trying to work around compiler limitations and bugs again would end up in a major trainwreck, so requiring a working compiler seemed to be the best choice. For Continous Integration purposes the compiler restriction is manageable and that's where most xxSAN reports come from. For a change this limitation might make GCC people actually look at their bugs. Some issues with CSAN in GCC are 7 years old and one has been 'fixed' 3 years ago with a half baken solution which 'solved' the reported issue but not the underlying problem. The KCSAN developers also ponder to use a GCC plugin to become independent, but that's not something which will show up in a few days. Blocking KCSAN until wide spread compiler support is available is not a really good alternative because the continuous growth of lockless optimizations in the kernel demands proper tooling support" * tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits) compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inlining compiler.h: Move function attributes to compiler_types.h compiler.h: Avoid nested statement expression in data_race() compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE() kcsan: Update Documentation to change supported compilers kcsan: Remove 'noinline' from __no_kcsan_or_inline kcsan: Pass option tsan-instrument-read-before-write to Clang kcsan: Support distinguishing volatile accesses kcsan: Restrict supported compilers kcsan: Avoid inserting __tsan_func_entry/exit if possible ubsan, kcsan: Don't combine sanitizer with kcov on clang objtool, kcsan: Add kcsan_disable_current() and kcsan_enable_current_nowarn() kcsan: Add __kcsan_{enable,disable}_current() variants checkpatch: Warn about data_race() without comment kcsan: Use GFP_ATOMIC under spin lock Improve KCSAN documentation a bit kcsan: Make reporting aware of KCSAN tests kcsan: Fix function matching in report kcsan: Change data_race() to no longer require marking racing accesses kcsan: Move kcsan_{disable,enable}_current() to kcsan-checks.h ... |
|
|
|
a58dfea297 |
block-5.8-2020-06-11
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl7ioawQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvbJD/wNLN/H4yIQ7tU5XDdvxvpx/u9FC1t2Pep0
w/olj6wnrsHw/WsgJIlw7efTq9QATfszG/dJKJiBGdiJoCKE1TW/CM6RNfDJb4Z3
TUa9ghYYzcfI2NRdV94Ol9qRThjB6OG6Cdw4k3oKbx44EJOzgatBI6xIA3nU+f/L
XO+xl2z3+t28guMvcgUkdJsR8GvSrwcXCvw3X/3uqbtAv5hhMbR7jyqxcHDLX72t
I+y3/dWfKaienujEmcLKeW+f2RFyjYIvDbQ5b/JDqLah7Fn1A2wYf+mx7iZuQZSi
5nwGcPuj++8GXS6G8JegAl+s5L3AyBNdz5nrxdAlRjDTMgIUstFgueLnCaW64QNF
93kWK5gDwhq+26AFl3mGJ3m+qhh1AhGWaVniBiFA3OUeWcOgVGlRf6jtmWazQaEI
v15WTiAXTsQujnV+t5KYKQnm9vJLIcc/njiSss1JXnqrxR6fH+QCHQ96ckTCqx66
0GbN5RkuC2J/RHYEyYnYIJlNZGDsCVoBC3QR10WNlng82cxMyrahS011xUTn9VN+
0Gnz1ilNFc+bx1jUO+pl6EdIsEBbFkKioyoZsgba5mvM+Nn3nGbvqQPJc+18fSV2
BW1x2yuoc6yjwuol9NMV+cy13Z9u+uA4c0mFIetjuyjE3rZb77iuIiIKVWMRh6Av
Ip6GuPEA2A==
=TOc1
-----END PGP SIGNATURE-----
Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Some followup fixes for this merge window. In particular:
- Seqcount write missing preemption disable for stats (Ahmed)
- blktrace fixes (Chaitanya)
- Redundant initializations (Colin)
- Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max,
Niklas, Rikard)
- loop flag bug regression fix (Martijn)
- blk-mq tagging fixes (Christoph, Ming)"
* tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
umem: remove redundant initialization of variable ret
pktcdvd: remove redundant initialization of variable ret
nvmet: fail outstanding host posted AEN req
nvme-pci: use simple suspend when a HMB is enabled
nvme-fc: don't call nvme_cleanup_cmd() for AENs
nvmet-tcp: constify nvmet_tcp_ops
nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops
nvme: do not call del_gendisk() on a disk that was never added
blk-mq: fix blk_mq_all_tag_iter
blk-mq: split out a __blk_mq_get_driver_tag helper
blktrace: fix endianness for blk_log_remap()
blktrace: fix endianness in get_pdu_int()
blktrace: use errno instead of bi_status
block: nr_sects_write(): Disable preemption on seqcount write
block: remove the error argument to the block_bio_complete tracepoint
loop: Fix wrong masking of status flags
block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
|
|
|
|
37d1a04b13 |
Rebase locking/kcsan to locking/urgent
Merge the state of the locking kcsan branch before the read/write_once() and the atomics modifications got merged. Squash the fallout of the rebase on top of the read/write once and atomic fallback work into the merge. The history of the original branch is preserved in tag locking-kcsan-2020-06-02. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
|
|
bf2b300844 |
x86/entry: Rename trace_hardirqs_off_prepare()
The typical pattern for trace_hardirqs_off_prepare() is:
ENTRY
lockdep_hardirqs_off(); // because hardware
... do entry magic
instrumentation_begin();
trace_hardirqs_off_prepare();
... do actual work
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
instrumentation_end();
... do exit magic
lockdep_hardirqs_on();
which shows that it's named wrong, rename it to
trace_hardirqs_off_finish(), as it concludes the hardirq_off transition.
Also, given that the above is the only correct order, make the traditional
all-in-one trace_hardirqs_off() follow suit.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200529213321.415774872@infradead.org
|
|
|
|
1c38372662 |
Merge branch 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull sysctl fixes from Al Viro: "Fixups to regressions in sysctl series" * 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sysctl: reject gigantic reads/write to sysctl files cdrom: fix an incorrect __user annotation on cdrom_sysctl_info trace: fix an incorrect __user annotation on stack_trace_sysctl random: fix an incorrect __user annotation on proc_do_entropy net/sysctl: remove leftover __user annotations on neigh_proc_dointvec* net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl |
|
|
|
22d5bd6867 |
tracing/probe: Fix bpf_task_fd_query() for kprobes and uprobes
Commit |
|
|
|
d1e521adad |
Tracing updates for 5.8:
No new features this release. Mostly clean ups, restructuring and
documentation.
- Have ftrace_bug() show ftrace errors before the WARN, as the WARN will
reboot the box before the error messages are printed if panic_on_warn
is set.
- Have traceoff_on_warn disable tracing sooner (before prints)
- Write a message to the trace buffer that its being disabled when
disable_trace_on_warning() is set.
- Separate out synthetic events from histogram code to let it be used by
other parts of the kernel.
- More documentation on histogram design.
- Other small fixes and clean ups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXt+LEhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qj2zAP9sD/W4jafYayucj+MvRP7sy+Q0iAH7
WMn8fkk958cgfQD8D1QFtkkx+3O3TRT6ApGf11w5+JgSWUE2gSbW9H4fPQk=
=X5t4
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"No new features this release. Mostly clean ups, restructuring and
documentation.
- Have ftrace_bug() show ftrace errors before the WARN, as the WARN
will reboot the box before the error messages are printed if
panic_on_warn is set.
- Have traceoff_on_warn disable tracing sooner (before prints)
- Write a message to the trace buffer that its being disabled when
disable_trace_on_warning() is set.
- Separate out synthetic events from histogram code to let it be used
by other parts of the kernel.
- More documentation on histogram design.
- Other small fixes and clean ups"
* tag 'trace-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Remove obsolete PREEMPTIRQ_EVENTS kconfig option
tracing/doc: Fix ascii-art in histogram-design.rst
tracing: Add a trace print when traceoff_on_warning is triggered
ftrace,bug: Improve traceoff_on_warn
selftests/ftrace: Distinguish between hist and synthetic event checks
tracing: Move synthetic events to a separate file
tracing: Fix events.rst section numbering
tracing/doc: Fix typos in histogram-design.rst
tracing: Add hist_debug trace event files for histogram debugging
tracing: Add histogram-design document
tracing: Check state.disabled in synth event trace functions
tracing/probe: reverse arguments to list_add
tools/bootconfig: Add a summary of test cases and return error
ftrace: show debugging information when panic_on_warn set
|
|
|
|
98a23609b1 |
maccess: always use strict semantics for probe_kernel_read
Except for historical confusion in the kprobes/uprobes and bpf tracers, which has been fixed now, there is no good reason to ever allow user memory accesses from probe_kernel_read. Switch probe_kernel_read to only read from kernel memory. [akpm@linux-foundation.org: update it for "mm, dump_page(): do not crash with invalid mapping pointer"] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-17-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
9de1fec50b |
tracing/kprobes: handle mixed kernel/userspace probes better
Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe helpers, rework probes to try a user probe based on the address if the architecture has a common address space for kernel and userspace. [svens@linux.ibm.com:use strncpy_from_kernel_nofault() in fetch_store_string()] Link: http://lkml.kernel.org/r/20200606181903.49384-1-svens@linux.ibm.com Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-15-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
8d92db5c04 |
bpf: rework the compat kernel probe handling
Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe helpers, rework the compat probes to check if an address is a kernel or userspace one, and then use the low-level kernel or user probe helper shared by the proper kernel and user probe helpers. This slightly changes behavior as the compat probe on a user address doesn't check the lockdown flags, just as the pure user probes do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-14-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
19c8d8ac63 |
bpf:bpf_seq_printf(): handle potentially unsafe format string better
User the proper helper for kernel or userspace addresses based on TASK_SIZE instead of the dangerous strncpy_from_unsafe function. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
aec6ce5913 |
bpf: handle the compat string in bpf_trace_copy_string better
User the proper helper for kernel or userspace addresses based on TASK_SIZE instead of the dangerous strncpy_from_unsafe function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-13-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
d7b2977b81 |
bpf: factor out a bpf_trace_copy_string helper
Split out a helper to do the fault free access to the string pointer to get it out of a crazy indentation level. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-12-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
02dddb160e |
maccess: rename strnlen_unsafe_user to strnlen_user_nofault
This matches the naming of strnlen_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-9-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
c4cb164426 |
maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault
This matches the naming of strncpy_from_user_nofault, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-8-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
bd88bb5d40 |
maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault
This matches the naming of strncpy_from_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
d8ed45c5dc |
mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
2062a4e8ae |
kallsyms/printk: add loglvl to print_ip_sym()
Patch series "Add log level to show_stack()", v3. Add log level argument to show_stack(). Done in three stages: 1. Introducing show_stack_loglvl() for every architecture 2. Migrating old users with an explicit log level 3. Renaming show_stack_loglvl() into show_stack() Justification: - It's a design mistake to move a business-logic decision into platform realization detail. - I have currently two patches sets that would benefit from this work: Removing console_loglevel jumps in sysrq driver [1] Hung task warning before panic [2] - suggested by Tetsuo (but he probably didn't realise what it would involve). - While doing (1), (2) the backtraces were adjusted to headers and other messages for each situation - so there won't be a situation when the backtrace is printed, but the headers are missing because they have lesser log level (or the reverse). - As the result in (2) plays with console_loglevel for kdb are removed. The least important for upstream, but maybe still worth to note that every company I've worked in so far had an off-list patch to print backtrace with the needed log level (but only for the architecture they cared about). If you have other ideas how you will benefit from show_stack() with a log level - please, reply to this cover letter. See also discussion on v1: https://lore.kernel.org/linux-riscv/20191106083538.z5nlpuf64cigxigh@pathway.suse.cz/ This patch (of 50): print_ip_sym() needs to have a log level parameter to comply with other parts being printed. Otherwise, half of the expected backtrace would be printed and other may be missing with some logging level. The following callee(s) are using now the adjusted log level: - microblaze/unwind: the same level as headers & userspace unwind. Note that pr_debug()'s there are for debugging the unwinder itself. - nds32/traps: symbol addresses are printed with the same log level as backtrace headers. - lockdep: ip for locking issues is printed with the same log level as other part of the warning. - sched: ip where preemption was disabled is printed as error like the rest part of the message. - ftrace: bug reports are now consistent in the log level being used. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Ben Segall <bsegall@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Dmitry Safonov <dima@arista.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Mark Salter <msalter@redhat.com> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Brian Cain <bcain@codeaurora.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Jason Wessel <jason.wessel@windriver.com> Link: http://lkml.kernel.org/r/20200418201944.482088-2-dima@arista.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
7ff0d4490e |
trace: fix an incorrect __user annotation on stack_trace_sysctl
No user pointers for sysctls anymore.
Fixes:
|