mirror of https://github.com/torvalds/linux.git
36271 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
9a76c0ee3a |
seccomp fixes for v5.13-rc4
- Fix addfd notification race condition (Sargun Dhillon) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCyhNIACgkQiXL039xt wCbExBAAoniF2+pW8sN32KK6a4uLGJCPCcbwZqWGw2zINqn6+I6KGAld37lGPu3E ASuu28O45NXcP9SpHLxNT1jRhAet57G6OjSV78jEzVII2EogUIBOyRji7yTk8xCt kCp21/9RaQ3DitYe2vh9R2neNIZh/PodmY8V5tkP2HacgaEuf5+yRhB/1QbTm7HG +mMZsejw1eEryJ49cw7XkYpWNjyz5vxwvXWJt6nfgm7wTnNopUQUKJGwnp2bX9cZ LUgstLq0SpHW7uxwEq4NYux3qsD9kaj5SgZxb/6KkHNmg5q6WUXxm0FljipEIhq1 RBTLdH+6Ct+DcDryno2VDoRNP/Q3pim9jxTpfQQ5V6f4dVqNv6pVuR2uNfK/iEX2 mk7Rc99IifaXeOLITKGusZrm16msVg+o7wAu0B1iT0vyacPcwRXJtIWy829Z+gCP r5OsBguxPPTkxfoRWYX4WDNcZmuBC5hkyqzN8toiQjOGghdm9nXdH4jFl8kcqZps I7i0Me3JBWVskx1d8AKlkJv3ctbdUX7QV/HaPdsMLlXTLyqBR76D/uqeUFgmWpUq 2ib3bkJzRNYgm2nron1fmDOLTiJGVfEha5hmbThPrVziYv7+jwamHzPf8jPvB+tg nOpw/HEfoVQtuq/e+Ocdv6TLnZAZWnvxYC/RB3aTBq5xz+74nYA= =c9Hd -----END PGP SIGNATURE----- Merge tag 'seccomp-fixes-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fixes from Kees Cook: "This fixes a hard-to-hit race condition in the addfd user_notif feature of seccomp, visible since v5.9. And a small documentation fix" * tag 'seccomp-fixes-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Refactor notification handler to prepare for new semantics Documentation: seccomp: Fix user notification documentation |
|
|
|
ddc4739169 |
seccomp: Refactor notification handler to prepare for new semantics
This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.
Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].
[1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/
Fixes:
|
|
|
|
d7c5303fbc |
Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
touching some core parts of the stack, too, but nothing hair-raising.
Current release - regressions:
- tipc: make node link identity publish thread safe
- dsa: felix: re-enable TAS guard band mode
- stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
- stmmac: fix system hang if change mac address after interface ifdown
Current release - new code bugs:
- mptcp: avoid OOB access in setsockopt()
- bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
- ethtool: stats: fix a copy-paste error - init correct array size
Previous releases - regressions:
- sched: fix packet stuck problem for lockless qdisc
- net: really orphan skbs tied to closing sk
- mlx4: fix EEPROM dump support
- bpf: fix alu32 const subreg bound tracking on bitwise operations
- bpf: fix mask direction swap upon off reg sign change
- bpf, offload: reorder offload callback 'prepare' in verifier
- stmmac: Fix MAC WoL not working if PHY does not support WoL
- packetmmap: fix only tx timestamp on request
- tipc: skb_linearize the head skb when reassembling msgs
Previous releases - always broken:
- mac80211: address recent "FragAttacks" vulnerabilities
- mac80211: do not accept/forward invalid EAPOL frames
- mptcp: avoid potential error message floods
- bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent
out of buffer writes
- bpf: forbid trampoline attach for functions with variable arguments
- bpf: add deny list of functions to prevent inf recursion of tracing
programs
- tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
- can: isotp: prevent race between isotp_bind() and isotp_setsockopt()
- netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
fallback to non-AVX2 version
Misc:
- bpf: add kconfig knob for disabling unpriv bpf by default
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCuy2gACgkQMUZtbf5S
IruE5BAAhihia5EaiV71Bz/Cqr/d+osv5u283riKT8kBft0bWFVFFnT3iweWyR0/
5X+bB6zmr80Cuqh45ZeYyq+zJtiAAlsbD5hqBIGdMriSWLxciNKjVJRzuEjuqnek
USMW/LqGyf4NhmLogmQKpx8XcKSG7VYuK7vPrsH8us1dL5vIssceIXn8R9Dzj9NN
P77K5Z+Oka8XQJgetNLxR3tDAM/92RwIshotkhJbRwgiUvzb+wbnrnSOAZCIPgku
ydJyOxOklln1Sx07SejgzEl33ri0CkioDPThBWpOn7Mu0JrYKukXPKludoZcRYuJ
2jNLYfbH0ZS5EkOfk89h7j7MDoAJMUK72M+S1w5DEYz6eH2EjhAq9noZ6E1iQH+U
9vfoIvQjPh6Zhyk5QeM4dpt0cvR7rSElXkLVxo/x0dSBAi2rIng1bKeCUtv2J689
CsoD0oghtEzvUTYVxY6iNr15OFGl6KsZv4tVQ709gGA36sDlK8ozGbJH5WReobBl
f8H2WJlj2tVW5V75yUoio8TumDw34yk/5xlJFzm9GOwkqBrUcqOraHtHdUIsa4qr
KbELQQ9QVt4zYdLAiWy5BL/QLycp0ibmA1IB8W1bxEVSK1JXzREHzPxv85KOfZkn
8+vzNHmk2PEZYYsExiEykc5jXKOCPs8L0rJ6p4OverlbpDZcwIg=
=peMK
-----END PGP SIGNATURE-----
Merge tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
touching some core parts of the stack, too, but nothing hair-raising.
Current release - regressions:
- tipc: make node link identity publish thread safe
- dsa: felix: re-enable TAS guard band mode
- stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
- stmmac: fix system hang if change mac address after interface
ifdown
Current release - new code bugs:
- mptcp: avoid OOB access in setsockopt()
- bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
- ethtool: stats: fix a copy-paste error - init correct array size
Previous releases - regressions:
- sched: fix packet stuck problem for lockless qdisc
- net: really orphan skbs tied to closing sk
- mlx4: fix EEPROM dump support
- bpf: fix alu32 const subreg bound tracking on bitwise operations
- bpf: fix mask direction swap upon off reg sign change
- bpf, offload: reorder offload callback 'prepare' in verifier
- stmmac: Fix MAC WoL not working if PHY does not support WoL
- packetmmap: fix only tx timestamp on request
- tipc: skb_linearize the head skb when reassembling msgs
Previous releases - always broken:
- mac80211: address recent "FragAttacks" vulnerabilities
- mac80211: do not accept/forward invalid EAPOL frames
- mptcp: avoid potential error message floods
- bpf, ringbuf: deny reserve of buffers larger than ringbuf to
prevent out of buffer writes
- bpf: forbid trampoline attach for functions with variable arguments
- bpf: add deny list of functions to prevent inf recursion of tracing
programs
- tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
- can: isotp: prevent race between isotp_bind() and
isotp_setsockopt()
- netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
fallback to non-AVX2 version
Misc:
- bpf: add kconfig knob for disabling unpriv bpf by default"
* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
net: phy: Document phydev::dev_flags bits allocation
mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
mptcp: avoid error message on infinite mapping
mptcp: drop unconditional pr_warn on bad opt
mptcp: avoid OOB access in setsockopt()
nfp: update maintainer and mailing list addresses
net: mvpp2: add buffer header handling in RX
bnx2x: Fix missing error code in bnx2x_iov_init_one()
net: zero-initialize tc skb extension on allocation
net: hns: Fix kernel-doc
sctp: fix the proc_handler for sysctl encap_port
sctp: add the missing setting for asoc encap_port
bpf, selftests: Adjust few selftest result_unpriv outcomes
bpf: No need to simulate speculative domain for immediates
bpf: Fix mask direction swap upon off reg sign change
bpf: Wrap aux data inside bpf_sanitize_info container
bpf: Fix BPF_LSM kconfig symbol dependency
selftests/bpf: Add test for l3 use of bpf_redirect_peer
bpftool: Add sock_release help info for cgroup attach/prog load command
net: dsa: microchip: enable phy errata workaround on 9567
...
|
|
|
|
a703619127 |
bpf: No need to simulate speculative domain for immediates
In
|
|
|
|
bb01a1bba5 |
bpf: Fix mask direction swap upon off reg sign change
Masking direction as indicated via mask_to_left is considered to be
calculated once and then used to derive pointer limits. Thus, this
needs to be placed into bpf_sanitize_info instead so we can pass it
to sanitize_ptr_alu() call after the pointer move. Piotr noticed a
corner case where the off reg causes masking direction change which
then results in an incorrect final aux->alu_limit.
Fixes:
|
|
|
|
3d0220f686 |
bpf: Wrap aux data inside bpf_sanitize_info container
Add a container structure struct bpf_sanitize_info which holds the current aux info, and update call-sites to sanitize_ptr_alu() to pass it in. This is needed for passing in additional state later on. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> |
|
|
|
5c9d706f61 |
bpf: Fix BPF_LSM kconfig symbol dependency
Similarly as |
|
|
|
1434a31278 |
Merge branch 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo: - "cgroup_disable=" boot param was being applied too late confusing some subsystems. Fix it by moving application to __setup() time. - Comment spelling fixes. Included here to lower the chance of trivial future merge conflicts. * 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix spelling mistakes cgroup: disable controllers at parse time |
|
|
|
5df7ae7bed |
Merge branch 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo: "One commit to fix spurious workqueue stall warnings across VM suspensions" * 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: wq: handle VM suspension in stall detection |
|
|
|
08b2b6fdf6 |
cgroup: fix spelling mistakes
Fix some spelling mistakes in comments: hierarhcy ==> hierarchy automtically ==> automatically overriden ==> overridden In absense of .. or ==> In absence of .. and assocaited ==> associated taget ==> target initate ==> initiate succeded ==> succeeded curremt ==> current udpated ==> updated Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
|
|
|
0898678c74 |
Two locking fixes:
- Invoke the lockdep tracepoints in the correct place so the ordering
is correct again.
- Don't leave the mutex WAITER bit stale when the last waiter is dropping
out early due to a signal as that forces all subsequent lock operations
needlessly into the slowpath until it's cleaned up again.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmCqVmQTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUqXEACR3LsrJ+VfkktVlZK8DWygRcjLFvvo
mV620OEdJcCOwy/Qs3qKkyIoMiba7ASrbIWoZa28+tbZZ/iovXotkRH5rh1MhDoU
3S6QlpPeg7shyN5iDG0JlvDTVPQs6g4oC+8bAQmJIuUeQ7hPh72O49vIDSEF6mzG
6j8C0l5tYvmojgJKY6PJYWSZ6MNVv/gCUWWwRdmShSYmdNR3W/GaN6jTFI6qVitS
a3NE5ksVr1LC5Ro5QraVdmif/XlUxZ8UaEN6VyaXjBuOBO2UxUevm61khv0X5fpS
IHpcDjZukgSwccXSzd9bttWJ5EKqLDC+nfFeOdJg2GFXfRZd+uGwVV3IN2U8r7fj
pP9Wcy5dDJrFF7dVYnDU7y7IP2ZOwDoh98mQkVt90SV4zp2HcZnl3x5iqvxQrND3
r3c88myDOZBCCroRIMxxlNpYWOozlVYtHi/mmFj3x97YoPQYwpuMunz+/i8b5j6B
UvtM2VsevyiGZd9pzSZ/dl3Tf19VXrtY60Sc8qG6LdTukOldLBq6J9fOcUI2fHCZ
kXiS+utT1nIWyvwRgoMcFOTOTgfzdDKRYkPu7pMVcNoRB91KgTmozGVCT4uIN3dF
kHpm+FyGLgKDdL8AB7VTWSSTFgb2quZBeLGSr4OnVVSTJlQ3xfpKD5vtKBysYhf7
6My7E5pCZhgr9Q==
=7vqW
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Two locking fixes:
- Invoke the lockdep tracepoints in the correct place so the ordering
is correct again
- Don't leave the mutex WAITER bit stale when the last waiter is
dropping out early due to a signal as that forces all subsequent
lock operations needlessly into the slowpath until it's cleaned up
again"
* tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal
locking/lockdep: Correct calling tracepoints
|
|
|
|
0f90b88dbc |
watchdog: reliable handling of timestamps
Commit |
|
|
|
a0e31f3a38 |
Merge branch 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo fix from Eric Biederman:
"During the merge window an issue with si_perf and the siginfo ABI came
up. The alpha and sparc siginfo structure layout had changed with the
addition of SIGTRAP TRAP_PERF and the new field si_perf.
The reason only alpha and sparc were affected is that they are the
only architectures that use si_trapno.
Looking deeper it was discovered that si_trapno is used for only a few
select signals on alpha and sparc, and that none of the other
_sigfault fields past si_addr are used at all. Which means technically
no regression on alpha and sparc.
While the alignment concerns might be dismissed the abuse of si_errno
by SIGTRAP TRAP_PERF does have the potential to cause regressions in
existing userspace.
While we still have time before userspace starts using and depending
on the new definition siginfo for SIGTRAP TRAP_PERF this set of
changes cleans up siginfo_t.
- The si_trapno field is demoted from magic alpha and sparc status
and made an ordinary union member of the _sigfault member of
siginfo_t. Without moving it of course.
- si_perf is replaced with si_perf_data and si_perf_type ending the
abuse of si_errno.
- Unnecessary additions to signalfd_siginfo are removed"
* 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo
signal: Deliver all of the siginfo perf data in _perf
signal: Factor force_sig_perf out of perf_sigtrap
signal: Implement SIL_FAULT_TRAPNO
siginfo: Move si_trapno inside the union inside _si_fault
|
|
|
|
c1f47ebc9b |
Modules fixes for v5.13-rc3
- When CONFIG_MODULE_UNLOAD=n, module exit sections get sorted into the init region of the module in order to satisfy the requirements of jump_labels and static_calls. Previously, the exit section check was done in module_init_section(), but the solution there is not completely arch-indepedent as ARM is a special case and supplies its own module_init_section() function. Instead of pushing this logic further to the arch-specific code, switch to an arch-independent solution to check for module exit sections in the core module loader code in layout_sections() instead. Signed-off-by: Jessica Yu <jeyu@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmCndboQHGpleXVAa2Vy bmVsLm9yZwAKCRDARX44zjvBcqzYD/9+W1lTEMXwEpayn71iMNs3ECMBY4ZaKbEd bxH1oLDADnIunS00tHBn4LO/t1tK18en/du9NtXfH1rmmH7jp9qDqJPxZbAOg6+i g8UxDAj1D+o6X5WfaVx8ygJ5JFTo927yk8rzQz4nqy8D7ZT87x4BvRaZF199jRMk MBQWDh9AfOC5DehauMKu4CjeEEWebPjG9QUQlg9ngQMrsGtGdOHv1Ex8zH66Oi4X xxOVqmRQu3yLMGfv03znHKvRSVXAponCZVT1VOiHBK9T1CaEgdP3eBE4mlTTAcLh X913OV69dQeNzoDFsECZfE4fmypym5CnvloCEg8Kx4zi5GN6TBO3RSU3EyRQChva 7RgNFZsS43Q9d3Q3ZfL5HX9Db/kd4oex3tA3mvuAh4CkA9400x2H4FeHsfMrOfJB avxvgQhUUnfphQ0chIDHVtWfSAIWcLlNkl6VEx6MB5A/m4qJUz6VyZafoC2khE52 98NzXNdmmRuuI+hrUqVsUsDH3ZybbAf362OqiImiRFjlVfbnZzUnSsVS3j4+Ckj8 VWBy4fMivpBu5CT2P4CVhJAA2VfgzVJpZ7HrjSP4uSI1xcJxxsXb8hWbQ10NSynq tr3Q6gvWQN+/I4RLAv4XlXkvXPfIDN10DavhmY6ZBTuUTuRcTyP6Sd1TYvwUzMJi iGMTFFrXcg== =uBiQ -----END PGP SIGNATURE----- Merge tag 'modules-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module fix from Jessica Yu: "When CONFIG_MODULE_UNLOAD=n, module exit sections get sorted into the init region of the module in order to satisfy the requirements of jump_labels and static_calls. Previously, the exit section check was done in module_init_section(), but the solution there is not completely arch-indepedent as ARM is a special case and supplies its own module_init_section() function. Instead of pushing this logic further to the arch-specific code, switch to an arch-independent solution to check for module exit sections in the core module loader code in layout_sections() instead" * tag 'modules-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: check for exit sections in layout_sections() instead of module_init_section() |
|
|
|
921dd23597 |
Merge branch 'urgent.2021.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull kcsan fix from Paul McKenney:
"Fix for a regression introduced in this merge window by commit
|
|
|
|
ceb11679d9 |
bpf, offload: Reorder offload callback 'prepare' in verifier
Commit |
|
|
|
0af02eb2a7 |
bpf: Avoid using ARRAY_SIZE on an uninitialized pointer
The cppcheck static code analysis reported the following error:
if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(bufs->tmp_bufs))) {
^
ARRAY_SIZE is a macro that expands to sizeofs, so bufs is not actually
dereferenced at runtime, and the code is actually safe. But to keep
things tidy, this patch removes the need for a call to ARRAY_SIZE by
extracting the size of the array into a macro. Cppcheck should no longer
be confused and the code ends up being a bit cleaner.
Fixes:
|
|
|
|
8afcc19fbf |
bpf: Clarify a bpf_bprintf_prepare macro
The per-cpu buffers contain bprintf data rather than printf arguments. The macro name and comment were a bit confusing, this rewords them in a clearer way. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20210517092830.1026418-1-revest@chromium.org |
|
|
|
6bdacdb48e |
bpf: Fix BPF_JIT kconfig symbol dependency
Randy reported a randconfig build error recently on i386: ld: arch/x86/net/bpf_jit_comp32.o: in function `do_jit': bpf_jit_comp32.c:(.text+0x28c9): undefined reference to `__bpf_call_base' ld: arch/x86/net/bpf_jit_comp32.o: in function `bpf_int_jit_compile': bpf_jit_comp32.c:(.text+0x3694): undefined reference to `bpf_jit_blind_constants' ld: bpf_jit_comp32.c:(.text+0x3719): undefined reference to `bpf_jit_binary_free' ld: bpf_jit_comp32.c:(.text+0x3745): undefined reference to `bpf_jit_binary_alloc' ld: bpf_jit_comp32.c:(.text+0x37d3): undefined reference to `bpf_jit_prog_release_other' [...] The cause was that |
|
|
|
940d71c646 |
wq: handle VM suspension in stall detection
If VCPU is suspended (VM suspend) in wq_watchdog_timer_fn() then
once this VCPU resumes it will see the new jiffies value, while it
may take a while before IRQ detects PVCLOCK_GUEST_STOPPED on this
VCPU and updates all the watchdogs via pvclock_touch_watchdogs().
There is a small chance of misreported WQ stalls in the meantime,
because new jiffies is time_after() old 'ts + thresh'.
wq_watchdog_timer_fn()
{
for_each_pool(pool, pi) {
if (time_after(jiffies, ts + thresh)) {
pr_emerg("BUG: workqueue lockup - pool");
}
}
}
Save jiffies at the beginning of this function and use that value
for stall detection. If VM gets suspended then we continue using
"old" jiffies value and old WQ touch timestamps. If IRQ at some
point restarts the stall detection cycle (pvclock_touch_watchdogs())
then old jiffies will always be before new 'ts + thresh'.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
|
|
45e1ba4083 |
cgroup: disable controllers at parse time
This patch effectively reverts the commit |
|
|
|
0683b53197 |
signal: Deliver all of the siginfo perf data in _perf
Don't abuse si_errno and deliver all of the perf data in _perf member of siginfo_t. Note: The data field in the perf data structures in a u64 to allow a pointer to be encoded without needed to implement a 32bit and 64bit version of the same structure. There already exists a 32bit and 64bit versions siginfo_t, and the 32bit version can not include a 64bit member as it only has 32bit alignment. So unsigned long is used in siginfo_t instead of a u64 as unsigned long can encode a pointer on all architectures linux supports. v1: https://lkml.kernel.org/r/m11rarqqx2.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210503203814.25487-10-ebiederm@xmission.com v3: https://lkml.kernel.org/r/20210505141101.11519-11-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-4-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
|
|
|
af5eeab7e8 |
signal: Factor force_sig_perf out of perf_sigtrap
Separate filling in siginfo for TRAP_PERF from deciding that siginal needs to be sent. There are enough little details that need to be correct when properly filling in siginfo_t that it is easy to make mistakes if filling in the siginfo_t is in the same function with other logic. So factor out force_sig_perf to reduce the cognative load of on reviewers, maintainers and implementors. v1: https://lkml.kernel.org/r/m17dkjqqxz.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-10-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-3-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
|
|
|
9abcabe311 |
signal: Implement SIL_FAULT_TRAPNO
Now that si_trapno is part of the union in _si_fault and available on all architectures, add SIL_FAULT_TRAPNO and update siginfo_layout to return SIL_FAULT_TRAPNO when the code assumes si_trapno is valid. There is room for future changes to reduce when si_trapno is valid but this is all that is needed to make si_trapno and the other members of the the union in _sigfault mutually exclusive. Update the code that uses siginfo_layout to deal with SIL_FAULT_TRAPNO and have the same code ignore si_trapno in in all other cases. v1: https://lkml.kernel.org/r/m1o8dvs7s7.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-6-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-2-ebiederm@xmission.com Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
|
|
|
add0b32ef9 |
siginfo: Move si_trapno inside the union inside _si_fault
It turns out that linux uses si_trapno very sparingly, and as such it can be considered extra information for a very narrow selection of signals, rather than information that is present with every fault reported in siginfo. As such move si_trapno inside the union inside of _si_fault. This results in no change in placement, and makes it eaiser to extend _si_fault in the future as this reduces the number of special cases. In particular with si_trapno included in the union it is no longer a concern that the union must be pointer aligned on most architectures because the union follows immediately after si_addr which is a pointer. This change results in a difference in siginfo field placement on sparc and alpha for the fields si_addr_lsb, si_lower, si_upper, si_pkey, and si_perf. These architectures do not implement the signals that would use si_addr_lsb, si_lower, si_upper, si_pkey, and si_perf. Further these architecture have not yet implemented the userspace that would use si_perf. The point of this change is in fact to correct these placement issues before sparc or alpha grow userspace that cares. This change was discussed[1] and the agreement is that this change is currently safe. [1]: https://lkml.kernel.org/r/CAK8P3a0+uKYwL1NhY6Hvtieghba2hKYGD6hcKx5n8=4Gtt+pHA@mail.gmail.com Acked-by: Marco Elver <elver@google.com> v1: https://lkml.kernel.org/r/m1tunns7yf.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-5-ebiederm@xmission.com Link: https://lkml.kernel.org/r/20210517195748.8880-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
|
|
|
976aac5f88 |
kcsan: Fix debugfs initcall return type
clang with CONFIG_LTO_CLANG points out that an initcall function should return an 'int' due to the changes made to the initcall macros in commit |
|
|
|
3a010c4932 |
locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal
When a interruptible mutex locker is interrupted by a signal
without acquiring this lock and removed from the wait queue.
if the mutex isn't contended enough to have a waiter
put into the wait queue again, the setting of the WAITER
bit will force mutex locker to go into the slowpath to
acquire the lock every time, so if the wait queue is empty,
the WAITER bit need to be clear.
Fixes:
|
|
|
|
89e70d5c58 |
locking/lockdep: Correct calling tracepoints
The commit |
|
|
|
055f23b74b |
module: check for exit sections in layout_sections() instead of module_init_section()
Previously, when CONFIG_MODULE_UNLOAD=n, the module loader just does not attempt to load exit sections since it never expects that any code in those sections will ever execute. However, dynamic code patching (alternatives, jump_label and static_call) can have sites in __exit code, even if __exit is never executed. Therefore __exit must be present at runtime, at least for as long as __init code is. Commit |
|
|
|
8ce3648158 |
Two fixes for timers:
- Use the ALARM feature check in the alarmtimer core code insted of
the old method of checking for the set_alarm() callback. Drivers
can have that callback set but the feature bit cleared. If such
a RTC device is selected then alarms wont work.
- Use a proper define to let the preprocessor check whether Hyper-V VDSO
clocksource should be active. The code used a constant in an enum with
#ifdef, which evaluates to always false and disabled the clocksource
for VDSO.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmChLI8THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUJMD/wOQ/R7jXe/EWti3+w11TATvkP+ZzDv
LcAfZ/ZP8wgrUTbjLqTTyeOFoI9q39emnq3FvCoRsF+rdHRbnZNAB3kWQmh/i1tL
j8BuGogzvVLkBmriQIzVxYgEroCZVySWkO27B7ToBq64IeI4IBVB4jQiJis614m7
5wTHKgN0MkAtWUmwDqkqycFDuWyZNPkR3Ht26zk46Lvk0dmIPh14zbVzezfFEtq4
9DBeGuLDLVtzaBNLWUvnpXL7wxuFB+E8euO5otbmgRNz7CXaE6e6zy6zspK2ahmp
FRq+nrG6yK6ucoFhGFABfKZCGorhh1ghhniPUXQKP9B29z146pN6TLFAVAutBk4z
RoRdyGb9npoO1pB0f2tl0U65TBBlMCnLnDB3hcQ/eyMG7AC8ABHalBIFUjzEPB4b
3eDa+ZxfkW8/oiSLTssQiJ6TJW1EQNaVja1TuHvtPi5RdasbS4LEkQnDaePQ3/nl
tDLekfsDF4KxetZehIlRDqyN9cqIHVphs3pTysyWR7+aOTduWWF58ZtgR7SvTCVu
7Zu+PhP06A1MtEugnwcAcpG5XYCsAXdZXinuQhPndXqazN4wMJkanXNk03z//JmQ
wG//lFAC+9EfA8i9RDr2DeE6JISD2g+jj2Di9bjjxelp5Mi0bNZ0zdIiww6EJjRg
v4F0vCp3By8SQg==
=TruV
-----END PGP SIGNATURE-----
Merge tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Two fixes for timers:
- Use the ALARM feature check in the alarmtimer core code insted of
the old method of checking for the set_alarm() callback.
Drivers can have that callback set but the feature bit cleared. If
such a RTC device is selected then alarms wont work.
- Use a proper define to let the preprocessor check whether Hyper-V
VDSO clocksource should be active.
The code used a constant in an enum with #ifdef, which evaluates to
always false and disabled the clocksource for VDSO"
* tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86
alarmtimer: Check RTC features instead of ops
|
|
|
|
c12a29ed90 |
Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency enumeration bug.
Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCffOARHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iHBRAAm7p68+/sec2neJ2SxrOdl3kWU5yUXgM/ X2WUQiU8ERAI1IfaKcJBbJCDlIr7Pufwec31IvLpyM5my+pfNkuB9EcLxwQuUZ8y 2IZXF3HlaxWUEfwVqAQ/Dm1J1jExz20vSVzom/2TeE8H1kibdjs6EfouW17FZbwc CXtZC5MWArU/Wt5cjm84Cn5JAx0Udw3RKv8O5o3w/gz0RMjTGCzxlS54QwF+j1fG r1kRL+64yS1LPofnsEDSqfw52J/agSpVOgOiRtn7RUYPoTlmkYZ7l1JeZe/bukDi YsF6uE8nfoRrjhdWVwOpvjEeTzP1hnNBT64piOY+G0wdoBJHmU+jzu5mJIyjxAeY BnJqA7cH16F9cIKCPilmsifbptJtli+Y301036sxMBj8IlcbPKdHlW/qG9ibUCeN r6IPZnONd5JaDeEUCQl91fhGxDn8JrSew5Bh6Yp8B2KsJ9cXirUoPORjqu7Fccfe YRHNPfK8JpSPGv5SSXRrrr6bSdPBhEueqUemfItTGsPpZY/mD0iTIlecol6o0Wfc A11rk6Hb1BMVveNSCTrH7VFJ9nsql1XI5C7rp0D4+9uEDEYRHsq9rInZSevbytsI ocF03ineypbGmiiLT5cYiwR2+ucheX8WaS+BpGXlxjTwvAV+s0QdeTe9UyW9mySl R1ly0Jwpd3Q= =Ggm4 -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency enumeration bug" * tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu() |
|
|
|
a4147415bd |
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: resource, squashfs, hfsplus, modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan, pagemap, and ioremap)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/ioremap: fix iomap_max_page_shift docs: admin-guide: update description for kernel.modprobe sysctl hfsplus: prevent corruption in shrinking truncate mm/filemap: fix readahead return types kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled mm: fix struct page layout on 32-bit systems ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()" userfaultfd: release page in error path to avoid BUG_ON squashfs: fix divide error in calculate_skip() kernel/resource: fix return code check in __request_free_mem_region mm, slub: move slub_debug static key enabling outside slab_mutex mm/hugetlb: fix cow where page writtable in child mm/hugetlb: fix F_SEAL_FUTURE_WRITE |
|
|
|
eb1f065f90 |
kernel/resource: fix return code check in __request_free_mem_region
Splitting an earlier version of a patch that allowed calling __request_region() while holding the resource lock into a series of patches required changing the return code for the newly introduced __request_region_locked(). Unfortunately this change was not carried through to a subsequent commit |
|
|
|
25a1298726 |
tracing: Fix trace_check_vprintf() for %.*s
The sanity check of all strings being read from the ring buffer to make sure they are in safe memory space did not account for the %.*s notation having another parameter to process (the length). Add that to the check. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYJ7e5xQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qm/IAPwJfjbQb6quaF1PMTY/pOEby5wIvv4c TZxFGN03FgzYRgD8CSUvB/L0gDs56oL5X6gw0Fs/9CJ2cVUo1bCPHEj4LgY= =3v5m -----END PGP SIGNATURE----- Merge tag 'trace-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Fix trace_check_vprintf() for %.*s The sanity check of all strings being read from the ring buffer to make sure they are in safe memory space did not account for the %.*s notation having another parameter to process (the length). Add that to the check" * tag 'trace-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Handle %.*s in trace_check_vprintf() |
|
|
|
eb01f5353b |
tracing: Handle %.*s in trace_check_vprintf()
If a trace event uses the %*.s notation, the trace_check_vprintf() will
fail and will warn about a bad processing of strings, because it does not
take into account the length field when processing the star (*) part.
Have it handle this case as well.
Link: https://lore.kernel.org/linux-nfs/238C0E2D-C2A4-4578-ADD2-C565B3B99842@oracle.com/
Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Fixes:
|
|
|
|
dbb5afad10 |
ptrace: make ptrace() fail if the tracee changed its pid unexpectedly
Suppose we have 2 threads, the group-leader L and a sub-theread T,
both parked in ptrace_stop(). Debugger tries to resume both threads
and does
ptrace(PTRACE_CONT, T);
ptrace(PTRACE_CONT, L);
If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not
resume the old leader L, it resumes the post-exec thread T which was
actually now stopped in PTHREAD_EVENT_EXEC. In this case the
PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the
tracee changed its pid.
This patch makes ptrace() fail in this case until debugger does wait()
and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all
ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL.
The patch doesn't add the new PTRACE_ option to not complicate the API,
and I _hope_ this won't cause any noticeable regression:
- If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec
and the tracer does a ptrace request without having consumed
the exec event, it's 100% sure that the thread the ptracer
thinks it is targeting does not exist anymore, or isn't the
same as the one it thinks it is targeting.
- To some degree this patch adds nothing new. In the scenario
above ptrace(L) can fail with -ESRCH if it is called after the
execing sub-thread wakes the leader up and before it "steals"
the leader's pid.
Test-case:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <errno.h>
#include <pthread.h>
#include <assert.h>
void *tf(void *arg)
{
execve("/usr/bin/true", NULL, NULL);
assert(0);
return NULL;
}
int main(void)
{
int leader = fork();
if (!leader) {
kill(getpid(), SIGSTOP);
pthread_t th;
pthread_create(&th, NULL, tf, NULL);
for (;;)
pause();
return 0;
}
waitpid(leader, NULL, WSTOPPED);
ptrace(PTRACE_SEIZE, leader, 0,
PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC);
waitpid(leader, NULL, 0);
ptrace(PTRACE_CONT, leader, 0,0);
waitpid(leader, NULL, 0);
int status, thread = waitpid(-1, &status, 0);
assert(thread > 0 && thread != leader);
assert(status == 0x80137f);
ptrace(PTRACE_CONT, thread, 0,0);
/*
* waitid() because waitpid(leader, &status, WNOWAIT) does not
* report status. Why ????
*
* Why WEXITED? because we have another kernel problem connected
* to mt-exec.
*/
siginfo_t info;
assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) == 0);
assert(info.si_pid == leader && info.si_status == 0x0405);
/* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */
assert(ptrace(PTRACE_CONT, leader, 0,0) == -1);
assert(errno == ESRCH);
assert(leader == waitpid(leader, &status, WNOHANG));
assert(status == 0x04057f);
assert(ptrace(PTRACE_CONT, leader, 0,0) == 0);
return 0;
}
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Simon Marchi <simon.marchi@efficios.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pedro Alves <palves@redhat.com>
Acked-by: Simon Marchi <simon.marchi@efficios.com>
Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
02dbb7246c |
sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()
In commit: |
|
|
|
df6f823703 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-05-11
The following pull-request contains BPF updates for your *net* tree.
We've added 13 non-merge commits during the last 8 day(s) which contain
a total of 21 files changed, 817 insertions(+), 382 deletions(-).
The main changes are:
1) Fix multiple ringbuf bugs in particular to prevent writable mmap of
read-only pages, from Andrii Nakryiko & Thadeu Lima de Souza Cascardo.
2) Fix verifier alu32 known-const subregister bound tracking for bitwise
operations and/or/xor, from Daniel Borkmann.
3) Reject trampoline attachment for functions with variable arguments,
and also add a deny list of other forbidden functions, from Jiri Olsa.
4) Fix nested bpf_bprintf_prepare() calls used by various helpers by
switching to per-CPU buffers, from Florent Revest.
5) Fix kernel compilation with BTF debug info on ppc64 due to pahole
missing TCP-CC functions like cubictcp_init, from Martin KaFai Lau.
6) Add a kconfig entry to provide an option to disallow unprivileged
BPF by default, from Daniel Borkmann.
7) Fix libbpf compilation for older libelf when GELF_ST_VISIBILITY()
macro is not available, from Arnaldo Carvalho de Melo.
8) Migrate test_tc_redirect to test_progs framework as prep work
for upcoming skb_change_head() fix & selftest, from Jussi Maki.
9) Fix a libbpf segfault in add_dummy_ksym_var() if BTF is not
present, from Ian Rogers.
10) Fix tx_only micro-benchmark in xdpsock BPF sample with proper frame
size, from Magnus Karlsson.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
e2d5b2bb76 |
bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one
per-cpu buffer that they use to store temporary data (arguments to
bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it
by the end of their scope with bpf_bprintf_cleanup.
If one of these helpers gets called within the scope of one of these
helpers, for example: a first bpf program gets called, uses
bpf_trace_printk which calls raw_spin_lock_irqsave which is traced by
another bpf program that calls bpf_snprintf, then the second "get"
fails. Essentially, these helpers are not re-entrant. They would return
-EBUSY and print a warning message once.
This patch triples the number of bprintf buffers to allow three levels
of nesting. This is very similar to what was done for tracepoints in
"9594dc3c7e7 bpf: fix nested bpf tracepoints with per-cpu data"
Fixes:
|
|
|
|
35e3815fa8 |
bpf: Add deny list of btf ids check for tracing programs
The recursion check in __bpf_prog_enter and __bpf_prog_exit
leaves some (not inlined) functions unprotected:
In __bpf_prog_enter:
- migrate_disable is called before prog->active is checked
In __bpf_prog_exit:
- migrate_enable,rcu_read_unlock_strict are called after
prog->active is decreased
When attaching trampoline to them we get panic like:
traps: PANIC: double fault, error_code: 0x0
double fault: 0000 [#1] SMP PTI
RIP: 0010:__bpf_prog_enter+0x4/0x50
...
Call Trace:
<IRQ>
bpf_trampoline_6442466513_0+0x18/0x1000
migrate_disable+0x5/0x50
__bpf_prog_enter+0x9/0x50
bpf_trampoline_6442466513_0+0x18/0x1000
migrate_disable+0x5/0x50
__bpf_prog_enter+0x9/0x50
bpf_trampoline_6442466513_0+0x18/0x1000
migrate_disable+0x5/0x50
__bpf_prog_enter+0x9/0x50
bpf_trampoline_6442466513_0+0x18/0x1000
migrate_disable+0x5/0x50
...
Fixing this by adding deny list of btf ids for tracing
programs and checking btf id during program verification.
Adding above functions to this list.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210429114712.43783-1-jolsa@kernel.org
|
|
|
|
08389d8882 |
bpf: Add kconfig knob for disabling unpriv bpf by default
Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.
This still allows a transition of 2 -> {0,1} through an admin. Similarly,
this also still keeps 1 -> {1} behavior intact, so that once set to permanently
disabled, it cannot be undone aside from a reboot.
We've also added extra2 with max of 2 for the procfs handler, so that an admin
still has a chance to toggle between 0 <-> 2.
Either way, as an additional alternative, applications can make use of CAP_BPF
that we added a while ago.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net
|
|
|
|
b24abcff91 |
bpf, kconfig: Add consolidated menu entry for bpf with core options
Right now, all core BPF related options are scattered in different Kconfig
locations mainly due to historic reasons. Moving forward, lets add a proper
subsystem entry under ...
General setup --->
BPF subsystem --->
... in order to have all knobs in a single location and thus ease BPF related
configuration. Networking related bits such as sockmap are out of scope for
the general setup and therefore better suited to remain in net/Kconfig.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/f23f58765a4d59244ebd8037da7b6a6b2fb58446.1620765074.git.daniel@iogearbox.net
|
|
|
|
e09784a8a7 |
alarmtimer: Check RTC features instead of ops
RTC drivers used to leave .set_alarm() NULL in order to signal the RTC
device doesn't support alarms. The drivers are now clearing the
RTC_FEATURE_ALARM bit for that purpose in order to keep the rtc_class_ops
structure const. So now, .set_alarm() is set unconditionally and this
possibly causes the alarmtimer code to select an RTC device that doesn't
support alarms.
Test RTC_FEATURE_ALARM instead of relying on ops->set_alarm to determine
whether alarms are available.
Fixes:
|
|
|
|
04ea3086c4 |
bpf: Prevent writable memory-mapping of read-only ringbuf pages
Only the very first page of BPF ringbuf that contains consumer position
counter is supposed to be mapped as writeable by user-space. Producer
position is read-only and can be modified only by the kernel code. BPF ringbuf
data pages are read-only as well and are not meant to be modified by
user-code to maintain integrity of per-record headers.
This patch allows to map only consumer position page as writeable and
everything else is restricted to be read-only. remap_vmalloc_range()
internally adds VM_DONTEXPAND, so all the established memory mappings can't be
extended, which prevents any future violations through mremap()'ing.
Fixes:
|
|
|
|
4b81ccebae |
bpf, ringbuf: Deny reserve of buffers larger than ringbuf
A BPF program might try to reserve a buffer larger than the ringbuf size.
If the consumer pointer is way ahead of the producer, that would be
successfully reserved, allowing the BPF program to read or write out of
the ringbuf allocated area.
Reported-by: Ryota Shiga (Flatt Security)
Fixes:
|
|
|
|
049c4e1371 |
bpf: Fix alu32 const subreg bound tracking on bitwise operations
Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
When both the src & dst subreg is a known constant, then the assumption is
that scalar_min_max_*() will take care to update bounds correctly. However,
this is not the case, for example, consider a register R2 which has a tnum
of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
case of value 0x00000001. R2 is then and'ed with a register R3 which is a
64 bit known constant, here, 0x100000002.
What can be seen in line '10:' is that 32 bit bounds reach an invalid state
where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
only takes place when both the 64 bit src & dst register is a known constant.
Given scalar32_min_max_*() is intended to be designed as closely as possible
to scalar_min_max_*(), update the 32 bit bounds in this situation through
__mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
32 bit space). This is possible given var32_off already holds the final value
as dst_reg->var_off is updated before calling scalar32_min_max_*().
Before fix, invalid tracking of R2:
[...]
9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
9: (5f) r2 &= r3
10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
[...]
After fix, correct tracking of R2:
[...]
9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
9: (5f) r2 &= r3
10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
[...]
Fixes:
|
|
|
|
9819f682e4 |
A set of scheduler updates:
- Prevent PSI state corruption when schedule() races with cgroup move. A
recent commit combined two PSI callbacks to reduce the number of cgroup
tree updates, but missed that schedule() can drop rq::lock for load
balancing, which opens the race window for cgroup_move_task() which then
observes half updated state. The fix is to solely use task::ps_flags
instead of looking at the potentially mismatching scheduler state
- Prevent an out-of-bounds access in uclamp caused bu a rounding division
which can lead to an off-by-one error exceeding the buckets array size.
- Prevent unfairness caused by missing load decay when a task is attached
to a cfs runqueue. The old load of the task is attached to the runqueue
and never removed. Fix it by enforcing the load update through the
hierarchy for unthrottled run queue instances.
- A documentation fix fot the 'sched_verbose' command line option
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmCX6VATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoXYxEADDmfbQt0QdUHJB95QdunDseL5U787L
qPZ8wUpl+a2IBgqiD1at2IvWJNeivW3GNoRl4lMYOzX/3/Eh5AApfpxeDBtckht2
sUfcduPDk9rrocCP/dLtQK3vVIoZWZladRsYT8K53l68roT6T2+Qkrwd5OtyhfPc
apwIvknVbQ3exUq/OmXtyc0oLLJJ1lyeteJ0ZdIcTuMbeM9IhG8Tm2v3Rh3G0Ic2
eBHOGNIjQWtvP55TyiwtWj35MaXCvy8c7my6YpffjjgLX1X0Tro3/7Jnzo16rsyt
6yR8G4gBtrW+pPH+LNUk45Wpp51B4p1EjpMAMApA94Z9yIxsIip8PoKV6EN2+8sh
K3cfSQlQubXilWNSRQSx/gQLkSXr8Y/wexajcOzycTXw+ifh6biseFCYPTgwkxDB
FKJAdePvh6dntk/2DB5gvRaZY1HI5L+Iv8neiQfHttUPcXYRgSOs9V7k80j+qyqE
QV/vlImZRTW0fiqtWS9ZAFRNGzq/QB/UKp+znDQoUVBE4zxB9nekVDqsCTz4H5n9
oBIIj/xwMfqVojKSH72leK64O1/+ucX9l4/Qxcs4E6LZjYRQL9tmCoRBLZ1uyQ9S
Ee9wpz6TIX9J5Dgr1gYs1WNaheC1Xonu5JtU4ysWUX3jLdBSnJD5vY9OD13rKUV7
eGJKjI979fVhiA==
=L3ar
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler updates:
- Prevent PSI state corruption when schedule() races with cgroup
move.
A recent commit combined two PSI callbacks to reduce the number of
cgroup tree updates, but missed that schedule() can drop rq::lock
for load balancing, which opens the race window for
cgroup_move_task() which then observes half updated state.
The fix is to solely use task::ps_flags instead of looking at the
potentially mismatching scheduler state
- Prevent an out-of-bounds access in uclamp caused bu a rounding
division which can lead to an off-by-one error exceeding the
buckets array size.
- Prevent unfairness caused by missing load decay when a task is
attached to a cfs runqueue.
The old load of the task was attached to the runqueue and never
removed. Fix it by enforcing the load update through the hierarchy
for unthrottled run queue instances.
- A documentation fix fot the 'sched_verbose' command line option"
* tag 'sched-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix unfairness caused by missing load decay
sched: Fix out-of-bound access in uclamp
psi: Fix psi state corruption when schedule() races with cgroup move
sched,doc: sched_debug_verbose cmdline should be sched_verbose
|
|
|
|
732a27a089 |
A set of locking related fixes and updates:
- Two fixes for the futex syscall related to the timeout handling.
FUTEX_LOCK_PI does not support the FUTEX_CLOCK_REALTIME bit and because
it's not set the time namespace adjustment for clock MONOTONIC is
applied wrongly.
FUTEX_WAIT cannot support the FUTEX_CLOCK_REALTIME bit because its
always a relative timeout.
- Cleanups in the futex syscall entry points which became obvious when
the two timeout handling bugs were fixed.
- Cleanup of queued_write_lock_slowpath() as suggested by Linus
- Fixup of the smp_call_function_single_async() prototype
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmCX5X4THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoSpIEACRWDooQYxDnA81yib5a/R41xpp74uV
OFtEpJoJp0oEGeKpr2lXw2wC6fKpxouLJiPzxBs43IyMm8f6613aJSTrKExWPzqV
UYv6XYjcPZVf/sY0aLh0oO+Dte3BKupr8Nk+DVaanQ7NxBQwhu+P2ACrYSiu1AIi
P0dGvMLJTUTlz81uSCu9csUd67Zr4ZRSa4dOHOJaR2OrRK91QPs9QWHdseHp7vAm
N5X49kMRbBffs1Uk4b6TTBUpnPUzMXH4Wv7tp5tISVCVClbPKtLZ7IQ4TGRvnlFO
y67YLEEkHc11wCVbjRxZTqKyiXGqo000zEDxskICSBF5GAqA4JN49/TP00NOvbHS
zOL69jisqB3i2vUuggcDP1pzumGwHgkziOK1cljhKMBhfuErA0yrl2vWI1B5aKtH
BdFOSKDoWnkAIw8n3Q/8QTfwLxA4DZ5NrEJjYKTVGkYpwtUDVcKdlzfd0u2rqQFU
b0Qno4iDC3ZYibNfKcTCrkvu1iIKvr1FwWkVr1sMGH4/zFq4uRarsPMcba6Qt826
LAESoB9bd8yFGgY3wgnTSCCJdxVggrudxHM4p6jKEr/CnWWtgmzyMvUvd+XGZqfr
lshgI3SUQg95mu/MpRulNMxn8RvH1Tka26xoWGtCzBvl/DNGTbcFv6EsZJaN3xVe
+VYp6ht9gKf5nA==
=bo8R
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of locking related fixes and updates:
- Two fixes for the futex syscall related to the timeout handling.
FUTEX_LOCK_PI does not support the FUTEX_CLOCK_REALTIME bit and
because it's not set the time namespace adjustment for clock
MONOTONIC is applied wrongly.
FUTEX_WAIT cannot support the FUTEX_CLOCK_REALTIME bit because its
always a relative timeout.
- Cleanups in the futex syscall entry points which became obvious
when the two timeout handling bugs were fixed.
- Cleanup of queued_write_lock_slowpath() as suggested by Linus
- Fixup of the smp_call_function_single_async() prototype"
* tag 'locking-urgent-2021-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Make syscall entry points less convoluted
futex: Get rid of the val2 conditional dance
futex: Do not apply time namespace adjustment on FUTEX_LOCK_PI
Revert
|
|
|
|
0f979d815c |
Kbuild updates for v5.13 (2nd)
- Convert sh and sparc to use generic shell scripts to generate the
syscall headers
- refactor .gitignore files
- Update kernel/config_data.gz only when the content of the .config is
really changed, which avoids the unneeded re-link of vmlinux
- move "remove stale files" workarounds to scripts/remove-stale-files
- suppress unused-but-set-variable warnings by default for Clang as well
- fix locale setting LANG=C to LC_ALL=C
- improve 'make distclean'
- always keep intermediate objects from scripts/link-vmlinux.sh
- move IF_ENABLED out of <linux/kconfig.h> to make it self-contained
- misc cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmCWrucVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGRLkQAJ8t7PfMJLSh/VcgDXp3Z7fZ/V2M
RUGbOeRYErR1gylejuip/R19mS5MiBNecU60VrugZyDOMf98+mx61mI/ykpPeX92
sE3VU5MPXEwmv758QUr4gH014TZshMtHHo+tXA+NVUbqFp7RTnkZMDjOXGthYDHG
NhDou4LZ2P0CUKm8vb58SJPqB7ZdYOT9eEQEdHevm18Gx0KProCxRziup7loldy7
ET770okQ23if90ufCSVmnM6Ee6opoKYvXS5lv8V/a4xV/VbicbUclpzIZsHF7L2i
mIfr6dy480ncOaQlfWnX9ACgIeeqiFPOeZbAu7HAtwXzP5vCahgQ9FKVC7KPt+BP
Lf3LgdBrfSP5A7f7FrtkkPmP7pl1j6/Bq3+PhCur9XimtRIsvTOx7m7nuvsY4yHC
/wmBXFZgqE5DGyzpHXz1az8JHWw2AesP9L2f536BhfvRtdXaoOxPtZ/rmO1lfcMV
fWMa9f1em8lXwCiD1dR8UkBrIxItty+qqPffu2S/DlEepbiZrCg1gD827Fy7Mm3n
5rvrzYMOY2YK0yW1jtm+w3NlPlmG91BDUTP8tEcDxrTOIXezwqJf7fw8qIgGIy7W
3WzuBfgSvpT977ByMsB0YYugo2Xie+R1jpOWt7tv6KHM4varNBu0WpVhQhrKQr5o
agJiuvzsf3b+64oP
=935P
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- Convert sh and sparc to use generic shell scripts to generate the
syscall headers
- refactor .gitignore files
- Update kernel/config_data.gz only when the content of the .config
is really changed, which avoids the unneeded re-link of vmlinux
- move "remove stale files" workarounds to scripts/remove-stale-files
- suppress unused-but-set-variable warnings by default for Clang
as well
- fix locale setting LANG=C to LC_ALL=C
- improve 'make distclean'
- always keep intermediate objects from scripts/link-vmlinux.sh
- move IF_ENABLED out of <linux/kconfig.h> to make it self-contained
- misc cleanups
* tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
linux/kconfig.h: replace IF_ENABLED() with PTR_IF() in <linux/kernel.h>
kbuild: Don't remove link-vmlinux temporary files on exit/signal
kbuild: remove the unneeded comments for external module builds
kbuild: make distclean remove tag files in sub-directories
kbuild: make distclean work against $(objtree) instead of $(srctree)
kbuild: refactor modname-multi by using suffix-search
kbuild: refactor fdtoverlay rule
kbuild: parameterize the .o part of suffix-search
arch: use cross_compiling to check whether it is a cross build or not
kbuild: remove ARCH=sh64 support from top Makefile
.gitignore: prefix local generated files with a slash
kbuild: replace LANG=C with LC_ALL=C
Makefile: Move -Wno-unused-but-set-variable out of GCC only block
kbuild: add a script to remove stale generated files
kbuild: update config_data.gz only when the content of .config is changed
.gitignore: ignore only top-level modules.builtin
.gitignore: move tags and TAGS close to other tag files
kernel/.gitgnore: remove stale timeconst.h and hz.bc
usr/include: refactor .gitignore
genksyms: fix stale comment
...
|
|
|
|
fc858a5231 |
Networking fixes for 5.13-rc1, including fixes from bpf, can
and netfilter trees. Self-contained fixes, nothing risky.
Current release - new code bugs:
- dsa: ksz: fix a few bugs found by static-checker in the new driver
- stmmac: fix frame preemption handshake not triggering after
interface restart
Previous releases - regressions:
- make nla_strcmp handle more then one trailing null character
- fix stack OOB reads while fragmenting IPv4 packets in openvswitch
and net/sched
- sctp: do asoc update earlier in sctp_sf_do_dupcook_a
- sctp: delay auto_asconf init until binding the first addr
- stmmac: clear receive all(RA) bit when promiscuous mode is off
- can: mcp251x: fix resume from sleep before interface was brought up
Previous releases - always broken:
- bpf: fix leakage of uninitialized bpf stack under speculation
- bpf: fix masking negation logic upon negative dst register
- netfilter: don't assume that skb_header_pointer() will never fail
- only allow init netns to set default tcp cong to a restricted algo
- xsk: fix xp_aligned_validate_desc() when len == chunk_size to
avoid false positive errors
- ethtool: fix missing NLM_F_MULTI flag when dumping
- can: m_can: m_can_tx_work_queue(): fix tx_skb race condition
- sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
- bridge: fix NULL-deref caused by a races between assigning
rx_handler_data and setting the IFF_BRIDGE_PORT bit
Latecomer:
- seg6: add counters support for SRv6 Behaviors
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCV3YoACgkQMUZtbf5S
IrsQ2w//Q8/qbl6wGTKUfu6DZHYUU5j5sTwiHR823PKKSgXI+okWMN0KUlZszOsz
qnPkH6GuojRooOE1s8PFLSlt9axKhQ0y7uzMTrWYafQ+JZTtgg9/MiPxQ8fdiE5i
uOG1ngttZ+1jlE5tMPL4GAOSegg3rWVDclzqnJTdsPPOco3MWj6SL9xN0LDPxCEL
BDysRqL/UiOIoh4v6IXQRx2UWjsNGu4biM1po+Jfumnd9T0zKoEpzu6UN6yPShbx
284LihZSQtughCbhGqkErBOxfjZcvpFOQrqmjEvI+Z/eYg4InfWZemt8Sa92/alE
yAFjK76MUTaUxaAO/gk8XauhvkYOzJJwKpqhbOmlaM7oj55QdzT5/8JxMxVoA6hV
pscHOixk15GVse49PdPV8v47cyTLc/Xi69i+/uUdNVVfuORL1wft1w1xbd0S6Pbe
7Gqax21S7zxcDsrUli7cFheYiqtbQAL0anlIUz8tUOZFz0VQ/zPuFd4rUYZ/o38V
Mrevdk3t6CXNxS4CRXyUW4UejYB1O6Qw12sUue31e3h73d6LiN3NAiN5Qp7SEk1/
fvk+jfOf8vvmtimYvcUK2i0D+vqj4Ec/qRIE/XXuUDBcp22tPL9uWMfWavwTdAj1
Se4SzksTWF+NM0lO0ItonMyPh3ZXcSLhIv/gHrZwEKuWkXCGO4M=
=JmWS
-----END PGP SIGNATURE-----
Merge tag 'net-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.13-rc1, including fixes from bpf, can and
netfilter trees. Self-contained fixes, nothing risky.
Current release - new code bugs:
- dsa: ksz: fix a few bugs found by static-checker in the new driver
- stmmac: fix frame preemption handshake not triggering after
interface restart
Previous releases - regressions:
- make nla_strcmp handle more then one trailing null character
- fix stack OOB reads while fragmenting IPv4 packets in openvswitch
and net/sched
- sctp: do asoc update earlier in sctp_sf_do_dupcook_a
- sctp: delay auto_asconf init until binding the first addr
- stmmac: clear receive all(RA) bit when promiscuous mode is off
- can: mcp251x: fix resume from sleep before interface was brought up
Previous releases - always broken:
- bpf: fix leakage of uninitialized bpf stack under speculation
- bpf: fix masking negation logic upon negative dst register
- netfilter: don't assume that skb_header_pointer() will never fail
- only allow init netns to set default tcp cong to a restricted algo
- xsk: fix xp_aligned_validate_desc() when len == chunk_size to avoid
false positive errors
- ethtool: fix missing NLM_F_MULTI flag when dumping
- can: m_can: m_can_tx_work_queue(): fix tx_skb race condition
- sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
- bridge: fix NULL-deref caused by a races between assigning
rx_handler_data and setting the IFF_BRIDGE_PORT bit
Latecomer:
- seg6: add counters support for SRv6 Behaviors"
* tag 'net-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
atm: firestream: Use fallthrough pseudo-keyword
net: stmmac: Do not enable RX FIFO overflow interrupts
mptcp: fix splat when closing unaccepted socket
i40e: Remove LLDP frame filters
i40e: Fix PHY type identifiers for 2.5G and 5G adapters
i40e: fix the restart auto-negotiation after FEC modified
i40e: Fix use-after-free in i40e_client_subtask()
i40e: fix broken XDP support
netfilter: nftables: avoid potential overflows on 32bit arches
netfilter: nftables: avoid overflows in nft_hash_buckets()
tcp: Specify cmsgbuf is user pointer for receive zerocopy.
mlxsw: spectrum_mr: Update egress RIF list before route's action
net: ipa: fix inter-EE IRQ register definitions
can: m_can: m_can_tx_work_queue(): fix tx_skb race condition
can: mcp251x: fix resume from sleep before interface was brought up
can: mcp251xfd: mcp251xfd_probe(): add missing can_rx_offload_del() in error path
can: mcp251xfd: mcp251xfd_probe(): fix an error pointer dereference in probe
netfilter: nftables: Fix a memleak from userdata error path in new objects
netfilter: remove BUG_ON() after skb_header_pointer()
netfilter: nfnetlink_osf: Fix a missing skb_header_pointer() NULL check
...
|