mirror of https://github.com/torvalds/linux.git
57043 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
933a90bf4f |
Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ... |
|
|
|
5f4fc6d440 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix AF_XDP cq entry leak, from Ilya Maximets.
2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit.
3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika.
4) Fix handling of neigh timers wrt. entries added by userspace, from
Lorenzo Bianconi.
5) Various cases of missing of_node_put(), from Nishka Dasgupta.
6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing.
7) Various RDS layer fixes, from Gerd Rausch.
8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from
Cong Wang.
9) Fix FIB source validation checks over loopback, also from Cong Wang.
10) Use promisc for unsupported number of filters, from Justin Chen.
11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
tcp: fix tcp_set_congestion_control() use from bpf hook
ag71xx: fix return value check in ag71xx_probe()
ag71xx: fix error return code in ag71xx_probe()
usb: qmi_wwan: add D-Link DWM-222 A2 device ID
bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
net: dsa: sja1105: Fix missing unlock on error in sk_buff()
gve: replace kfree with kvfree
selftests/bpf: fix test_xdp_noinline on s390
selftests/bpf: fix "valid read map access into a read-only array 1" on s390
net/mlx5: Replace kfree with kvfree
MAINTAINERS: update netsec driver
ipv6: Unlink sibling route in case of failure
liquidio: Replace vmalloc + memset with vzalloc
udp: Fix typo in net/ipv4/udp.c
net: bcmgenet: use promisc for unsupported filters
ipv6: rt6_check should return NULL if 'from' is NULL
tipc: initialize 'validated' field of received packets
selftests: add a test case for rp_filter
fib: relax source validation check for loopback packets
mlxsw: spectrum: Do not process learned records with a dummy FID
...
|
|
|
|
249be8511b |
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
"The rest of MM and a kernel-wide procfs cleanup.
Summary of the more significant patches:
- Patch series "mm/memory_hotplug: Factor out memory block
devicehandling", v3. David Hildenbrand.
Some spring-cleaning of the memory hotplug code, notably in
drivers/base/memory.c
- "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
Shi.
Fix /proc/pid/smaps output for THP pages used in shmem.
- "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.
Bugfix and speedup for kernel/resource.c
- Patch series "mm: Further memory block device cleanups", David
Hildenbrand.
More spring-cleaning of the memory hotplug code.
- Patch series "mm: Sub-section memory hotplug support". Dan
Williams.
Generalise the memory hotplug code so that pmem can use it more
completely. Then remove the hacks from the libnvdimm code which
were there to work around the memory-hotplug code's constraints.
- "proc/sysctl: add shared variables for range check", Matteo Croce.
We have about 250 instances of
int zero;
...
.extra1 = &zero,
in the tree. This is a tree-wide sweep to make all those private
"zero"s and "one"s use global variables.
Alas, it isn't practical to make those two global integers const"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (38 commits)
proc/sysctl: add shared variables for range check
mm: migrate: remove unused mode argument
mm/sparsemem: cleanup 'section number' data types
libnvdimm/pfn: stop padding pmem namespaces to section alignment
libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
mm/devm_memremap_pages: enable sub-section remap
mm: document ZONE_DEVICE memory-model implications
mm/sparsemem: support sub-section hotplug
mm/sparsemem: prepare for sub-section ranges
mm: kill is_dev_zone() helper
mm/hotplug: kill is_dev_zone() usage in __remove_pages()
mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
mm/sparsemem: add helpers track active portions of a section at boot
mm/sparsemem: introduce a SECTION_IS_EARLY flag
mm/sparsemem: introduce struct mem_section_usage
drivers/base/memory.c: get rid of find_memory_block_hinted()
mm/memory_hotplug: move and simplify walk_memory_blocks()
mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
mm: make register_mem_sect_under_node() static
...
|
|
|
|
8d650cdeda |
tcp: fix tcp_set_congestion_control() use from bpf hook
Neal reported incorrect use of ns_capable() from bpf hook.
bpf_setsockopt(...TCP_CONGESTION...)
-> tcp_set_congestion_control()
-> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
-> ns_capable_common()
-> current_cred()
-> rcu_dereference_protected(current->cred, 1)
Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.
As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.
The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.
Fixes:
|
|
|
|
eec4844fae |
proc/sysctl: add shared variables for range check
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.
On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.
The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:
$ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
248
Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.
This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:
# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%
[mcroce@redhat.com: tipc: remove two unused variables]
Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
008cfbaa3f |
net: dsa: sja1105: Fix missing unlock on error in sk_buff()
Add the missing unlock before return from function sk_buff()
in the error handling case.
Fixes:
|
|
|
|
6860c981b9 |
NFS client updates for Linux 5.3
Highlights include:
Stable fixes:
- SUNRPC: Ensure bvecs are re-synced when we re-encode the RPC request
- Fix an Oops in ff_layout_track_ds_error due to a PTR_ERR() dereference
- Revert buggy NFS readdirplus optimisation
- NFSv4: Handle the special Linux file open access mode
- pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
Features:
- Allow NFS client to set up multiple TCP connections to the server using
a new 'nconnect=X' mount option. Queue length is used to balance load.
- Enhance statistics reporting to report on all transports when using
multiple connections.
- Speed up SUNRPC by removing bh-safe spinlocks
- Add a mechanism to allow NFSv4 to request that containers set a
unique per-host identifier for when the hostname is not set.
- Ensure NFSv4 updates the lease_time after a clientid update
Bugfixes and cleanup:
- Fix use-after-free in rpcrdma_post_recvs
- Fix a memory leak when nfs_match_client() is interrupted
- Fix buggy file access checking in NFSv4 open for execute
- disable unsupported client side deduplication
- Fix spurious client disconnections
- Fix occasional RDMA transport deadlock
- Various RDMA cleanups
- Various tracepoint fixes
- Fix the TCP callback channel to guarantee the server can actually send
the number of callback requests that was negotiated at mount time.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAl0wz/EACgkQZwvnipYK
APJUrQ/9FpbWTaI/H0BkBf+/iIDAwuGozldDDRkHpHvykvcbNSFPNk5CA4fV/lCq
rI0tD9ijj0KJhAjS1RWLb9oXMUm/MDIIcZNvsbOiIfGftzGnxAz022KS2JyV+Xk6
ul7sCQ8QmbQJzfZk+aFZxl5slS9i5nM+Sa//k5SK3JghOlLuYkW50OlmE6ME8rKP
sLWD2quLh1J5mjG+B9ml4YWozNi0elFN6ZqdzO/SjIhIG7rn2Hkr0glWcEprdiS7
qHoF6dIn80oR37YNrcbBtEGqBnMnWkpGAGq2Hgf5I2YtO+xfEgMqqs9FNIcahtQG
SGa1bnpcys9+fG4fnwzgOgYkvGzKdi/bIYc8lhBowl+76z+6VgjwlwT3wRtnWfO9
288LZDMMfOSYoDkbA9iGtoQRviIsoGMoeH/r5tZe7U1W35h60NGPVhH9sP9/Gc2s
Va1xBYK6PaIg+UxeghJmCnNK3zANyEI7AtDVvn0wsJuDcgXO5/PAANenc6/Fl6nf
U9XB4wmgaM2jN5uqLYT3bJAaAPAXTOayVnnd9oDYb87/31sYM6iwzSPkmm5+qIsi
0Ftks4PSjpjTTrmUvs3CL+ll3Y9bWUd58qlooGwi6u1DnXDQDzbIoRQBezNFJ+Lc
MahUVY7w9ATyJFJRvbscO7mRS8qTmr4qP91bAOsYQ0P2Yo02L7k=
=eR7d
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- SUNRPC: Ensure bvecs are re-synced when we re-encode the RPC
request
- Fix an Oops in ff_layout_track_ds_error due to a PTR_ERR()
dereference
- Revert buggy NFS readdirplus optimisation
- NFSv4: Handle the special Linux file open access mode
- pnfs: Fix a problem where we gratuitously start doing I/O through
the MDS
Features:
- Allow NFS client to set up multiple TCP connections to the server
using a new 'nconnect=X' mount option. Queue length is used to
balance load.
- Enhance statistics reporting to report on all transports when using
multiple connections.
- Speed up SUNRPC by removing bh-safe spinlocks
- Add a mechanism to allow NFSv4 to request that containers set a
unique per-host identifier for when the hostname is not set.
- Ensure NFSv4 updates the lease_time after a clientid update
Bugfixes and cleanup:
- Fix use-after-free in rpcrdma_post_recvs
- Fix a memory leak when nfs_match_client() is interrupted
- Fix buggy file access checking in NFSv4 open for execute
- disable unsupported client side deduplication
- Fix spurious client disconnections
- Fix occasional RDMA transport deadlock
- Various RDMA cleanups
- Various tracepoint fixes
- Fix the TCP callback channel to guarantee the server can actually
send the number of callback requests that was negotiated at mount
time"
* tag 'nfs-for-5.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (68 commits)
pnfs/flexfiles: Add tracepoints for detecting pnfs fallback to MDS
pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
SUNRPC: Optimise transport balancing code
SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request
pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error
NFSv4: Don't use the zero stateid with layoutget
SUNRPC: Fix up backchannel slot table accounting
SUNRPC: Fix initialisation of struct rpc_xprt_switch
SUNRPC: Skip zero-refcount transports
SUNRPC: Replace division by multiplication in calculation of queue length
NFSv4: Validate the stateid before applying it to state recovery
nfs4.0: Refetch lease_time after clientid update
nfs4: Rename nfs41_setup_state_renewal
nfs4: Make nfs4_proc_get_lease_time available for nfs4.0
nfs: Fix copy-and-paste error in debug message
NFS: Replace 16 seq_printf() calls by seq_puts()
NFS: Use seq_putc() in nfs_show_stats()
Revert "NFS: readdirplus optimization by cache mechanism" (memleak)
SUNRPC: Fix transport accounting when caller specifies an rpc_xprt
NFS: Record task, client ID, and XID in xdr_status trace points
...
|
|
|
|
bb74523167 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says: ==================== pull-request: bpf 2019-07-18 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) verifier precision propagation fix, from Andrii. 2) BTF size fix for typedefs, from Andrii. 3) a bunch of big endian fixes, from Ilya. 4) wide load from bpf_sock_addr fixes, from Stanislav. 5) a bunch of misc fixes from a number of developers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
54851aa90c |
ipv6: Unlink sibling route in case of failure
When a route needs to be appended to an existing multipath route,
fib6_add_rt2node() first appends it to the siblings list and increments
the number of sibling routes on each sibling.
Later, the function notifies the route via call_fib6_entry_notifiers().
In case the notification is vetoed, the route is not unlinked from the
siblings list, which can result in a use-after-free.
Fix this by unlinking the route from the siblings list before returning
an error.
Audited the rest of the call sites from which the FIB notification chain
is called and could not find more problems.
Fixes:
|
|
|
|
666a3d6e1e |
udp: Fix typo in net/ipv4/udp.c
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
f554af280a |
SUNRPC: Optimise transport balancing code
Moves the balancing code to avoid doing cursor changes on every search iteration. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
7536908982 |
SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request
The bvec tracks the list of pages, so if the number of pages changes
due to a re-encode, we need to reset the bvec as well.
Fixes:
|
|
|
|
d9b9c89304 |
Lots of exciting things this time!
- support for rbd object-map and fast-diff features (myself). This
will speed up reads, discards and things like snap diffs on sparse
images.
- ceph.snap.btime vxattr to expose snapshot creation time (David
Disseldorp). This will be used to integrate with "Restore Previous
Versions" feature added in Windows 7 for folks who reexport ceph
through SMB.
- security xattrs for ceph (Zheng Yan). Only selinux is supported
for now due to the limitations of ->dentry_init_security().
- support for MSG_ADDR2, FS_BTIME and FS_CHANGE_ATTR features (Jeff
Layton). This is actually a single feature bit which was missing
because of the filesystem pieces. With this in, the kernel client
will finally be reported as "luminous" by "ceph features" -- it is
still being reported as "jewel" even though all required Luminous
features were implemented in 4.13.
- stop NULL-terminating ceph vxattrs (Jeff Layton). The convention
with xattrs is to not terminate and this was causing inconsistencies
with ceph-fuse.
- change filesystem time granularity from 1 us to 1 ns, again fixing
an inconsistency with ceph-fuse (Luis Henriques).
On top of this there are some additional dentry name handling and cap
flushing fixes from Zheng. Finally, Jeff is formally taking over for
Zheng as the filesystem maintainer.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl0u+X8THGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi9byB/9fIxzoxtDMvixtJabuGSJRtlijDlWF
GlO6yIWCXl/8v8easR2PCF75U/xv0+QFQmze8PVi8u4Xz589P247NnEuyEZ9n84i
aCavARho6QLZPEL+B04NaqoHBl+ORKQTA6eKGhyKwRp/rn83z5Ubuw2tN7krHT3b
kCY61FuTQGxNY2o/WKv/iLwINYr7H23hCf0WwyyKH1bp7OegiQ14Ebn1NtfS3sMx
hS6h8Ya826vmUW0bCSS/9kzKYBCjksTig0HphUOHq6BoZJs++0b7GukIulRyuLfD
J9Gr9HGPoDCVzdmFfpn2FSlxdmqfO9amUSagd0ftLQfFlPlrpoULi0GW
=Bgxr
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-5.3-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"Lots of exciting things this time!
- support for rbd object-map and fast-diff features (myself). This
will speed up reads, discards and things like snap diffs on sparse
images.
- ceph.snap.btime vxattr to expose snapshot creation time (David
Disseldorp). This will be used to integrate with "Restore Previous
Versions" feature added in Windows 7 for folks who reexport ceph
through SMB.
- security xattrs for ceph (Zheng Yan). Only selinux is supported for
now due to the limitations of ->dentry_init_security().
- support for MSG_ADDR2, FS_BTIME and FS_CHANGE_ATTR features (Jeff
Layton). This is actually a single feature bit which was missing
because of the filesystem pieces. With this in, the kernel client
will finally be reported as "luminous" by "ceph features" -- it is
still being reported as "jewel" even though all required Luminous
features were implemented in 4.13.
- stop NULL-terminating ceph vxattrs (Jeff Layton). The convention
with xattrs is to not terminate and this was causing
inconsistencies with ceph-fuse.
- change filesystem time granularity from 1 us to 1 ns, again fixing
an inconsistency with ceph-fuse (Luis Henriques).
On top of this there are some additional dentry name handling and cap
flushing fixes from Zheng. Finally, Jeff is formally taking over for
Zheng as the filesystem maintainer"
* tag 'ceph-for-5.3-rc1' of git://github.com/ceph/ceph-client: (71 commits)
ceph: fix end offset in truncate_inode_pages_range call
ceph: use generic_delete_inode() for ->drop_inode
ceph: use ceph_evict_inode to cleanup inode's resource
ceph: initialize superblock s_time_gran to 1
MAINTAINERS: take over for Zheng as CephFS kernel client maintainer
rbd: setallochint only if object doesn't exist
rbd: support for object-map and fast-diff
rbd: call rbd_dev_mapping_set() from rbd_dev_image_probe()
libceph: export osd_req_op_data() macro
libceph: change ceph_osdc_call() to take page vector for response
libceph: bump CEPH_MSG_MAX_DATA_LEN (again)
rbd: new exclusive lock wait/wake code
rbd: quiescing lock should wait for image requests
rbd: lock should be quiesced on reacquire
rbd: introduce copyup state machine
rbd: rename rbd_obj_setup_*() to rbd_obj_init_*()
rbd: move OSD request allocation into object request state machines
rbd: factor out __rbd_osd_setup_discard_ops()
rbd: factor out rbd_osd_setup_copyup()
rbd: introduce obj_req->osd_reqs list
...
|
|
|
|
7402a4fedc |
SUNRPC: Fix up backchannel slot table accounting
Add a per-transport maximum limit in the socket case, and add helpers to allow the NFSv4 code to discover that limit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
9f98effc19 |
SUNRPC: Fix initialisation of struct rpc_xprt_switch
Ensure that we do initialise the fields xps_nactive, xps_queuelen and xps_net. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
49d05fe2c9 |
ipv6: rt6_check should return NULL if 'from' is NULL
Paul reported that l2tp sessions were broken after the commit referenced
in the Fixes tag. Prior to this commit rt6_check returned NULL if the
rt6_info 'from' was NULL - ie., the dst_entry was disconnected from a FIB
entry. Restore that behavior.
Fixes:
|
|
|
|
866e5fd8a7 |
tipc: initialize 'validated' field of received packets
The tipc_msg_validate() function leaves a boolean flag 'validated' in the validated buffer's control block, to avoid performing this action more than once. However, at reception of new packets, the position of this field may already have been set by lower layer protocols, so that the packet is erroneously perceived as already validated by TIPC. We fix this by initializing the said field to 'false' before performing the initial validation. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
66f8209547 |
fib: relax source validation check for loopback packets
In a rare case where we redirect local packets from veth to lo, these packets fail to pass the source validation when rp_filter is turned on, as the tracing shows: <...>-311708 [040] ..s1 7951180.957825: fib_table_lookup: table 254 oif 0 iif 1 src 10.53.180.130 dst 10.53.180.130 tos 0 scope 0 flags 0 <...>-311708 [040] ..s1 7951180.957826: fib_table_lookup_nh: nexthop dev eth0 oif 4 src 10.53.180.130 So, the fib table lookup returns eth0 as the nexthop even though the packets are local and should be routed to loopback nonetheless, but they can't pass the dev match check in fib_info_nh_uses_dev() without this patch. It should be safe to relax this check for this special case, as normally packets coming out of loopback device still have skb_dst so they won't even hit this slow path. Cc: Julian Anastasov <ja@ssi.bg> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
3f05e6886a |
net_sched: unset TCQ_F_CAN_BYPASS when adding filters
For qdisc's that support TC filters and set TCQ_F_CAN_BYPASS,
notably fq_codel, it makes no sense to let packets bypass the TC
filters we setup in any scenario, otherwise our packets steering
policy could not be enforced.
This can be reproduced easily with the following script:
ip li add dev dummy0 type dummy
ifconfig dummy0 up
tc qd add dev dummy0 root fq_codel
tc filter add dev dummy0 parent 8001: protocol arp basic action mirred egress redirect dev lo
tc filter add dev dummy0 parent 8001: protocol ip basic action mirred egress redirect dev lo
ping -I dummy0 192.168.112.1
Without this patch, packets are sent directly to dummy0 without
hitting any of the filters. With this patch, packets are redirected
to loopback as expected.
This fix is not perfect, it only unsets the flag but does not set it back
because we have to save the information somewhere in the qdisc if we
really want that. Note, both fq_codel and sfq clear this flag in their
->bind_tcf() but this is clearly not sufficient when we don't use any
class ID.
Fixes:
|
|
|
|
aa4948937b |
net/rds: Initialize ic->i_fastreg_wrs upon allocation
Otherwise, if an IB connection is torn down before "rds_ib_setup_qp" is called, the value of "ic->i_fastreg_wrs" is still at zero (as it wasn't initialized by "rds_ib_setup_qp"). Consequently "rds_ib_conn_path_shutdown" will spin forever, waiting for it to go back to "RDS_IB_DEFAULT_FR_WR", which of course will never happen as there are no outstanding work requests. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
3a2886cca7 |
net/rds: Keep track of and wait for FRWR segments in use upon shutdown
Since "rds_ib_free_frmr" and "rds_ib_free_frmr_list" simply put the FRMR memory segments on the "drop_list" or "free_list", and it is the job of "rds_ib_flush_mr_pool" to reap those entries by ultimately issuing a "IB_WR_LOCAL_INV" work-request, we need to trigger and then wait for all those memory segments attached to a particular connection to be fully released before we can move on to release the QP, CQ, etc. So we make "rds_ib_conn_path_shutdown" wait for one more atomic_t called "i_fastreg_inuse_count" that keeps track of how many FRWR memory segments are out there marked "FRMR_IS_INUSE" (and also wake_up rds_ib_ring_empty_wait, as they go away). Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
9547dff108 |
net/rds: Set fr_state only to FRMR_IS_FREE if IB_WR_LOCAL_INV had been successful
Fix a bug where fr_state first goes to FRMR_IS_STALE, because of a failure of operation IB_WR_LOCAL_INV, but then gets set back to "FRMR_IS_FREE" uncoditionally, even though the operation failed. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
aea01a2234 |
net/rds: Fix NULL/ERR_PTR inconsistency
Make function "rds_ib_try_reuse_ibmr" return NULL in case memory region could not be allocated, since callers simply check if the return value is not NULL. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
5f33141d2f |
net/rds: Wait for the FRMR_IS_FREE (or FRMR_IS_STALE) transition after posting IB_WR_LOCAL_INV
In order to:
1) avoid a silly bouncing between "clean_list" and "drop_list"
triggered by function "rds_ib_reg_frmr" as it is releases frmr
regions whose state is not "FRMR_IS_FREE" right away.
2) prevent an invalid access error in a race from a pending
"IB_WR_LOCAL_INV" operation with a teardown ("dma_unmap_sg", "put_page")
and de-registration ("ib_dereg_mr") of the corresponding
memory region.
Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
c9467447fc |
net/rds: Get rid of "wait_clean_list_grace" and add locking
Waiting for activity on the "clean_list" to quiesce is no substitute for proper locking. We can have multiple threads competing for "llist_del_first" via "rds_ib_reuse_mr", and a single thread competing for "llist_del_all" and "llist_del_first" via "rds_ib_flush_mr_pool". Since "llist_del_first" depends on "list->first->next" not to change in the midst of the operation, simply waiting for all current calls to "rds_ib_reuse_mr" to quiesce across all CPUs is woefully inadequate: By the time "wait_clean_list_grace" is done iterating over all CPUs to see that there is no concurrent caller to "rds_ib_reuse_mr", a new caller may have just shown up on the first CPU. Furthermore, <linux/llist.h> explicitly calls out the need for locking: * Cases where locking is needed: * If we have multiple consumers with llist_del_first used in one consumer, * and llist_del_first or llist_del_all used in other consumers, * then a lock is needed. Also, while at it, drop the unused "pool" parameter from "list_to_llist_nodes". Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
2c7da8e6b0 |
net/rds: Give fr_state a chance to transition to FRMR_IS_FREE
In the context of FRMR (ib_frmr.c): Memory regions make it onto the "clean_list" via "rds_ib_flush_mr_pool", after the memory region has been posted for invalidation via "rds_ib_post_inv". At that point in time, "fr_state" may still be in state "FRMR_IS_INUSE", since the only place where "fr_state" transitions to "FRMR_IS_FREE" is in "rds_ib_mr_cqe_handler", which is triggered by a tasklet. So in case we notice that "fr_state != FRMR_IS_FREE" (see below), we wait for "fr_inv_done" to trigger with a maximum of 10msec. Then we check again, and only put the memory region onto the drop_list (via "rds_ib_free_frmr") in case the situation remains unchanged. This avoids the problem of memory-regions bouncing between "clean_list" and "drop_list" before they even have a chance to be properly invalidated. Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
f11fe1dae1 |
net/sched: Make NET_ACT_CT depends on NF_NAT
If NF_NAT is m and NET_ACT_CT is y, build fails:
net/sched/act_ct.o: In function `tcf_ct_act':
act_ct.c:(.text+0x21ac): undefined reference to `nf_ct_nat_ext_add'
act_ct.c:(.text+0x229a): undefined reference to `nf_nat_icmp_reply_translation'
act_ct.c:(.text+0x233a): undefined reference to `nf_nat_setup_info'
act_ct.c:(.text+0x234a): undefined reference to `nf_nat_alloc_null_binding'
act_ct.c:(.text+0x237c): undefined reference to `nf_nat_packet'
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes:
|
|
|
|
86fda90ab5 |
net: sctp: fix warning "NULL check before some freeing functions is not needed"
This patch removes NULL checks before calling kfree. fixes below issues reported by coccicheck net/sctp/sm_make_chunk.c:2586:3-8: WARNING: NULL check before some freeing functions is not needed. net/sctp/sm_make_chunk.c:2652:3-8: WARNING: NULL check before some freeing functions is not needed. net/sctp/sm_make_chunk.c:2667:3-8: WARNING: NULL check before some freeing functions is not needed. net/sctp/sm_make_chunk.c:2684:3-8: WARNING: NULL check before some freeing functions is not needed. Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
a5b647007e |
fix: taprio: Change type of txtime-delay parameter to u32
During the review of the iproute2 patches for txtime-assist mode, it was
pointed out that it does not make sense for the txtime-delay parameter to
be negative. So, change the type of the parameter from s32 to u32.
Fixes:
|
|
|
|
db8051f30f |
skbuff: fix compilation warnings in skb_dump()
The commit |
|
|
|
163f88211c |
SUNRPC: Skip zero-refcount transports
When looking for the next transport to use for an RPC call, skip those that are in the process of being destroyed and that have a zero refcount. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
3cf7292280 |
SUNRPC: Replace division by multiplication in calculation of queue length
When checking whether or not a particular xprt queue length is shorter than the average queue length for all xprts, prefer to use multiplication rather than division for performance reasons. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
fb4da215ed |
pci-v5.3-changes
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl0siFoUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzi9A//S4jRyyZrgUr88Az0GbgMhE4b3yqc
uL7om/Sf+443gG6C+aKkZSM/IE9hrbyIKuYq7GGxDkzZ/HkucZo2yIuAHkPgG4ik
QQYJ8fJsmMq1bUht87c1ZZwGP0++Deq/Ns2+VNy/WBYqKLulnV0DvEEaJgPs9C5D
ppwccGdo6UghiujBTpE4ddUBjFjjURWqT6wSnMRDQ4EGwfUhG0MWwwHKI4hbBuaL
N6refuggdYyUUX5FeUOHa6VF6uTnSSAQ75k+40n4nljdayqoumHLskst77o9q5ZI
oXjdpwgmuEqYhfp03HEA4Xo/bBxiRj76NuTiEMKvPokxjpanwbLrdV0GhF0OIlM0
rp1NOI1w+vppFrU+rc2gtq+7hYXFmvdhjS29hFLeD91PP36N5d29jW5NVFpm7GCm
n4TMGAOsu8RB+bNua6ZbZVcDk2EnPgQeIcM0ZPoBtPK19Fg/rScdEU4u/aFE1Y0Q
C+Ks7D1qCvFpHzl/xAg0oo9v/jFsWef3qnQWOzot964Zz4W4NSVvB9Ox6Vbfj6C4
v331LJmlPxG8fxBNA3q28FrTxcG1NW6sgo3WY9VoSp/vc0aqaPKhm7sbraTt5IrI
TwqA/WhnAHv90MQCGFcofANyYTkjPkKk2QBFK6b0suoAmVdwVWWELi1WaZ+HdvgQ
JP7YpmC2cXcQBPk=
=ZGxL
-----END PGP SIGNATURE-----
Merge tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration changes:
- Evaluate PCI Boot Configuration _DSM to learn if firmware wants us
to preserve its resource assignments (Benjamin Herrenschmidt)
- Simplify resource distribution (Nicholas Johnson)
- Decode 32 GT/s link speed (Gustavo Pimentel)
Virtualization:
- Fix incorrect caching of VF config space size (Alex Williamson)
- Fix VF driver probing sysfs knobs (Alex Williamson)
Peer-to-peer DMA:
- Fix dma_virt_ops check (Logan Gunthorpe)
Altera host bridge driver:
- Allow building as module (Ley Foon Tan)
Armada 8K host bridge driver:
- add PHYs support (Miquel Raynal)
DesignWare host bridge driver:
- Export APIs to support removable loadable module (Vidya Sagar)
- Enable Relaxed Ordering erratum workaround only on Tegra20 &
Tegra30 (Vidya Sagar)
Hyper-V host bridge driver:
- Fix use-after-free in eject (Dexuan Cui)
Mobiveil host bridge driver:
- Clean up and fix many issues, including non-identify mapped
windows, 64-bit windows, multi-MSI, class code, INTx clearing (Hou
Zhiqiang)
Qualcomm host bridge driver:
- Use clk bulk API for 2.4.0 controllers (Bjorn Andersson)
- Add QCS404 support (Bjorn Andersson)
- Assert PERST for at least 100ms (Niklas Cassel)
R-Car host bridge driver:
- Add r8a774a1 DT support (Biju Das)
Tegra host bridge driver:
- Add support for Gen2, opportunistic UpdateFC and ACK (PCIe protocol
details) AER, GPIO-based PERST# (Manikanta Maddireddy)
- Fix many issues, including power-on failure cases, interrupt
masking in suspend, UPHY settings, AFI dynamic clock gating,
pending DLL transactions (Manikanta Maddireddy)
Xilinx host bridge driver:
- Fix NWL Multi-MSI programming (Bharat Kumar Gogada)
Endpoint support:
- Fix 64bit BAR support (Alan Mikhak)
- Fix pcitest build issues (Alan Mikhak, Andy Shevchenko)
Bug fixes:
- Fix NVIDIA GPU multi-function power dependencies (Abhishek Sahu)
- Fix NVIDIA GPU HDA enablement issue (Lukas Wunner)
- Ignore lockdep for sysfs "remove" (Marek Vasut)
Misc:
- Convert docs to reST (Changbin Du, Mauro Carvalho Chehab)"
* tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (107 commits)
PCI: Enable NVIDIA HDA controllers
tools: PCI: Fix installation when `make tools/pci_install`
PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB
PCI: Fix typos and whitespace errors
PCI: mobiveil: Fix INTx interrupt clearing in mobiveil_pcie_isr()
PCI: mobiveil: Fix infinite-loop in the INTx handling function
PCI: mobiveil: Move PCIe PIO enablement out of inbound window routine
PCI: mobiveil: Add upper 32-bit PCI base address setup in inbound window
PCI: mobiveil: Add upper 32-bit CPU base address setup in outbound window
PCI: mobiveil: Mask out hardcoded bits in inbound/outbound windows setup
PCI: mobiveil: Clear the control fields before updating it
PCI: mobiveil: Add configured inbound windows counter
PCI: mobiveil: Fix the valid check for inbound and outbound windows
PCI: mobiveil: Clean-up program_{ib/ob}_windows()
PCI: mobiveil: Remove an unnecessary return value check
PCI: mobiveil: Fix error return values
PCI: mobiveil: Refactor the MEM/IO outbound window initialization
PCI: mobiveil: Make some register updates more readable
PCI: mobiveil: Reformat the code for readability
dt-bindings: PCI: mobiveil: Change gpio_slave and apb_csr to optional
...
|
|
|
|
2a3c389a0f |
5.3 Merge window RDMA pull request
A smaller cycle this time. Notably we see another new driver, 'Soft
iWarp', and the deletion of an ancient unused driver for nes.
- Revise and simplify the signature offload RDMA MR APIs
- More progress on hoisting object allocation boiler plate code out of the
drivers
- Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib, i40iw
- Tree wide cleanups: struct_size, put_user_page, xarray, rst doc conversion
- Removal of obsolete ib_ucm chardev and nes driver
- netlink based discovery of chardevs and autoloading of the modules
providing them
- Move more of the rdamvt/hfi1 uapi to include/uapi/rdma
- New driver 'siw' for software based iWarp running on top of netdev,
much like rxe's software RoCE.
- mlx5 feature to report events in their raw devx format to userspace
- Expose per-object counters through rdma tool
- Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
from netdev
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl0ozSwACgkQOG33FX4g
mxqncg//Qe2zSnlbd6r3hofsc1WiHSx/CiXtT52BUGipO+cWQUwO7hGFuUHIFCuZ
JBg7mc998xkyLIH85a/txd+RwAIApKgHVdd+VlrmybZeYCiERAMFpWg8cHpzrbnw
l3Ln9fTtJf/NAhO0ZCGV9DCd01fs9yVQgAv21UnLJMUhp9Pzk/iMhu7C7IiSLKvz
t7iFhEqPXNJdoqZ+wtWyc/463YxKUd9XNg9Z1neQdaeZrX4UjgDbY9x/ub3zOvQV
jc/IL4GysJ3z8mfx5mAd6sE/jAjhcnJuaGYYATqkxiLZEP+muYwU50CNs951XhJC
b/EfRQIcLg9kq/u6CP+CuWlMrRWy3U7yj3/mrbbGhlGq88Yt6FGqUf0aFy6TYMaO
RzTG5ZR+0AmsOrR1QU+DbH9CKX5PGZko6E7UCdjROqUlAUOjNwRr99O5mYrZoM9E
PdN2vtdWY9COR3Q+7APdhWIA/MdN2vjr3LDsR3H94tru1yi6dB/BPDRcJieozaxn
2T+YrZbV+9/YgrccpPQCilaQdanXKpkmbYkbEzVLPcOEV/lT9odFDt3eK+6duVDL
ufu8fs1xapMDHKkcwo5jeNZcoSJymAvHmGfZlo2PPOmh802Ul60bvYKwfheVkhHF
Eee5/ovCMs1NLqFiq7Zq5mXO0fR0BHyg9VVjJBZm2JtazyuhoHQ=
=iWcG
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A smaller cycle this time. Notably we see another new driver, 'Soft
iWarp', and the deletion of an ancient unused driver for nes.
- Revise and simplify the signature offload RDMA MR APIs
- More progress on hoisting object allocation boiler plate code out
of the drivers
- Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib,
i40iw
- Tree wide cleanups: struct_size, put_user_page, xarray, rst doc
conversion
- Removal of obsolete ib_ucm chardev and nes driver
- netlink based discovery of chardevs and autoloading of the modules
providing them
- Move more of the rdamvt/hfi1 uapi to include/uapi/rdma
- New driver 'siw' for software based iWarp running on top of netdev,
much like rxe's software RoCE.
- mlx5 feature to report events in their raw devx format to userspace
- Expose per-object counters through rdma tool
- Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
from netdev"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits)
RMDA/siw: Require a 64 bit arch
RDMA/siw: Mark expected switch fall-throughs
RDMA/core: Fix -Wunused-const-variable warnings
rdma/siw: Remove set but not used variable 's'
rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS
RDMA/siw: Add missing rtnl_lock around access to ifa
rdma/siw: Use proper enumerated type in map_cqe_status
RDMA/siw: Remove unnecessary kthread create/destroy printouts
IB/rdmavt: Fix variable shadowing issue in rvt_create_cq
RDMA/core: Fix race when resolving IP address
RDMA/core: Make rdma_counter.h compile stand alone
IB/core: Work on the caller socket net namespace in nldev_newlink()
RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
RDMA/mlx5: Set RDMA DIM to be enabled by default
RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink
RDMA/core: Provide RDMA DIM support for ULPs
linux/dim: Implement RDMA adaptive moderation (DIM)
IB/mlx5: Report correctly tag matching rendezvous capability
docs: infiniband: add it to the driver-api bookset
IB/mlx5: Implement VHCA tunnel mechanism in DEVX
...
|
|
|
|
d4ecfeb154 |
bpf: allow wide aligned loads for bpf_sock_addr user_ip6 and msg_src_ip6
Add explicit check for u64 loads of user_ip6 and msg_src_ip6 and update the comment. Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|
|
|
b43995469e |
bpf: rename bpf_ctx_wide_store_ok to bpf_ctx_wide_access_ok
Rename bpf_ctx_wide_store_ok to bpf_ctx_wide_access_ok to indicate that it can be used for both loads and stores. Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> |
|
|
|
071c37983d |
net: neigh: fix multiple neigh timer scheduling
Neigh timer can be scheduled multiple times from userspace adding
multiple neigh entries and forcing the neigh timer scheduling passing
NTF_USE in the netlink requests.
This will result in a refcount leak and in the following dump stack:
[ 32.465295] NEIGH: BUG, double timer add, state is 8
[ 32.465308] CPU: 0 PID: 416 Comm: double_timer_ad Not tainted 5.2.0+ #65
[ 32.465311] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014
[ 32.465313] Call Trace:
[ 32.465318] dump_stack+0x7c/0xc0
[ 32.465323] __neigh_event_send+0x20c/0x880
[ 32.465326] ? ___neigh_create+0x846/0xfb0
[ 32.465329] ? neigh_lookup+0x2a9/0x410
[ 32.465332] ? neightbl_fill_info.constprop.0+0x800/0x800
[ 32.465334] neigh_add+0x4f8/0x5e0
[ 32.465337] ? neigh_xmit+0x620/0x620
[ 32.465341] ? find_held_lock+0x85/0xa0
[ 32.465345] rtnetlink_rcv_msg+0x204/0x570
[ 32.465348] ? rtnl_dellink+0x450/0x450
[ 32.465351] ? mark_held_locks+0x90/0x90
[ 32.465354] ? match_held_lock+0x1b/0x230
[ 32.465357] netlink_rcv_skb+0xc4/0x1d0
[ 32.465360] ? rtnl_dellink+0x450/0x450
[ 32.465363] ? netlink_ack+0x420/0x420
[ 32.465366] ? netlink_deliver_tap+0x115/0x560
[ 32.465369] ? __alloc_skb+0xc9/0x2f0
[ 32.465372] netlink_unicast+0x270/0x330
[ 32.465375] ? netlink_attachskb+0x2f0/0x2f0
[ 32.465378] netlink_sendmsg+0x34f/0x5a0
[ 32.465381] ? netlink_unicast+0x330/0x330
[ 32.465385] ? move_addr_to_kernel.part.0+0x20/0x20
[ 32.465388] ? netlink_unicast+0x330/0x330
[ 32.465391] sock_sendmsg+0x91/0xa0
[ 32.465394] ___sys_sendmsg+0x407/0x480
[ 32.465397] ? copy_msghdr_from_user+0x200/0x200
[ 32.465401] ? _raw_spin_unlock_irqrestore+0x37/0x40
[ 32.465404] ? lockdep_hardirqs_on+0x17d/0x250
[ 32.465407] ? __wake_up_common_lock+0xcb/0x110
[ 32.465410] ? __wake_up_common+0x230/0x230
[ 32.465413] ? netlink_bind+0x3e1/0x490
[ 32.465416] ? netlink_setsockopt+0x540/0x540
[ 32.465420] ? __fget_light+0x9c/0xf0
[ 32.465423] ? sockfd_lookup_light+0x8c/0xb0
[ 32.465426] __sys_sendmsg+0xa5/0x110
[ 32.465429] ? __ia32_sys_shutdown+0x30/0x30
[ 32.465432] ? __fd_install+0xe1/0x2c0
[ 32.465435] ? lockdep_hardirqs_off+0xb5/0x100
[ 32.465438] ? mark_held_locks+0x24/0x90
[ 32.465441] ? do_syscall_64+0xf/0x270
[ 32.465444] do_syscall_64+0x63/0x270
[ 32.465448] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix the issue unscheduling neigh_timer if selected entry is in 'IN_TIMER'
receiving a netlink request with NTF_USE flag set
Reported-by: Marek Majkowski <marek@cloudflare.com>
Fixes:
|
|
|
|
6e3d1bbbba |
sit: use dst_cache in ipip6_tunnel_xmit
Same as other ip tunnel, use dst_cache in xmit action to avoid unnecessary fib lookups. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
d12109291c |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix excessive stack usage in cxgb4, from Arnd Bergmann.
2) Missing skb queue lock init in tipc, from Chris Packham.
3) Fix some regressions in ipv6 flow label handling, from Eric Dumazet.
4) Elide flow dissection of local packets in FIB rules, from Petar
Penkov.
5) Fix TLS support build failure in mlx5, from Tariq Toukab.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
ppp: mppe: Revert "ppp: mppe: Add softdep to arc4"
net: dsa: qca8k: replace legacy gpio include
net: hisilicon: Use devm_platform_ioremap_resource
cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()
tipc: ensure head->lock is initialised
tc-tests: updated skbedit tests
nfp: flower: ensure ip protocol is specified for L4 matches
nfp: flower: fix ethernet check on match fields
net/mlx5e: Provide cb_list pointer when setting up tc block on rep
net: phy: make exported variables non-static
net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()
davinci_cpdma: don't cast dma_addr_t to pointer
net: openvswitch: do not update max_headroom if new headroom is equal to old headroom
net/mlx5e: Convert single case statement switch statements into if statements
net/mlx5: E-Switch, Reduce ingress acl modify metadata stack usage
net/mlx5e: Fix unused variable warning when CONFIG_MLX5_ESWITCH is off
net/mlx5e: Fix compilation error in TLS code
ipv6: fix static key imbalance in fl_create()
ipv6: fix potential crash in ip6_datagram_dst_update()
ipv6: tcp: fix flowlabels reflection for RST packets
...
|
|
|
|
a2d79c7174 |
for-5.3/io_uring-20190711
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0nUl4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplCaEACa7c7ybRgV6CZpS9PXXVqBJqYILRLBXsnn
MXomrSdjKMs0Y928RAQHFNWh3HaHHtdvSmFvwOpfF5lyrYXVfc9MQ7brqarDp1t2
f4jvkL63BVG2Zs/VL8QVAz+CwtCF39hduzUR/Y9j/4+rJNhSBMNJLY0nlB5weCTy
MJetnLoQ9ETA2+xu49vAM/PFJgBynNUAyUer918y8QysJRj90/VnhieQmrVb4tpG
Q4yZFKq4YPDs0tLEX4Nj6eJERcyW/4MC2oZ0aPXU4g2Dc3SVWaSNOo5WpkP+crGt
0dbyLmhomteE6+Kaco1hAWIkG/RuvgiMzDizryi0enXP51edV3Vnwyg3MQSUhcnf
Pn7vrDkajKBE9rFGlLy8V4gkKdS8XJQy2xA1MWm3aWgGl4v0j64EXIe0IhIK30vU
25A9jLDcdgr74+Lw+vWLLd+oeGD0iFf6wiEp+3jzEdtfVNE/lD6yilTzbdz2V0UK
8T1sRLMEkaG7CbxOVc1UAfcvObjuqQihEI0fQvl4yxV178h8mtWB87YmV2S2EhzP
v6FSxiC1yZ7J+rwb/Mff7+1GoOgzrpS/zESk2WMTgcwVdiwFfv5eIC26ZNWObJ/x
IY+4xRgTf2dEsjBeumOuBzxTfzrZb+pTO4GCa4O+t0UDQRIwl0y20pTXKtxU3y/U
gKPXEjgXrQ==
=jDiB
-----END PGP SIGNATURE-----
Merge tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"This contains:
- Support for recvmsg/sendmsg as first class opcodes.
I don't envision going much further down this path, as there are
plans in progress to support potentially any system call in an
async fashion through io_uring. But I think it does make sense to
have certain core ops available directly, especially those that can
support a "try this non-blocking" flag/mode. (me)
- Handle generic short reads automatically.
This can happen fairly easily if parts of the buffered read is
cached. Since the application needs to issue another request for
the remainder, just do this internally and save kernel/user
roundtrip while providing a nicer more robust API. (me)
- Support for linked SQEs.
This allows SQEs to depend on each other, enabling an application
to eg queue a read-from-this-file,write-to-that-file pair. (me)
- Fix race in stopping SQ thread (Jackie)"
* tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block:
io_uring: fix io_sq_thread_stop running in front of io_sq_thread
io_uring: add support for recvmsg()
io_uring: add support for sendmsg()
io_uring: add support for sqe links
io_uring: punt short reads to async context
uio: make import_iovec()/compat_import_iovec() return bytes on success
|
|
|
|
23bbbf5c1f |
9p pull request for inclusion in 5.13
Two small fixes to properly cleanup the 9p transports list if virtio/xen module initialization fail. 9p might otherwise try to access memory from a module that failed to register got freed. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAl0oKGcACgkQq06b7GqY 5nAfgw//TQNozxgNjwTndpDtVP8R992DcHeTf/S6X7jIu2HArD8tJe516fmkR3T/ ZNMMEiRQhhOjy1h3I2Z/tG0uo90Y6Q4CjWFQKC3pWVILBHb2rWpkgmDtS6jI6KZM ogScIkLwgLucfYmS/ST5JB1zS1c/DCbCWKTpwaDDt3WFdIVd2gfITo7BB0mC16an OjZLpQahSKpyTz2IFh3FWWTO7PjPcuXUQMmlOfcxT8udNwMiVWs39hu186V2kAhx mohAooN0lfHGfQnfu4Kqpy0WSbsMjjLdk0W9jMWIEQ4svv88hbVn8+q5cnW/MWcP w9C3rXsutj3E+9ShEnKiq42MFJ0g9Tis8wcqa90soVHteIK/VD5vJrrIA9RmyAer OPnED8qJzJtyyD4BDgQnlH6nepHDPD0ar2287eRrLb8CHFrKhzh4D1Lt2WP4+mHM o7QZjDsgY24SJeYy7UZTFRN4k36av1PyGKtSBpTWnMabRiWP9CKRIRAM5P+Wu++h T70p5CS8XcBEisrj3w4ejR/fF1geAvXZvBM8+1dBWrmdSk2q6wvALPIaBKBE+Gxn rM2kJHVN5QTDoSXsUPoGb2E+Ab8rkT6oHm3+2z91YKeVT8xFGJneH9+6miv6rp1/ 5Fv/cjxK91vPwxFWxNqE42LI+xOqoRkl1/AXWFY0wsRevXNe/kk= =yAcL -----END PGP SIGNATURE----- Merge tag '9p-for-5.3' of git://github.com/martinetd/linux Pull 9p updates from Dominique Martinet: "Two small fixes to properly cleanup the 9p transports list if virtio/xen module initialization fail. 9p might otherwise try to access memory from a module that failed to register got freed" * tag '9p-for-5.3' of git://github.com/martinetd/linux: 9p/xen: Add cleanup path in p9_trans_xen_init 9p/virtio: Add cleanup path in p9_virtio_init |
|
|
|
39ceda5ce1 |
Kbuild updates for v5.3
- remove headers_{install,check}_all targets
- remove unreasonable 'depends on !UML' from CONFIG_SAMPLES
- re-implement 'make headers_install' more cleanly
- add new header-test-y syntax to compile-test headers
- compile-test exported headers to ensure they are compilable in
user-space
- compile-test headers under include/ to ensure they are self-contained
- remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value flags
- add -Werror=unknown-warning-option for Clang
- add 128-bit built-in types support to genksyms
- fix missed rebuild of modules.builtin
- propagate 'No space left on device' error in fixdep to Make
- allow Clang to use its integrated assembler
- improve some coccinelle scripts
- add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
path for $(srctree).
- do not ignore errors when compression utility is missing
- misc cleanups
-----BEGIN PGP SIGNATURE-----
iQJSBAABCgA8FiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl0oxNkeHHlhbWFkYS5t
YXNhaGlyb0Bzb2Npb25leHQuY29tAAoJED2LAQed4NsGnhcP/AuM8s+3SYFiLitJ
ISbznLFP2Xatq0SPXp5+moez/AMTK6Mm1biPcdo20d+TjVEh4+9F2nq12Ii9U8/D
tds9A6G8+Bb28r9GMIVQPdFohijW6ijtDziS31iQnIWyPsP/yx6PKfLAD9F4ca1x
7/4btmu+BOMjtN0NrMWSNz5MM47xUzoWIALL40SV4PzGVXLCQZ2PBNPeSRIk22Jt
ynDNPuNsmDWcFfwAE+sLSDrhCHZlwM8rg8rf6jmYdc4LcN4cj0oho5+K1TRyC9mn
fO3PT25juFejthxQulxEfyGggnyLM6BNTgPDGcCHSP4nD7mlXA9GcpZICtJOgGGu
SlDadMZ0GRMK5zcZ0MF0GQboeyViwsbXgrRcYuXt6cUFWX4P/1SeAQ5Mf4u1EKqf
hEbwFXV/g81ht0lFS8gyWkvdpoNPtxGHNPusLjp65C4rc0/48/s+7EE/u8JTPl1g
dQTeIOds6XUOkJgqhEfuq+8gfngbjKc9bYhs+ACbkCzBltQdnb6m5aLgk0ODxe8I
WbGn0+cQcS9VVwre7E5DnFSVWVOHAG5taiUwj0KDcHB0Jxw9Gvorq9WU1ppHHYH2
XQIFBx7XHdn28d+plS8R23vAPgDgrGdvE5RYK5tNQLhTJ6BbjlZ1n/Tmxzu62scK
deG3aCOB13Om7OTzTUh9+C3TC9ZQ
=E2Rz
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- remove headers_{install,check}_all targets
- remove unreasonable 'depends on !UML' from CONFIG_SAMPLES
- re-implement 'make headers_install' more cleanly
- add new header-test-y syntax to compile-test headers
- compile-test exported headers to ensure they are compilable in
user-space
- compile-test headers under include/ to ensure they are self-contained
- remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value
flags
- add -Werror=unknown-warning-option for Clang
- add 128-bit built-in types support to genksyms
- fix missed rebuild of modules.builtin
- propagate 'No space left on device' error in fixdep to Make
- allow Clang to use its integrated assembler
- improve some coccinelle scripts
- add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
path for $(srctree).
- do not ignore errors when compression utility is missing
- misc cleanups
* tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits)
kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix
kbuild: Inform user to pass ARCH= for make mrproper
kbuild: fix compression errors getting ignored
kbuild: add a flag to force absolute path for srctree
kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree
kbuild: remove src and obj from the top Makefile
scripts/tags.sh: remove unused environment variables from comments
scripts/tags.sh: drop SUBARCH support for ARM
kbuild: compile-test kernel headers to ensure they are self-contained
kheaders: include only headers into kheaders_data.tar.xz
kheaders: remove meaningless -R option of 'ls'
kbuild: support header-test-pattern-y
kbuild: do not create wrappers for header-test-y
kbuild: compile-test exported headers to ensure they are self-contained
init/Kconfig: add CONFIG_CC_CAN_LINK
kallsyms: exclude kasan local symbols on s390
kbuild: add more hints about SUBDIRS replacement
coccinelle: api/stream_open: treat all wait_.*() calls as blocking
coccinelle: put_device: Add a cast to an expression for an assignment
coccinelle: put_device: Adjust a message construction
...
|
|
|
|
d12cffe932 |
tipc: ensure head->lock is initialised
tipc_named_node_up() creates a skb list. It passes the list to tipc_node_xmit() which has some code paths that can call skb_queue_purge() which relies on the list->lock being initialised. The spin_lock is only needed if the messages end up on the receive path but when the list is created in tipc_named_node_up() we don't necessarily know if it is going to end up there. Once all the skb list users are updated in tipc it will then be possible to update them to use the unlocked variants of the skb list functions and initialise the lock when we know the message will follow the receive path. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
c1a970d06f |
net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()
After recent refactoring of block offlads infrastructure, indr_dev->block
pointer is dereferenced before it is verified to be non-NULL. Example stack
trace where this behavior leads to NULL-pointer dereference error when
creating vxlan dev on system with mlx5 NIC with offloads enabled:
[ 1157.852938] ==================================================================
[ 1157.866877] BUG: KASAN: null-ptr-deref in tc_indr_block_ing_cmd.isra.41+0x9c/0x160
[ 1157.880877] Read of size 4 at addr 0000000000000090 by task ip/3829
[ 1157.901637] CPU: 22 PID: 3829 Comm: ip Not tainted 5.2.0-rc6+ #488
[ 1157.914438] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 1157.929031] Call Trace:
[ 1157.938318] dump_stack+0x9a/0xeb
[ 1157.948362] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
[ 1157.960262] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
[ 1157.972082] __kasan_report+0x176/0x192
[ 1157.982513] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160
[ 1157.994348] kasan_report+0xe/0x20
[ 1158.004324] tc_indr_block_ing_cmd.isra.41+0x9c/0x160
[ 1158.015950] ? tcf_block_setup+0x430/0x430
[ 1158.026558] ? kasan_unpoison_shadow+0x30/0x40
[ 1158.037464] __tc_indr_block_cb_register+0x5f5/0xf20
[ 1158.049288] ? mlx5e_rep_indr_tc_block_unbind+0xa0/0xa0 [mlx5_core]
[ 1158.062344] ? tc_indr_block_dev_put.part.47+0x5c0/0x5c0
[ 1158.074498] ? rdma_roce_rescan_device+0x20/0x20 [ib_core]
[ 1158.086580] ? br_device_event+0x98/0x480 [bridge]
[ 1158.097870] ? strcmp+0x30/0x50
[ 1158.107578] mlx5e_nic_rep_netdevice_event+0xdd/0x180 [mlx5_core]
[ 1158.120212] notifier_call_chain+0x6d/0xa0
[ 1158.130753] register_netdevice+0x6fc/0x7e0
[ 1158.141322] ? netdev_change_features+0xa0/0xa0
[ 1158.152218] ? vxlan_config_apply+0x210/0x310 [vxlan]
[ 1158.163593] __vxlan_dev_create+0x2ad/0x520 [vxlan]
[ 1158.174770] ? vxlan_changelink+0x490/0x490 [vxlan]
[ 1158.185870] ? rcu_read_unlock+0x60/0x60 [vxlan]
[ 1158.196798] vxlan_newlink+0x99/0xf0 [vxlan]
[ 1158.207303] ? __vxlan_dev_create+0x520/0x520 [vxlan]
[ 1158.218601] ? rtnl_create_link+0x3d0/0x450
[ 1158.228900] __rtnl_newlink+0x8a7/0xb00
[ 1158.238701] ? stack_access_ok+0x35/0x80
[ 1158.248450] ? rtnl_link_unregister+0x1a0/0x1a0
[ 1158.258735] ? find_held_lock+0x6d/0xd0
[ 1158.268379] ? is_bpf_text_address+0x67/0xf0
[ 1158.278330] ? lock_acquire+0xc1/0x1f0
[ 1158.287686] ? is_bpf_text_address+0x5/0xf0
[ 1158.297449] ? is_bpf_text_address+0x86/0xf0
[ 1158.307310] ? kernel_text_address+0xec/0x100
[ 1158.317155] ? arch_stack_walk+0x92/0xe0
[ 1158.326497] ? __kernel_text_address+0xe/0x30
[ 1158.336213] ? unwind_get_return_address+0x2f/0x50
[ 1158.346267] ? create_prof_cpu_mask+0x20/0x20
[ 1158.355936] ? arch_stack_walk+0x92/0xe0
[ 1158.365117] ? stack_trace_save+0x8a/0xb0
[ 1158.374272] ? stack_trace_consume_entry+0x80/0x80
[ 1158.384226] ? match_held_lock+0x33/0x210
[ 1158.393216] ? kasan_unpoison_shadow+0x30/0x40
[ 1158.402593] rtnl_newlink+0x53/0x80
[ 1158.410925] rtnetlink_rcv_msg+0x3a5/0x600
[ 1158.419777] ? validate_linkmsg+0x400/0x400
[ 1158.428620] ? find_held_lock+0x6d/0xd0
[ 1158.437117] ? match_held_lock+0x1b/0x210
[ 1158.445760] ? validate_linkmsg+0x400/0x400
[ 1158.454642] netlink_rcv_skb+0xc7/0x1f0
[ 1158.463150] ? netlink_ack+0x470/0x470
[ 1158.471538] ? netlink_deliver_tap+0x1f3/0x5a0
[ 1158.480607] netlink_unicast+0x2ae/0x350
[ 1158.489099] ? netlink_attachskb+0x340/0x340
[ 1158.497935] ? _copy_from_iter_full+0xde/0x3b0
[ 1158.506945] ? __virt_addr_valid+0xb6/0xf0
[ 1158.515578] ? __check_object_size+0x159/0x240
[ 1158.524515] netlink_sendmsg+0x4d3/0x630
[ 1158.532879] ? netlink_unicast+0x350/0x350
[ 1158.541400] ? netlink_unicast+0x350/0x350
[ 1158.549805] sock_sendmsg+0x94/0xa0
[ 1158.557561] ___sys_sendmsg+0x49d/0x570
[ 1158.565625] ? copy_msghdr_from_user+0x210/0x210
[ 1158.574457] ? __fput+0x1e2/0x330
[ 1158.581948] ? __kasan_slab_free+0x130/0x180
[ 1158.590407] ? kmem_cache_free+0xb6/0x2d0
[ 1158.598574] ? mark_lock+0xc7/0x790
[ 1158.606177] ? task_work_run+0xcf/0x100
[ 1158.614165] ? exit_to_usermode_loop+0x102/0x110
[ 1158.622954] ? __lock_acquire+0x963/0x1ee0
[ 1158.631199] ? lockdep_hardirqs_on+0x260/0x260
[ 1158.639777] ? match_held_lock+0x1b/0x210
[ 1158.647918] ? lockdep_hardirqs_on+0x260/0x260
[ 1158.656501] ? match_held_lock+0x1b/0x210
[ 1158.664643] ? __fget_light+0xa6/0xe0
[ 1158.672423] ? __sys_sendmsg+0xd2/0x150
[ 1158.680334] __sys_sendmsg+0xd2/0x150
[ 1158.688063] ? __ia32_sys_shutdown+0x30/0x30
[ 1158.696435] ? lock_downgrade+0x2e0/0x2e0
[ 1158.704541] ? mark_held_locks+0x1a/0x90
[ 1158.712611] ? mark_held_locks+0x1a/0x90
[ 1158.720619] ? do_syscall_64+0x1e/0x2c0
[ 1158.728530] do_syscall_64+0x78/0x2c0
[ 1158.736254] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1158.745414] RIP: 0033:0x7f62d505cb87
[ 1158.753070] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00[87/1817]
48 89 f3 48
[ 1158.780924] RSP: 002b:00007fffd9832268 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1158.793204] RAX: ffffffffffffffda RBX: 000000005d26048f RCX: 00007f62d505cb87
[ 1158.805111] RDX: 0000000000000000 RSI: 00007fffd98322d0 RDI: 0000000000000003
[ 1158.817055] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006
[ 1158.828987] R10: 00007f62d50ce260 R11: 0000000000000246 R12: 0000000000000001
[ 1158.840909] R13: 000000000067e540 R14: 0000000000000000 R15: 000000000067ed20
[ 1158.852873] ==================================================================
Introduce new function tcf_block_non_null_shared() that verifies block
pointer before dereferencing it to obtain index. Use the function in
tc_indr_block_ing_cmd() to prevent NULL pointer dereference.
Fixes:
|
|
|
|
6b660c4177 |
net: openvswitch: do not update max_headroom if new headroom is equal to old headroom
When a vport is deleted, the maximum headroom size would be changed. If the vport which has the largest headroom is deleted, the new max_headroom would be set. But, if the new headroom size is equal to the old headroom size, updating routine is unnecessary. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
f632a8170a |
Driver Core and debugfs changes for 5.3-rc1
Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups. Because of this, there is going
to be some merge issues with your tree at the moment, I'll follow up
with the expected resolutions to make it easier for you.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups (will cause build warnings
with s390 and coresight drivers in your tree)
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse
easier due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of merge
issues that Stephen has been patient with me for. Other than the merge
issues, functionality is working properly in linux-next :)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
LpRyb3zX29oChFaZkc5a
=XrEZ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
|
|
|
|
ef8f3d48af |
Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
"Am experimenting with splitting MM up into identifiable subsystems
perhaps with a view to gitifying it in complex ways. Also with more
verbose "incoming" emails.
Most of MM is here and a few other trees.
Subsystems affected by this patch series:
- hotfixes
- iommu
- scripts
- arch/sh
- ocfs2
- mm:slab-generic
- mm:slub
- mm:kmemleak
- mm:kasan
- mm:cleanups
- mm:debug
- mm:pagecache
- mm:swap
- mm:memcg
- mm:gup
- mm:pagemap
- mm:infrastructure
- mm:vmalloc
- mm:initialization
- mm:pagealloc
- mm:vmscan
- mm:tools
- mm:proc
- mm:ras
- mm:oom-kill
hotfixes:
mm: vmscan: scan anonymous pages on file refaults
mm/nvdimm: add is_ioremap_addr and use that to check ioremap address
mm/memcontrol: fix wrong statistics in memory.stat
mm/z3fold.c: lock z3fold page before __SetPageMovable()
nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header
MAINTAINERS: nilfs2: update email address
iommu:
include/linux/dmar.h: replace single-char identifiers in macros
scripts:
scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
scripts/decode_stacktrace: look for modules with .ko.debug extension
scripts/spelling.txt: drop "sepc" from the misspelling list
scripts/spelling.txt: add spelling fix for prohibited
scripts/decode_stacktrace: Accept dash/underscore in modules
scripts/spelling.txt: add more spellings to spelling.txt
arch/sh:
arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS
sh: config: remove left-over BACKLIGHT_LCD_SUPPORT
sh: prevent warnings when using iounmap
ocfs2:
fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat"
ocfs2/dlm: use struct_size() helper
ocfs2: add last unlock times in locking_state
ocfs2: add locking filter debugfs file
ocfs2: add first lock wait time in locking_state
ocfs: no need to check return value of debugfs_create functions
fs/ocfs2/dlmglue.c: unneeded variable: "status"
ocfs2: use kmemdup rather than duplicating its implementation
mm:slab-generic:
Patch series "mm/slab: Improved sanity checking":
mm/slab: validate cache membership under freelist hardening
mm/slab: sanity-check page type when looking up cache
lkdtm/heap: add tests for freelist hardening
mm:slub:
mm/slub.c: avoid double string traverse in kmem_cache_flags()
slub: don't panic for memcg kmem cache creation failure
mm:kmemleak:
mm/kmemleak.c: fix check for softirq context
mm/kmemleak.c: change error at _write when kmemleak is disabled
docs: kmemleak: add more documentation details
mm:kasan:
mm/kasan: print frame description for stack bugs
Patch series "Bitops instrumentation for KASAN", v5:
lib/test_kasan: add bitops tests
x86: use static_cpu_has in uaccess region to avoid instrumentation
asm-generic, x86: add bitops instrumentation for KASAN
Patch series "mm/kasan: Add object validation in ksize()", v3:
mm/kasan: introduce __kasan_check_{read,write}
mm/kasan: change kasan_check_{read,write} to return boolean
lib/test_kasan: Add test for double-kzfree detection
mm/slab: refactor common ksize KASAN logic into slab_common.c
mm/kasan: add object validation in ksize()
mm:cleanups:
include/linux/pfn_t.h: remove pfn_t_to_virt()
Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect":
arm: remove ARCH_SELECT_MEMORY_MODEL
s390: remove ARCH_SELECT_MEMORY_MODEL
sparc: remove ARCH_SELECT_MEMORY_MODEL
mm/gup.c: make follow_page_mask() static
mm/memory.c: trivial clean up in insert_page()
mm: make !CONFIG_HUGE_PAGE wrappers into static inlines
include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info
mm: remove the account_page_dirtied export
mm/page_isolation.c: change the prototype of undo_isolate_page_range()
include/linux/vmpressure.h: use spinlock_t instead of struct spinlock
mm: remove the exporting of totalram_pages
include/linux/pagemap.h: document trylock_page() return value
mm:debug:
mm/failslab.c: by default, do not fail allocations with direct reclaim only
Patch series "debug_pagealloc improvements":
mm, debug_pagelloc: use static keys to enable debugging
mm, page_alloc: more extensive free page checking with debug_pagealloc
mm, debug_pagealloc: use a page type instead of page_ext flag
mm:pagecache:
Patch series "fix filler_t callback type mismatches", v2:
mm/filemap.c: fix an overly long line in read_cache_page
mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page
jffs2: pass the correct prototype to read_cache_page
9p: pass the correct prototype to read_cache_page
mm/filemap.c: correct the comment about VM_FAULT_RETRY
mm:swap:
mm, swap: fix race between swapoff and some swap operations
mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()
mm, swap: use rbtree for swap_extent
mm/mincore.c: fix race between swapoff and mincore
mm:memcg:
memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL
memcg, fsnotify: no oom-kill for remote memcg charging
mm, memcg: introduce memory.events.local
mm: memcontrol: dump memory.stat during cgroup OOM
Patch series "mm: reparent slab memory on cgroup removal", v7:
mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache()
mm: memcg/slab: rename slab delayed deactivation functions and fields
mm: memcg/slab: generalize postponed non-root kmem_cache deactivation
mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg()
mm: memcg/slab: unify SLAB and SLUB page accounting
mm: memcg/slab: don't check the dying flag on kmem_cache creation
mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
mm: memcg/slab: rework non-root kmem_cache lifecycle management
mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages
mm: memcg/slab: reparent memcg kmem_caches on cgroup removal
mm, memcg: add a memcg_slabinfo debugfs file
mm:gup:
Patch series "switch the remaining architectures to use generic GUP", v4:
mm: use untagged_addr() for get_user_pages_fast addresses
mm: simplify gup_fast_permitted
mm: lift the x86_32 PAE version of gup_get_pte to common code
MIPS: use the generic get_user_pages_fast code
sh: add the missing pud_page definition
sh: use the generic get_user_pages_fast code
sparc64: add the missing pgd_page definition
sparc64: define untagged_addr()
sparc64: use the generic get_user_pages_fast code
mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP
mm: reorder code blocks in gup.c
mm: consolidate the get_user_pages* implementations
mm: validate get_user_pages_fast flags
mm: move the powerpc hugepd code to mm/gup.c
mm: switch gup_hugepte to use try_get_compound_head
mm: mark the page referenced in gup_hugepte
mm/gup: speed up check_and_migrate_cma_pages() on huge page
mm/gup.c: remove some BUG_ONs from get_gate_page()
mm/gup.c: mark undo_dev_pagemap as __maybe_unused
mm:pagemap:
asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
alpha: switch to generic version of pte allocation
arm: switch to generic version of pte allocation
arm64: switch to generic version of pte allocation
csky: switch to generic version of pte allocation
m68k: sun3: switch to generic version of pte allocation
mips: switch to generic version of pte allocation
nds32: switch to generic version of pte allocation
nios2: switch to generic version of pte allocation
parisc: switch to generic version of pte allocation
riscv: switch to generic version of pte allocation
um: switch to generic version of pte allocation
unicore32: switch to generic version of pte allocation
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
mm/memory.c: fail when offset == num in first check of __vm_map_pages()
mm:infrastructure:
mm/mmu_notifier: use hlist_add_head_rcu()
mm:vmalloc:
Patch series "Some cleanups for the KVA/vmalloc", v5:
mm/vmalloc.c: remove "node" argument
mm/vmalloc.c: preload a CPU with one object for split purpose
mm/vmalloc.c: get rid of one single unlink_va() when merge
mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va()
mm/vmalloc.c: spelling> s/informaion/information/
mm:initialization:
mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist
mm/large system hash: clear hashdist when only one node with memory is booted
mm:pagealloc:
arm64: move jump_label_init() before parse_early_param()
Patch series "add init_on_alloc/init_on_free boot options", v10:
mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
mm: init: report memory auto-initialization features at boot time
mm:vmscan:
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
mm: vmscan: correct some vmscan counters for THP swapout
mm:tools:
tools/vm/slabinfo: order command line options
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add sorting info to help menu
mm:proc:
proc: use down_read_killable mmap_sem for /proc/pid/maps
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/map_files
mm: use down_read_killable for locking mmap_sem in access_remote_vm
mm: smaps: split PSS into components
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm:ras:
mm/memory-failure.c: clarify error message
mm:oom-kill:
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm, oom: refactor dump_tasks for memcg OOMs
mm, oom: remove redundant task_in_mem_cgroup() check
oom: decouple mems_allowed from oom_unkillable_task
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()"
* akpm: (147 commits)
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()
oom: decouple mems_allowed from oom_unkillable_task
mm, oom: remove redundant task_in_mem_cgroup() check
mm, oom: refactor dump_tasks for memcg OOMs
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm/memory-failure.c: clarify error message
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm: smaps: split PSS into components
mm: use down_read_killable for locking mmap_sem in access_remote_vm
proc: use down_read_killable mmap_sem for /proc/pid/map_files
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/maps
tools/vm/slabinfo: add sorting info to help menu
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: order command line options
mm: vmscan: correct some vmscan counters for THP swapout
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
...
|
|
|
|
6471384af2 |
mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
|
|
a101b043c4 |
SUNRPC: Fix transport accounting when caller specifies an rpc_xprt
Ensure that we do the required accounting for the round robin queue when the caller to rpc_init_task() has passed in a transport to be used. Reported-by: Olga Kornievskaia <aglo@umich.edu> Reported-by: Neil Brown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> |
|
|
|
347543e640 |
Merge tag 'nfs-rdma-for-5.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
NFSoRDMA client updates for 5.3 New features: - Add a way to place MRs back on the free list - Reduce context switching - Add new trace events Bugfixes and cleanups: - Fix a BUG when tracing is enabled with NFSv4.1 - Fix a use-after-free in rpcrdma_post_recvs - Replace use of xdr_stream_pos in rpcrdma_marshal_req - Fix occasional transport deadlock - Fix show_nfs_errors macros, other tracing improvements - Remove RPCRDMA_REQ_F_PENDING and fr_state - Various simplifications and refactors |