mirror of https://github.com/torvalds/linux.git
1469 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
2ab002c755 |
Driver core and debugfs updates
Here is the big set of driver core and debugfs updates for 6.14-rc1.
It's coming late in the merge cycle as there are a number of merge
conflicts with your tree now, and I wanted to make sure they were
working properly. To resolve them, look in linux-next, and I will send
the "fixup" patch as a response to the pull request.
Included in here is a bunch of driver core, PCI, OF, and platform rust
bindings (all acked by the different subsystem maintainers), hence the
merge conflict with the rust tree, and some driver core api updates to
mark things as const, which will also require some fixups due to new
stuff coming in through other trees in this merge window.
There are also a bunch of debugfs updates from Al, and there is at least
one user that does have a regression with these, but Al is working on
tracking down the fix for it. In my use (and everyone else's linux-next
use), it does not seem like a big issue at the moment.
Here's a short list of the things in here:
- driver core bindings for PCI, platform, OF, and some i/o functions.
We are almost at the "write a real driver in rust" stage now,
depending on what you want to do.
- misc device rust bindings and a sample driver to show how to use
them
- debugfs cleanups in the fs as well as the users of the fs api for
places where drivers got it wrong or were unnecessarily doing things
in complex ways.
- driver core const work, making more of the api take const * for
different parameters to make the rust bindings easier overall.
- other small fixes and updates
All of these have been in linux-next with all of the aforementioned
merge conflicts, and the one debugfs issue, which looks to be resolved
"soon".
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ5koPA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymFHACfT5acDKf2Bov2Lc/5u3vBW/R6ChsAnj+LmgVI
hcDSPodj4szR40RRnzBd
=u5Ey
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the big set of driver core and debugfs updates for 6.14-rc1.
Included in here is a bunch of driver core, PCI, OF, and platform rust
bindings (all acked by the different subsystem maintainers), hence the
merge conflict with the rust tree, and some driver core api updates to
mark things as const, which will also require some fixups due to new
stuff coming in through other trees in this merge window.
There are also a bunch of debugfs updates from Al, and there is at
least one user that does have a regression with these, but Al is
working on tracking down the fix for it. In my use (and everyone
else's linux-next use), it does not seem like a big issue at the
moment.
Here's a short list of the things in here:
- driver core rust bindings for PCI, platform, OF, and some i/o
functions.
We are almost at the "write a real driver in rust" stage now,
depending on what you want to do.
- misc device rust bindings and a sample driver to show how to use
them
- debugfs cleanups in the fs as well as the users of the fs api for
places where drivers got it wrong or were unnecessarily doing
things in complex ways.
- driver core const work, making more of the api take const * for
different parameters to make the rust bindings easier overall.
- other small fixes and updates
All of these have been in linux-next with all of the aforementioned
merge conflicts, and the one debugfs issue, which looks to be resolved
"soon""
* tag 'driver-core-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits)
rust: device: Use as_char_ptr() to avoid explicit cast
rust: device: Replace CString with CStr in property_present()
devcoredump: Constify 'struct bin_attribute'
devcoredump: Define 'struct bin_attribute' through macro
rust: device: Add property_present()
saner replacement for debugfs_rename()
orangefs-debugfs: don't mess with ->d_name
octeontx2: don't mess with ->d_parent or ->d_parent->d_name
arm_scmi: don't mess with ->d_parent->d_name
slub: don't mess with ->d_name
sof-client-ipc-flood-test: don't mess with ->d_name
qat: don't mess with ->d_name
xhci: don't mess with ->d_iname
mtu3: don't mess wiht ->d_iname
greybus/camera - stop messing with ->d_iname
mediatek: stop messing with ->d_iname
netdevsim: don't embed file_operations into your structs
b43legacy: make use of debugfs_get_aux()
b43: stop embedding struct file_operations into their objects
carl9170: stop embedding file_operations into their objects
...
|
|
|
|
0ad9617c78 |
Networking changes for 6.14.
Core
----
- More core refactoring to reduce the RTNL lock contention,
including preparatory work for the per-network namespace RTNL lock,
replacing RTNL lock with a per device-one to protect NAPI-related
net device data and moving synchronize_net() calls outside such
lock.
- Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and
more specific TCP coverage.
- Reduce network namespace tear-down time by removing per-subsystems
synchronize_net() in tipc and sched.
- Add flow label selector support for fib rules, allowing traffic
redirection based on such header field.
Netfilter
---------
- Do not remove netdev basechain when last device is gone, allowing
netdev basechains without devices.
- Revisit the flowtable teardown strategy, dealing better with fin,
reset and re-open events.
- Scale-up IP-vs connection dumping by avoiding linear search on
each restart.
Protocols
---------
- A significant XDP socket refactor, consolidating and optimizing
several helpers into the core
- Better scaling of ICMP rate-limiting, by removing false-sharing in
inet peers handling.
- Introduces netlink notifications for multicast IPv4 and IPv6
address changes.
- Add ipsec support for IP-TFS/AggFrag encapsulation, allowing
aggregation and fragmentation of the inner IP.
- Add sysctl to configure TIME-WAIT reuse delay for TCP sockets,
to avoid local port exhaustion issues when the average connection
lifetime is very short.
- Support updating keys (re-keying) for connections using kernel
TLS (for TLS 1.3 only).
- Support ipv4-mapped ipv6 address clients in smc-r v2.
- Add support for jumbo data packet transmission in RxRPC sockets,
gluing multiple data packets in a single UDP packet.
- Support RxRPC RACK-TLP to manage packet loss and retransmission in
conjunction with the congestion control algorithm.
Driver API
----------
- Introduce a unified and structured interface for reporting PHY
statistics, exposing consistent data across different H/W via
ethtool.
- Make timestamping selectable, allow the user to select the desired
hwtstamp provider (PHY or MAC) administratively.
- Add support for configuring a header-data-split threshold (HDS)
value via ethtool, to deal with partial or buggy H/W implementation.
- Consolidate DSA drivers Energy Efficiency Ethernet support.
- Add EEE management to phylink, making use of the phylib
implementation.
- Add phylib support for in-band capabilities negotiation.
- Simplify how phylib-enabled mac drivers expose the supported
interfaces.
Tests and tooling
-----------------
- Make the YNL tool package-friendly to make it easier to deploy it
separately from the kernel.
- Increase TCP selftest coverage importing several packetdrill
test-cases.
- Regenerate the ethtool uapi header from the YNL spec,
to ease maintenance and future development.
- Add YNL support for decoding the link types used in net
self-tests, allowing a single build to run both net and
drivers/net.
Drivers
-------
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- add cross E-Switch QoS support
- add SW Steering support for ConnectX-8
- implement support for HW-Managed Flow Steering, improving the
rule deletion/insertion rate
- support for multi-host LAG
- Intel (ixgbe, ice, igb):
- ice: add support for devlink health events
- ixgbe: add initial support for E610 chipset variant
- igb: add support for AF_XDP zero-copy
- Meta:
- add support for basic RSS config
- allow changing the number of channels
- add hardware monitoring support
- Broadcom (bnxt):
- implement TCP data split and HDS threshold ethtool support,
enabling Device Memory TCP.
- Marvell Octeon:
- implement egress ipsec offload support for the cn10k family
- Hisilicon (HIBMC):
- implement unicast MAC filtering
- Ethernet NICs embedded and virtual:
- Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding
contented atomic operations for drop counters
- Freescale:
- quicc: phylink conversion
- enetc: support Tx and Rx checksum offload and improve TSO
performances
- MediaTek:
- airoha: introduce support for ETS and HTB Qdisc offload
- Microchip:
- lan78XX USB: preparation work for phylink conversion
- Synopsys (stmmac):
- support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45
- refactor EEE support to leverage the new driver API
- optimize DMA and cache access to increase raw RX performances
by 40%
- TI:
- icssg-prueth: add multicast filtering support for VLAN
interface
- netkit:
- add ability to configure head/tailroom
- VXLAN:
- accepts packets with user-defined reserved bit
- Ethernet switches:
- Microchip:
- lan969x: add RGMII support
- lan969x: improve TX and RX performance using the FDMA engine
- nVidia/Mellanox:
- move Tx header handling to PCI driver, to ease XDP support
- Ethernet PHYs:
- Texas Instruments DP83822:
- add support for GPIO2 clock output
- Realtek:
- 8169: add support for RTL8125D rev.b
- rtl822x: add hwmon support for the temperature sensor
- Microchip:
- add support for RDS PTP hardware
- consolidate periodic output signal generation
- CAN:
- several DT-bindings to DT schema conversions
- tcan4x5x:
- add HW standby support
- support nWKRQ voltage selection
- kvaser:
- allowing Bus Error Reporting runtime configuration
- WiFi:
- the on-going Multi-Link Operation (MLO) effort continues, affecting
both the stack and in drivers
- mac80211/cfg80211:
- Emergency Preparedness Communication Services (EPCS) station mode
support
- support for adding and removing station links for MLO
- add support for WiFi 7/EHT mesh over 320 MHz channels
- report Tx power info for each link
- RealTek (rtw88):
- enable USB Rx aggregation and USB 3 to improve performance
- LED support
- RealTek (rtw89):
- refactor power save to support Multi-Link Operations
- add support for RTL8922AE-VS variant
- MediaTek (mt76):
- single wiphy multiband support (preparation for MLO)
- p2p device support
- add TP-Link TXE50UH USB adapter support
- Qualcomm (ath10k):
- support for the QCA6698AQ IP core
- Qualcomm (ath12k):
- enable MLO for QCN9274
- Bluetooth:
- Allow sysfs to trigger hdev reset, to allow recovering devices
not responsive from user-space
- MediaTek: add support for MT7922, MT7925, MT7921e devices
- Realtek: add support for RTL8851BE devices
- Qualcomm: add support for WCN785x devices
- ISO: allow BIG re-sync
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmePf5YSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkUcMQALblhkGTxurnfT+yK+Bsuhn2LoHl2RPN
4u2Kjkzm+2FYgcw6lS17cFXsnfAPlRIpmhnmKk1EBgsBdkuL29c+jtqnljA2bboD
tIMhMgWiaLS3xgEMrLeKnseIo0G9mviQRphGeZPFTaLb4Ww/bd5LAp4ZGc5oij76
tURatC3b6MuO4Lt5U+jWKnRwviXku8udHkVHXlvPdirawHCVinmx3tvce/BI/MaD
eUOp6ZeJCPCOLtk7b8WEyxxvdY0f6D9ed82qfPDHjb94SJv+Vxb38RZtNuApIjn9
S0KdlNih/4flDy17LDxGYSyFps78lUFRbpqmsUlnZkyLXpsph7/WTvAmMAFcrX0K
UgQ/F/q5GAvcP5WZcCj5+tZaRmfKQraQirXMtYU/Uj50qCnSU7ssyACASt23GLZ8
OF8tCLlm9lLOU1B6Ofkul1Dbo5f0Xpaghga4dFb0kzSfbm78fTUnqBNsJ7jIkWfi
fD6dO+fg+p2ZMD0CACGo3CNxQuJmaQWg6BIDeno6God8kZ6qBMxY/sFr4qozrvFH
x/FgQq8dgc8WLmaPejKiNIPkdQepXrIiv3T9jgMVyEjJnWB/LBfyWKSQOdTfnLs+
rgr4YMV6XW4bx0fYqTI8B9jZ+FCWbG6sn4UtRTHITKcd3FSvd8Y+PHa5YyCUWvJM
l8pePMGF0XVF
=hrsp
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"This is slightly smaller than usual, with the most interesting work
being still around RTNL scope reduction.
Core:
- More core refactoring to reduce the RTNL lock contention, including
preparatory work for the per-network namespace RTNL lock, replacing
RTNL lock with a per device-one to protect NAPI-related net device
data and moving synchronize_net() calls outside such lock.
- Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge
and more specific TCP coverage.
- Reduce network namespace tear-down time by removing per-subsystems
synchronize_net() in tipc and sched.
- Add flow label selector support for fib rules, allowing traffic
redirection based on such header field.
Netfilter:
- Do not remove netdev basechain when last device is gone, allowing
netdev basechains without devices.
- Revisit the flowtable teardown strategy, dealing better with fin,
reset and re-open events.
- Scale-up IP-vs connection dumping by avoiding linear search on each
restart.
Protocols:
- A significant XDP socket refactor, consolidating and optimizing
several helpers into the core
- Better scaling of ICMP rate-limiting, by removing false-sharing in
inet peers handling.
- Introduces netlink notifications for multicast IPv4 and IPv6
address changes.
- Add ipsec support for IP-TFS/AggFrag encapsulation, allowing
aggregation and fragmentation of the inner IP.
- Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to
avoid local port exhaustion issues when the average connection
lifetime is very short.
- Support updating keys (re-keying) for connections using kernel TLS
(for TLS 1.3 only).
- Support ipv4-mapped ipv6 address clients in smc-r v2.
- Add support for jumbo data packet transmission in RxRPC sockets,
gluing multiple data packets in a single UDP packet.
- Support RxRPC RACK-TLP to manage packet loss and retransmission in
conjunction with the congestion control algorithm.
Driver API:
- Introduce a unified and structured interface for reporting PHY
statistics, exposing consistent data across different H/W via
ethtool.
- Make timestamping selectable, allow the user to select the desired
hwtstamp provider (PHY or MAC) administratively.
- Add support for configuring a header-data-split threshold (HDS)
value via ethtool, to deal with partial or buggy H/W
implementation.
- Consolidate DSA drivers Energy Efficiency Ethernet support.
- Add EEE management to phylink, making use of the phylib
implementation.
- Add phylib support for in-band capabilities negotiation.
- Simplify how phylib-enabled mac drivers expose the supported
interfaces.
Tests and tooling:
- Make the YNL tool package-friendly to make it easier to deploy it
separately from the kernel.
- Increase TCP selftest coverage importing several packetdrill
test-cases.
- Regenerate the ethtool uapi header from the YNL spec, to ease
maintenance and future development.
- Add YNL support for decoding the link types used in net self-tests,
allowing a single build to run both net and drivers/net.
Drivers:
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- add cross E-Switch QoS support
- add SW Steering support for ConnectX-8
- implement support for HW-Managed Flow Steering, improving the
rule deletion/insertion rate
- support for multi-host LAG
- Intel (ixgbe, ice, igb):
- ice: add support for devlink health events
- ixgbe: add initial support for E610 chipset variant
- igb: add support for AF_XDP zero-copy
- Meta:
- add support for basic RSS config
- allow changing the number of channels
- add hardware monitoring support
- Broadcom (bnxt):
- implement TCP data split and HDS threshold ethtool support,
enabling Device Memory TCP.
- Marvell Octeon:
- implement egress ipsec offload support for the cn10k family
- Hisilicon (HIBMC):
- implement unicast MAC filtering
- Ethernet NICs embedded and virtual:
- Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding
contented atomic operations for drop counters
- Freescale:
- quicc: phylink conversion
- enetc: support Tx and Rx checksum offload and improve TSO
performances
- MediaTek:
- airoha: introduce support for ETS and HTB Qdisc offload
- Microchip:
- lan78XX USB: preparation work for phylink conversion
- Synopsys (stmmac):
- support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45
- refactor EEE support to leverage the new driver API
- optimize DMA and cache access to increase raw RX performances
by 40%
- TI:
- icssg-prueth: add multicast filtering support for VLAN
interface
- netkit:
- add ability to configure head/tailroom
- VXLAN:
- accepts packets with user-defined reserved bit
- Ethernet switches:
- Microchip:
- lan969x: add RGMII support
- lan969x: improve TX and RX performance using the FDMA engine
- nVidia/Mellanox:
- move Tx header handling to PCI driver, to ease XDP support
- Ethernet PHYs:
- Texas Instruments DP83822:
- add support for GPIO2 clock output
- Realtek:
- 8169: add support for RTL8125D rev.b
- rtl822x: add hwmon support for the temperature sensor
- Microchip:
- add support for RDS PTP hardware
- consolidate periodic output signal generation
- CAN:
- several DT-bindings to DT schema conversions
- tcan4x5x:
- add HW standby support
- support nWKRQ voltage selection
- kvaser:
- allowing Bus Error Reporting runtime configuration
- WiFi:
- the on-going Multi-Link Operation (MLO) effort continues,
affecting both the stack and in drivers
- mac80211/cfg80211:
- Emergency Preparedness Communication Services (EPCS) station
mode support
- support for adding and removing station links for MLO
- add support for WiFi 7/EHT mesh over 320 MHz channels
- report Tx power info for each link
- RealTek (rtw88):
- enable USB Rx aggregation and USB 3 to improve performance
- LED support
- RealTek (rtw89):
- refactor power save to support Multi-Link Operations
- add support for RTL8922AE-VS variant
- MediaTek (mt76):
- single wiphy multiband support (preparation for MLO)
- p2p device support
- add TP-Link TXE50UH USB adapter support
- Qualcomm (ath10k):
- support for the QCA6698AQ IP core
- Qualcomm (ath12k):
- enable MLO for QCN9274
- Bluetooth:
- Allow sysfs to trigger hdev reset, to allow recovering devices
not responsive from user-space
- MediaTek: add support for MT7922, MT7925, MT7921e devices
- Realtek: add support for RTL8851BE devices
- Qualcomm: add support for WCN785x devices
- ISO: allow BIG re-sync"
* tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits)
net/rose: prevent integer overflows in rose_setsockopt()
net: phylink: fix regression when binding a PHY
net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup
net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup
net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path
ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL.
ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL.
ipv6: Move lifetime validation to inet6_rtm_newaddr().
ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr().
ipv6: Pass dev to inet6_addr_add().
ipv6: Convert inet6_ioctl() to per-netns RTNL.
ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup().
ipv6: Hold rtnl_net_lock() in addrconf_dad_work().
ipv6: Hold rtnl_net_lock() in addrconf_verify_work().
ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL.
ipv6: Add __in6_dev_get_rtnl_net().
net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags
net: mii: Fix the Speed display when the network cable is not connected
sysctl net: Remove macro checks for CONFIG_SYSCTL
eth: bnxt: update header sizing defaults
...
|
|
|
|
1d6d399223 |
Kthreads affinity follow either of 4 existing different patterns:
1) Per-CPU kthreads must stay affine to a single CPU and never execute
relevant code on any other CPU. This is currently handled by smpboot
code which takes care of CPU-hotplug operations. Affinity here is
a correctness constraint.
2) Some kthreads _have_ to be affine to a specific set of CPUs and can't
run anywhere else. The affinity is set through kthread_bind_mask()
and the subsystem takes care by itself to handle CPU-hotplug
operations. Affinity here is assumed to be a correctness constraint.
3) Per-node kthreads _prefer_ to be affine to a specific NUMA node. This
is not a correctness constraint but merely a preference in terms of
memory locality. kswapd and kcompactd both fall into this category.
The affinity is set manually like for any other task and CPU-hotplug
is supposed to be handled by the relevant subsystem so that the task
is properly reaffined whenever a given CPU from the node comes up.
Also care should be taken so that the node affinity doesn't cross
isolated (nohz_full) cpumask boundaries.
4) Similar to the previous point except kthreads have a _preferred_
affinity different than a node. Both RCU boost kthreads and RCU
exp kworkers fall into this category as they refer to "RCU nodes"
from a distinctly distributed tree.
Currently the preferred affinity patterns (3 and 4) have at least 4
identified users, with more or less success when it comes to handle
CPU-hotplug operations and CPU isolation. Each of which do it in its own
ad-hoc way.
This is an infrastructure proposal to handle this with the following API
changes:
_ kthread_create_on_node() automatically affines the created kthread to
its target node unless it has been set as per-cpu or bound with
kthread_bind[_mask]() before the first wake-up.
- kthread_affine_preferred() is a new function that can be called right
after kthread_create_on_node() to specify a preferred affinity
different than the specified node.
When the preferred affinity can't be applied because the possible
targets are offline or isolated (nohz_full), the kthread is affine
to the housekeeping CPUs (which means to all online CPUs most of the
time or only the non-nohz_full CPUs when nohz_full= is set).
kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
converted, along with a few old drivers.
Summary of the changes:
* Consolidate a bunch of ad-hoc implementations of kthread_run_on_cpu()
* Introduce task_cpu_fallback_mask() that defines the default last
resort affinity of a task to become nohz_full aware
* Add some correctness check to ensure kthread_bind() is always called
before the first kthread wake up.
* Default affine kthread to its preferred node.
* Convert kswapd / kcompactd and remove their halfway working ad-hoc
affinity implementation
* Implement kthreads preferred affinity
* Unify kthread worker and kthread API's style
* Convert RCU kthreads to the new API and remove the ad-hoc affinity
implementation.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmeNf8gACgkQhSRUR1CO
jHedQQ/+IxTjjqQiItzrq41TES2S0desHDq8lNJFb7rsR/DtKFyLx3s67cOYV+cM
Yx54QHg2m/Fz4nXMQ7Po5ygOtJGCKBc5C5QQy7y0lVKeTQK+daDfEtBSa3oG7j3C
u+E3tTY6qxkbCzymUyaKkHN4/ay2vLvjFS50luV7KMyI3x47Aji+t7VdCX4LCPP2
eAwOALWD0+7qLJ/VF6gsmQLKA4Qx7PQAzBa3KSBmUN9UcN8Gk1bQHCTIQKDHP9LQ
v8BXrNZtYX1o2+snNYpX2z6/ECjxkdwriOgqqZY5306hd9RAQ1u46Dx3byrIqjGn
ULG/XQ2istPyhTqb/h+RbrobdOcwEUIeqk8hRRbBXE8bPpqUz9EMuaCMxWDbQjgH
NTuKG4ifKJ/IqstkkuDkdOiByE/ysMmwqrTXgSnu2ITNL9yY3BEgFbvA95hgo42s
f7QCxEfZb1MHcNEMENSMwM3xw5lLMGMpxVZcMQ3gLwyotMBRrhFZm1qZJG7TITYW
IDIeCbH4JOMdQwLs3CcWTXio0N5/85NhRNFV+IDn96OrgxObgnMtV8QwNgjXBAJ5
wGeJWt8s34W1Zo3qS9gEuVzEhW4XaxISQQMkHe8faKkK6iHmIB/VjSQikDwwUNQ/
AspYj82RyWBCDZsqhiYh71kpxjvS6Xp0bj39Ce1sNsOnuksxKkQ=
=g8In
-----END PGP SIGNATURE-----
Merge tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks
Pull kthread updates from Frederic Weisbecker:
"Kthreads affinity follow either of 4 existing different patterns:
1) Per-CPU kthreads must stay affine to a single CPU and never
execute relevant code on any other CPU. This is currently handled
by smpboot code which takes care of CPU-hotplug operations.
Affinity here is a correctness constraint.
2) Some kthreads _have_ to be affine to a specific set of CPUs and
can't run anywhere else. The affinity is set through
kthread_bind_mask() and the subsystem takes care by itself to
handle CPU-hotplug operations. Affinity here is assumed to be a
correctness constraint.
3) Per-node kthreads _prefer_ to be affine to a specific NUMA node.
This is not a correctness constraint but merely a preference in
terms of memory locality. kswapd and kcompactd both fall into this
category. The affinity is set manually like for any other task and
CPU-hotplug is supposed to be handled by the relevant subsystem so
that the task is properly reaffined whenever a given CPU from the
node comes up. Also care should be taken so that the node affinity
doesn't cross isolated (nohz_full) cpumask boundaries.
4) Similar to the previous point except kthreads have a _preferred_
affinity different than a node. Both RCU boost kthreads and RCU
exp kworkers fall into this category as they refer to "RCU nodes"
from a distinctly distributed tree.
Currently the preferred affinity patterns (3 and 4) have at least 4
identified users, with more or less success when it comes to handle
CPU-hotplug operations and CPU isolation. Each of which do it in its
own ad-hoc way.
This is an infrastructure proposal to handle this with the following
API changes:
- kthread_create_on_node() automatically affines the created kthread
to its target node unless it has been set as per-cpu or bound with
kthread_bind[_mask]() before the first wake-up.
- kthread_affine_preferred() is a new function that can be called
right after kthread_create_on_node() to specify a preferred
affinity different than the specified node.
When the preferred affinity can't be applied because the possible
targets are offline or isolated (nohz_full), the kthread is affine to
the housekeeping CPUs (which means to all online CPUs most of the time
or only the non-nohz_full CPUs when nohz_full= is set).
kswapd, kcompactd, RCU boost kthreads and RCU exp kworkers have been
converted, along with a few old drivers.
Summary of the changes:
- Consolidate a bunch of ad-hoc implementations of
kthread_run_on_cpu()
- Introduce task_cpu_fallback_mask() that defines the default last
resort affinity of a task to become nohz_full aware
- Add some correctness check to ensure kthread_bind() is always
called before the first kthread wake up.
- Default affine kthread to its preferred node.
- Convert kswapd / kcompactd and remove their halfway working ad-hoc
affinity implementation
- Implement kthreads preferred affinity
- Unify kthread worker and kthread API's style
- Convert RCU kthreads to the new API and remove the ad-hoc affinity
implementation"
* tag 'kthread-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
kthread: modify kernel-doc function name to match code
rcu: Use kthread preferred affinity for RCU exp kworkers
treewide: Introduce kthread_run_worker[_on_cpu]()
kthread: Unify kthread_create_on_cpu() and kthread_create_worker_on_cpu() automatic format
rcu: Use kthread preferred affinity for RCU boost
kthread: Implement preferred affinity
mm: Create/affine kswapd to its preferred node
mm: Create/affine kcompactd to its preferred node
kthread: Default affine kthread to its preferred NUMA node
kthread: Make sure kthread hasn't started while binding it
sched,arm64: Handle CPU isolation on last resort fallback rq selection
arm64: Exclude nohz_full CPUs from 32bits el0 support
lib: test_objpool: Use kthread_run_on_cpu()
kallsyms: Use kthread_run_on_cpu()
soc/qman: test: Use kthread_run_on_cpu()
arm/bL_switcher: Use kthread_run_on_cpu()
|
|
|
|
4b0a3ffa79 |
net: dsa: implement get_ts_stats ethtool operation for user ports
Integrate with the standard infrastructure for reporting hardware packet timestamping statistics. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250116104628.123555-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
dd19f4116e |
Merge 6.13-rc7 into driver-core-next
We need the debugfs / driver-core fixes in here as well for testing and to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
|
|
|
b04e317b52 |
treewide: Introduce kthread_run_worker[_on_cpu]()
kthread_create() creates a kthread without running it yet. kthread_run() creates a kthread and runs it. On the other hand, kthread_create_worker() creates a kthread worker and runs it. This difference in behaviours is confusing. Also there is no way to create a kthread worker and affine it using kthread_bind_mask() or kthread_affine_preferred() before starting it. Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]() that behaves just like kthread_run(). kthread_create_worker[_on_cpu]() will now only create a kthread worker without starting it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> |
|
|
|
60c6e3a592 |
net: dsa: no longer call ds->ops->get_mac_eee()
All implementations of get_mac_eee() now just return zero without doing anything useful. Remove the call to this method in preparation to removing the method from each DSA driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUlkz-007Uyl-UA@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
f1e8bf5632 |
driver core: Constify API device_find_child() and adapt for various usages
Constify the following API:
struct device *device_find_child(struct device *dev, void *data,
int (*match)(struct device *dev, void *data));
To :
struct device *device_find_child(struct device *dev, const void *data,
device_match_t match);
typedef int (*device_match_t)(struct device *dev, const void *data);
with the following reasons:
- Protect caller's match data @*data which is for comparison and lookup
and the API does not actually need to modify @*data.
- Make the API's parameters (@match)() and @data have the same type as
all of other device finding APIs (bus|class|driver)_find_device().
- All kinds of existing device match functions can be directly taken
as the API's argument, they were exported by driver core.
Constify the API and adapt for various existing usages.
BTW, various subsystem changes are squashed into this commit to meet
'git bisect' requirement, and this commit has the minimal and simplest
changes to complement squashing shortcoming, and that may bring extra
code improvement.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org> # for drivers/pwm
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-4-6623037414d4@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
07e5c4eb94 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/renesas/rswitch.h |
|
|
|
16f027cd40 |
net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic
Robert Hodaszi reports that locally terminated traffic towards VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing the same symptoms as for commit |
|
|
|
5098462fba |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
36ff681d22 |
net: dsa: tag_ocelot_8021q: fix broken reception
The blamed commit changed the dsa_8021q_rcv() calling convention to
accept pre-populated source_port and switch_id arguments. If those are
not available, as in the case of tag_ocelot_8021q, the arguments must be
pre-initialized with -1.
Due to the bug of passing uninitialized arguments in tag_ocelot_8021q,
dsa_8021q_rcv() does not detect that it needs to populate the
source_port and switch_id, and this makes dsa_conduit_find_user() fail,
which leads to packet loss on reception.
Fixes:
|
|
|
|
88325a291a |
net: dsa: require .support_eee() method to be implemented
Now that we have updated all drivers, switch DSA to require an implementation of the .support_eee() method for EEE to be usable, rather than defaulting to being permissive when not implemented. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/E1tL14e-006cZy-AT@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
99379f5872 |
net: dsa: provide implementation of .support_eee()
Provide a trivial implementation for the .support_eee() method which switch drivers can use to simply indicate that they support EEE on all their user ports. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/E1tL149-006cZJ-JJ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
9723a77318 |
net: dsa: add hook to determine whether EEE is supported
Add a hook to determine whether the switch supports EEE. This will return false if the switch does not, or true if it does. If the method is not implemented, we assume (currently) that the switch supports EEE. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/E1tL144-006cZD-El@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
66c366392e |
net: dsa: remove check for dp->pl in EEE methods
When user ports are initialised, a phylink instance is always created, and so dp->pl will always be non-NULL. The EEE methods are only used for user ports, so checking for dp->pl to be NULL makes no sense. No other phylink-calling method implements similar checks in DSA. Remove this unnecessary check. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/E1tL13z-006cZ7-BZ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
be325f08c4 |
rtnetlink: add ndo_fdb_dump_context
rtnl_fdb_dump() and various ndo_fdb_dump() helpers share a hidden layout of cb->ctx. Before switching rtnl_fdb_dump() to for_each_netdev_dump() in the following patch, make this more explicit. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241209100747.2269613-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
f12b363887 |
net: dsa: use ethtool string helpers
These are the preferred way to copy ethtool strings. Avoids incrementing pointers all over the place. Signed-off-by: Rosen Penev <rosenp@gmail.com> (for hellcreek driver) Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://patch.msgid.link/20241028044828.1639668-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
3535d70df9 |
net: dsa: allow matchall mirroring rules towards the CPU
If the CPU bandwidth capacity permits, it may be useful to mirror the entire ingress of a user port to software. This is in fact possible to express even if there is no net_device representation for the CPU port. In fact, that approach was already exhausted and that representation wouldn't have even helped [1]. The idea behind implementing this is that currently, we refuse to offload any mirroring towards a non-DSA target net_device. But if we acknowledge the fact that to reach any foreign net_device, the switch must send the packet to the CPU anyway, then we can simply offload just that part, and let the software do the rest. There is only one condition we need to uphold: the filter needs to be present in the software data path as well (no skip_sw). There are 2 actions to consider: FLOW_ACTION_MIRRED (redirect to egress of target interface) and FLOW_ACTION_MIRRED_INGRESS (redirect to ingress of target interface). We don't have the ability/API to offload FLOW_ACTION_MIRRED_INGRESS when the target port is also a DSA user port, but we could also permit that through mirred to the CPU + software. Example: $ ip link add dummy0 type dummy; ip link set dummy0 up $ tc qdisc add dev swp0 clsact $ tc filter add dev swp0 ingress matchall action mirred ingress mirror dev dummy0 Any DSA driver with a ds->ops->port_mirror_add() implementation can now make use of this with no additional change. [1] https://lore.kernel.org/netdev/20191002233750.13566-1-olteanv@gmail.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
4cc4394a89 |
net: dsa: add more extack messages in dsa_user_add_cls_matchall_mirred()
Do not leave -EOPNOTSUPP errors without an explanation. It is confusing for the user to figure out what is wrong otherwise. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
c11ace14d9 |
net: dsa: use "extack" as argument to flow_action_basic_hw_stats_check()
We already have an "extack" stack variable in dsa_user_add_cls_matchall_police() and dsa_user_add_cls_matchall_mirred(), there is no need to retrieve it again from cls->common.extack. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
a0af7162cc |
net: dsa: clean up dsa_user_add_cls_matchall()
The body is a bit hard to read, hard to extend, and has duplicated conditions. Clean up the "if (many conditions) else if (many conditions, some of them repeated)" pattern by: - Moving the repeated conditions out - Replacing the repeated tests for the same variable with a switch/case - Moving the protocol check inside the dsa_user_add_cls_matchall_mirred() function call. This is pure refactoring, no logic has been changed, though some tests were reordered. The order does not matter - they are independent things to be tested for. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
d5020cb41e |
net: dsa: replace devlink resource registration calls by devl_ variants
Replace devlink_resource_register(), devlink_resource_occ_get_register(), and devlink_resource_occ_get_unregister() calls by respective devl_* variants. Mentioned functions have no direct users in any drivers, and are going to be removed in subsequent patches. Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20241023131248.27192-6-przemyslaw.kitszel@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
e44ef3f66c |
netpoll: remove ndo_netpoll_setup() second argument
npinfo is not used in any of the ndo_netpoll_setup() methods. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241018052108.2610827-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
|
|
|
ecb595ebba |
net: dsa: remove dsa_port_phylink_mac_select_pcs()
There is no longer any reason to implement the mac_select_pcs() callback in DSA. Returning ERR_PTR(-EOPNOTSUPP) is functionally equivalent to not providing the function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> |
|
|
|
9c0fc36ec4 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3). No conflicts and no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
8c924369cb |
net: dsa: refuse cross-chip mirroring operations
In case of a tc mirred action from one switch to another, the behavior
is not correct. We simply tell the source switch driver to program a
mirroring entry towards mirror->to_local_port = to_dp->index, but it is
not even guaranteed that the to_dp belongs to the same switch as dp.
For proper cross-chip support, we would need to go through the
cross-chip notifier layer in switch.c, program the entry on cascade
ports, and introduce new, explicit API for cross-chip mirroring, given
that intermediary switches should have introspection into the DSA tags
passed through the cascade port (and not just program a port mirror on
the entire cascade port). None of that exists today.
Reject what is not implemented so that user space is not misled into
thinking it works.
Fixes:
|
|
|
|
5397706165 |
net: dsa: remove obsolete phylink dsa_switch operations
No driver now uses the DSA switch phylink members, so we can now remove the method pointers, but we need to leave empty shim functions to allow those drivers that do not provide phylink MAC operations structure to continue functioning. Signed-off-by: Russell King (oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # sja1105, felix, dsa_loop Link: https://patch.msgid.link/E1swKNV-0060oN-1b@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
6c24a03a61 |
net: dsa: improve shutdown sequence
Alexander Sverdlin presents 2 problems during shutdown with the
lan9303 driver. One is specific to lan9303 and the other just happens
to reproduce there.
The first problem is that lan9303 is unique among DSA drivers in that it
calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown,
not remove):
phy_state_machine()
-> ...
-> dsa_user_phy_read()
-> ds->ops->phy_read()
-> lan9303_phy_read()
-> chip->ops->phy_read()
-> lan9303_mdio_phy_read()
-> dev_get_drvdata()
But we never stop the phy_state_machine(), so it may continue to run
after dsa_switch_shutdown(). Our common pattern in all DSA drivers is
to set drvdata to NULL to suppress the remove() method that may come
afterwards. But in this case it will result in an NPD.
The second problem is that the way in which we set
dp->conduit->dsa_ptr = NULL; is concurrent with receive packet
processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL,
but afterwards, rather than continuing to use that non-NULL value,
dev->dsa_ptr is dereferenced again and again without NULL checks:
dsa_conduit_find_user() and many other places. In between dereferences,
there is no locking to ensure that what was valid once continues to be
valid.
Both problems have the common aspect that closing the conduit interface
solves them.
In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN
event in dsa_user_netdevice_event() which closes user ports as well.
dsa_port_disable_rt() calls phylink_stop(), which synchronously stops
the phylink state machine, and ds->ops->phy_read() will thus no longer
call into the driver after this point.
In the second case, dev_close(conduit) should do this, as per
Documentation/networking/driver.rst:
| Quiescence
| ----------
|
| After the ndo_stop routine has been called, the hardware must
| not receive or transmit any data. All in flight packets must
| be aborted. If necessary, poll or wait for completion of
| any reset commands.
So it should be sufficient to ensure that later, when we zeroize
conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call
on this conduit.
The addition of the netif_device_detach() function is to ensure that
ioctls, rtnetlinks and ethtool requests on the user ports no longer
propagate down to the driver - we're no longer prepared to handle them.
The race condition actually did not exist when commit
|
|
|
|
3f464b193d |
net: dsa: microchip: update tag_ksz masks for KSZ9477 family
Remove magic number 7 by introducing a GENMASK macro instead. Remove magic number 0x80 by using the BIT macro instead. Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240909134301.75448-1-vtpieter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
00d066a4d4 |
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
|
|
|
761d527d5d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.h |
|
|
|
6f2b72c04d |
net: dsa: microchip: fix tag_ksz egress mask for KSZ8795 family
Fix the tag_ksz egress mask for DSA_TAG_PROTO_KSZ8795, the port is encoded in the two and not three LSB. This fix is for completeness, for example the bug doesn't manifest itself on the KSZ8794 because bit 2 seems to be always zero. Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Link: https://patch.msgid.link/20240813142750.772781-7-vtpieter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
93e4649efa |
net: dsa: provide a software untagging function on RX for VLAN-aware bridges
Through code analysis, I realized that the ds->untag_bridge_pvid logic
is contradictory - see the newly added FIXME above the kernel-doc for
dsa_software_untag_vlan_unaware_bridge().
Moreover, for the Felix driver, I need something very similar, but which
is actually _not_ contradictory: untag the bridge PVID on RX, but for
VLAN-aware bridges. The existing logic does it for VLAN-unaware bridges.
Since I don't want to change the functionality of drivers which were
supposedly properly tested with the ds->untag_bridge_pvid flag, I have
introduced a new one: ds->untag_vlan_aware_bridge_pvid, and I have
refactored the DSA reception code into a common path for both flags.
TODO: both flags should be unified under a single ds->software_vlan_untag,
which users of both current flags should set. This is not something that
can be carried out right away. It needs very careful examination of all
drivers which make use of this functionality, since some of them
actually get this wrong in the first place.
For example, commit
|
|
|
|
67c3ca2c5c |
net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and register injection
Problem description ------------------- On an NXP LS1028A (felix DSA driver) with the following configuration: - ocelot-8021q tagging protocol - VLAN-aware bridge (with STP) spanning at least swp0 and swp1 - 8021q VLAN upper interfaces on swp0 and swp1: swp0.700, swp1.700 - ptp4l on swp0.700 and swp1.700 we see that the ptp4l instances do not see each other's traffic, and they all go to the grand master state due to the ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES condition. Jumping to the conclusion for the impatient ------------------------------------------- There is a zero-day bug in the ocelot switchdev driver in the way it handles VLAN-tagged packet injection. The correct logic already exists in the source code, in function ocelot_xmit_get_vlan_info() added by commit |
|
|
|
30b3560050 |
Merge branch 'net-make-timestamping-selectable'
First part of "net: Make timestamping selectable" from Kory Maincent. Change the driver-facing type already to lower rebasing pain. Link: https://lore.kernel.org/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
2111375b85 |
net: Add struct kernel_ethtool_ts_info
In prevision to add new UAPI for hwtstamp we will be limited to the struct ethtool_ts_info that is currently passed in fixed binary format through the ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code already started operating on an extensible kernel variant of that structure, similar in concept to struct kernel_hwtstamp_config vs struct hwtstamp_config. Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here we introduce the kernel-only structure in include/linux/ethtool.h. The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO. Acked-by: Shannon Nelson <shannon.nelson@amd.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
85aabd1fe9 |
net: dsa: prepare 'dsa_tag_8021q_bridge_join' for standalone use
The 'dsa_tag_8021q_bridge_join' could be used as a generic implementation of the 'ds->ops->port_bridge_join()' function. However, it is necessary to synchronize their arguments. This patch also moves the 'tx_fwd_offload' flag configuration line into 'dsa_tag_8021q_bridge_join' body. Currently, every (sja1105) driver sets it, and the future vsc73xx implementation will also need it for simplification. Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-11-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
6c87e1a479 |
net: dsa: vsc73xx: introduce tag 8021q for vsc73xx
This commit introduces a new tagger based on 802.1q tagging. It's designed for the vsc73xx driver. The VSC73xx family doesn't have any tag support for the RGMII port, but it could be based on VLANs. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patch.msgid.link/20240713211620.1125910-8-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
d124cf54df |
net: dsa: tag_sja1105: refactor skb->dev assignment to dsa_tag_8021q_find_user()
A new tagging protocol implementation based on tag_8021q is on the horizon, and it appears that it also has to open-code the complicated logic of finding a source port based on a VLAN header. Create a single dsa_tag_8021q_find_user() and make sja1105 call it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240713211620.1125910-7-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
823e5cc141 |
net: dsa: tag_sja1105: prefer precise source port info on SJA1110 too
Now that dsa_8021q_rcv() handles better the case where we don't overwrite the precise source information if it comes from an external (non-tag_8021q) source, we can now unify the call sequence between sja1105_rcv() and sja1110_rcv(). This is a preparatory change for creating a higher-level wrapper for the entire sequence which will live in tag_8021q. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-6-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
0064b863ab |
net: dsa: tag_sja1105: absorb entire sja1105_vlan_rcv() into dsa_8021q_rcv()
tag_sja1105 has a wrapper over dsa_8021q_rcv(): sja1105_vlan_rcv(), which determines whether the packet came from a bridge with vlan_filtering=1 (the case resolved via dsa_find_designated_bridge_port_by_vid()), or if it contains a tag_8021q header. Looking at a new tagger implementation for vsc73xx, based also on tag_8021q, it is becoming clear that the logic is needed there as well. So instead of forcing each tagger to wrap around dsa_8021q_rcv(), let's merge the logic into the core. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-5-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
dcfe767378 |
net: dsa: tag_sja1105: absorb logic for not overwriting precise info into dsa_8021q_rcv()
In both sja1105_rcv() and sja1110_rcv(), we may have precise source port information coming from parallel hardware mechanisms, in addition to the tag_8021q header. Only sja1105_rcv() has extra logic to not overwrite that precise info with what's present in the VLAN tag. This is because sja1110_rcv() gets by, by having a reversed set of checks when assigning skb->dev. When the source port is imprecise (vbid >=1), source_port and switch_id will be set to zeroes by dsa_8021q_rcv(), which might be problematic. But by checking for vbid >= 1 first, sja1110_rcv() fends that off. We would like to make more code common between sja1105_rcv() and sja1110_rcv(), and for that, we need to make sure that sja1110_rcv() also goes through the precise source port preservation logic. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20240713211620.1125910-4-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
983e44f0ee |
net: dsa: Fix typo in NET_DSA_TAG_RTL4_A Kconfig
Fix a minor typo in the help text for the NET_DSA_TAG_RTL4_A config option. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Link: https://lore.kernel.org/r/20240607020843.1380735-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
eef8e906ae |
net: dsa: update the unicast MAC address when changing conduit
When changing DSA user interface conduit while the user interface is up, DSA exhibits different behavior in comparison to when the interface is down. This different behavior concerns the primary unicast MAC address stored in the port standalone FDB and in the conduit device UC database. If we put a switch port down while changing the conduit with ip link set sw0p0 down ip link set sw0p0 type dsa conduit conduit1 ip link set sw0p0 up we delete the address in dsa_user_close() and install the (possibly different) address in dsa_user_open(). But when changing the conduit on the fly, the old address is not deleted and the new one is not installed. Since we explicitly want to support live-changing the conduit, uninstall the old address before calling dsa_port_assign_conduit() and install the (possibly different) new address after the call. Because conduit change might also trigger address change (the user interface is supposed to inherit the conduit interface MAC address if no address is defined in hardware (dp->mac is a zero address)), move the eth_hw_addr_inherit() call from dsa_user_change_conduit() to dsa_port_change_conduit(), just before installing the new address. Although this is in theory a flaw in DSA core, it needs not be backported, since there is currently no DSA driver that can be affected by this. The only DSA driver that supports changing conduit is felix, and, as explained by Vladimir Oltean [1]: There are 2 reasons why with felix the bug does not manifest itself. First is because both the 'ocelot' and the alternate 'ocelot-8021q' tagging protocols have the 'promisc_on_conduit = true' flag. So the unicast address doesn't have to be in the conduit's RX filter - neither the old or the new conduit. Second, dsa_user_host_uc_install() theoretically leaves behind host FDB entries installed towards the wrong (old) CPU port. But in felix_fdb_add(), we treat any FDB entry requested towards any CPU port as if it was a multicast FDB entry programmed towards _all_ CPU ports. For that reason, it is installed towards the port mask of the PGID_CPU port group ID: if (dsa_port_is_cpu(dp)) port = PGID_CPU; Therefore no Fixes tag for this change. [1] https://lore.kernel.org/netdev/20240507201827.47suw4fwcjrbungy@skbuf/ Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
77f7541248 |
net: dsa: deduplicate code adding / deleting the port address to fdb
The sequence
if (dsa_switch_supports_uc_filtering(ds))
dsa_port_standalone_host_fdb_add(dp, addr, 0);
if (!ether_addr_equal(addr, conduit->dev_addr))
dev_uc_add(conduit, addr);
is executed both in dsa_user_open() and dsa_user_set_mac_addr().
Its reverse is executed both in dsa_user_close() and
dsa_user_set_mac_addr().
Refactor these sequences into new functions dsa_user_host_uc_install()
and dsa_user_host_uc_uninstall().
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
bbb31b7ae1 |
net: dsa: remove mac_prepare()/mac_finish() shims
No DSA driver makes use of the mac_prepare()/mac_finish() shimmed operations anymore, so we can remove these. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/E1sByNx-00ELW1-Vp@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
|
|
|
2c92ca849f |
tracing/treewide: Remove second parameter of __assign_str()
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.
This means that with:
__string(field, mystring)
Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.
There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:
git grep -l __assign_str | while read a ; do
sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
mv /tmp/test-file $a;
done
I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.
Note, the same updates will need to be done for:
__assign_str_len()
__assign_rel_str()
__assign_rel_str_len()
I tested this with both an allmodconfig and an allyesconfig (build only for both).
[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/
Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
|
|
|
|
5f5109af47 |
net: dsa: add support switches global DSCP priority mapping
Some switches like Microchip KSZ variants do not support per port DSCP priority configuration. Instead there is a global DSCP mapping table. To handle it, we will accept set/del request to any of user ports to make global configuration and update dcb app entries for all other ports. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
|
|
96c6f33795 |
net: dsa: add support for DCB get/set apptrust configuration
Add DCB support to get/set trust configuration for different packet priority information sources. Some switch allow to chose different source of packet priority classification. For example on KSZ switches it is possible to configure VLAN PCP and/or DSCP sources. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> |