linux

Commit Graph

Author	SHA1	Message	Date
Marek Behún	e8f0dc024c	net: dsa: realtek: Fix LED group port bit for non-zero LED group The rtl8366rb_led_group_port_mask() function always returns LED port bit in LED group 0; the switch statement returns the same thing in all non-default cases. This means that the driver does not currently support configuring LEDs in non-zero LED groups. Fix this. Fixes: `32d6170054` ("net: dsa: realtek: add LED drivers for rtl8366rb") Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260311111237.29002-1-kabel@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 19:03:21 -07:00
Mehul Rao	6c5a9baa15	tipc: fix divide-by-zero in tipc_sk_filter_connect() A user can set conn_timeout to any value via setsockopt(TIPC_CONN_TIMEOUT), including values less than 4. When a SYN is rejected with TIPC_ERR_OVERLOAD and the retry path in tipc_sk_filter_connect() executes: delay %= (tsk->conn_timeout / 4); If conn_timeout is in the range [0, 3], the integer division yields 0, and the modulo operation triggers a divide-by-zero exception, causing a kernel oops/panic. Fix this by clamping conn_timeout to a minimum of 4 at the point of use in tipc_sk_filter_connect(). Oops: divide error: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 119 Comm: poc-F144 Not tainted 7.0.0-rc2+ RIP: 0010:tipc_sk_filter_rcv (net/tipc/socket.c:2236 net/tipc/socket.c:2362) Call Trace: tipc_sk_backlog_rcv (include/linux/instrumented.h:82 include/linux/atomic/atomic-instrumented.h:32 include/net/sock.h:2357 net/tipc/socket.c:2406) __release_sock (include/net/sock.h:1185 net/core/sock.c:3213) release_sock (net/core/sock.c:3797) tipc_connect (net/tipc/socket.c:2570) __sys_connect (include/linux/file.h:62 include/linux/file.h:83 net/socket.c:2098) Fixes: `6787927475` ("tipc: buffer overflow handling in listener socket") Cc: stable@vger.kernel.org Signed-off-by: Mehul Rao <mehulrao@gmail.com> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Link: https://patch.msgid.link/20260310170730.28841-1-mehulrao@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 18:56:28 -07:00
Bastien Curutchet (Schneider Electric)	99c8c16a4a	net: dsa: microchip: Fix error path in PTP IRQ setup If request_threaded_irq() fails during the PTP message IRQ setup, the newly created IRQ mapping is never disposed. Indeed, the ksz_ptp_irq_setup()'s error path only frees the mappings that were successfully set up. Dispose the newly created mapping if the associated request_threaded_irq() fails at setup. Cc: stable@vger.kernel.org Fixes: `d0b8fec8ae` ("net: dsa: microchip: Fix symetry in ksz_ptp_msg_irq_{setup/free}()") Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20260309-ksz-ptp-irq-fix-v1-1-757b3b985955@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 18:06:22 -07:00
Jakub Kicinski	20c1be4cc8	Merge branch 'net-bpf-nd_tbl-fixes-for-when-ipv6-disable-1' Ricardo B. Marlière says: ==================== {net,bpf}: nd_tbl fixes for when ipv6.disable=1 Please consider merging these four patches to fix three crashes that were found after this report: https://lore.kernel.org/all/CAHXs0ORzd62QOG-Fttqa2Cx_A_VFp=utE2H2VTX5nqfgs7LDxQ@mail.gmail.com The first patch from Jakub Kicinski is a preparation in order to enable the use ipv6_mod_enabled() even when CONFIG_IPV6=n. ==================== Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-0-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 17:53:41 -07:00
Ricardo B. Marlière	d56b5d1634	bpf: bpf_out_neigh_v6: Fix nd_tbl NULL dereference when IPv6 is disabled When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. If bpf_redirect_neigh() is called with explicit AF_INET6 nexthop parameters, __bpf_redirect_neigh_v6() can skip the IPv6 FIB lookup and call bpf_out_neigh_v6() directly. bpf_out_neigh_v6() then calls ip_neigh_gw6(), which uses ipv6_stub->nd_tbl. BUG: kernel NULL pointer dereference, address: 0000000000000248 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:skb_do_redirect+0x44f/0xf40 Call Trace: <TASK> ? srso_alias_return_thunk+0x5/0xfbef5 ? __tcf_classify.constprop.0+0x83/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? tcf_classify+0x2b/0x50 ? srso_alias_return_thunk+0x5/0xfbef5 ? tc_run+0xb8/0x120 ? srso_alias_return_thunk+0x5/0xfbef5 __dev_queue_xmit+0x6fa/0x1000 ? srso_alias_return_thunk+0x5/0xfbef5 packet_sendmsg+0x10da/0x1700 ? srso_alias_return_thunk+0x5/0xfbef5 __sys_sendto+0x1f3/0x220 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x101/0xf80 ? exc_page_fault+0x6e/0x170 ? srso_alias_return_thunk+0x5/0xfbef5 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Fix this by adding an early check in bpf_out_neigh_v6(). If IPv6 is disabled, drop the packet before neighbor lookup. Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de> Fixes: `ba452c9e99` ("bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-4-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 17:53:38 -07:00
Ricardo B. Marlière	dcb4e22314	bpf: bpf_out_neigh_v4: Fix nd_tbl NULL dereference when IPv6 is disabled When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. If bpf_redirect_neigh() is called from tc with an explicit nexthop of nh_family == AF_INET6, bpf_out_neigh_v4() takes the AF_INET6 branch and calls ip_neigh_gw6(), which relies on ipv6_stub->nd_tbl. BUG: kernel NULL pointer dereference, address: 0000000000000248 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:skb_do_redirect+0xb93/0xf00 Call Trace: <TASK> ? srso_alias_return_thunk+0x5/0xfbef5 ? __tcf_classify.constprop.0+0x83/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? tcf_classify+0x2b/0x50 ? srso_alias_return_thunk+0x5/0xfbef5 ? tc_run+0xb8/0x120 ? srso_alias_return_thunk+0x5/0xfbef5 __dev_queue_xmit+0x6fa/0x1000 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? alloc_skb_with_frags+0x58/0x200 packet_sendmsg+0x10da/0x1700 ? srso_alias_return_thunk+0x5/0xfbef5 __sys_sendto+0x1f3/0x220 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x101/0xf80 ? exc_page_fault+0x6e/0x170 ? srso_alias_return_thunk+0x5/0xfbef5 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Fix this by adding an early check in the AF_INET6 branch of bpf_out_neigh_v4(). If IPv6 is disabled, unlock RCU and drop the packet. Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de> Fixes: `ba452c9e99` ("bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-3-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 17:53:38 -07:00
Ricardo B. Marlière	30021e969d	net: bonding: Fix nd_tbl NULL dereference when IPv6 is disabled When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never initialized because inet6_init() exits before ndisc_init() is called which initializes it. If bonding ARP/NS validation is enabled, an IPv6 NS/NA packet received on a slave can reach bond_validate_na(), which calls bond_has_this_ip6(). That path calls ipv6_chk_addr() and can crash in __ipv6_chk_addr_and_flags(). BUG: kernel NULL pointer dereference, address: 00000000000005d8 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:__ipv6_chk_addr_and_flags+0x69/0x170 Call Trace: <IRQ> ipv6_chk_addr+0x1f/0x30 bond_validate_na+0x12e/0x1d0 [bonding] ? __pfx_bond_handle_frame+0x10/0x10 [bonding] bond_rcv_validate+0x1a0/0x450 [bonding] bond_handle_frame+0x5e/0x290 [bonding] ? srso_alias_return_thunk+0x5/0xfbef5 __netif_receive_skb_core.constprop.0+0x3e8/0xe50 ? srso_alias_return_thunk+0x5/0xfbef5 ? update_cfs_rq_load_avg+0x1a/0x240 ? srso_alias_return_thunk+0x5/0xfbef5 ? __enqueue_entity+0x5e/0x240 __netif_receive_skb_one_core+0x39/0xa0 process_backlog+0x9c/0x150 __napi_poll+0x30/0x200 ? srso_alias_return_thunk+0x5/0xfbef5 net_rx_action+0x338/0x3b0 handle_softirqs+0xc9/0x2a0 do_softirq+0x42/0x60 </IRQ> <TASK> __local_bh_enable_ip+0x62/0x70 __dev_queue_xmit+0x2d3/0x1000 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? packet_parse_headers+0x10a/0x1a0 packet_sendmsg+0x10da/0x1700 ? kick_pool+0x5f/0x140 ? srso_alias_return_thunk+0x5/0xfbef5 ? __queue_work+0x12d/0x4f0 __sys_sendto+0x1f3/0x220 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x101/0xf80 ? exc_page_fault+0x6e/0x170 ? srso_alias_return_thunk+0x5/0xfbef5 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Fix this by checking ipv6_mod_enabled() before dispatching IPv6 packets to bond_na_rcv(). If IPv6 is disabled, return early from bond_rcv_validate() and avoid the path to ipv6_chk_addr(). Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de> Fixes: `4e24be018e` ("bonding: add new parameter ns_targets") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-2-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 17:53:37 -07:00
Jakub Kicinski	94a4b1f959	ipv6: move the disable_ipv6_mod knob to core code From: Jakub Kicinski <kuba@kernel.org> Make sure disable_ipv6_mod itself is not part of the IPv6 module, in case core code wants to refer to it. We will remove support for IPv6=m soon, this change helps make fixes we commit before that less messy. Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-1-e2677e85628c@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-11 17:53:37 -07:00
Nicolai Buchwitz	908c344d5c	net: bcmgenet: fix broken EEE by converting to phylib-managed state The bcmgenet EEE implementation is broken in several ways. phy_support_eee() is never called, so the PHY never advertises EEE and phylib never sets phydev->enable_tx_lpi. bcmgenet_mac_config() checks priv->eee.eee_enabled to decide whether to enable the MAC LPI logic, but that field is never initialised to true, so the MAC never enters Low Power Idle even when EEE is negotiated - wasting the power savings EEE is designed to provide. The only way to get EEE working at all is a manual 'ethtool --set-eee eth0 eee on' after every link-up, and even then bcmgenet_get_eee() immediately clobbers the reported state because phy_ethtool_get_eee() overwrites eee_enabled and tx_lpi_enabled with the uninitialised PHY eee_cfg values. Finally, bcmgenet_mac_config() is only called on link-up, so EEE is never disabled in hardware on link-down. Fix all of this by removing the MAC-side EEE state tracking (priv->eee) and aligning with the pattern used by other non-phylink MAC drivers such as FEC. Call phy_support_eee() in bcmgenet_mii_probe() so the PHY advertises EEE link modes and phylib tracks negotiation state. Move the EEE hardware control to bcmgenet_mii_setup(), which is called on every link event, and drive it directly from phydev->enable_tx_lpi - the flag phylib sets when EEE is negotiated and the user has not disabled it. This enables EEE automatically once the link partner agrees and disables it cleanly on link-down. Make bcmgenet_get_eee() and bcmgenet_set_eee() pure passthroughs to phy_ethtool_get_eee() and phy_ethtool_set_eee(), with the MAC hardware register read/written for tx_lpi_timer. Drop struct ethtool_keee eee from struct bcmgenet_priv. Fixes: `fe0d4fd928` ("net: phy: Keep track of EEE configuration") Link: https://lore.kernel.org/netdev/d352039f-4cbb-41e6-9aeb-0b4f3941b54c@lunn.ch/ Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260310054935.1238594-1-nb@tipi-net.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:41:13 -07:00
Paul Moses	57885276cc	net-shapers: don't free reply skb after genlmsg_reply() genlmsg_reply() hands the reply skb to netlink, and netlink_unicast() consumes it on all return paths, whether the skb is queued successfully or freed on an error path. net_shaper_nl_get_doit() and net_shaper_nl_cap_get_doit() currently jump to free_msg after genlmsg_reply() fails and call nlmsg_free(msg), which can hit the same skb twice. Return the genlmsg_reply() error directly and keep free_msg only for pre-reply failures. Fixes: `4b623f9f0f` ("net-shapers: implement NL get operation") Fixes: `553ea9f1ef` ("net: shaper: implement introspection support") Cc: stable@vger.kernel.org Signed-off-by: Paul Moses <p@1g4.org> Link: https://patch.msgid.link/20260309173450.538026-2-p@1g4.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:29:09 -07:00
Daniel Golle	f441b489cc	net: dsa: mxl862xx: don't set user_mii_bus The PHY addresses in the MII bus are not equal to the port addresses, so the bus cannot be assigned as user_mii_bus. Falling back on the user_mii_bus in case a PHY isn't declared in device tree will result in using the wrong (in this case: off-by-+1) PHY. Remove the wrong assignment. Fixes: `23794bec1c` ("net: dsa: add basic initial driver for MxL862xx switches") Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/0f0df310fd8cab57e0e5e3d0831dd057fd05bcd5.1773103271.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:18:13 -07:00
Fan Wu	2503d08f8a	net: ethernet: arc: emac: quiesce interrupts before requesting IRQ Normal RX/TX interrupts are enabled later, in arc_emac_open(), so probe should not see interrupt delivery in the usual case. However, hardware may still present stale or latched interrupt status left by firmware or the bootloader. If probe later unwinds after devm_request_irq() has installed the handler, such a stale interrupt can still reach arc_emac_intr() during teardown and race with release of the associated net_device. Avoid that window by putting the device into a known quiescent state before requesting the IRQ: disable all EMAC interrupt sources and clear any pending EMAC interrupt status bits. This keeps the change hardware-focused and minimal, while preventing spurious IRQ delivery from leftover state. Fixes: `e4f2379db6` ("ethernet/arc/arc_emac - Add new driver") Cc: stable@vger.kernel.org Signed-off-by: Fan Wu <fanwu01@zju.edu.cn> Link: https://patch.msgid.link/20260309132409.584966-1-fanwu01@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:05:12 -07:00
Jakub Kicinski	28b225282d	page_pool: store detach_time as ktime_t to avoid false-negatives While testing other changes in vng I noticed that nl_netdev.page_pool_check flakes. This never happens in real CI. Turns out vng may boot and get to that test in less than a second. page_pool_detached() records the detach time in seconds, so if vng is fast enough detach time is set to 0. Other code treats 0 as "not detached". detach_time is only used to report the state to the user, so it's not a huge deal in practice but let's fix it. Store the raw ktime_t (nanoseconds) instead. A nanosecond value of 0 is practically impossible. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Fixes: `69cb4952b6` ("net: page_pool: report when page pool was destroyed") Link: https://patch.msgid.link/20260310003907.3540019-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 19:03:34 -07:00
Kevin Hao	881a0263d5	net: macb: Shuffle the tx ring before enabling tx Quanyang observed that when using an NFS rootfs on an AMD ZynqMp board, the rootfs may take an extended time to recover after a suspend. Upon investigation, it was determined that the issue originates from a problem in the macb driver. According to the Zynq UltraScale TRM [1], when transmit is disabled, the transmit buffer queue pointer resets to point to the address specified by the transmit buffer queue base address register. In the current implementation, the code merely resets `queue->tx_head` and `queue->tx_tail` to '0'. This approach presents several issues: - Packets already queued in the tx ring are silently lost, leading to memory leaks since the associated skbs cannot be released. - Concurrent write access to `queue->tx_head` and `queue->tx_tail` may occur from `macb_tx_poll()` or `macb_start_xmit()` when these values are reset to '0'. - The transmission may become stuck on a packet that has already been sent out, with its 'TX_USED' bit set, but has not yet been processed. However, due to the manipulation of 'queue->tx_head' and 'queue->tx_tail', `macb_tx_poll()` incorrectly assumes there are no packets to handle because `queue->tx_head == queue->tx_tail`. This issue is only resolved when a new packet is placed at this position. This is the root cause of the prolonged recovery time observed for the NFS root filesystem. To resolve this issue, shuffle the tx ring and tx skb array so that the first unsent packet is positioned at the start of the tx ring. Additionally, ensure that updates to `queue->tx_head` and `queue->tx_tail` are properly protected with the appropriate lock. [1] https://docs.amd.com/v/u/en-US/ug1085-zynq-ultrascale-trm Fixes: `bf9cf80cab` ("net: macb: Fix tx/rx malfunction after phy link down and up") Reported-by: Quanyang Wang <quanyang.wang@windriver.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260307-zynqmp-v2-1-6ef98a70e1d0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-10 17:36:44 -07:00
Paolo Abeni	73aefba4e2	linux-can-fixes-for-7.0-20260310 -----BEGIN PGP SIGNATURE----- iIkEABYKADEWIQSl+MghEFFAdY3pYJLMOmT6rpmt0gUCaa/u1RMcbWtsQHBlbmd1 dHJvbml4LmRlAAoJEMw6ZPquma3SmsABAJR3wGHrBuQ5QmhLPSmDKsNPMM2so9v9 AW4bGswHf+ReAP99CKH9aNvEIm43/Ap55lTLMhuBEKyYMS1e6SXFN572Dw== =nBZ9 -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-7.0-20260310' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2026-03-10 this is a pull request of 2 patches for net/main. Haibo Chen's patch fixes the maximum allowed bit rate error, which was broken in v6.19. Wenyuan Li contributes a patch for the hi311x driver that adds missing error checking in the caller of the hi3110_power_enable() function, hi3110_open(). linux-can-fixes-for-7.0-20260310 * tag 'linux-can-fixes-for-7.0-20260310' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: hi311x: hi3110_open(): add check for hi3110_power_enable() return value can: dev: keep the max bitrate error at 5% ==================== Link: https://patch.msgid.link/20260310103547.2299403-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 15:13:55 +01:00
Weiming Shi	6f1a9140ec	net: add xmit recursion limit to tunnel xmit functions Tunnel xmit functions (iptunnel_xmit, ip6tunnel_xmit) lack their own recursion limit. When a bond device in broadcast mode has GRE tap interfaces as slaves, and those GRE tunnels route back through the bond, multicast/broadcast traffic triggers infinite recursion between bond_xmit_broadcast() and ip_tunnel_xmit()/ip6_tnl_xmit(), causing kernel stack overflow. The existing XMIT_RECURSION_LIMIT (8) in the no-qdisc path is not sufficient because tunnel recursion involves route lookups and full IP output, consuming much more stack per level. Use a lower limit of 4 (IP_TUNNEL_RECURSION_LIMIT) to prevent overflow. Add recursion detection using dev_xmit_recursion helpers directly in iptunnel_xmit() and ip6tunnel_xmit() to cover all IPv4/IPv6 tunnel paths including UDP encapsulated tunnels (VXLAN, Geneve, etc.). Move dev_xmit_recursion helpers from net/core/dev.h to public header include/linux/netdevice.h so they can be used by tunnel code. BUG: KASAN: stack-out-of-bounds in blake2s.constprop.0+0xe7/0x160 Write of size 32 at addr ffff88810033fed0 by task kworker/0:1/11 Workqueue: mld mld_ifc_work Call Trace: <TASK> __build_flow_key.constprop.0 (net/ipv4/route.c:515) ip_rt_update_pmtu (net/ipv4/route.c:1073) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:84) ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) ip_finish_output2 (net/ipv4/ip_output.c:237) ip_output (net/ipv4/ip_output.c:438) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:86) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) ip_finish_output2 (net/ipv4/ip_output.c:237) ip_output (net/ipv4/ip_output.c:438) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:86) ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) mld_sendpack mld_ifc_work process_one_work worker_thread </TASK> Fixes: `745e20f1b6` ("net: add a recursion limit in xmit path") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Link: https://patch.msgid.link/20260306160133.3852900-2-bestswngs@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 13:30:30 +01:00
Paolo Abeni	3228835877	Merge branch 'amd-xgbe-rx-adaptation-and-phy-handling-fixes' Raju Rangoju says: ==================== amd-xgbe: RX adaptation and PHY handling fixes This series fixes several issues in the amd-xgbe driver related to RX adaptation and PHY handling in 10GBASE-KR mode, particularly when auto-negotiation is disabled. Patch 1 fixes link status handling during RX adaptation by correctly reading the latched link status bit so transient link drops are detected without losing the current state. Patch 2 prevents CRC errors that can occur when performing RX adaptation with auto-negotiation turned off. The driver now stops TX/RX before re-triggering RX adaptation and only re-enables traffic once adaptation completes and the link is confirmed up, ensuring packets are not corrupted during the adaptation window. Patch 3 restores the intended ordering of PHY reset relative to phy_start(), making sure PHY settings are reset before the PHY is started instead of afterwards. ==================== Link: https://patch.msgid.link/20260306111629.1515676-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 12:07:09 +01:00
Raju Rangoju	a8ba129af4	amd-xgbe: reset PHY settings before starting PHY commit `f93505f357` ("amd-xgbe: let the MAC manage PHY PM") moved xgbe_phy_reset() from xgbe_open() to xgbe_start(), placing it after phy_start(). As a result, the PHY settings were being reset after the PHY had already started. Reorder the calls so that the PHY settings are reset before phy_start() is invoked. Fixes: `f93505f357` ("amd-xgbe: let the MAC manage PHY PM") Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20260306111629.1515676-4-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 12:07:07 +01:00
Raju Rangoju	27a4dd0c70	amd-xgbe: prevent CRC errors during RX adaptation with AN disabled When operating in 10GBASE-KR mode with auto-negotiation disabled and RX adaptation enabled, CRC errors can occur during the RX adaptation process. This happens because the driver continues transmitting and receiving packets while adaptation is in progress. Fix this by stopping TX/RX immediately when the link goes down and RX adaptation needs to be re-triggered, and only re-enabling TX/RX after adaptation completes and the link is confirmed up. Introduce a flag to track whether TX/RX was disabled for adaptation so it can be restored correctly. This prevents packets from being transmitted or received during the RX adaptation window and avoids CRC errors from corrupted frames. The flag tracking the data path state is synchronized with hardware state in xgbe_start() to prevent stale state after device restarts. This ensures that after a restart cycle (where xgbe_stop disables TX/RX and xgbe_start re-enables them), the flag correctly reflects that the data path is active. Fixes: `4f3b20bfbb` ("amd-xgbe: add support for rx-adaptation") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20260306111629.1515676-3-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 12:07:07 +01:00
Raju Rangoju	6485cb96be	amd-xgbe: fix link status handling in xgbe_rx_adaptation The link status bit is latched low to allow detection of momentary link drops. If the status indicates that the link is already down, read it again to obtain the current state. Fixes: `4f3b20bfbb` ("amd-xgbe: add support for rx-adaptation") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20260306111629.1515676-2-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 12:07:06 +01:00
Chengfeng Ye	7d86aa41c0	mctp: route: hold key->lock in mctp_flow_prepare_output() mctp_flow_prepare_output() checks key->dev and may call mctp_dev_set_key(), but it does not hold key->lock while doing so. mctp_dev_set_key() and mctp_dev_release_key() are annotated with __must_hold(&key->lock), so key->dev access is intended to be serialized by key->lock. The mctp_sendmsg() transmit path reaches mctp_flow_prepare_output() via mctp_local_output() -> mctp_dst_output() without holding key->lock, so the check-and-set sequence is racy. Example interleaving: CPU0 CPU1 ---- ---- mctp_flow_prepare_output(key, devA) if (!key->dev) // sees NULL mctp_flow_prepare_output( key, devB) if (!key->dev) // still NULL mctp_dev_set_key(devB, key) mctp_dev_hold(devB) key->dev = devB mctp_dev_set_key(devA, key) mctp_dev_hold(devA) key->dev = devA // overwrites devB Now both devA and devB references were acquired, but only the final key->dev value is tracked for release. One reference can be lost, causing a resource leak as mctp_dev_release_key() would only decrease the reference on one dev. Fix by taking key->lock around the key->dev check and mctp_dev_set_key() call. Fixes: `67737c4572` ("mctp: Pass flow data & flow release events to drivers") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://patch.msgid.link/20260306031402.857224-1-dg573847474@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 11:38:36 +01:00
Jiayuan Chen	950803f725	bonding: fix type confusion in bond_setup_by_slave() kernel BUG at net/core/skbuff.c:2306! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI RIP: 0010:pskb_expand_head+0xa08/0xfe0 net/core/skbuff.c:2306 RSP: 0018:ffffc90004aff760 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88807e3c8780 RCX: ffffffff89593e0e RDX: ffff88807b7c4900 RSI: ffffffff89594747 RDI: ffff88807b7c4900 RBP: 0000000000000820 R08: 0000000000000005 R09: 0000000000000000 R10: 00000000961a63e0 R11: 0000000000000000 R12: ffff88807e3c8780 R13: 00000000961a6560 R14: dffffc0000000000 R15: 00000000961a63e0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1a0ed8df0 CR3: 000000002d816000 CR4: 00000000003526f0 Call Trace: <TASK> ipgre_header+0xdd/0x540 net/ipv4/ip_gre.c:900 dev_hard_header include/linux/netdevice.h:3439 [inline] packet_snd net/packet/af_packet.c:3028 [inline] packet_sendmsg+0x3ae5/0x53c0 net/packet/af_packet.c:3108 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] ____sys_sendmsg+0xa54/0xc30 net/socket.c:2592 ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646 __sys_sendmsg+0x170/0x220 net/socket.c:2678 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fe1a0e6c1a9 When a non-Ethernet device (e.g. GRE tunnel) is enslaved to a bond, bond_setup_by_slave() directly copies the slave's header_ops to the bond device: bond_dev->header_ops = slave_dev->header_ops; This causes a type confusion when dev_hard_header() is later called on the bond device. Functions like ipgre_header(), ip6gre_header(),all use netdev_priv(dev) to access their device-specific private data. When called with the bond device, netdev_priv() returns the bond's private data (struct bonding) instead of the expected type (e.g. struct ip_tunnel), leading to garbage values being read and kernel crashes. Fix this by introducing bond_header_ops with wrapper functions that delegate to the active slave's header_ops using the slave's own device. This ensures netdev_priv() in the slave's header functions always receives the correct device. The fix is placed in the bonding driver rather than individual device drivers, as the root cause is bond blindly inheriting header_ops from the slave without considering that these callbacks expect a specific netdev_priv() layout. The type confusion can be observed by adding a printk in ipgre_header() and running the following commands: ip link add dummy0 type dummy ip addr add 10.0.0.1/24 dev dummy0 ip link set dummy0 up ip link add gre1 type gre local 10.0.0.1 ip link add bond1 type bond mode active-backup ip link set gre1 master bond1 ip link set gre1 up ip link set bond1 up ip addr add fe80::1/64 dev bond1 Fixes: `1284cd3a2b` ("bonding: two small fixes for IPoIB support") Suggested-by: Jay Vosburgh <jv@jvosburgh.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Link: https://patch.msgid.link/20260306021508.222062-1-jiayuan.chen@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 11:33:58 +01:00
Wenyuan Li	47bba09b14	can: hi311x: hi3110_open(): add check for hi3110_power_enable() return value In hi3110_open(), the return value of hi3110_power_enable() is not checked. If power enable fails, the device may not function correctly, while the driver still returns success. Add a check for the return value and propagate the error accordingly. Signed-off-by: Wenyuan Li <2063309626@qq.com> Link: https://patch.msgid.link/tencent_B5E2E7528BB28AA8A2A56E16C49BD58B8B07@qq.com Fixes: `57e83fb9b7` ("can: hi311x: Add Holt HI-311x CAN driver") [mkl: adjust subject, commit message and jump label] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-10 11:12:52 +01:00
Haibo Chen	1eea46908c	can: dev: keep the max bitrate error at 5% Commit `b360a13d44` ("can: dev: print bitrate error with two decimal digits") changed calculation of the bit rate error from on-tenth of a percent to on-hundredth of a percent, but forgot to adjust the scale of the CAN_CALC_MAX_ERROR constant. Keeping the existing logic unchanged: Only when the bitrate error exceeds 5% should an error be returned. Otherwise, simply output a warning log. Fixes: `b360a13d44` ("can: dev: print bitrate error with two decimal digits") Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Link: https://patch.msgid.link/20260306-can-fix-v1-1-ac526cec6777@nxp.com Cc: stable@kernel.org [mkl: improve commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2026-03-10 11:12:52 +01:00
Haiyue Wang	e3f5e0f22c	mctp: i2c: fix skb memory leak in receive path When 'midev->allow_rx' is false, the newly allocated skb isn't consumed by netif_rx(), it needs to free the skb directly. Fixes: `f5b8abf9fc` ("mctp i2c: MCTP I2C binding driver") Signed-off-by: Haiyue Wang <haiyuewa@163.com> Link: https://patch.msgid.link/20260305143240.97592-1-haiyuewa@163.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 11:10:55 +01:00
Paolo Abeni	fdfd103aec	Merge branch 'net-enetc-fix-fallback-phy-address-handling-and-do-not-skip-setting-for-addr-0' Wei Fang says: ==================== net: enetc: fix fallback PHY address handling and do not skip setting for addr 0 There are two potential issues when PHY address 0 is used on the board, see the commit messages of the patches for more details. v1: https://lore.kernel.org/imx/20260303103047.228005-1-wei.fang@nxp.com/ ==================== Link: https://patch.msgid.link/20260305031211.904812-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 10:36:48 +01:00
Wei Fang	dbe17e7783	net: enetc: do not skip setting LaBCR[MDIO_PHYAD_PRTAD] for addr 0 Given that some platforms may use PHY address 0 (I suppose the PHY may not treat address 0 as a broadcast address or default response address). It is possible for some boards to connect multiple PHYs to the same ENETC MAC, for example: - a PHY with a non-zero address connects to ENETC MAC through SGMII interface (selected via DTS_A) - a PHY with address 0 connects to ENETC MAC through RGMII interface (selected via DTS_B) For the case where the ENETC port MDIO is used to manage the PHY, when switching from DTS_A to DTS_B via soft reboot, LaBCR[MDIO_PHYAD_PRTAD] must be updated to 0 because the NETCMIX block is not reset during soft reboot. However, the current driver explicitly skips configuring address 0, causing LaBCR[MDIO_PHYAD_PRTAD] to retain its old value. Therefore, remove the special-case skip of PHY address 0 so that valid configurations using address 0 are properly supported. Fixes: `6633df05f3` ("net: enetc: set the external PHY address in IERB for port MDIO usage") Fixes: `50bfd9c06f` ("net: enetc: set external PHY address in IERB for i.MX94 ENETC") Reviewed-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260305031211.904812-3-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 10:36:46 +01:00
Wei Fang	246953f33e	net: enetc: fix incorrect fallback PHY address handling The current netc_get_phy_addr() implementation falls back to PHY address 0 when the "mdio" node or the PHY child node is missing. On i.MX95, this causes failures when a real PHY is actually assigned address 0 and is managed through the EMDIO interface. Because the bit 0 of phy_mask will be set, leading imx95_enetc_mdio_phyaddr_config() to return an error, and the netc_blk_ctrl driver probe subsequently fails. Fix this by returning -ENODEV when neither an "mdio" node nor any PHY node is present, it means that ENETC port MDIO is not used to manage the PHY, so there is no need to configure LaBCR[MDIO_PHYAD_PRTAD]. Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com> Closes: https://lore.kernel.org/all/7825188.GXAFRqVoOG@steina-w Fixes: `6633df05f3` ("net: enetc: set the external PHY address in IERB for port MDIO usage") Reviewed-by: Clark Wang <xiaoning.wang@nxp.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260305031211.904812-2-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-10 10:36:46 +01:00
Pavan Chebbi	0d9a60a061	bnxt_en: Fix RSS table size check when changing ethtool channels When changing channels, the current check in bnxt_set_channels() is not checking for non-default RSS contexts when the RSS table size changes. The current check for IFF_RXFH_CONFIGURED is only sufficient for the default RSS context. Expand the check to include the presence of any non-default RSS contexts. Allowing such change will result in incorrect configuration of the context's RSS table when the table size changes. Fixes: `b3d0083caf` ("bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh()") Reported-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/netdev/20260303181535.2671734-1-bjorn@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260306225854.3575672-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:49:47 -07:00
Jakub Kicinski	183f682591	Merge branch 'net-usb-lan78xx-accumulated-bug-fixes' Oleksij Rempel says: ==================== net: usb: lan78xx: accumulated bug fixes This series contains a collection of standalone bug fixes for the Microchip LAN78xx driver, addressing packet handling, TX statistics, invalid register accesses, and a kernel warning during disconnect. ==================== Link: https://patch.msgid.link/20260305143429.530909-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:48:39 -07:00
Oleksij Rempel	312c816c6b	net: usb: lan78xx: fix WARN in __netif_napi_del_locked on disconnect Remove redundant netif_napi_del() call from disconnect path. A WARN may be triggered in __netif_napi_del_locked() during USB device disconnect: WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 This happens because netif_napi_del() is called in the disconnect path while NAPI is still enabled. However, it is not necessary to call netif_napi_del() explicitly, since unregister_netdev() will handle NAPI teardown automatically and safely. Removing the redundant call avoids triggering the warning. Full trace: lan78xx 1-1:1.0 enu1: Failed to read register index 0x000000c4. ret = -ENODEV lan78xx 1-1:1.0 enu1: Failed to set MAC down with error -ENODEV lan78xx 1-1:1.0 enu1: Link is Down lan78xx 1-1:1.0 enu1: Failed to read register index 0x00000120. ret = -ENODEV ------------[ cut here ]------------ WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 Modules linked in: flexcan can_dev fuse CPU: 0 UID: 0 PID: 11 Comm: kworker/0:1 Not tainted 6.16.0-rc2-00624-ge926949dab03 #9 PREEMPT Hardware name: SKOV IMX8MP CPU revC - bd500 (DT) Workqueue: usb_hub_wq hub_event pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __netif_napi_del_locked+0x2b4/0x350 lr : __netif_napi_del_locked+0x7c/0x350 sp : ffffffc085b673c0 x29: ffffffc085b673c0 x28: ffffff800b7f2000 x27: ffffff800b7f20d8 x26: ffffff80110bcf58 x25: ffffff80110bd978 x24: 1ffffff0022179eb x23: ffffff80110bc000 x22: ffffff800b7f5000 x21: ffffff80110bc000 x20: ffffff80110bcf38 x19: ffffff80110bcf28 x18: dfffffc000000000 x17: ffffffc081578940 x16: ffffffc08284cee0 x15: 0000000000000028 x14: 0000000000000006 x13: 0000000000040000 x12: ffffffb0022179e8 x11: 1ffffff0022179e7 x10: ffffffb0022179e7 x9 : dfffffc000000000 x8 : 0000004ffdde8619 x7 : ffffff80110bcf3f x6 : 0000000000000001 x5 : ffffff80110bcf38 x4 : ffffff80110bcf38 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 1ffffff0022179e7 x0 : 0000000000000000 Call trace: __netif_napi_del_locked+0x2b4/0x350 (P) lan78xx_disconnect+0xf4/0x360 usb_unbind_interface+0x158/0x718 device_remove+0x100/0x150 device_release_driver_internal+0x308/0x478 device_release_driver+0x1c/0x30 bus_remove_device+0x1a8/0x368 device_del+0x2e0/0x7b0 usb_disable_device+0x244/0x540 usb_disconnect+0x220/0x758 hub_event+0x105c/0x35e0 process_one_work+0x760/0x17b0 worker_thread+0x768/0xce8 kthread+0x3bc/0x690 ret_from_fork+0x10/0x20 irq event stamp: 211604 hardirqs last enabled at (211603): [<ffffffc0828cc9ec>] _raw_spin_unlock_irqrestore+0x84/0x98 hardirqs last disabled at (211604): [<ffffffc0828a9a84>] el1_dbg+0x24/0x80 softirqs last enabled at (211296): [<ffffffc080095f10>] handle_softirqs+0x820/0xbc8 softirqs last disabled at (210993): [<ffffffc080010288>] __do_softirq+0x18/0x20 ---[ end trace 0000000000000000 ]--- lan78xx 1-1:1.0 enu1: failed to kill vid 0081/0 Fixes: `e110bc8258` ("net: usb: lan78xx: Convert to PHYLINK for improved PHY and MAC management") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20260305143429.530909-5-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:48:33 -07:00
Oleksij Rempel	d9cc0e440f	net: usb: lan78xx: skip LTM configuration for LAN7850 Do not configure Latency Tolerance Messaging (LTM) on USB 2.0 hardware. The LAN7850 is a High-Speed (USB 2.0) only device and does not support SuperSpeed features like LTM. Currently, the driver unconditionally attempts to configure LTM registers during initialization. On the LAN7850, these registers do not exist, resulting in writes to invalid or undocumented memory space. This issue was identified during a port to the regmap API with strict register validation enabled. While no functional issues or crashes have been observed from these invalid writes, bypassing LTM initialization on the LAN7850 ensures the driver strictly adheres to the hardware's valid register map. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20260305143429.530909-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:48:33 -07:00
Oleksij Rempel	50988747c3	net: usb: lan78xx: fix TX byte statistics for small packets Account for hardware auto-padding in TX byte counters to reflect actual wire traffic. The LAN7850 hardware automatically pads undersized frames to the minimum Ethernet frame length (ETH_ZLEN, 60 bytes). However, the driver tracks the network statistics based on the unpadded socket buffer length. This results in the tx_bytes counter under-reporting the actual physical bytes placed on the Ethernet wire for small packets (like short ARP or ICMP requests). Use max_t() to ensure the transmission statistics accurately account for the hardware-generated padding. Fixes: `d383216a7e` ("lan78xx: Introduce Tx URB processing improvements") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20260305143429.530909-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:48:33 -07:00
Oleksij Rempel	e4f774a0cc	net: usb: lan78xx: fix silent drop of packets with checksum errors Do not drop packets with checksum errors at the USB driver level; pass them to the network stack. Previously, the driver dropped all packets where the 'Receive Error Detected' (RED) bit was set, regardless of the specific error type. This caused packets with only IP or TCP/UDP checksum errors to be dropped before reaching the kernel, preventing the network stack from accounting for them or performing software fallback. Add a mask for hard hardware errors to safely drop genuinely corrupt frames, while allowing checksum-errored frames to pass with their ip_summed field explicitly set to CHECKSUM_NONE. Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20260305143429.530909-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 19:48:33 -07:00
Eric Dumazet	7a85d370bb	MAINTAINERS: include/net/tc_wrapper.h belongs to TC subsystem include/net/tc_wrapper.h changes should be reviewed by TC maintainers. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260307120607.3504191-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 18:59:28 -07:00
Mehul Rao	b2662e7593	net: nexthop: fix percpu use-after-free in remove_nh_grp_entry When removing a nexthop from a group, remove_nh_grp_entry() publishes the new group via rcu_assign_pointer() then immediately frees the removed entry's percpu stats with free_percpu(). However, the synchronize_net() grace period in the caller remove_nexthop_from_groups() runs after the free. RCU readers that entered before the publish still see the old group and can dereference the freed stats via nh_grp_entry_stats_inc() -> get_cpu_ptr(nhge->stats), causing a use-after-free on percpu memory. Fix by deferring the free_percpu() until after synchronize_net() in the caller. Removed entries are chained via nh_list onto a local deferred free list. After the grace period completes and all RCU readers have finished, the percpu stats are safely freed. Fixes: `f4676ea74b` ("net: nexthop: Add nexthop group entry stats") Cc: stable@vger.kernel.org Signed-off-by: Mehul Rao <mehulrao@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260306233821.196789-1-mehulrao@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 18:48:26 -07:00
Shuangpeng Bai	288598d80a	serial: caif: hold tty->link reference in ldisc_open and ser_release A reproducer triggers a KASAN slab-use-after-free in pty_write_room() when caif_serial's TX path calls tty_write_room(). The faulting access is on tty->link->port. Hold an extra kref on tty->link for the lifetime of the caif_serial line discipline: get it in ldisc_open() and drop it in ser_release(), and also drop it on the ldisc_open() error path. With this change applied, the reproducer no longer triggers the UAF in my testing. Link: https://gist.github.com/shuangpengbai/c898debad6bdf170a84be7e6b3d8707f Link: https://lore.kernel.org/netdev/20260301220525.1546355-1-shuangpeng.kernel@gmail.com Fixes: `e31d5a0594` ("caif: tty's are kref objects so take a reference") Signed-off-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://patch.msgid.link/20260306034006.3395740-1-shuangpeng.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 18:47:55 -07:00
Álvaro Fernández Rojas	87d1268521	net: sfp: improve Huawei MA5671a fixup With the current sfp_fixup_ignore_tx_fault() fixup we ignore the TX_FAULT signal, but we also need to apply sfp_fixup_ignore_los() in order to be able to communicate with the module even if the fiber isn't connected for configuration purposes. This is needed for all the MA5671a firmwares, excluding the FS modded firmware. Fixes: `2069624dac` ("net: sfp: Add tx-fault workaround for Huawei MA5671A SFP ONT") Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260306125139.213637-1-noltari@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-09 18:46:46 -07:00
Jakub Kicinski	c113d5e326	Merge branch 'net-spacemit-a-few-error-handling-fixes' Vivian Wang says: ==================== net: spacemit: A few error handling fixes Recently a user reported a supposed UAF/double-free in this driver. It turned out to be a false positive (ugh) from a bug with riscv's kfence_protect_page() [1], but it did also prompt me to review the driver code yet again. These are some fixes for error handling problems that I've found. [1]: https://lore.kernel.org/r/20260303-handle-kfence-protect-spurious-fault-v2-0-f80d8354d79d@iscas.ac.cn/ ==================== Link: https://patch.msgid.link/20260305-k1-ethernet-more-fixes-v2-0-e4e434d65055@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 18:58:36 -08:00
Vivian Wang	86292155be	net: spacemit: Fix error handling in emac_tx_mem_map() The DMA mappings were leaked on mapping error. Free them with the existing emac_free_tx_buf() function. Fixes: `bfec6d7f20` ("net: spacemit: Add K1 Ethernet MAC") Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://patch.msgid.link/20260305-k1-ethernet-more-fixes-v2-2-e4e434d65055@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 18:58:34 -08:00
Vivian Wang	3aa1417803	net: spacemit: Fix error handling in emac_alloc_rx_desc_buffers() Even if we get a dma_mapping_error() while mapping an RX buffer, we should still update rx_ring->head to ensure that the buffers we were able to allocate and map are used. Fix this by breaking out to the existing code after the loop, analogous to the existing handling for skb allocation failure. Fixes: `bfec6d7f20` ("net: spacemit: Add K1 Ethernet MAC") Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://patch.msgid.link/20260305-k1-ethernet-more-fixes-v2-1-e4e434d65055@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 18:58:34 -08:00
Miaoqian Lin	4245a79003	rxrpc, afs: Fix missing error pointer check after rxrpc_kernel_lookup_peer() rxrpc_kernel_lookup_peer() can also return error pointers in addition to NULL, so just checking for NULL is not sufficient. Fix this by: (1) Changing rxrpc_kernel_lookup_peer() to return -ENOMEM rather than NULL on allocation failure. (2) Making the callers in afs use IS_ERR() and PTR_ERR() to pass on the error code returned. Fixes: `72904d7b9b` ("rxrpc, afs: Allow afs to pin rxrpc_peer objects") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Co-developed-by: David Howells <dhowells@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/368272.1772713861@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:49:52 -08:00
Jakub Kicinski	7a4d74676c	Merge branch 'further-sja1105-phylink-link-replay-fixups' Vladimir Oltean says: ==================== Further SJA1105 phylink link replay fixups While I was playing around with the subsystem knowledge in Chris Mason's review-prompts to see what LLMs would have needed to catch the bug behind commit `bfd264fbbb` ("net: dsa: sja1105: protect link replay helpers against NULL phylink instance"), it flagged another issue instead, which IMO is valid. This is being fixed in patch 2/2. Patch 1/2 is preparatory reordering for that. I haven't noticed any physical issues, it only has to do with the soundness of the new call path introduced in January in commit `0b2edc531e` ("net: dsa: sja1105: let phylink help with the replay of link callbacks"). ==================== Link: https://patch.msgid.link/20260304220900.3865120-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:48:03 -08:00
Vladimir Oltean	ce2da643f0	net: dsa: sja1105: ensure phylink_replay_link_end() will not be missed Most errors that can occur in sja1105_static_config_reload() are fatal (example: fail to communicate with hardware), but not all are. For example, sja1105_static_config_upload() -> kcalloc() may fail, and if that happens, we have called phylink_replay_link_begin() but never phylink_replay_link_end(). Under that circumstance, all port phylink instances are left in a state where the resolver is stopped with the PHYLINK_DISABLE_REPLAY bit set. We have effectively disabled link management with no way to recover from this condition. Avoid that situation by ensuring phylink_replay_link_begin() is always paired with phylink_replay_link_end(), regardless of whether we faced any errors during switch reset, configuration reload and general state reload. Fixes: `0b2edc531e` ("net: dsa: sja1105: let phylink help with the replay of link callbacks") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260304220900.3865120-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:48:01 -08:00
Vladimir Oltean	976703cae7	net: dsa: sja1105: reorder sja1105_reload_cbs() and phylink_replay_link_end() Move phylink_replay_link_end() as the last locked operation under sja1105_static_config_reload(). The purpose is to be able to goto this step from the error path of intermediate steps (we must call phylink_replay_link_end()). sja1105_reload_cbs() notably does not depend on port states or link speeds. See commit `954ad9bf13` ("net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload") which has discussed this issue specifically. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260304220900.3865120-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:48:01 -08:00
Weiming Shi	0cc0c2e661	net/sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit teql_master_xmit() calls netdev_start_xmit(skb, slave) to transmit through slave devices, but does not update skb->dev to the slave device beforehand. When a gretap tunnel is a TEQL slave, the transmit path reaches iptunnel_xmit() which saves dev = skb->dev (still pointing to teql0 master) and later calls iptunnel_xmit_stats(dev, pkt_len). This function does: get_cpu_ptr(dev->tstats) Since teql_master_setup() does not set dev->pcpu_stat_type to NETDEV_PCPU_STAT_TSTATS, the core network stack never allocates tstats for teql0, so dev->tstats is NULL. get_cpu_ptr(NULL) computes NULL + __per_cpu_offset[cpu], resulting in a page fault. BUG: unable to handle page fault for address: ffff8880e6659018 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 68bc067 P4D 68bc067 PUD 0 Oops: Oops: 0002 [#1] SMP KASAN PTI RIP: 0010:iptunnel_xmit (./include/net/ip_tunnels.h:664 net/ipv4/ip_tunnel_core.c:89) Call Trace: <TASK> ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) __gre_xmit (net/ipv4/ip_gre.c:478) gre_tap_xmit (net/ipv4/ip_gre.c:779) teql_master_xmit (net/sched/sch_teql.c:319) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) neigh_direct_output (net/core/neighbour.c:1660) ip_finish_output2 (net/ipv4/ip_output.c:237) __ip_finish_output.part.0 (net/ipv4/ip_output.c:315) ip_mc_output (net/ipv4/ip_output.c:369) ip_send_skb (net/ipv4/ip_output.c:1508) udp_send_skb (net/ipv4/udp.c:1195) udp_sendmsg (net/ipv4/udp.c:1485) inet_sendmsg (net/ipv4/af_inet.c:859) __sys_sendto (net/socket.c:2206) Fix this by setting skb->dev = slave before calling netdev_start_xmit(), so that tunnel xmit functions see the correct slave device with properly allocated tstats. Fixes: `039f50629b` ("ip_tunnel: Move stats update to iptunnel_xmit()") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Link: https://patch.msgid.link/20260304044216.3517851-3-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:45:37 -08:00
Jian Zhang	5c3398a542	net: ncsi: fix skb leak in error paths Early return paths in NCSI RX and AEN handlers fail to release the received skb, resulting in a memory leak. Specifically, ncsi_aen_handler() returns on invalid AEN packets without consuming the skb. Similarly, ncsi_rcv_rsp() exits early when failing to resolve the NCSI device, response handler, or request, leaving the skb unfreed. CC: stable@vger.kernel.org Fixes: `7a82ecf4cf` ("net/ncsi: NCSI AEN packet handler") Fixes: `138635cc27` ("net/ncsi: NCSI response packet handler") Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com> Link: https://patch.msgid.link/20260305060656.3357250-1-zhangjian.3032@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:34:48 -08:00
Jakub Kicinski	63f428cb94	Merge branch 'mlx5-misc-fixes-2026-03-05' Tariq Toukan says: ==================== mlx5 misc fixes 2026-03-05 This patchset provides misc bug fixes from the team to the mlx5 core and Eth drivers. ==================== Link: https://patch.msgid.link/20260305142634.1813208-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:26:04 -08:00
Dragos Tatulea	a6413e6f6c	net/mlx5e: RX, Fix XDP multi-buf frag counting for legacy RQ XDP multi-buf programs can modify the layout of the XDP buffer when the program calls bpf_xdp_pull_data() or bpf_xdp_adjust_tail(). The referenced commit in the fixes tag corrected the assumption in the mlx5 driver that the XDP buffer layout doesn't change during a program execution. However, this fix introduced another issue: the dropped fragments still need to be counted on the driver side to avoid page fragment reference counting issues. Such issue can be observed with the test_xdp_native_adjst_tail_shrnk_data selftest when using a payload of 3600 and shrinking by 256 bytes (an upcoming selftest patch): the last fragment gets released by the XDP code but doesn't get tracked by the driver. This results in a negative pp_ref_count during page release and the following splat: WARNING: include/net/page_pool/helpers.h:297 at mlx5e_page_release_fragmented.isra.0+0x4a/0x50 [mlx5_core], CPU#12: ip/3137 Modules linked in: [...] CPU: 12 UID: 0 PID: 3137 Comm: ip Not tainted 6.19.0-rc3+ #12 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5e_page_release_fragmented.isra.0+0x4a/0x50 [mlx5_core] [...] Call Trace: <TASK> mlx5e_dealloc_rx_wqe+0xcb/0x1a0 [mlx5_core] mlx5e_free_rx_descs+0x7f/0x110 [mlx5_core] mlx5e_close_rq+0x50/0x60 [mlx5_core] mlx5e_close_queues+0x36/0x2c0 [mlx5_core] mlx5e_close_channel+0x1c/0x50 [mlx5_core] mlx5e_close_channels+0x45/0x80 [mlx5_core] mlx5e_safe_switch_params+0x1a5/0x230 [mlx5_core] mlx5e_change_mtu+0xf3/0x2f0 [mlx5_core] netif_set_mtu_ext+0xf1/0x230 do_setlink.isra.0+0x219/0x1180 rtnl_newlink+0x79f/0xb60 rtnetlink_rcv_msg+0x213/0x3a0 netlink_rcv_skb+0x48/0xf0 netlink_unicast+0x24a/0x350 netlink_sendmsg+0x1ee/0x410 __sock_sendmsg+0x38/0x60 ____sys_sendmsg+0x232/0x280 ___sys_sendmsg+0x78/0xb0 __sys_sendmsg+0x5f/0xb0 [...] do_syscall_64+0x57/0xc50 This patch fixes the issue by doing page frag counting on all the original XDP buffer fragments for all relevant XDP actions (XDP_TX , XDP_REDIRECT and XDP_PASS). This is basically reverting to the original counting before the commit in the fixes tag. As frag_page is still pointing to the original tail, the nr_frags parameter to xdp_update_skb_frags_info() needs to be calculated in a different way to reflect the new nr_frags. Fixes: `afd5ba577c` ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Link: https://patch.msgid.link/20260305142634.1813208-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:25:39 -08:00
Dragos Tatulea	db25c42c2e	net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQ XDP multi-buf programs can modify the layout of the XDP buffer when the program calls bpf_xdp_pull_data() or bpf_xdp_adjust_tail(). The referenced commit in the fixes tag corrected the assumption in the mlx5 driver that the XDP buffer layout doesn't change during a program execution. However, this fix introduced another issue: the dropped fragments still need to be counted on the driver side to avoid page fragment reference counting issues. The issue was discovered by the drivers/net/xdp.py selftest, more specifically the test_xdp_native_tx_mb: - The mlx5 driver allocates a page_pool page and initializes it with a frag counter of 64 (pp_ref_count=64) and the internal frag counter to 0. - The test sends one packet with no payload. - On RX (mlx5e_skb_from_cqe_mpwrq_nonlinear()), mlx5 configures the XDP buffer with the packet data starting in the first fragment which is the page mentioned above. - The XDP program runs and calls bpf_xdp_pull_data() which moves the header into the linear part of the XDP buffer. As the packet doesn't contain more data, the program drops the tail fragment since it no longer contains any payload (pp_ref_count=63). - mlx5 device skips counting this fragment. Internal frag counter remains 0. - mlx5 releases all 64 fragments of the page but page pp_ref_count is 63 => negative reference counting error. Resulting splat during the test: WARNING: CPU: 0 PID: 188225 at ./include/net/page_pool/helpers.h:297 mlx5e_page_release_fragmented.isra.0+0xbd/0xe0 [mlx5_core] Modules linked in: [...] CPU: 0 UID: 0 PID: 188225 Comm: ip Not tainted 6.18.0-rc7_for_upstream_min_debug_2025_12_08_11_44 #1 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5e_page_release_fragmented.isra.0+0xbd/0xe0 [mlx5_core] [...] Call Trace: <TASK> mlx5e_free_rx_mpwqe+0x20a/0x250 [mlx5_core] mlx5e_dealloc_rx_mpwqe+0x37/0xb0 [mlx5_core] mlx5e_free_rx_descs+0x11a/0x170 [mlx5_core] mlx5e_close_rq+0x78/0xa0 [mlx5_core] mlx5e_close_queues+0x46/0x2a0 [mlx5_core] mlx5e_close_channel+0x24/0x90 [mlx5_core] mlx5e_close_channels+0x5d/0xf0 [mlx5_core] mlx5e_safe_switch_params+0x2ec/0x380 [mlx5_core] mlx5e_change_mtu+0x11d/0x490 [mlx5_core] mlx5e_change_nic_mtu+0x19/0x30 [mlx5_core] netif_set_mtu_ext+0xfc/0x240 do_setlink.isra.0+0x226/0x1100 rtnl_newlink+0x7a9/0xba0 rtnetlink_rcv_msg+0x220/0x3c0 netlink_rcv_skb+0x4b/0xf0 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 ____sys_sendmsg+0x1e8/0x240 ___sys_sendmsg+0x7c/0xb0 [...] __sys_sendmsg+0x5f/0xb0 do_syscall_64+0x55/0xc70 The problem applies for XDP_PASS as well which is handled in a different code path in the driver. This patch fixes the issue by doing page frag counting on all the original XDP buffer fragments for all relevant XDP actions (XDP_TX , XDP_REDIRECT and XDP_PASS). This is basically reverting to the original counting before the commit in the fixes tag. As frag_page is still pointing to the original tail, the nr_frags parameter to xdp_update_skb_frags_info() needs to be calculated in a different way to reflect the new nr_frags. Fixes: `87bcef158a` ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Cc: Amery Hung <ameryhung@gmail.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260305142634.1813208-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-06 17:25:39 -08:00

1 2 3 4 5 ...

1427648 Commits All Branches Search

1427648 Commits

All Branches