Including fixes from IPsec and wireless.

Previous releases - regressions:
 
  - prevent NULL deref in generic_hwtstamp_ioctl_lower(),
    newer APIs don't populate all the pointers in the request
 
  - phylink: add missing supported link modes for the fixed-link
 
  - mptcp: fix false positive warning in mptcp_pm_nl_rm_addr
 
 Previous releases - always broken:
 
  - openvswitch: remove never-working support for setting NSH fields
 
  - xfrm: number of fixes for error paths of xfrm_state creation/
    modification/deletion
 
  - xfrm: fixes for offload
    - fix the determination of the protocol of the inner packet
    - don't push locally generated packets directly to L2 tunnel
      mode offloading, they still need processing from the standard
      xfrm path
 
  - mptcp: fix a couple of corner cases in fallback and fastclose
    handling
 
  - wifi: rtw89: hw_scan: prevent connections from getting stuck,
    work around apparent bug in FW by tweaking messages we send
 
  - af_unix: fix duplicate data if PEEK w/ peek_offset needs to wait
 
  - veth: more robust handing of race to avoid txq getting stuck
 
  - eth: ps3_gelic_net: handle skb allocation failures
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkfQv8ACgkQMUZtbf5S
 IruptQ//aa3wFISt/xfCxaygNnQAL1gxWbKg15cLnv88S5r5wrh+Mf/r+qHkT0IG
 qaxPToztcTKNGzRqMI/7x/OEXjCdfEmJtNcb5JfYRxGkYVcje81bhHrxZN+nLLcj
 jx1xD7rXCjfFNDVFSl+OzQkoTZtfGphX+JgCTKLPZ8t4qR/UhU1IwYgpoaiK0uCT
 gitXJPHrDGJLHud0ilaXbn+q0ZH9ZU08QSRhsaAq7C6YPafUseEr2fuQYgFr1kkp
 mtzFm84bEnRSneQ5+noVzoc5llbu3vf3Wd9Y5tr5sBaBjh5OpH/kK0kuPy6Lzhvd
 8y05jl8kyASMtRBoTvmpYOdi3xUNbge5AwJN3FIo4KPCmyr1bQoNoQVIivUpoiGz
 Zvv7QcceP7E5057CRuhi06krld/QaScHkmhUododbqJKmL8NWaSoXLRO7oYjB4kU
 1MuAwOJWVpsqnTUEbMw7dHJPZFTqRkLYv6oAmIqb/AWzTcqg9OXjLEc+OzUqWEl6
 mHv/SonSjAM81m3GxCxtNtH2ErKwdMIgI/ado39+KvksKTH1bhN9ZpmuyyktzpCj
 PBn9FVdd7zyct92u5xXtsKVkDWKGyD419Z2lFHC0FaAsNQmPrSqzi+U50pAtie5M
 JWABm8hsSdia4dTDrdldMgM50dzuBd5nSceY7XCqPA8nZ5+Af/8=
 =89Xl
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from IPsec and wireless.

  Previous releases - regressions:

   - prevent NULL deref in generic_hwtstamp_ioctl_lower(),
     newer APIs don't populate all the pointers in the request

   - phylink: add missing supported link modes for the fixed-link

   - mptcp: fix false positive warning in mptcp_pm_nl_rm_addr

  Previous releases - always broken:

   - openvswitch: remove never-working support for setting NSH fields

   - xfrm: number of fixes for error paths of xfrm_state creation/
     modification/deletion

   - xfrm: fixes for offload
      - fix the determination of the protocol of the inner packet
      - don't push locally generated packets directly to L2 tunnel
        mode offloading, they still need processing from the standard
        xfrm path

   - mptcp: fix a couple of corner cases in fallback and fastclose
     handling

   - wifi: rtw89: hw_scan: prevent connections from getting stuck,
     work around apparent bug in FW by tweaking messages we send

   - af_unix: fix duplicate data if PEEK w/ peek_offset needs to wait

   - veth: more robust handing of race to avoid txq getting stuck

   - eth: ps3_gelic_net: handle skb allocation failures"

* tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  vsock: Ignore signal/timeout on connect() if already established
  be2net: pass wrb_params in case of OS2BMC
  l2tp: reset skb control buffer on xmit
  net: dsa: microchip: lan937x: Fix RGMII delay tuning
  selftests: mptcp: add a check for 'add_addr_accepted'
  mptcp: fix address removal logic in mptcp_pm_nl_rm_addr
  selftests: mptcp: join: userspace: longer timeout
  selftests: mptcp: join: endpoints: longer timeout
  selftests: mptcp: join: fastclose: remove flaky marks
  mptcp: fix duplicate reset on fastclose
  mptcp: decouple mptcp fastclose from tcp close
  mptcp: do not fallback when OoO is present
  mptcp: fix premature close in case of fallback
  mptcp: avoid unneeded subflow-level drops
  mptcp: fix ack generation for fallback msk
  wifi: rtw89: hw_scan: Don't let the operating channel be last
  net: phylink: add missing supported link modes for the fixed-link
  selftest: af_unix: Add test for SO_PEEK_OFF.
  af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic().
  net/mlx5: Clean up only new IRQ glue on request_irq() failure
  ...
This commit is contained in:
Linus Torvalds 2025-11-20 08:52:07 -08:00
commit 8e621c9a33
43 changed files with 521 additions and 252 deletions

View File

@ -9266,7 +9266,6 @@ M: Ido Schimmel <idosch@nvidia.com>
L: bridge@lists.linux.dev L: bridge@lists.linux.dev
L: netdev@vger.kernel.org L: netdev@vger.kernel.org
S: Maintained S: Maintained
W: http://www.linuxfoundation.org/en/Net:Bridge
F: include/linux/if_bridge.h F: include/linux/if_bridge.h
F: include/uapi/linux/if_bridge.h F: include/uapi/linux/if_bridge.h
F: include/linux/netfilter_bridge/ F: include/linux/netfilter_bridge/

View File

@ -376,8 +376,18 @@ static int hellcreek_led_setup(struct hellcreek *hellcreek)
hellcreek_set_brightness(hellcreek, STATUS_OUT_IS_GM, 1); hellcreek_set_brightness(hellcreek, STATUS_OUT_IS_GM, 1);
/* Register both leds */ /* Register both leds */
led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good); ret = led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good);
led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm); if (ret) {
dev_err(hellcreek->dev, "Failed to register sync_good LED\n");
goto out;
}
ret = led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm);
if (ret) {
dev_err(hellcreek->dev, "Failed to register is_gm LED\n");
led_classdev_unregister(&hellcreek->led_sync_good);
goto out;
}
ret = 0; ret = 0;

View File

@ -540,6 +540,7 @@ static void lan937x_set_tune_adj(struct ksz_device *dev, int port,
ksz_pread16(dev, port, reg, &data16); ksz_pread16(dev, port, reg, &data16);
/* Update tune Adjust */ /* Update tune Adjust */
data16 &= ~PORT_TUNE_ADJ;
data16 |= FIELD_PREP(PORT_TUNE_ADJ, val); data16 |= FIELD_PREP(PORT_TUNE_ADJ, val);
ksz_pwrite16(dev, port, reg, data16); ksz_pwrite16(dev, port, reg, data16);

View File

@ -282,7 +282,7 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
if (!airoha_is_valid_gdm_port(eth, port)) if (!airoha_is_valid_gdm_port(eth, port))
return -EINVAL; return -EINVAL;
if (dsa_port >= 0) if (dsa_port >= 0 || eth->ports[1])
pse_port = port->id == 4 ? FE_PSE_PORT_GDM4 pse_port = port->id == 4 ? FE_PSE_PORT_GDM4
: port->id; : port->id;
else else

View File

@ -1296,7 +1296,8 @@ static void be_xmit_flush(struct be_adapter *adapter, struct be_tx_obj *txo)
(adapter->bmc_filt_mask & BMC_FILT_MULTICAST) (adapter->bmc_filt_mask & BMC_FILT_MULTICAST)
static bool be_send_pkt_to_bmc(struct be_adapter *adapter, static bool be_send_pkt_to_bmc(struct be_adapter *adapter,
struct sk_buff **skb) struct sk_buff **skb,
struct be_wrb_params *wrb_params)
{ {
struct ethhdr *eh = (struct ethhdr *)(*skb)->data; struct ethhdr *eh = (struct ethhdr *)(*skb)->data;
bool os2bmc = false; bool os2bmc = false;
@ -1360,7 +1361,7 @@ static bool be_send_pkt_to_bmc(struct be_adapter *adapter,
* to BMC, asic expects the vlan to be inline in the packet. * to BMC, asic expects the vlan to be inline in the packet.
*/ */
if (os2bmc) if (os2bmc)
*skb = be_insert_vlan_in_pkt(adapter, *skb, NULL); *skb = be_insert_vlan_in_pkt(adapter, *skb, wrb_params);
return os2bmc; return os2bmc;
} }
@ -1387,7 +1388,7 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev)
/* if os2bmc is enabled and if the pkt is destined to bmc, /* if os2bmc is enabled and if the pkt is destined to bmc,
* enqueue the pkt a 2nd time with mgmt bit set. * enqueue the pkt a 2nd time with mgmt bit set.
*/ */
if (be_send_pkt_to_bmc(adapter, &skb)) { if (be_send_pkt_to_bmc(adapter, &skb, &wrb_params)) {
BE_WRB_F_SET(wrb_params.features, OS2BMC, 1); BE_WRB_F_SET(wrb_params.features, OS2BMC, 1);
wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params); wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params);
if (unlikely(!wrb_cnt)) if (unlikely(!wrb_cnt))

View File

@ -3246,7 +3246,7 @@ void ice_ptp_init(struct ice_pf *pf)
err = ice_ptp_init_port(pf, &ptp->port); err = ice_ptp_init_port(pf, &ptp->port);
if (err) if (err)
goto err_exit; goto err_clean_pf;
/* Start the PHY timestamping block */ /* Start the PHY timestamping block */
ice_ptp_reset_phy_timestamping(pf); ice_ptp_reset_phy_timestamping(pf);
@ -3263,13 +3263,19 @@ void ice_ptp_init(struct ice_pf *pf)
dev_info(ice_pf_to_dev(pf), "PTP init successful\n"); dev_info(ice_pf_to_dev(pf), "PTP init successful\n");
return; return;
err_clean_pf:
mutex_destroy(&ptp->port.ps_lock);
ice_ptp_cleanup_pf(pf);
err_exit: err_exit:
/* If we registered a PTP clock, release it */ /* If we registered a PTP clock, release it */
if (pf->ptp.clock) { if (pf->ptp.clock) {
ptp_clock_unregister(ptp->clock); ptp_clock_unregister(ptp->clock);
pf->ptp.clock = NULL; pf->ptp.clock = NULL;
} }
ptp->state = ICE_PTP_ERROR; /* Keep ICE_PTP_UNINIT state to avoid ambiguity at driver unload
* and to avoid duplicated resources release.
*/
ptp->state = ICE_PTP_UNINIT;
dev_err(ice_pf_to_dev(pf), "PTP failed %d\n", err); dev_err(ice_pf_to_dev(pf), "PTP failed %d\n", err);
} }
@ -3282,9 +3288,19 @@ void ice_ptp_init(struct ice_pf *pf)
*/ */
void ice_ptp_release(struct ice_pf *pf) void ice_ptp_release(struct ice_pf *pf)
{ {
if (pf->ptp.state != ICE_PTP_READY) if (pf->ptp.state == ICE_PTP_UNINIT)
return; return;
if (pf->ptp.state != ICE_PTP_READY) {
mutex_destroy(&pf->ptp.port.ps_lock);
ice_ptp_cleanup_pf(pf);
if (pf->ptp.clock) {
ptp_clock_unregister(pf->ptp.clock);
pf->ptp.clock = NULL;
}
return;
}
pf->ptp.state = ICE_PTP_UNINIT; pf->ptp.state = ICE_PTP_UNINIT;
/* Disable timestamping for both Tx and Rx */ /* Disable timestamping for both Tx and Rx */

View File

@ -63,6 +63,8 @@ static void idpf_remove(struct pci_dev *pdev)
destroy_workqueue(adapter->vc_event_wq); destroy_workqueue(adapter->vc_event_wq);
for (i = 0; i < adapter->max_vports; i++) { for (i = 0; i < adapter->max_vports; i++) {
if (!adapter->vport_config[i])
continue;
kfree(adapter->vport_config[i]->user_config.q_coalesce); kfree(adapter->vport_config[i]->user_config.q_coalesce);
kfree(adapter->vport_config[i]); kfree(adapter->vport_config[i]);
adapter->vport_config[i] = NULL; adapter->vport_config[i] = NULL;

View File

@ -324,10 +324,8 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i,
free_irq(irq->map.virq, &irq->nh); free_irq(irq->map.virq, &irq->nh);
err_req_irq: err_req_irq:
#ifdef CONFIG_RFS_ACCEL #ifdef CONFIG_RFS_ACCEL
if (i && rmap && *rmap) { if (i && rmap && *rmap)
free_irq_cpu_rmap(*rmap); irq_cpu_rmap_remove(*rmap, irq->map.virq);
*rmap = NULL;
}
err_irq_rmap: err_irq_rmap:
#endif #endif
if (i && pci_msix_can_alloc_dyn(dev->pdev)) if (i && pci_msix_can_alloc_dyn(dev->pdev))

View File

@ -601,6 +601,8 @@ int mlxsw_linecard_devlink_info_get(struct mlxsw_linecard *linecard,
err = devlink_info_version_fixed_put(req, err = devlink_info_version_fixed_put(req,
DEVLINK_INFO_VERSION_GENERIC_FW_PSID, DEVLINK_INFO_VERSION_GENERIC_FW_PSID,
info->psid); info->psid);
if (err)
goto unlock;
sprintf(buf, "%u.%u.%u", info->fw_major, info->fw_minor, sprintf(buf, "%u.%u.%u", info->fw_major, info->fw_minor,
info->fw_sub_minor); info->fw_sub_minor);

View File

@ -830,8 +830,10 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
return -EINVAL; return -EINVAL;
rule = mlxsw_sp_acl_rule_lookup(mlxsw_sp, ruleset, f->cookie); rule = mlxsw_sp_acl_rule_lookup(mlxsw_sp, ruleset, f->cookie);
if (!rule) if (!rule) {
return -EINVAL; err = -EINVAL;
goto err_rule_get_stats;
}
err = mlxsw_sp_acl_rule_get_stats(mlxsw_sp, rule, &packets, &bytes, err = mlxsw_sp_acl_rule_get_stats(mlxsw_sp, rule, &packets, &bytes,
&drops, &lastuse, &used_hw_stats); &drops, &lastuse, &used_hw_stats);

View File

@ -4,6 +4,7 @@
* Copyright (c) 2019-2020 Marvell International Ltd. * Copyright (c) 2019-2020 Marvell International Ltd.
*/ */
#include <linux/array_size.h>
#include <linux/netdevice.h> #include <linux/netdevice.h>
#include <linux/etherdevice.h> #include <linux/etherdevice.h>
#include <linux/skbuff.h> #include <linux/skbuff.h>
@ -960,7 +961,7 @@ static inline void qede_tpa_cont(struct qede_dev *edev,
{ {
int i; int i;
for (i = 0; cqe->len_list[i]; i++) for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++)
qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index, qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index,
le16_to_cpu(cqe->len_list[i])); le16_to_cpu(cqe->len_list[i]));
@ -985,7 +986,7 @@ static int qede_tpa_end(struct qede_dev *edev,
dma_unmap_page(rxq->dev, tpa_info->buffer.mapping, dma_unmap_page(rxq->dev, tpa_info->buffer.mapping,
PAGE_SIZE, rxq->data_direction); PAGE_SIZE, rxq->data_direction);
for (i = 0; cqe->len_list[i]; i++) for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++)
qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index, qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index,
le16_to_cpu(cqe->len_list[i])); le16_to_cpu(cqe->len_list[i]));
if (unlikely(i > 1)) if (unlikely(i > 1))

View File

@ -260,6 +260,7 @@ void gelic_card_down(struct gelic_card *card)
if (atomic_dec_if_positive(&card->users) == 0) { if (atomic_dec_if_positive(&card->users) == 0) {
pr_debug("%s: real do\n", __func__); pr_debug("%s: real do\n", __func__);
napi_disable(&card->napi); napi_disable(&card->napi);
timer_delete_sync(&card->rx_oom_timer);
/* /*
* Disable irq. Wireless interrupts will * Disable irq. Wireless interrupts will
* be disabled later if any * be disabled later if any
@ -970,7 +971,8 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr,
* gelic_card_decode_one_descr - processes an rx descriptor * gelic_card_decode_one_descr - processes an rx descriptor
* @card: card structure * @card: card structure
* *
* returns 1 if a packet has been sent to the stack, otherwise 0 * returns 1 if a packet has been sent to the stack, -ENOMEM on skb alloc
* failure, otherwise 0
* *
* processes an rx descriptor by iommu-unmapping the data buffer and passing * processes an rx descriptor by iommu-unmapping the data buffer and passing
* the packet up to the stack * the packet up to the stack
@ -981,16 +983,18 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
struct gelic_descr_chain *chain = &card->rx_chain; struct gelic_descr_chain *chain = &card->rx_chain;
struct gelic_descr *descr = chain->head; struct gelic_descr *descr = chain->head;
struct net_device *netdev = NULL; struct net_device *netdev = NULL;
int dmac_chain_ended; int dmac_chain_ended = 0;
int prepare_rx_ret;
status = gelic_descr_get_status(descr); status = gelic_descr_get_status(descr);
if (status == GELIC_DESCR_DMA_CARDOWNED) if (status == GELIC_DESCR_DMA_CARDOWNED)
return 0; return 0;
if (status == GELIC_DESCR_DMA_NOT_IN_USE) { if (status == GELIC_DESCR_DMA_NOT_IN_USE || !descr->skb) {
dev_dbg(ctodev(card), "dormant descr? %p\n", descr); dev_dbg(ctodev(card), "dormant descr? %p\n", descr);
return 0; dmac_chain_ended = 1;
goto refill;
} }
/* netdevice select */ /* netdevice select */
@ -1048,9 +1052,10 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
refill: refill:
/* is the current descriptor terminated with next_descr == NULL? */ /* is the current descriptor terminated with next_descr == NULL? */
dmac_chain_ended = if (!dmac_chain_ended)
be32_to_cpu(descr->hw_regs.dmac_cmd_status) & dmac_chain_ended =
GELIC_DESCR_RX_DMA_CHAIN_END; be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
GELIC_DESCR_RX_DMA_CHAIN_END;
/* /*
* So that always DMAC can see the end * So that always DMAC can see the end
* of the descriptor chain to avoid * of the descriptor chain to avoid
@ -1062,10 +1067,11 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE); gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
/* /*
* this call can fail, but for now, just leave this * this call can fail, propagate the error
* descriptor without skb
*/ */
gelic_descr_prepare_rx(card, descr); prepare_rx_ret = gelic_descr_prepare_rx(card, descr);
if (prepare_rx_ret)
return prepare_rx_ret;
chain->tail = descr; chain->tail = descr;
chain->head = descr->next; chain->head = descr->next;
@ -1087,6 +1093,13 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
return 1; return 1;
} }
static void gelic_rx_oom_timer(struct timer_list *t)
{
struct gelic_card *card = timer_container_of(card, t, rx_oom_timer);
napi_schedule(&card->napi);
}
/** /**
* gelic_net_poll - NAPI poll function called by the stack to return packets * gelic_net_poll - NAPI poll function called by the stack to return packets
* @napi: napi structure * @napi: napi structure
@ -1099,14 +1112,22 @@ static int gelic_net_poll(struct napi_struct *napi, int budget)
{ {
struct gelic_card *card = container_of(napi, struct gelic_card, napi); struct gelic_card *card = container_of(napi, struct gelic_card, napi);
int packets_done = 0; int packets_done = 0;
int work_result = 0;
while (packets_done < budget) { while (packets_done < budget) {
if (!gelic_card_decode_one_descr(card)) work_result = gelic_card_decode_one_descr(card);
if (work_result != 1)
break; break;
packets_done++; packets_done++;
} }
if (work_result == -ENOMEM) {
napi_complete_done(napi, packets_done);
mod_timer(&card->rx_oom_timer, jiffies + 1);
return packets_done;
}
if (packets_done < budget) { if (packets_done < budget) {
napi_complete_done(napi, packets_done); napi_complete_done(napi, packets_done);
gelic_card_rx_irq_on(card); gelic_card_rx_irq_on(card);
@ -1576,6 +1597,8 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev)
mutex_init(&card->updown_lock); mutex_init(&card->updown_lock);
atomic_set(&card->users, 0); atomic_set(&card->users, 0);
timer_setup(&card->rx_oom_timer, gelic_rx_oom_timer, 0);
return card; return card;
} }

View File

@ -268,6 +268,7 @@ struct gelic_vlan_id {
struct gelic_card { struct gelic_card {
struct napi_struct napi; struct napi_struct napi;
struct net_device *netdev[GELIC_PORT_MAX]; struct net_device *netdev[GELIC_PORT_MAX];
struct timer_list rx_oom_timer;
/* /*
* hypervisor requires irq_status should be * hypervisor requires irq_status should be
* 8 bytes aligned, but u64 member is * 8 bytes aligned, but u64 member is

View File

@ -637,6 +637,9 @@ static int phylink_validate(struct phylink *pl, unsigned long *supported,
static void phylink_fill_fixedlink_supported(unsigned long *supported) static void phylink_fill_fixedlink_supported(unsigned long *supported)
{ {
linkmode_set_bit(ETHTOOL_LINK_MODE_Pause_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, supported);

View File

@ -392,14 +392,12 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
} }
/* Restore Eth hdr pulled by dev_forward_skb/eth_type_trans */ /* Restore Eth hdr pulled by dev_forward_skb/eth_type_trans */
__skb_push(skb, ETH_HLEN); __skb_push(skb, ETH_HLEN);
/* Depend on prior success packets started NAPI consumer via
* __veth_xdp_flush(). Cancel TXQ stop if consumer stopped,
* paired with empty check in veth_poll().
*/
netif_tx_stop_queue(txq); netif_tx_stop_queue(txq);
smp_mb__after_atomic(); /* Makes sure NAPI peer consumer runs. Consumer is responsible
if (unlikely(__ptr_ring_empty(&rq->xdp_ring))) * for starting txq again, until then ndo_start_xmit (this
netif_tx_wake_queue(txq); * function) will not be invoked by the netstack again.
*/
__veth_xdp_flush(rq);
break; break;
case NET_RX_DROP: /* same as NET_XMIT_DROP */ case NET_RX_DROP: /* same as NET_XMIT_DROP */
drop: drop:
@ -900,17 +898,9 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
struct veth_xdp_tx_bq *bq, struct veth_xdp_tx_bq *bq,
struct veth_stats *stats) struct veth_stats *stats)
{ {
struct veth_priv *priv = netdev_priv(rq->dev);
int queue_idx = rq->xdp_rxq.queue_index;
struct netdev_queue *peer_txq;
struct net_device *peer_dev;
int i, done = 0, n_xdpf = 0; int i, done = 0, n_xdpf = 0;
void *xdpf[VETH_XDP_BATCH]; void *xdpf[VETH_XDP_BATCH];
/* NAPI functions as RCU section */
peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
for (i = 0; i < budget; i++) { for (i = 0; i < budget; i++) {
void *ptr = __ptr_ring_consume(&rq->xdp_ring); void *ptr = __ptr_ring_consume(&rq->xdp_ring);
@ -959,9 +949,6 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
rq->stats.vs.xdp_packets += done; rq->stats.vs.xdp_packets += done;
u64_stats_update_end(&rq->stats.syncp); u64_stats_update_end(&rq->stats.syncp);
if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq)))
netif_tx_wake_queue(peer_txq);
return done; return done;
} }
@ -969,12 +956,20 @@ static int veth_poll(struct napi_struct *napi, int budget)
{ {
struct veth_rq *rq = struct veth_rq *rq =
container_of(napi, struct veth_rq, xdp_napi); container_of(napi, struct veth_rq, xdp_napi);
struct veth_priv *priv = netdev_priv(rq->dev);
int queue_idx = rq->xdp_rxq.queue_index;
struct netdev_queue *peer_txq;
struct veth_stats stats = {}; struct veth_stats stats = {};
struct net_device *peer_dev;
struct veth_xdp_tx_bq bq; struct veth_xdp_tx_bq bq;
int done; int done;
bq.count = 0; bq.count = 0;
/* NAPI functions as RCU section */
peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
xdp_set_return_frame_no_direct(); xdp_set_return_frame_no_direct();
done = veth_xdp_rcv(rq, budget, &bq, &stats); done = veth_xdp_rcv(rq, budget, &bq, &stats);
@ -996,6 +991,13 @@ static int veth_poll(struct napi_struct *napi, int budget)
veth_xdp_flush(rq, &bq); veth_xdp_flush(rq, &bq);
xdp_clear_return_frame_no_direct(); xdp_clear_return_frame_no_direct();
/* Release backpressure per NAPI poll */
smp_rmb(); /* Paired with netif_tx_stop_queue set_bit */
if (peer_txq && netif_tx_queue_stopped(peer_txq)) {
txq_trans_cond_update(peer_txq);
netif_tx_wake_queue(peer_txq);
}
return done; return done;
} }

View File

@ -7694,6 +7694,13 @@ int rtw89_hw_scan_add_chan_list_ax(struct rtw89_dev *rtwdev,
INIT_LIST_HEAD(&list); INIT_LIST_HEAD(&list);
list_for_each_entry_safe(ch_info, tmp, &scan_info->chan_list, list) { list_for_each_entry_safe(ch_info, tmp, &scan_info->chan_list, list) {
/* The operating channel (tx_null == true) should
* not be last in the list, to avoid breaking
* RTL8851BU and RTL8832BU.
*/
if (list_len + 1 == RTW89_SCAN_LIST_LIMIT_AX && ch_info->tx_null)
break;
list_move_tail(&ch_info->list, &list); list_move_tail(&ch_info->list, &list);
list_len++; list_len++;

View File

@ -701,7 +701,6 @@ static void mpc_rcvd_sweep_req(struct mpcg_info *mpcginfo)
grp->sweep_req_pend_num--; grp->sweep_req_pend_num--;
ctcmpc_send_sweep_resp(ch); ctcmpc_send_sweep_resp(ch);
kfree(mpcginfo);
return; return;
} }

View File

@ -536,7 +536,8 @@ static inline int xfrm_af2proto(unsigned int family)
static inline const struct xfrm_mode *xfrm_ip2inner_mode(struct xfrm_state *x, int ipproto) static inline const struct xfrm_mode *xfrm_ip2inner_mode(struct xfrm_state *x, int ipproto)
{ {
if ((ipproto == IPPROTO_IPIP && x->props.family == AF_INET) || if ((x->sel.family != AF_UNSPEC) ||
(ipproto == IPPROTO_IPIP && x->props.family == AF_INET) ||
(ipproto == IPPROTO_IPV6 && x->props.family == AF_INET6)) (ipproto == IPPROTO_IPV6 && x->props.family == AF_INET6))
return &x->inner_mode; return &x->inner_mode;
else else

View File

@ -443,6 +443,9 @@ static int generic_hwtstamp_ioctl_lower(struct net_device *dev, int cmd,
struct ifreq ifrr; struct ifreq ifrr;
int err; int err;
if (!kernel_cfg->ifr)
return -EINVAL;
strscpy_pad(ifrr.ifr_name, dev->name, IFNAMSIZ); strscpy_pad(ifrr.ifr_name, dev->name, IFNAMSIZ);
ifrr.ifr_ifru = kernel_cfg->ifr->ifr_ifru; ifrr.ifr_ifru = kernel_cfg->ifr->ifr_ifru;

View File

@ -828,13 +828,15 @@ void devl_rate_nodes_destroy(struct devlink *devlink)
if (!devlink_rate->parent) if (!devlink_rate->parent)
continue; continue;
refcount_dec(&devlink_rate->parent->refcnt);
if (devlink_rate_is_leaf(devlink_rate)) if (devlink_rate_is_leaf(devlink_rate))
ops->rate_leaf_parent_set(devlink_rate, NULL, devlink_rate->priv, ops->rate_leaf_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL); NULL, NULL);
else if (devlink_rate_is_node(devlink_rate)) else if (devlink_rate_is_node(devlink_rate))
ops->rate_node_parent_set(devlink_rate, NULL, devlink_rate->priv, ops->rate_node_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL); NULL, NULL);
refcount_dec(&devlink_rate->parent->refcnt);
devlink_rate->parent = NULL;
} }
list_for_each_entry_safe(devlink_rate, tmp, &devlink->rate_list, list) { list_for_each_entry_safe(devlink_rate, tmp, &devlink->rate_list, list) {
if (devlink_rate_is_node(devlink_rate)) { if (devlink_rate_is_node(devlink_rate)) {

View File

@ -122,8 +122,10 @@ static struct sk_buff *xfrm4_tunnel_gso_segment(struct xfrm_state *x,
struct sk_buff *skb, struct sk_buff *skb,
netdev_features_t features) netdev_features_t features)
{ {
__be16 type = x->inner_mode.family == AF_INET6 ? htons(ETH_P_IPV6) const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x,
: htons(ETH_P_IP); XFRM_MODE_SKB_CB(skb)->protocol);
__be16 type = inner_mode->family == AF_INET6 ? htons(ETH_P_IPV6)
: htons(ETH_P_IP);
return skb_eth_gso_segment(skb, features, type); return skb_eth_gso_segment(skb, features, type);
} }

View File

@ -158,8 +158,10 @@ static struct sk_buff *xfrm6_tunnel_gso_segment(struct xfrm_state *x,
struct sk_buff *skb, struct sk_buff *skb,
netdev_features_t features) netdev_features_t features)
{ {
__be16 type = x->inner_mode.family == AF_INET ? htons(ETH_P_IP) const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x,
: htons(ETH_P_IPV6); XFRM_MODE_SKB_CB(skb)->protocol);
__be16 type = inner_mode->family == AF_INET ? htons(ETH_P_IP)
: htons(ETH_P_IPV6);
return skb_eth_gso_segment(skb, features, type); return skb_eth_gso_segment(skb, features, type);
} }

View File

@ -1246,9 +1246,9 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, uns
else else
l2tp_build_l2tpv3_header(session, __skb_push(skb, session->hdr_len)); l2tp_build_l2tpv3_header(session, __skb_push(skb, session->hdr_len));
/* Reset skb netfilter state */ /* Reset control buffer */
memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); memset(skb->cb, 0, sizeof(skb->cb));
IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED | IPSKB_REROUTED);
nf_reset_ct(skb); nf_reset_ct(skb);
/* L2TP uses its own lockdep subclass to avoid lockdep splats caused by /* L2TP uses its own lockdep subclass to avoid lockdep splats caused by

View File

@ -838,8 +838,11 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
opts->suboptions = 0; opts->suboptions = 0;
/* Force later mptcp_write_options(), but do not use any actual
* option space.
*/
if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb))) if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb)))
return false; return true;
if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) { if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) {
if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) || if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) ||
@ -1041,6 +1044,31 @@ static void __mptcp_snd_una_update(struct mptcp_sock *msk, u64 new_snd_una)
WRITE_ONCE(msk->snd_una, new_snd_una); WRITE_ONCE(msk->snd_una, new_snd_una);
} }
static void rwin_update(struct mptcp_sock *msk, struct sock *ssk,
struct sk_buff *skb)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct tcp_sock *tp = tcp_sk(ssk);
u64 mptcp_rcv_wnd;
/* Avoid touching extra cachelines if TCP is going to accept this
* skb without filling the TCP-level window even with a possibly
* outdated mptcp-level rwin.
*/
if (!skb->len || skb->len < tcp_receive_window(tp))
return;
mptcp_rcv_wnd = atomic64_read(&msk->rcv_wnd_sent);
if (!after64(mptcp_rcv_wnd, subflow->rcv_wnd_sent))
return;
/* Some other subflow grew the mptcp-level rwin since rcv_wup,
* resync.
*/
tp->rcv_wnd += mptcp_rcv_wnd - subflow->rcv_wnd_sent;
subflow->rcv_wnd_sent = mptcp_rcv_wnd;
}
static void ack_update_msk(struct mptcp_sock *msk, static void ack_update_msk(struct mptcp_sock *msk,
struct sock *ssk, struct sock *ssk,
struct mptcp_options_received *mp_opt) struct mptcp_options_received *mp_opt)
@ -1208,6 +1236,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
*/ */
if (mp_opt.use_ack) if (mp_opt.use_ack)
ack_update_msk(msk, sk, &mp_opt); ack_update_msk(msk, sk, &mp_opt);
rwin_update(msk, sk, skb);
/* Zero-data-length packets are dropped by the caller and not /* Zero-data-length packets are dropped by the caller and not
* propagated to the MPTCP layer, so the skb extension does not * propagated to the MPTCP layer, so the skb extension does not
@ -1294,6 +1323,10 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
if (rcv_wnd_new != rcv_wnd_old) { if (rcv_wnd_new != rcv_wnd_old) {
raise_win: raise_win:
/* The msk-level rcv wnd is after the tcp level one,
* sync the latter.
*/
rcv_wnd_new = rcv_wnd_old;
win = rcv_wnd_old - ack_seq; win = rcv_wnd_old - ack_seq;
tp->rcv_wnd = min_t(u64, win, U32_MAX); tp->rcv_wnd = min_t(u64, win, U32_MAX);
new_win = tp->rcv_wnd; new_win = tp->rcv_wnd;
@ -1317,6 +1350,21 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
update_wspace: update_wspace:
WRITE_ONCE(msk->old_wspace, tp->rcv_wnd); WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
subflow->rcv_wnd_sent = rcv_wnd_new;
}
static void mptcp_track_rwin(struct tcp_sock *tp)
{
const struct sock *ssk = (const struct sock *)tp;
struct mptcp_subflow_context *subflow;
struct mptcp_sock *msk;
if (!ssk)
return;
subflow = mptcp_subflow_ctx(ssk);
msk = mptcp_sk(subflow->conn);
WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
} }
__sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum) __sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
@ -1611,6 +1659,10 @@ void mptcp_write_options(struct tcphdr *th, __be32 *ptr, struct tcp_sock *tp,
opts->reset_transient, opts->reset_transient,
opts->reset_reason); opts->reset_reason);
return; return;
} else if (unlikely(!opts->suboptions)) {
/* Fallback to TCP */
mptcp_track_rwin(tp);
return;
} }
if (OPTION_MPTCP_PRIO & opts->suboptions) { if (OPTION_MPTCP_PRIO & opts->suboptions) {

View File

@ -18,6 +18,7 @@ struct mptcp_pm_add_entry {
u8 retrans_times; u8 retrans_times;
struct timer_list add_timer; struct timer_list add_timer;
struct mptcp_sock *sock; struct mptcp_sock *sock;
struct rcu_head rcu;
}; };
static DEFINE_SPINLOCK(mptcp_pm_list_lock); static DEFINE_SPINLOCK(mptcp_pm_list_lock);
@ -155,7 +156,7 @@ bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk,
entry = mptcp_pm_del_add_timer(msk, addr, false); entry = mptcp_pm_del_add_timer(msk, addr, false);
ret = entry; ret = entry;
kfree(entry); kfree_rcu(entry, rcu);
return ret; return ret;
} }
@ -345,22 +346,27 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
{ {
struct mptcp_pm_add_entry *entry; struct mptcp_pm_add_entry *entry;
struct sock *sk = (struct sock *)msk; struct sock *sk = (struct sock *)msk;
struct timer_list *add_timer = NULL; bool stop_timer = false;
rcu_read_lock();
spin_lock_bh(&msk->pm.lock); spin_lock_bh(&msk->pm.lock);
entry = mptcp_lookup_anno_list_by_saddr(msk, addr); entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (entry && (!check_id || entry->addr.id == addr->id)) { if (entry && (!check_id || entry->addr.id == addr->id)) {
entry->retrans_times = ADD_ADDR_RETRANS_MAX; entry->retrans_times = ADD_ADDR_RETRANS_MAX;
add_timer = &entry->add_timer; stop_timer = true;
} }
if (!check_id && entry) if (!check_id && entry)
list_del(&entry->list); list_del(&entry->list);
spin_unlock_bh(&msk->pm.lock); spin_unlock_bh(&msk->pm.lock);
/* no lock, because sk_stop_timer_sync() is calling timer_delete_sync() */ /* Note: entry might have been removed by another thread.
if (add_timer) * We hold rcu_read_lock() to ensure it is not freed under us.
sk_stop_timer_sync(sk, add_timer); */
if (stop_timer)
sk_stop_timer_sync(sk, &entry->add_timer);
rcu_read_unlock();
return entry; return entry;
} }
@ -415,7 +421,7 @@ static void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
list_for_each_entry_safe(entry, tmp, &free_list, list) { list_for_each_entry_safe(entry, tmp, &free_list, list) {
sk_stop_timer_sync(sk, &entry->add_timer); sk_stop_timer_sync(sk, &entry->add_timer);
kfree(entry); kfree_rcu(entry, rcu);
} }
} }

View File

@ -672,7 +672,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id) void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id)
{ {
if (rm_id && WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { if (rm_id && !WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) {
u8 limit_add_addr_accepted = u8 limit_add_addr_accepted =
mptcp_pm_get_limit_add_addr_accepted(msk); mptcp_pm_get_limit_add_addr_accepted(msk);

View File

@ -78,6 +78,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib)
if (__mptcp_check_fallback(msk)) if (__mptcp_check_fallback(msk))
return true; return true;
/* The caller possibly is not holding the msk socket lock, but
* in the fallback case only the current subflow is touching
* the OoO queue.
*/
if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
return false;
spin_lock_bh(&msk->fallback_lock); spin_lock_bh(&msk->fallback_lock);
if (!msk->allow_infinite_fallback) { if (!msk->allow_infinite_fallback) {
spin_unlock_bh(&msk->fallback_lock); spin_unlock_bh(&msk->fallback_lock);
@ -937,14 +944,19 @@ static void mptcp_reset_rtx_timer(struct sock *sk)
bool mptcp_schedule_work(struct sock *sk) bool mptcp_schedule_work(struct sock *sk)
{ {
if (inet_sk_state_load(sk) != TCP_CLOSE && if (inet_sk_state_load(sk) == TCP_CLOSE)
schedule_work(&mptcp_sk(sk)->work)) { return false;
/* each subflow already holds a reference to the sk, and the
* workqueue is invoked by a subflow, so sk can't go away here. /* Get a reference on this socket, mptcp_worker() will release it.
*/ * As mptcp_worker() might complete before us, we can not avoid
sock_hold(sk); * a sock_hold()/sock_put() if schedule_work() returns false.
*/
sock_hold(sk);
if (schedule_work(&mptcp_sk(sk)->work))
return true; return true;
}
sock_put(sk);
return false; return false;
} }
@ -2399,7 +2411,6 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
/* flags for __mptcp_close_ssk() */ /* flags for __mptcp_close_ssk() */
#define MPTCP_CF_PUSH BIT(1) #define MPTCP_CF_PUSH BIT(1)
#define MPTCP_CF_FASTCLOSE BIT(2)
/* be sure to send a reset only if the caller asked for it, also /* be sure to send a reset only if the caller asked for it, also
* clean completely the subflow status when the subflow reaches * clean completely the subflow status when the subflow reaches
@ -2410,7 +2421,7 @@ static void __mptcp_subflow_disconnect(struct sock *ssk,
unsigned int flags) unsigned int flags)
{ {
if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) || if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
(flags & MPTCP_CF_FASTCLOSE)) { subflow->send_fastclose) {
/* The MPTCP code never wait on the subflow sockets, TCP-level /* The MPTCP code never wait on the subflow sockets, TCP-level
* disconnect should never fail * disconnect should never fail
*/ */
@ -2457,14 +2468,8 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) { if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
/* be sure to force the tcp_close path tcp_set_state(ssk, TCP_CLOSE);
* to generate the egress reset
*/
ssk->sk_lingertime = 0;
sock_set_flag(ssk, SOCK_LINGER);
subflow->send_fastclose = 1;
}
need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk); need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk);
if (!dispose_it) { if (!dispose_it) {
@ -2560,7 +2565,8 @@ static void __mptcp_close_subflow(struct sock *sk)
if (ssk_state != TCP_CLOSE && if (ssk_state != TCP_CLOSE &&
(ssk_state != TCP_CLOSE_WAIT || (ssk_state != TCP_CLOSE_WAIT ||
inet_sk_state_load(sk) != TCP_ESTABLISHED)) inet_sk_state_load(sk) != TCP_ESTABLISHED ||
__mptcp_check_fallback(msk)))
continue; continue;
/* 'subflow_data_ready' will re-sched once rx queue is empty */ /* 'subflow_data_ready' will re-sched once rx queue is empty */
@ -2768,9 +2774,26 @@ static void mptcp_do_fastclose(struct sock *sk)
struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_sock *msk = mptcp_sk(sk);
mptcp_set_state(sk, TCP_CLOSE); mptcp_set_state(sk, TCP_CLOSE);
mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), /* Explicitly send the fastclose reset as need */
subflow, MPTCP_CF_FASTCLOSE); if (__mptcp_check_fallback(msk))
return;
mptcp_for_each_subflow_safe(msk, subflow, tmp) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
lock_sock(ssk);
/* Some subflow socket states don't allow/need a reset.*/
if ((1 << ssk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
goto unlock;
subflow->send_fastclose = 1;
tcp_send_active_reset(ssk, ssk->sk_allocation,
SK_RST_REASON_TCP_ABORT_ON_CLOSE);
unlock:
release_sock(ssk);
}
} }
static void mptcp_worker(struct work_struct *work) static void mptcp_worker(struct work_struct *work)
@ -2797,7 +2820,11 @@ static void mptcp_worker(struct work_struct *work)
__mptcp_close_subflow(sk); __mptcp_close_subflow(sk);
if (mptcp_close_tout_expired(sk)) { if (mptcp_close_tout_expired(sk)) {
struct mptcp_subflow_context *subflow, *tmp;
mptcp_do_fastclose(sk); mptcp_do_fastclose(sk);
mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, subflow->tcp_sock, subflow, 0);
mptcp_close_wake_up(sk); mptcp_close_wake_up(sk);
} }
@ -3222,7 +3249,8 @@ static int mptcp_disconnect(struct sock *sk, int flags)
/* msk->subflow is still intact, the following will not free the first /* msk->subflow is still intact, the following will not free the first
* subflow * subflow
*/ */
mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE); mptcp_do_fastclose(sk);
mptcp_destroy_common(msk);
/* The first subflow is already in TCP_CLOSE status, the following /* The first subflow is already in TCP_CLOSE status, the following
* can't overlap with a fallback anymore * can't overlap with a fallback anymore
@ -3401,7 +3429,7 @@ void mptcp_rcv_space_init(struct mptcp_sock *msk, const struct sock *ssk)
msk->rcvq_space.space = TCP_INIT_CWND * TCP_MSS_DEFAULT; msk->rcvq_space.space = TCP_INIT_CWND * TCP_MSS_DEFAULT;
} }
void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) void mptcp_destroy_common(struct mptcp_sock *msk)
{ {
struct mptcp_subflow_context *subflow, *tmp; struct mptcp_subflow_context *subflow, *tmp;
struct sock *sk = (struct sock *)msk; struct sock *sk = (struct sock *)msk;
@ -3410,7 +3438,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags)
/* join list will be eventually flushed (with rst) at sock lock release time */ /* join list will be eventually flushed (with rst) at sock lock release time */
mptcp_for_each_subflow_safe(msk, subflow, tmp) mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, 0);
__skb_queue_purge(&sk->sk_receive_queue); __skb_queue_purge(&sk->sk_receive_queue);
skb_rbtree_purge(&msk->out_of_order_queue); skb_rbtree_purge(&msk->out_of_order_queue);
@ -3428,7 +3456,7 @@ static void mptcp_destroy(struct sock *sk)
/* allow the following to close even the initial subflow */ /* allow the following to close even the initial subflow */
msk->free_first = 1; msk->free_first = 1;
mptcp_destroy_common(msk, 0); mptcp_destroy_common(msk);
sk_sockets_allocated_dec(sk); sk_sockets_allocated_dec(sk);
} }

View File

@ -509,6 +509,7 @@ struct mptcp_subflow_context {
u64 remote_key; u64 remote_key;
u64 idsn; u64 idsn;
u64 map_seq; u64 map_seq;
u64 rcv_wnd_sent;
u32 snd_isn; u32 snd_isn;
u32 token; u32 token;
u32 rel_write_seq; u32 rel_write_seq;
@ -976,7 +977,7 @@ static inline void mptcp_propagate_sndbuf(struct sock *sk, struct sock *ssk)
local_bh_enable(); local_bh_enable();
} }
void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags); void mptcp_destroy_common(struct mptcp_sock *msk);
#define MPTCP_TOKEN_MAX_RETRIES 4 #define MPTCP_TOKEN_MAX_RETRIES 4

View File

@ -572,69 +572,6 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key,
return 0; return 0;
} }
static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
const struct nlattr *a)
{
struct nshhdr *nh;
size_t length;
int err;
u8 flags;
u8 ttl;
int i;
struct ovs_key_nsh key;
struct ovs_key_nsh mask;
err = nsh_key_from_nlattr(a, &key, &mask);
if (err)
return err;
/* Make sure the NSH base header is there */
if (!pskb_may_pull(skb, skb_network_offset(skb) + NSH_BASE_HDR_LEN))
return -ENOMEM;
nh = nsh_hdr(skb);
length = nsh_hdr_len(nh);
/* Make sure the whole NSH header is there */
err = skb_ensure_writable(skb, skb_network_offset(skb) +
length);
if (unlikely(err))
return err;
nh = nsh_hdr(skb);
skb_postpull_rcsum(skb, nh, length);
flags = nsh_get_flags(nh);
flags = OVS_MASKED(flags, key.base.flags, mask.base.flags);
flow_key->nsh.base.flags = flags;
ttl = nsh_get_ttl(nh);
ttl = OVS_MASKED(ttl, key.base.ttl, mask.base.ttl);
flow_key->nsh.base.ttl = ttl;
nsh_set_flags_and_ttl(nh, flags, ttl);
nh->path_hdr = OVS_MASKED(nh->path_hdr, key.base.path_hdr,
mask.base.path_hdr);
flow_key->nsh.base.path_hdr = nh->path_hdr;
switch (nh->mdtype) {
case NSH_M_TYPE1:
for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++) {
nh->md1.context[i] =
OVS_MASKED(nh->md1.context[i], key.context[i],
mask.context[i]);
}
memcpy(flow_key->nsh.context, nh->md1.context,
sizeof(nh->md1.context));
break;
case NSH_M_TYPE2:
memset(flow_key->nsh.context, 0,
sizeof(flow_key->nsh.context));
break;
default:
return -EINVAL;
}
skb_postpush_rcsum(skb, nh, length);
return 0;
}
/* Must follow skb_ensure_writable() since that can move the skb data. */ /* Must follow skb_ensure_writable() since that can move the skb data. */
static void set_tp_port(struct sk_buff *skb, __be16 *port, static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check) __be16 new_port, __sum16 *check)
@ -1130,10 +1067,6 @@ static int execute_masked_set_action(struct sk_buff *skb,
get_mask(a, struct ovs_key_ethernet *)); get_mask(a, struct ovs_key_ethernet *));
break; break;
case OVS_KEY_ATTR_NSH:
err = set_nsh(skb, flow_key, a);
break;
case OVS_KEY_ATTR_IPV4: case OVS_KEY_ATTR_IPV4:
err = set_ipv4(skb, flow_key, nla_data(a), err = set_ipv4(skb, flow_key, nla_data(a),
get_mask(a, struct ovs_key_ipv4 *)); get_mask(a, struct ovs_key_ipv4 *));
@ -1170,6 +1103,7 @@ static int execute_masked_set_action(struct sk_buff *skb,
case OVS_KEY_ATTR_CT_LABELS: case OVS_KEY_ATTR_CT_LABELS:
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4: case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4:
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6: case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6:
case OVS_KEY_ATTR_NSH:
err = -EINVAL; err = -EINVAL;
break; break;
} }

View File

@ -1305,6 +1305,11 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
return 0; return 0;
} }
/*
* Constructs NSH header 'nh' from attributes of OVS_ACTION_ATTR_PUSH_NSH,
* where 'nh' points to a memory block of 'size' bytes. It's assumed that
* attributes were previously validated with validate_push_nsh().
*/
int nsh_hdr_from_nlattr(const struct nlattr *attr, int nsh_hdr_from_nlattr(const struct nlattr *attr,
struct nshhdr *nh, size_t size) struct nshhdr *nh, size_t size)
{ {
@ -1314,8 +1319,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr,
u8 ttl = 0; u8 ttl = 0;
int mdlen = 0; int mdlen = 0;
/* validate_nsh has check this, so we needn't do duplicate check here
*/
if (size < NSH_BASE_HDR_LEN) if (size < NSH_BASE_HDR_LEN)
return -ENOBUFS; return -ENOBUFS;
@ -1359,46 +1362,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr,
return 0; return 0;
} }
int nsh_key_from_nlattr(const struct nlattr *attr,
struct ovs_key_nsh *nsh, struct ovs_key_nsh *nsh_mask)
{
struct nlattr *a;
int rem;
/* validate_nsh has check this, so we needn't do duplicate check here
*/
nla_for_each_nested(a, attr, rem) {
int type = nla_type(a);
switch (type) {
case OVS_NSH_KEY_ATTR_BASE: {
const struct ovs_nsh_key_base *base = nla_data(a);
const struct ovs_nsh_key_base *base_mask = base + 1;
nsh->base = *base;
nsh_mask->base = *base_mask;
break;
}
case OVS_NSH_KEY_ATTR_MD1: {
const struct ovs_nsh_key_md1 *md1 = nla_data(a);
const struct ovs_nsh_key_md1 *md1_mask = md1 + 1;
memcpy(nsh->context, md1->context, sizeof(*md1));
memcpy(nsh_mask->context, md1_mask->context,
sizeof(*md1_mask));
break;
}
case OVS_NSH_KEY_ATTR_MD2:
/* Not supported yet */
return -ENOTSUPP;
default:
return -EINVAL;
}
}
return 0;
}
static int nsh_key_put_from_nlattr(const struct nlattr *attr, static int nsh_key_put_from_nlattr(const struct nlattr *attr,
struct sw_flow_match *match, bool is_mask, struct sw_flow_match *match, bool is_mask,
bool is_push_nsh, bool log) bool is_push_nsh, bool log)
@ -2839,17 +2802,13 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
return err; return err;
} }
static bool validate_nsh(const struct nlattr *attr, bool is_mask, static bool validate_push_nsh(const struct nlattr *attr, bool log)
bool is_push_nsh, bool log)
{ {
struct sw_flow_match match; struct sw_flow_match match;
struct sw_flow_key key; struct sw_flow_key key;
int ret = 0;
ovs_match_init(&match, &key, true, NULL); ovs_match_init(&match, &key, true, NULL);
ret = nsh_key_put_from_nlattr(attr, &match, is_mask, return !nsh_key_put_from_nlattr(attr, &match, false, true, log);
is_push_nsh, log);
return !ret;
} }
/* Return false if there are any non-masked bits set. /* Return false if there are any non-masked bits set.
@ -2997,13 +2956,6 @@ static int validate_set(const struct nlattr *a,
break; break;
case OVS_KEY_ATTR_NSH:
if (eth_type != htons(ETH_P_NSH))
return -EINVAL;
if (!validate_nsh(nla_data(a), masked, false, log))
return -EINVAL;
break;
default: default:
return -EINVAL; return -EINVAL;
} }
@ -3437,7 +3389,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
return -EINVAL; return -EINVAL;
} }
mac_proto = MAC_PROTO_NONE; mac_proto = MAC_PROTO_NONE;
if (!validate_nsh(nla_data(a), false, true, true)) if (!validate_push_nsh(nla_data(a), log))
return -EINVAL; return -EINVAL;
break; break;

View File

@ -65,8 +65,6 @@ int ovs_nla_put_actions(const struct nlattr *attr,
void ovs_nla_free_flow_actions(struct sw_flow_actions *); void ovs_nla_free_flow_actions(struct sw_flow_actions *);
void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *); void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *);
int nsh_key_from_nlattr(const struct nlattr *attr, struct ovs_key_nsh *nsh,
struct ovs_key_nsh *nsh_mask);
int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nshhdr *nh, int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nshhdr *nh,
size_t size); size_t size);

View File

@ -2954,6 +2954,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
u = unix_sk(sk); u = unix_sk(sk);
redo:
/* Lock the socket to prevent queue disordering /* Lock the socket to prevent queue disordering
* while sleeps in memcpy_tomsg * while sleeps in memcpy_tomsg
*/ */
@ -2965,7 +2966,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
struct sk_buff *skb, *last; struct sk_buff *skb, *last;
int chunk; int chunk;
redo:
unix_state_lock(sk); unix_state_lock(sk);
if (sock_flag(sk, SOCK_DEAD)) { if (sock_flag(sk, SOCK_DEAD)) {
err = -ECONNRESET; err = -ECONNRESET;
@ -3015,7 +3015,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
goto out; goto out;
} }
mutex_lock(&u->iolock);
goto redo; goto redo;
unlock: unlock:
unix_state_unlock(sk); unix_state_unlock(sk);

View File

@ -1661,18 +1661,40 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
timeout = schedule_timeout(timeout); timeout = schedule_timeout(timeout);
lock_sock(sk); lock_sock(sk);
if (signal_pending(current)) { /* Connection established. Whatever happens to socket once we
err = sock_intr_errno(timeout); * release it, that's not connect()'s concern. No need to go
sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; * into signal and timeout handling. Call it a day.
sock->state = SS_UNCONNECTED; *
vsock_transport_cancel_pkt(vsk); * Note that allowing to "reset" an already established socket
vsock_remove_connected(vsk); * here is racy and insecure.
goto out_wait; */
} else if ((sk->sk_state != TCP_ESTABLISHED) && (timeout == 0)) { if (sk->sk_state == TCP_ESTABLISHED)
err = -ETIMEDOUT; break;
/* If connection was _not_ established and a signal/timeout came
* to be, we want the socket's state reset. User space may want
* to retry.
*
* sk_state != TCP_ESTABLISHED implies that socket is not on
* vsock_connected_table. We keep the binding and the transport
* assigned.
*/
if (signal_pending(current) || timeout == 0) {
err = timeout == 0 ? -ETIMEDOUT : sock_intr_errno(timeout);
/* Listener might have already responded with
* VIRTIO_VSOCK_OP_RESPONSE. Its handling expects our
* sk_state == TCP_SYN_SENT, which hereby we break.
* In such case VIRTIO_VSOCK_OP_RST will follow.
*/
sk->sk_state = TCP_CLOSE; sk->sk_state = TCP_CLOSE;
sock->state = SS_UNCONNECTED; sock->state = SS_UNCONNECTED;
/* Try to cancel VIRTIO_VSOCK_OP_REQUEST skb sent out by
* transport->connect().
*/
vsock_transport_cancel_pkt(vsk); vsock_transport_cancel_pkt(vsk);
goto out_wait; goto out_wait;
} }

View File

@ -438,7 +438,7 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET && check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
x->props.mode == XFRM_MODE_TUNNEL; x->props.mode == XFRM_MODE_TUNNEL;
switch (x->inner_mode.family) { switch (skb_dst(skb)->ops->family) {
case AF_INET: case AF_INET:
/* Check for IPv4 options */ /* Check for IPv4 options */
if (ip_hdr(skb)->ihl != 5) if (ip_hdr(skb)->ihl != 5)

View File

@ -698,7 +698,7 @@ static void xfrm_get_inner_ipproto(struct sk_buff *skb, struct xfrm_state *x)
return; return;
if (x->outer_mode.encap == XFRM_MODE_TUNNEL) { if (x->outer_mode.encap == XFRM_MODE_TUNNEL) {
switch (x->outer_mode.family) { switch (skb_dst(skb)->ops->family) {
case AF_INET: case AF_INET:
xo->inner_ipproto = ip_hdr(skb)->protocol; xo->inner_ipproto = ip_hdr(skb)->protocol;
break; break;
@ -772,8 +772,12 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb)
/* Exclusive direct xmit for tunnel mode, as /* Exclusive direct xmit for tunnel mode, as
* some filtering or matching rules may apply * some filtering or matching rules may apply
* in transport mode. * in transport mode.
* Locally generated packets also require
* the normal XFRM path for L2 header setup,
* as the hardware needs the L2 header to match
* for encryption, so skip direct output as well.
*/ */
if (x->props.mode == XFRM_MODE_TUNNEL) if (x->props.mode == XFRM_MODE_TUNNEL && !skb->sk)
return xfrm_dev_direct_output(sk, x, skb); return xfrm_dev_direct_output(sk, x, skb);
return xfrm_output_resume(sk, skb, 0); return xfrm_output_resume(sk, skb, 0);

View File

@ -592,6 +592,7 @@ void xfrm_state_free(struct xfrm_state *x)
} }
EXPORT_SYMBOL(xfrm_state_free); EXPORT_SYMBOL(xfrm_state_free);
static void xfrm_state_delete_tunnel(struct xfrm_state *x);
static void xfrm_state_gc_destroy(struct xfrm_state *x) static void xfrm_state_gc_destroy(struct xfrm_state *x)
{ {
if (x->mode_cbs && x->mode_cbs->destroy_state) if (x->mode_cbs && x->mode_cbs->destroy_state)
@ -607,6 +608,7 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
kfree(x->replay_esn); kfree(x->replay_esn);
kfree(x->preplay_esn); kfree(x->preplay_esn);
xfrm_unset_type_offload(x); xfrm_unset_type_offload(x);
xfrm_state_delete_tunnel(x);
if (x->type) { if (x->type) {
x->type->destructor(x); x->type->destructor(x);
xfrm_put_type(x->type); xfrm_put_type(x->type);
@ -806,7 +808,6 @@ void __xfrm_state_destroy(struct xfrm_state *x)
} }
EXPORT_SYMBOL(__xfrm_state_destroy); EXPORT_SYMBOL(__xfrm_state_destroy);
static void xfrm_state_delete_tunnel(struct xfrm_state *x);
int __xfrm_state_delete(struct xfrm_state *x) int __xfrm_state_delete(struct xfrm_state *x)
{ {
struct net *net = xs_net(x); struct net *net = xs_net(x);
@ -2073,6 +2074,7 @@ static struct xfrm_state *xfrm_state_clone_and_setup(struct xfrm_state *orig,
return x; return x;
error: error:
x->km.state = XFRM_STATE_DEAD;
xfrm_state_put(x); xfrm_state_put(x);
out: out:
return NULL; return NULL;
@ -2157,11 +2159,15 @@ struct xfrm_state *xfrm_state_migrate(struct xfrm_state *x,
xfrm_state_insert(xc); xfrm_state_insert(xc);
} else { } else {
if (xfrm_state_add(xc) < 0) if (xfrm_state_add(xc) < 0)
goto error; goto error_add;
} }
return xc; return xc;
error_add:
if (xuo)
xfrm_dev_state_delete(xc);
error: error:
xc->km.state = XFRM_STATE_DEAD;
xfrm_state_put(xc); xfrm_state_put(xc);
return NULL; return NULL;
} }
@ -2191,14 +2197,18 @@ int xfrm_state_update(struct xfrm_state *x)
} }
if (x1->km.state == XFRM_STATE_ACQ) { if (x1->km.state == XFRM_STATE_ACQ) {
if (x->dir && x1->dir != x->dir) if (x->dir && x1->dir != x->dir) {
to_put = x1;
goto out; goto out;
}
__xfrm_state_insert(x); __xfrm_state_insert(x);
x = NULL; x = NULL;
} else { } else {
if (x1->dir != x->dir) if (x1->dir != x->dir) {
to_put = x1;
goto out; goto out;
}
} }
err = 0; err = 0;
@ -3298,6 +3308,7 @@ int __net_init xfrm_state_init(struct net *net)
void xfrm_state_fini(struct net *net) void xfrm_state_fini(struct net *net)
{ {
unsigned int sz; unsigned int sz;
int i;
flush_work(&net->xfrm.state_hash_work); flush_work(&net->xfrm.state_hash_work);
xfrm_state_flush(net, 0, false); xfrm_state_flush(net, 0, false);
@ -3305,14 +3316,17 @@ void xfrm_state_fini(struct net *net)
WARN_ON(!list_empty(&net->xfrm.state_all)); WARN_ON(!list_empty(&net->xfrm.state_all));
for (i = 0; i <= net->xfrm.state_hmask; i++) {
WARN_ON(!hlist_empty(net->xfrm.state_byseq + i));
WARN_ON(!hlist_empty(net->xfrm.state_byspi + i));
WARN_ON(!hlist_empty(net->xfrm.state_bysrc + i));
WARN_ON(!hlist_empty(net->xfrm.state_bydst + i));
}
sz = (net->xfrm.state_hmask + 1) * sizeof(struct hlist_head); sz = (net->xfrm.state_hmask + 1) * sizeof(struct hlist_head);
WARN_ON(!hlist_empty(net->xfrm.state_byseq));
xfrm_hash_free(net->xfrm.state_byseq, sz); xfrm_hash_free(net->xfrm.state_byseq, sz);
WARN_ON(!hlist_empty(net->xfrm.state_byspi));
xfrm_hash_free(net->xfrm.state_byspi, sz); xfrm_hash_free(net->xfrm.state_byspi, sz);
WARN_ON(!hlist_empty(net->xfrm.state_bysrc));
xfrm_hash_free(net->xfrm.state_bysrc, sz); xfrm_hash_free(net->xfrm.state_bysrc, sz);
WARN_ON(!hlist_empty(net->xfrm.state_bydst));
xfrm_hash_free(net->xfrm.state_bydst, sz); xfrm_hash_free(net->xfrm.state_bydst, sz);
free_percpu(net->xfrm.state_cache_input); free_percpu(net->xfrm.state_cache_input);
} }

View File

@ -947,8 +947,11 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
if (attrs[XFRMA_SA_PCPU]) { if (attrs[XFRMA_SA_PCPU]) {
x->pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]); x->pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]);
if (x->pcpu_num >= num_possible_cpus()) if (x->pcpu_num >= num_possible_cpus()) {
err = -ERANGE;
NL_SET_ERR_MSG(extack, "pCPU number too big");
goto error; goto error;
}
} }
err = __xfrm_init_state(x, extack); err = __xfrm_init_state(x, extack);
@ -3035,6 +3038,9 @@ static int xfrm_add_acquire(struct sk_buff *skb, struct nlmsghdr *nlh,
} }
xfrm_state_free(x); xfrm_state_free(x);
xfrm_dev_policy_delete(xp);
xfrm_dev_policy_free(xp);
security_xfrm_policy_free(xp->security);
kfree(xp); kfree(xp);
return 0; return 0;

View File

@ -45,6 +45,7 @@ skf_net_off
socket socket
so_incoming_cpu so_incoming_cpu
so_netns_cookie so_netns_cookie
so_peek_off
so_txtime so_txtime
so_rcv_listener so_rcv_listener
stress_reuseport_listen stress_reuseport_listen

View File

@ -6,6 +6,7 @@ TEST_GEN_PROGS := \
scm_inq \ scm_inq \
scm_pidfd \ scm_pidfd \
scm_rights \ scm_rights \
so_peek_off \
unix_connect \ unix_connect \
# end of TEST_GEN_PROGS # end of TEST_GEN_PROGS

View File

@ -0,0 +1,162 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright 2025 Google LLC */
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include "../../kselftest_harness.h"
FIXTURE(so_peek_off)
{
int fd[2]; /* 0: sender, 1: receiver */
};
FIXTURE_VARIANT(so_peek_off)
{
int type;
};
FIXTURE_VARIANT_ADD(so_peek_off, stream)
{
.type = SOCK_STREAM,
};
FIXTURE_VARIANT_ADD(so_peek_off, dgram)
{
.type = SOCK_DGRAM,
};
FIXTURE_VARIANT_ADD(so_peek_off, seqpacket)
{
.type = SOCK_SEQPACKET,
};
FIXTURE_SETUP(so_peek_off)
{
struct timeval timeout = {
.tv_sec = 0,
.tv_usec = 3000,
};
int ret;
ret = socketpair(AF_UNIX, variant->type, 0, self->fd);
ASSERT_EQ(0, ret);
ret = setsockopt(self->fd[1], SOL_SOCKET, SO_RCVTIMEO_NEW,
&timeout, sizeof(timeout));
ASSERT_EQ(0, ret);
ret = setsockopt(self->fd[1], SOL_SOCKET, SO_PEEK_OFF,
&(int){0}, sizeof(int));
ASSERT_EQ(0, ret);
}
FIXTURE_TEARDOWN(so_peek_off)
{
close_range(self->fd[0], self->fd[1], 0);
}
#define sendeq(fd, str, flags) \
do { \
int bytes, len = strlen(str); \
\
bytes = send(fd, str, len, flags); \
ASSERT_EQ(len, bytes); \
} while (0)
#define recveq(fd, str, buflen, flags) \
do { \
char buf[(buflen) + 1] = {}; \
int bytes; \
\
bytes = recv(fd, buf, buflen, flags); \
ASSERT_NE(-1, bytes); \
ASSERT_STREQ(str, buf); \
} while (0)
#define async \
for (pid_t pid = (pid = fork(), \
pid < 0 ? \
__TH_LOG("Failed to start async {}"), \
_metadata->exit_code = KSFT_FAIL, \
__bail(1, _metadata), \
0xdead : \
pid); \
!pid; exit(0))
TEST_F(so_peek_off, single_chunk)
{
sendeq(self->fd[0], "aaaabbbb", 0);
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks)
{
sendeq(self->fd[0], "aaaa", 0);
sendeq(self->fd[0], "bbbb", 0);
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks_blocking)
{
async {
usleep(1000);
sendeq(self->fd[0], "aaaa", 0);
}
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
async {
usleep(1000);
sendeq(self->fd[0], "bbbb", 0);
}
/* goto again; -> goto redo; in unix_stream_read_generic(). */
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks_overlap)
{
sendeq(self->fd[0], "aaaa", 0);
recveq(self->fd[1], "aa", 2, MSG_PEEK);
sendeq(self->fd[0], "bbbb", 0);
if (variant->type == SOCK_STREAM) {
/* SOCK_STREAM tries to fill the buffer. */
recveq(self->fd[1], "aabb", 4, MSG_PEEK);
recveq(self->fd[1], "bb", 100, MSG_PEEK);
} else {
/* SOCK_DGRAM and SOCK_SEQPACKET returns at the skb boundary. */
recveq(self->fd[1], "aa", 100, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
}
TEST_F(so_peek_off, two_chunks_overlap_blocking)
{
async {
usleep(1000);
sendeq(self->fd[0], "aaaa", 0);
}
recveq(self->fd[1], "aa", 2, MSG_PEEK);
async {
usleep(1000);
sendeq(self->fd[0], "bbbb", 0);
}
/* Even SOCK_STREAM does not wait if at least one byte is read. */
recveq(self->fd[1], "aa", 100, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_HARNESS_MAIN

View File

@ -30,6 +30,11 @@ tfail()
do_test "tfail" false do_test "tfail" false
} }
tfail2()
{
do_test "tfail2" false
}
txfail() txfail()
{ {
FAIL_TO_XFAIL=yes do_test "txfail" false FAIL_TO_XFAIL=yes do_test "txfail" false
@ -132,6 +137,8 @@ test_ret()
ret_subtest $ksft_fail "tfail" txfail tfail ret_subtest $ksft_fail "tfail" txfail tfail
ret_subtest $ksft_xfail "txfail" txfail txfail ret_subtest $ksft_xfail "txfail" txfail txfail
ret_subtest $ksft_fail "tfail2" tfail2 tfail
} }
exit_status_tests_run() exit_status_tests_run()

View File

@ -43,7 +43,7 @@ __ksft_status_merge()
weights[$i]=$((weight++)) weights[$i]=$((weight++))
done done
if [[ ${weights[$a]} > ${weights[$b]} ]]; then if [[ ${weights[$a]} -ge ${weights[$b]} ]]; then
echo "$a" echo "$a"
return 0 return 0
else else

View File

@ -3500,7 +3500,6 @@ fullmesh_tests()
fastclose_tests() fastclose_tests()
{ {
if reset_check_counter "fastclose test" "MPTcpExtMPFastcloseTx"; then if reset_check_counter "fastclose test" "MPTcpExtMPFastcloseTx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=client \ test_linkfail=1024 fastclose=client \
run_tests $ns1 $ns2 10.0.1.1 run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 0 0 0 chk_join_nr 0 0 0
@ -3509,7 +3508,6 @@ fastclose_tests()
fi fi
if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=server \ test_linkfail=1024 fastclose=server \
run_tests $ns1 $ns2 10.0.1.1 run_tests $ns1 $ns2 10.0.1.1
join_rst_nr=1 \ join_rst_nr=1 \
@ -3806,7 +3804,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns1 set_userspace_pm $ns1
pm_nl_set_limits $ns2 2 2 pm_nl_set_limits $ns2 2 2
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
wait_mpj $ns1 wait_mpj $ns1
@ -3839,7 +3837,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2 set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1 pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
wait_mpj $ns2 wait_mpj $ns2
@ -3867,7 +3865,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2 set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1 pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
wait_mpj $ns2 wait_mpj $ns2
@ -3888,7 +3886,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2 set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1 pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
wait_mpj $ns2 wait_mpj $ns2
@ -3912,7 +3910,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns1 set_userspace_pm $ns1
pm_nl_set_limits $ns2 1 1 pm_nl_set_limits $ns2 1 1
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
wait_mpj $ns1 wait_mpj $ns1
@ -3943,7 +3941,7 @@ endpoint_tests()
pm_nl_set_limits $ns1 2 2 pm_nl_set_limits $ns1 2 2
pm_nl_set_limits $ns2 2 2 pm_nl_set_limits $ns2 2 2
pm_nl_add_endpoint $ns1 10.0.2.1 flags signal pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
{ test_linkfail=128 speed=slow \ { timeout_test=120 test_linkfail=128 speed=slow \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
@ -3970,7 +3968,7 @@ endpoint_tests()
pm_nl_set_limits $ns2 0 3 pm_nl_set_limits $ns2 0 3
pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow
pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
@ -4048,7 +4046,7 @@ endpoint_tests()
# broadcast IP: no packet for this address will be received on ns1 # broadcast IP: no packet for this address will be received on ns1
pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal
pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal
{ test_linkfail=128 speed=5 \ { timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!
@ -4057,38 +4055,45 @@ endpoint_tests()
$ns1 10.0.2.1 id 1 flags signal $ns1 10.0.2.1 id 1 flags signal
chk_subflow_nr "before delete" 2 chk_subflow_nr "before delete" 2
chk_mptcp_info subflows 1 subflows 1 chk_mptcp_info subflows 1 subflows 1
chk_mptcp_info add_addr_signal 2 add_addr_accepted 1
pm_nl_del_endpoint $ns1 1 10.0.2.1 pm_nl_del_endpoint $ns1 1 10.0.2.1
pm_nl_del_endpoint $ns1 2 224.0.0.1 pm_nl_del_endpoint $ns1 2 224.0.0.1
sleep 0.5 sleep 0.5
chk_subflow_nr "after delete" 1 chk_subflow_nr "after delete" 1
chk_mptcp_info subflows 0 subflows 0 chk_mptcp_info subflows 0 subflows 0
chk_mptcp_info add_addr_signal 0 add_addr_accepted 0
pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal
pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal
wait_mpj $ns2 wait_mpj $ns2
chk_subflow_nr "after re-add" 3 chk_subflow_nr "after re-add" 3
chk_mptcp_info subflows 2 subflows 2 chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_del_endpoint $ns1 42 10.0.1.1 pm_nl_del_endpoint $ns1 42 10.0.1.1
sleep 0.5 sleep 0.5
chk_subflow_nr "after delete ID 0" 2 chk_subflow_nr "after delete ID 0" 2
chk_mptcp_info subflows 2 subflows 2 chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal
wait_mpj $ns2 wait_mpj $ns2
chk_subflow_nr "after re-add ID 0" 3 chk_subflow_nr "after re-add ID 0" 3
chk_mptcp_info subflows 3 subflows 3 chk_mptcp_info subflows 3 subflows 3
chk_mptcp_info add_addr_signal 3 add_addr_accepted 2
pm_nl_del_endpoint $ns1 99 10.0.1.1 pm_nl_del_endpoint $ns1 99 10.0.1.1
sleep 0.5 sleep 0.5
chk_subflow_nr "after re-delete ID 0" 2 chk_subflow_nr "after re-delete ID 0" 2
chk_mptcp_info subflows 2 subflows 2 chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal
wait_mpj $ns2 wait_mpj $ns2
chk_subflow_nr "after re-re-add ID 0" 3 chk_subflow_nr "after re-re-add ID 0" 3
chk_mptcp_info subflows 3 subflows 3 chk_mptcp_info subflows 3 subflows 3
chk_mptcp_info add_addr_signal 3 add_addr_accepted 2
mptcp_lib_kill_group_wait $tests_pid mptcp_lib_kill_group_wait $tests_pid
kill_events_pids kill_events_pids
@ -4121,7 +4126,7 @@ endpoint_tests()
# broadcast IP: no packet for this address will be received on ns1 # broadcast IP: no packet for this address will be received on ns1
pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal
pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow
{ test_linkfail=128 speed=20 \ { timeout_test=120 test_linkfail=128 speed=20 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$! local tests_pid=$!