linux/drivers/net/ethernet/intel/ice
Jacob Keller 84bf1ac85a ice: fix Rx page leak on multi-buffer frames
The ice_put_rx_mbuf() function handles calling ice_put_rx_buf() for each
buffer in the current frame. This function was introduced as part of
handling multi-buffer XDP support in the ice driver.

It works by iterating over the buffers from first_desc up to 1 plus the
total number of fragments in the frame, cached from before the XDP program
was executed.

If the hardware posts a descriptor with a size of 0, the logic used in
ice_put_rx_mbuf() breaks. Such descriptors get skipped and don't get added
as fragments in ice_add_xdp_frag. Since the buffer isn't counted as a
fragment, we do not iterate over it in ice_put_rx_mbuf(), and thus we don't
call ice_put_rx_buf().

Because we don't call ice_put_rx_buf(), we don't attempt to re-use the
page or free it. This leaves a stale page in the ring, as we don't
increment next_to_alloc.

The ice_reuse_rx_page() assumes that the next_to_alloc has been incremented
properly, and that it always points to a buffer with a NULL page. Since
this function doesn't check, it will happily recycle a page over the top
of the next_to_alloc buffer, losing track of the old page.

Note that this leak only occurs for multi-buffer frames. The
ice_put_rx_mbuf() function always handles at least one buffer, so a
single-buffer frame will always get handled correctly. It is not clear
precisely why the hardware hands us descriptors with a size of 0 sometimes,
but it happens somewhat regularly with "jumbo frames" used by 9K MTU.

To fix ice_put_rx_mbuf(), we need to make sure to call ice_put_rx_buf() on
all buffers between first_desc and next_to_clean. Borrow the logic of a
similar function in i40e used for this same purpose. Use the same logic
also in ice_get_pgcnts().

Instead of iterating over just the number of fragments, use a loop which
iterates until the current index reaches to the next_to_clean element just
past the current frame. Unlike i40e, the ice_put_rx_mbuf() function does
call ice_put_rx_buf() on the last buffer of the frame indicating the end of
packet.

For non-linear (multi-buffer) frames, we need to take care when adjusting
the pagecnt_bias. An XDP program might release fragments from the tail of
the frame, in which case that fragment page is already released. Only
update the pagecnt_bias for the first descriptor and fragments still
remaining post-XDP program. Take care to only access the shared info for
fragmented buffers, as this avoids a significant cache miss.

The xdp_xmit value only needs to be updated if an XDP program is run, and
only once per packet. Drop the xdp_xmit pointer argument from
ice_put_rx_mbuf(). Instead, set xdp_xmit in the ice_clean_rx_irq() function
directly. This avoids needing to pass the argument and avoids an extra
bit-wise OR for each buffer in the frame.

Move the increment of the ntc local variable to ensure its updated *before*
all calls to ice_get_pgcnts() or ice_put_rx_mbuf(), as the loop logic
requires the index of the element just after the current frame.

Now that we use an index pointer in the ring to identify the packet, we no
longer need to track or cache the number of fragments in the rx_ring.

Cc: Christoph Petrausch <christoph.petrausch@deepl.com>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Closes: https://lore.kernel.org/netdev/CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com/
Fixes: 743bbd93cf ("ice: put Rx buffers after being done with current frame")
Tested-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Tested-by: Priya Singh <priyax.singh@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16 14:01:46 -07:00
..
devlink Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue 2025-07-25 16:01:18 -07:00
Makefile ice: move TSPLL functions to a separate file 2025-06-18 08:59:22 -07:00
ice.h ice: fix NULL pointer dereference in ice_unplug_aux_dev() on reset 2025-08-25 09:12:39 -07:00
ice_adapter.c ice: use fixed adapter index for E825C embedded devices 2025-08-25 09:45:23 -07:00
ice_adapter.h ice: use fixed adapter index for E825C embedded devices 2025-08-25 09:45:23 -07:00
ice_adminq_cmd.h ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_arfs.c net: ice: Perform accurate aRFS flow match 2025-06-17 10:09:18 -07:00
ice_arfs.h ice: use napi's irq affinity and rmap IRQ notifiers 2025-02-26 19:51:37 -08:00
ice_base.c net: Fix typos 2025-07-25 10:29:07 -07:00
ice_base.h
ice_common.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_common.h ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_controlq.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_controlq.h ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_dcb.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_dcb.h
ice_dcb_lib.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_dcb_lib.h iidc/ice/irdma: Update IDC to support multiple consumers 2025-05-09 11:35:43 -07:00
ice_dcb_nl.c ice: Replace ice specific DSCP mapping num with a kernel define 2025-04-30 13:09:08 -07:00
ice_dcb_nl.h
ice_ddp.c ice: don't leave device non-functional if Tx scheduler config fails 2025-08-25 09:44:43 -07:00
ice_ddp.h
ice_debugfs.c ice: check correct pointer in fwlog debugfs 2025-07-15 13:01:16 -07:00
ice_devids.h ice: add E835 device IDs 2025-07-18 09:02:28 -07:00
ice_dpll.c ice: use libie_aq_str 2025-07-24 09:40:49 -07:00
ice_dpll.h ice: add ref-sync dpll pins 2025-06-27 16:38:02 -07:00
ice_eswitch.c ice: fix eswitch code memory leak in reset scenario 2025-06-17 10:09:24 -07:00
ice_eswitch.h
ice_eswitch_br.c
ice_eswitch_br.h
ice_ethtool.c ice: use libie_aq_str 2025-07-24 09:40:49 -07:00
ice_ethtool.h ice: remove invalid parameter of equalizer 2025-01-24 10:49:42 -08:00
ice_ethtool_fdir.c ice: make const read-only array dflt_rules static 2025-04-11 11:58:57 -07:00
ice_fdir.c
ice_fdir.h
ice_flex_pipe.c ice: convert ice_add_prof() to bitmap 2025-07-18 09:02:28 -07:00
ice_flex_pipe.h ice: convert ice_add_prof() to bitmap 2025-07-18 09:02:28 -07:00
ice_flex_type.h
ice_flow.c ice: convert ice_add_prof() to bitmap 2025-07-18 09:02:28 -07:00
ice_flow.h net: intel: move RSS packet classifier types to libie 2025-06-09 09:56:18 -07:00
ice_fltr.c
ice_fltr.h
ice_fw_update.c ice: use libie_aq_str 2025-07-24 09:40:49 -07:00
ice_fw_update.h
ice_fwlog.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_fwlog.h
ice_gnss.c ice: Don't check device type when checking GNSS presence 2025-02-10 08:52:04 -08:00
ice_gnss.h ice: Don't check device type when checking GNSS presence 2025-02-10 08:52:04 -08:00
ice_hw_autogen.h ice: add functions to get and set Tx queue context 2025-07-10 14:33:33 -07:00
ice_hwmon.c
ice_hwmon.h
ice_idc.c ice: fix NULL pointer dereference in ice_unplug_aux_dev() on reset 2025-08-25 09:12:39 -07:00
ice_idc_int.h iidc/ice/irdma: Break iidc.h into two headers 2025-04-30 13:09:08 -07:00
ice_irq.c ice: Fix signedness bug in ice_init_interrupt_scheme() 2025-02-14 17:18:00 -08:00
ice_irq.h ice: simplify VF MSI-X managing 2025-02-05 09:04:57 -08:00
ice_lag.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_lag.h ice: breakout common LAG code into helpers 2025-07-18 09:02:28 -07:00
ice_lan_tx_rx.h ice: Add E830 checksum offload support 2025-03-18 10:15:49 +01:00
ice_lib.c Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue 2025-07-25 16:01:18 -07:00
ice_lib.h ice: move ice_vsi_update_l2tsel to ice_lib.c 2025-07-10 14:36:58 -07:00
ice_main.c ice: fix NULL access of tx->in_use in ice_ll_ts_intr 2025-09-02 11:05:50 -07:00
ice_nvm.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_nvm.h
ice_osdep.h
ice_parser.c
ice_parser.h ice: fix ice_parser_rt::bst_key array size 2025-01-24 10:49:30 -08:00
ice_parser_rt.c ice: fix ice_parser_rt::bst_key array size 2025-01-24 10:49:30 -08:00
ice_pf_vsi_vlan_ops.c
ice_pf_vsi_vlan_ops.h
ice_protocol_type.h
ice_ptp.c ice: fix NULL access of tx->in_use in ice_ptp_ts_irq 2025-09-02 11:05:50 -07:00
ice_ptp.h ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() 2025-07-03 09:37:49 -07:00
ice_ptp_consts.h ice: rename TSPLL and CGU functions and definitions 2025-06-18 08:59:22 -07:00
ice_ptp_hw.c Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue 2025-07-25 16:01:18 -07:00
ice_ptp_hw.h ice: rename TSPLL and CGU functions and definitions 2025-06-18 08:59:22 -07:00
ice_repr.c ice: enable LLDP TX for VFs through tc 2025-04-11 10:45:52 -07:00
ice_repr.h
ice_sbq_cmd.h ice: refactor ice_sbq_msg_dev enum 2025-04-11 10:46:37 -07:00
ice_sched.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_sched.h
ice_sf_eth.c
ice_sf_eth.h
ice_sf_vsi_vlan_ops.c
ice_sf_vsi_vlan_ops.h
ice_sriov.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_sriov.h ice: expose VF functions used by live migration 2025-07-10 14:36:58 -07:00
ice_switch.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_switch.h
ice_tc_lib.c ice: improve error message for insufficient filter space 2025-04-11 11:58:57 -07:00
ice_tc_lib.h ice: enable LLDP TX for VFs through tc 2025-04-11 10:45:52 -07:00
ice_trace.h
ice_tspll.c ice: move TSPLL init calls to ice_ptp.c 2025-06-26 08:37:00 -07:00
ice_tspll.h ice: use designated initializers for TSPLL consts 2025-06-18 08:59:23 -07:00
ice_txrx.c ice: fix Rx page leak on multi-buffer frames 2025-09-16 14:01:46 -07:00
ice_txrx.h ice: fix Rx page leak on multi-buffer frames 2025-09-16 14:01:46 -07:00
ice_txrx_lib.c ice: Add E830 checksum offload support 2025-03-18 10:15:49 +01:00
ice_txrx_lib.h ice: stop storing XDP verdict within ice_rx_buf 2025-01-31 10:07:46 -08:00
ice_type.h ice: rename TSPLL and CGU functions and definitions 2025-06-18 08:59:22 -07:00
ice_vf_lib.c ice: breakout common LAG code into helpers 2025-07-18 09:02:28 -07:00
ice_vf_lib.h ice: introduce ice_get_vf_by_dev() wrapper 2025-07-10 14:37:39 -07:00
ice_vf_lib_private.h ice: Fix deinitializing VF in error path 2025-02-25 19:09:36 -08:00
ice_vf_mbx.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_vf_mbx.h
ice_vf_vsi_vlan_ops.c
ice_vf_vsi_vlan_ops.h
ice_virtchnl.c ice: use libie_aq_str 2025-07-24 09:40:49 -07:00
ice_virtchnl.h ice: expose VF functions used by live migration 2025-07-10 14:36:58 -07:00
ice_virtchnl_allowlist.c net: intel: rename 'hena' to 'hashcfg' for clarity 2025-06-09 09:56:18 -07:00
ice_virtchnl_allowlist.h
ice_virtchnl_fdir.c treewide, timers: Rename from_timer() to timer_container_of() 2025-06-08 09:07:37 +02:00
ice_virtchnl_fdir.h
ice_vlan.h
ice_vlan_mode.c ice, libie: move generic adminq descriptors to lib 2025-07-24 09:22:26 -07:00
ice_vlan_mode.h
ice_vsi_vlan_lib.c ice: use libie_aq_str 2025-07-24 09:40:49 -07:00
ice_vsi_vlan_lib.h
ice_vsi_vlan_ops.c
ice_vsi_vlan_ops.h
ice_xsk.c ice: use generic unrolled_count() macro 2025-02-10 17:54:43 -08:00
ice_xsk.h ice: use generic unrolled_count() macro 2025-02-10 17:54:43 -08:00