Commit Graph

224 Commits

Author SHA1 Message Date
Julia Lawall a13de901e8 gve: use vmalloc_array and vcalloc
Use vmalloc_array and vcalloc to protect against
multiplication overflows.

The changes were done using the following Coccinelle
semantic patch:

// <smpl>
@initialize:ocaml@
@@

let rename alloc =
  match alloc with
    "vmalloc" -> "vmalloc_array"
  | "vzalloc" -> "vcalloc"
  | _ -> failwith "unknown"

@@
    size_t e1,e2;
    constant C1, C2;
    expression E1, E2, COUNT, x1, x2, x3;
    typedef u8;
    typedef __u8;
    type t = {u8,__u8,char,unsigned char};
    identifier alloc = {vmalloc,vzalloc};
    fresh identifier realloc = script:ocaml(alloc) { rename alloc };
@@

(
      alloc(x1*x2*x3)
|
      alloc(C1 * C2)
|
      alloc((sizeof(t)) * (COUNT), ...)
|
-     alloc((e1) * (e2))
+     realloc(e1, e2)
|
-     alloc((e1) * (COUNT))
+     realloc(COUNT, e1)
|
-     alloc((E1) * (E2))
+     realloc(E1, E2)
)
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20230627144339.144478-5-Julia.Lawall@inria.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-27 09:30:23 -07:00
Coco Li a695641c8e gve: Support IPv6 Big TCP on DQ
Add support for using IPv6 Big TCP on DQ which can handle large TSO/GRO
packets. See https://lwn.net/Articles/895398/. This can improve the
throughput and CPU usage.

Perf test result:
ip -d link show $DEV
gso_max_size 185000 gso_max_segs 65535 tso_max_size 262143 tso_max_segs 65535 gro_max_size 185000

For performance, tested with neper using 9k MTU on hardware that supports 200Gb/s line rate.

In single streams when line rate is not saturated, we expect throughput improvements.
When the networking is performing at line rate, we expect cpu usage improvements.

Tcp_stream (unidirectional stream test, T=thread, F=flow):
skb=180kb, T=1, F=1, no zerocopy: throughput average=64576.88 Mb/s, sender stime=8.3, receiver stime=10.68
skb=64kb,  T=1, F=1, no zerocopy: throughput average=64862.54 Mb/s, sender stime=9.96, receiver stime=12.67
skb=180kb, T=1, F=1, yes zerocopy:  throughput average=146604.97 Mb/s, sender stime=10.61, receiver stime=5.52
skb=64kb,  T=1, F=1, yes zerocopy:  throughput average=131357.78 Mb/s, sender stime=12.11, receiver stime=12.25

skb=180kb, T=20, F=100, no zerocopy:  throughput average=182411.37 Mb/s, sender stime=41.62, receiver stime=79.4
skb=64kb,  T=20, F=100, no zerocopy:  throughput average=182892.02 Mb/s, sender stime=57.39, receiver stime=72.69
skb=180kb, T=20, F=100, yes zerocopy: throughput average=182337.65 Mb/s, sender stime=27.94, receiver stime=39.7
skb=64kb,  T=20, F=100, yes zerocopy: throughput average=182144.20 Mb/s, sender stime=47.06, receiver stime=39.01

Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230522201552.3585421-1-ziweixiao@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23 21:11:35 -07:00
Ziwei Xiao f4c2e67c17 gve: Remove the code of clearing PBA bit
Clearing the PBA bit from the driver is race prone and it may lead to
dropped interrupt events. This could potentially lead to the traffic
being completely halted.

Fixes: 5e8c5adf95 ("gve: DQO: Add core netdev features")
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:30:46 +01:00
Shailend Chand 4de00f0acc gve: Unify duplicate GQ min pkt desc size constants
The two constants accomplish the same thing.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230407184830.309398-1-shailend@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-04-11 15:47:14 +02:00
Jakub Kicinski d9c960675a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/google/gve/gve.h
  3ce9345580 ("gve: Secure enough bytes in the first TX desc for all TCP pkts")
  75eaae158b ("gve: Add XDP DROP and TX support for GQI-QPL format")
https://lore.kernel.org/all/20230406104927.45d176f5@canb.auug.org.au/
https://lore.kernel.org/all/c5872985-1a95-0bc8-9dcc-b6f23b439e9d@tessares.net/

Adjacent changes:

net/can/isotp.c
  051737439e ("can: isotp: fix race between isotp_sendsmg() and isotp_release()")
  96d1c81e6a ("can: isotp: add module parameter for maximum pdu size")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-06 12:01:20 -07:00
Shailend Chand 3ce9345580 gve: Secure enough bytes in the first TX desc for all TCP pkts
Non-GSO TCP packets whose SKBs' linear portion did not include the
entire TCP header were not populating the first Tx descriptor with
as many bytes as the vNIC expected. This change ensures that all
TCP packets populate the first descriptor with the correct number of
bytes.

Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Shailend Chand <shailend@google.com>
Link: https://lore.kernel.org/r/20230403172809.2939306-1-shailend@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-04 18:58:12 -07:00
Jakub Kicinski dc0a7b5200 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
  6e9d51b1a5 ("net/mlx5e: Initialize link speed to zero")
  1bffcea429 ("net/mlx5e: Add devlink hairpin queues parameters")
https://lore.kernel.org/all/20230324120623.4ebbc66f@canb.auug.org.au/
https://lore.kernel.org/all/20230321211135.47711-1-saeed@kernel.org/

Adjacent changes:

drivers/net/phy/phy.c
  323fe43cf9 ("net: phy: Improved PHY error reporting in state machine")
  4203d84032 ("net: phy: Ensure state transitions are processed from phy_stop()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-24 10:10:20 -07:00
Joshua Washington 68c3e4fc86 gve: Cache link_speed value from device
The link speed is never changed for the uptime of a VM, and the current
implementation sends an admin queue command for each call. Admin queue
command invocations have nontrivial overhead (e.g., VM exits), which can
be disruptive to users if triggered frequently. Our telemetry data shows
that there are VMs that make frequent calls to this admin queue command.
Caching the result of the original admin queue command would eliminate
the need to send multiple admin queue commands on subsequent calls to
retrieve link speed.

Fixes: 7e074d5a76 ("gve: Enable Link Speed Reporting in the driver.")
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230321172332.91678-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22 22:03:21 -07:00
Praveen Kaligineedi fd8e40321a gve: Add AF_XDP zero-copy support for GQI-QPL format
Adding AF_XDP zero-copy support.

Note: Although these changes support AF_XDP socket in zero-copy
mode, there is still a copy happening within the driver between
XSK buffer pool and QPL bounce buffers in GQI-QPL format.
In GQI-QPL queue format, the driver needs to allocate a fixed size
memory, the size specified by vNIC device, for RX/TX and register this
memory as a bounce buffer with the vNIC device when a queue is
created. The number of pages in the bounce buffer is limited and the
pages need to be made available to the vNIC by copying the RX data out
to prevent head-of-line blocking. Therefore, we cannot pass the XSK
buffer pool to the vNIC.

The number of copies on RX path from the bounce buffer to XSK buffer is 2
for AF_XDP copy mode (bounce buffer -> allocated page frag -> XSK buffer)
and 1 for AF_XDP zero-copy mode (bounce buffer -> XSK buffer).

This patch contains the following changes:
1) Enable and disable XSK buffer pool
2) Copy XDP packets from QPL bounce buffers to XSK buffer on rx
3) Copy XDP packets from XSK buffer to QPL bounce buffers and
   ring the doorbell as part of XDP TX napi poll
4) ndo_xsk_wakeup callback support

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:21 +00:00
Praveen Kaligineedi 39a7f4aa3e gve: Add XDP REDIRECT support for GQI-QPL format
This patch contains the following changes:
1) Support for XDP REDIRECT action on rx
2) ndo_xdp_xmit callback support

In GQI-QPL queue format, the driver needs to allocate a fixed size
memory, the size specified by vNIC device, for RX/TX and register this
memory as a bounce buffer with the vNIC device when a queue is created.
The number of pages in the bounce buffer is limited and the pages need to
be made available to the vNIC by copying the RX data out to prevent
head-of-line blocking. The XDP_REDIRECT packets are therefore immediately
copied to a newly allocated page.

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:21 +00:00
Praveen Kaligineedi 75eaae158b gve: Add XDP DROP and TX support for GQI-QPL format
Add support for XDP PASS, DROP and TX actions.

This patch contains the following changes:
1) Support installing/uninstalling XDP program
2) Add dedicated XDP TX queues
3) Add support for XDP DROP action
4) Add support for XDP TX action

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:20 +00:00
Praveen Kaligineedi 7fc2bf78a4 gve: Changes to add new TX queues
Changes to enable adding and removing TX queues without calling
gve_close() and gve_open().

Made the following changes:
1) priv->tx, priv->rx and priv->qpls arrays are allocated based on
   max tx queues and max rx queues
2) Changed gve_adminq_create_tx_queues(), gve_adminq_destroy_tx_queues(),
gve_tx_alloc_rings() and gve_tx_free_rings() functions to add/remove a
subset of TX queues rather than all the TX queues.

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:20 +00:00
Praveen Kaligineedi 2e80aeae9f gve: XDP support GQI-QPL: helper function changes
This patch adds/modifies helper functions needed to add XDP
support.

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17 08:29:20 +00:00
Praveen Kaligineedi 8437114593 gve: Fix gve interrupt names
IRQs are currently requested before the netdevice is registered
and a proper name is assigned to the device. Changing interrupt
name to avoid using the format string in the name.

Interrupt name before change: eth%d-ntfy-block.<blk_id>
Interrupt name after change: gve-ntfy-blk<blk_id>@pci:<pci_name>

Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 10:03:26 +00:00
Jeroen de Borst a5affbd8a7 gve: Handle alternate miss completions
The virtual NIC has 2 ways of indicating a miss-path
completion. This handles the alternate.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:52:14 +00:00
Jeroen de Borst c2a0c3ed5b gve: Adding a new AdminQ command to verify driver
Check whether the driver is compatible with the device
presented.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:52:14 +00:00
Yang Yingliang 64c426dfbb gve: Fix error return code in gve_prefill_rx_pages()
If alloc_page() fails in gve_prefill_rx_pages(), it should return
an error code in the error path.

Fixes: 82fd151d38 ("gve: Reduce alloc and copy costs in the GQ rx path")
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Catherine Sullivan <csully@google.com>
Cc: Shailend Chand <shailend@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07 11:32:28 +00:00
Shailend Chand 82fd151d38 gve: Reduce alloc and copy costs in the GQ rx path
Previously, even if just one of the many fragments of a 9k packet
required a copy, we'd copy the whole packet into a freshly-allocated
9k-sized linear SKB, and this led to performance issues.

By having a pool of pages to copy into, each fragment can be
independently handled, leading to a reduced incidence of
allocation and copy.

Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-02 11:52:51 +00:00
Thomas Gleixner 068c38ad88 net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.

Convert to the regular interface.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28 20:13:54 -07:00
Jakub Kicinski b48b89f9c1 net: drop the weight argument from netif_napi_add
We tell driver developers to always pass NAPI_POLL_WEIGHT
as the weight to netif_napi_add(). This may be confusing
to newcomers, drop the weight argument, those who really
need to tweak the weight can use netif_napi_add_weight().

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:57:14 -07:00
Shailend Chand 8ccac4edc8 gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20220913000901.959546-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:31:06 -07:00
Sebastian Andrzej Siewior 278d3ba615 net: Use u64_stats_fetch_begin_irq() for stats fetch.
On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the
reader is in preemptible context and the writer side
(u64_stats_update_begin*()) runs in an interrupt context (IRQ or
softirq) then the writer can update the stats during the read operation.
This update remains undetected.

Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP
are not interrupted by a writer. 32bit-SMP remains unaffected by this
change.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Catherine Sullivan <csully@google.com>
Cc: David Awogbemila <awogbemila@google.com>
Cc: Dimitris Michailidis <dmichail@fungible.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <simon.horman@corigine.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: oss-drivers@corigine.com
Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29 13:02:27 +01:00
Eric Dumazet 504148fedb net: add skb_[inner_]tcp_all_headers helpers
Most drivers use "skb_transport_offset(skb) + tcp_hdrlen(skb)"
to compute headers length for a TCP packet, but others
use more convoluted (but equivalent) ways.

Add skb_tcp_all_headers() and skb_inner_tcp_all_headers()
helpers to harmonize this a bit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-02 16:22:25 +01:00
Jilin Yuan 577d7685d5 google/gve:fix repeated words in comments
Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629130013.3273-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-30 20:04:10 -07:00
Colin Ian King 2fc559c8cb gve: Fix spelling mistake "droping" -> "dropping"
There is a spelling mistake in a netdev_warn warning. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220315222615.2960504-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 19:29:00 -07:00
Haiyue Wang b0471c2610 gve: enhance no queue page list detection
The commit
a5886ef4f4 ("gve: Introduce per netdev `enum gve_queue_format`")
introduces three queue format type, only GVE_GQI_QPL_FORMAT queue has
page list. So it should use the queue page list number to detect the
zero size queue page list. Correct the design logic.

Using the 'queue_format == GVE_GQI_RDA_FORMAT' may lead to request zero
sized memory allocation, like if the queue format is GVE_DQO_RDA_FORMAT.

The kernel memory subsystem will return ZERO_SIZE_PTR, which is not NULL
address, so the driver can run successfully. Also the code still checks
the queue page list number firstly, then accesses the allocated memory,
so zero number queue page list allocation will not lead to access fault.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Bailey Forrest <bcf@google.com>
Link: https://lore.kernel.org/r/20220215051751.260866-1-haiyue.wang@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-15 18:01:06 -08:00
Tao Liu 084cbb2ec3 gve: Recording rx queue before sending to napi
This caused a significant performance degredation when using generic XDP
with multiple queues.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Tao Liu <xliutaox@google.com>
Link: https://lore.kernel.org/r/20220207175901.2486596-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-08 16:52:31 -08:00
Haiyue Wang 1f84a9450d gve: fix the wrong AdminQ buffer queue index check
The 'tail' and 'head' are 'unsigned int' type free-running count, when
'head' is overflow, the 'int i (= tail) < u32 head' will be false:

Only '- loop 0: idx = 63' result is shown, so it needs to use 'int' type
to compare, it can handle the overflow correctly.

typedef uint32_t u32;

int main()
{
        u32 tail, head;
        int stail, shead;
        int i, loop;

        tail = 0xffffffff;
        head = 0x00000000;

        for (i = tail, loop = 0; i < head; i++) {
                unsigned int idx = i & 63;

                printf("+ loop %d: idx = %u\n", loop++, idx);
        }

        stail = tail;
        shead = head;
        for (i = stail, loop = 0; i < shead; i++) {
                unsigned int idx = i & 63;

                printf("- loop %d: idx = %u\n", loop++, idx);
        }

        return 0;
}

Fixes: 5cdad90de6 ("gve: Batch AQ commands for creating and destroying queues.")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-28 15:07:56 +00:00
Catherine Sullivan a92f7a6fee gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-26 18:45:01 -08:00
Tao Liu 6081ac2013 gve: Add tx|rx-coalesce-usec for DQO
Adding ethtool support for changing rx-coalesce-usec and tx-coalesce-usec
when using the DQO queue format.

Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Jordan Kim 2c9198356d gve: Add consumed counts to ethtool stats
Being able to see how many descriptors are in-use is helpful
when diagnosing certain issues.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: Jordan Kim <jrkim@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan 974365e518 gve: Implement suspend/resume/shutdown
Add support for suspend, resume and shutdown.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Willem de Bruijn 497dbb2b97 gve: Add optional metadata descriptor type GVE_TXD_MTD
Allow drivers to pass metadata along with packet data to the device.
Introduce a new metadata descriptor type

* GVE_TXD_MTD

This descriptor is optional. If present it immediate follows the
packet descriptor and precedes the segment descriptor.

This descriptor may be repeated. Multiple metadata descriptors may
follow. There are no immediate uses for this, this is for future
proofing. At present devices allow only 1 MTD descriptor.

The lower four bits of the type_flags field encode GVE_TXD_MTD.
The upper four bits of the type_flags field encodes a *sub*type.

Introduce one such metadata descriptor subtype

* GVE_MTD_SUBTYPE_PATH

This shares path information with the device for network failure
discovery and robust response:

Linux derives ipv6 flowlabel and ECMP multipath from sk->sk_txhash,
and updates this field on error with sk_rethink_txhash. Allow the host
stack to do the same. Pass the tx_hash value if set. Also communicate
whether the path hash is set, or more exactly, what its type is. Define
two common types

  GVE_MTD_PATH_HASH_NONE
  GVE_MTD_PATH_HASH_L4

Concrete examples of error conditions that are resolved are
mentioned in the commits that add sk_rethink_txhash calls. Such as
commit 7788174e87 ("tcp: change IPv6 flow-label upon receiving
spurious retransmission").

Experimental results mirror what the theory suggests: where IPv6
FlowLabel is included in path selection (e.g., LAG/ECMP), flowlabel
rotation on TCP timeout avoids the vast majority of TCP disconnects
that would otherwise have occurred during link failures in long-haul
backbones, when an alternative path is available.

Rotation can be applied to various bad connection signals, such as
timeouts and spurious retransmissions. In aggregate, such flow level
signals can help locate network issues. Define initial common states:

  GVE_MTD_PATH_STATE_DEFAULT
  GVE_MTD_PATH_STATE_TIMEOUT
  GVE_MTD_PATH_STATE_CONGESTION
  GVE_MTD_PATH_STATE_RETRANSMIT

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan 5fd07df47a gve: remove memory barrier around seqno
No longer needed after we introduced the barrier in gve_napi_poll.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:54 +00:00
Catherine Sullivan 13e7939c95 gve: Update gve_free_queue_page_list signature
The id field should be a u32 not a signed int.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Catherine Sullivan d30baacc04 gve: Move the irq db indexes out of the ntfy block struct
Giving the device access to other kernel structs is not ideal.
Move the indexes into their own array and just keep pointers to
them in the ntfy block struct.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Jeroen de Borst a10834a36c gve: Correct order of processing device options
The legacy raw addressing device option was processed before the
new RDA queue format option.  This caused the supported features mask,
which is provided only on the RDA queue format option, not to be set.

This disabled jumbo-frame support when using raw adressing.

Fixes: 255489f5b3 ("gve: Add a jumbo-frame device option")
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:41:53 +00:00
Jakub Kicinski 3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Ameer Hamza e6f60c51f0 gve: fix for null pointer dereference.
Avoid passing NULL skb to __skb_put() function call if
napi_alloc_skb() returns NULL.

Fixes: 37149e9374 ("gve: Implement packet continuation for RX.")
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Link: https://lore.kernel.org/r/20211205183810.8299-1-amhamza.mgc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:57:17 -08:00
Hao Chen 7462494408 ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for
.get_ringparam and .set_ringparam to extend more ring params
through netlink.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:49 +00:00
Dan Carpenter 721111b1b2 gve: fix unmatched u64_stats_update_end()
The u64_stats_update_end() call is supposed to be inside the curly
braces so it pairs with the u64_stats_update_begin().

Fixes: 37149e9374 ("gve: Implement packet continuation for RX.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:42:25 +00:00
Dan Carpenter 1c360cc1cc gve: Fix off by one in gve_tx_timeout()
The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this >
needs to be >= to prevent an off by one bug.  The priv->ntfy_blocks[]
array is allocated in gve_alloc_notify_blocks().

Fixes: 87a7f321bb ("gve: Recover from queue stall due to missed IRQ")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-09 13:58:46 +00:00
Shailend Chand 255489f5b3 gve: Add a jumbo-frame device option.
A widely deployed driver has a bug that will cause the driver not
to load when a max_mtu > 2048 is present in the device descriptor.

To avoid this bug while still enabling jumbo frames, we present a lower
max_mtu in the device descriptor and pass the actual max_mtu in
a separate device option.

The driver supports 2 different queue formats. To enable features
on one queue format, but not the other, a supported_features mask
was added to the device options in the device descriptor.

Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:13:12 +01:00
David Awogbemila 37149e9374 gve: Implement packet continuation for RX.
This enables the driver to receive RX packets spread across multiple
buffers:

For a given multi-fragment packet the "packet continuation" bit is set
on all descriptors except the last one. These descriptors' payloads are
combined into a single SKB before the SKB is handed to the
networking stack.

This change adds a "packet buffer size" notion for RX queues. The
CreateRxQueue AdminQueue command sent to the device now includes the
packet_buffer_size.

We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the
driver the opportunity to flip pages where we can instead of copying.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:13:12 +01:00
David Awogbemila 1344e751e9 gve: Add RX context.
This refactor moves the skb_head and skb_tail fields into a new
gve_rx_ctx struct. This new struct will contain information about the
current packet being processed. This is in preparation for
multi-descriptor RX packets.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:13:12 +01:00
Catherine Sullivan 1b4d1c9bab gve: Track RX buffer allocation failures
The rx_buf_alloc_fail counter wasn't getting updated.

Fixes: 433e274b8f ("gve: Add stats for gve.")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:36 +01:00
Jordan Kim ea5d3455ad gve: Allow pageflips on larger pages
Half pages are just used for small enough packets. This change allows
this to also apply for systems with pages larger than 4 KB.

Fixes: 02b0e0c18b ("gve: Rx Buffer Recycling")
Signed-off-by: Jordan Kim <jrkim@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:36 +01:00
Catherine Sullivan 4edf8249bc gve: Add netif_set_xps_queue call
Configure XPS when adding tx queues to the notification blocks.

Fixes: dbdaa67540 ("gve: Move some static functions to a common file")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:36 +01:00
John Fraker 87a7f321bb gve: Recover from queue stall due to missed IRQ
Don't always reset the driver on a TX timeout. Attempt to
recover by kicking the queue in case an IRQ was missed.

Fixes: 9e5f7d26a4 ("gve: Add workqueue and reset support")
Signed-off-by: John Fraker <jfraker@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:36 +01:00
Tao Liu 61d72c7e48 gve: Do lazy cleanup in TX path
When TX queue is full, attemt to process enough TX completions
to avoid stalling the queue.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:35 +01:00
Catherine Sullivan 58401b2a46 gve: Add rx buffer pagecnt bias
Add a pagecnt bias field to rx buffer info struct to eliminate
needing to increment the atomic page ref count on every pass in the
rx hotpath.

Also prefetch two packet pages ahead.

Fixes: ede3fcf5ec ("gve: Add support for raw addressing to the rx path")
Signed-off-by: Yanchun Fu <yangchun@google.com>
Signed-off-by: Nathan Lewis <npl@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:35 +01:00
Yangchun Fu 2cb67ab153 gve: Switch to use napi_complete_done
Use napi_complete_done to allow for the use of gro_flush_timeout.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-11 23:25:35 +01:00
Jakub Kicinski 9fe1155233 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07 15:24:06 -07:00
Eric Dumazet 17c37d748f gve: report 64bit tx_bytes counter from gve_handle_report_stats()
Each tx queue maintains a 64bit counter for bytes, there is
no reason to truncate this to 32bit (or this has not been
documented)

Fixes: 24aeb56f2d ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yangchun Fu <yangchun@google.com>
Cc: Kuo Zhao <kuozhao@google.com>
Cc: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06 15:11:51 +01:00
Eric Dumazet 2f57d4975f gve: fix gve_get_stats()
gve_get_stats() can report wrong numbers if/when u64_stats_fetch_retry()
returns true.

What is needed here is to sample values in temporary variables,
and only use them after each loop is ended.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Catherine Sullivan <csully@google.com>
Cc: Sagi Shahar <sagis@google.com>
Cc: Jon Olson <jonolson@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Luigi Rizzo <lrizzo@google.com>
Cc: Jeroen de Borst <jeroendb@google.com>
Cc: Tao Liu <xliutaox@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06 15:11:51 +01:00
Catherine Sullivan d4b111fda6 gve: Properly handle errors in gve_assign_qpl
Ignored errors would result in crash.

Fixes: ede3fcf5ec ("gve: Add support for raw addressing to the rx path")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06 15:04:26 +01:00
Tao Liu 922aa9bcac gve: Avoid freeing NULL pointer
Prevent possible crashes when cleaning up after unsuccessful
initializations.

Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Catherine Sully <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06 15:04:26 +01:00
Catherine Sullivan d03477ee10 gve: Correct available tx qpl check
The qpl_map_size is rounded up to a multiple of sizeof(long), but the
number of qpls doesn't have to be.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06 15:04:26 +01:00
Jakub Kicinski f3956ebb3b ethernet: use eth_hw_addr_set() instead of ether_addr_copy()
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - ether_addr_copy(dev->dev_addr, np)
  + eth_hw_addr_set(dev, np)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:18:25 +01:00
Gustavo A. R. Silva 7fec4d3919 gve: Use kvcalloc() instead of kvzalloc()
Use 2-factor argument form kvcalloc() instead of kvzalloc().

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:40:51 +01:00
Arnd Bergmann 1e0083bd07 gve: DQO: avoid unused variable warnings
The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:

drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
  494 |                 struct gve_tx_dma_buf *buf =

This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.

Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.

Fixes: a57e5de476 ("gve: DQO: Add TX path")
Link: https://lore.kernel.org/netdev/20210723231957.1113800-1-bcf@google.com/
Link: https://lore.kernel.org/netdev/20210721151100.2042139-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 15:24:36 +01:00
Haiyue Wang 63a9192b8f gve: fix the wrong AdminQ buffer overflow check
The 'tail' pointer is also free-running count, so it needs to be masked
as 'adminq_prod_cnt' does, to become an index value of AdminQ buffer.

Fixes: 5cdad90de6 ("gve: Batch AQ commands for creating and destroying queues.")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-16 10:41:40 -07:00
Bailey Forrest 1bfa4d0cb5 gve: DQO: Remove incorrect prefetch
The prefetch is incorrectly using the dma address instead of the virtual
address.

It's supposed to be:
prefetch((char *)buf_state->page_info.page_address +
	 buf_state->page_info.page_offset)

However, after correcting this mistake, there is no evidence of
performance improvement.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-02 12:06:17 -07:00
Christophe JAILLET bde3c8ffdd gve: Simplify code and axe the use of a deprecated API
The wrappers in include/linux/pci-dma-compat.h should go away.

Replace 'pci_set_dma_mask/pci_set_consistent_dma_mask' by an equivalent
and less verbose 'dma_set_mask_and_coherent()' call.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-02 11:56:15 -07:00
Christophe JAILLET 6dce38b4b7 gve: Propagate error codes to caller
If 'gve_probe()' fails, we should propagate the error code, instead of
hard coding a -ENXIO value.
Make sure that all error handling paths set a correct value for 'err'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 15:45:01 -07:00
Christophe JAILLET 2342ae10d1 gve: Fix an error handling path in 'gve_probe()'
If the 'register_netdev() call fails, we must release the resources
allocated by the previous 'gve_init_priv()' call, as already done in the
remove function.

Add a new label and the missing 'gve_teardown_priv_resources()' in the
error handling path.

Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01 15:45:01 -07:00
Jakub Kicinski b6df00789e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in net/netfilter/nf_tables_api.c.

Duplicate fix in tools/testing/selftests/net/devlink_port_split.py
- take the net-next version.

skmsg, and L4 bpf - keep the bpf code but remove the flags
and err params.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29 15:45:27 -07:00
Dan Carpenter ecd89c02da gve: DQO: Fix off by one in gve_rx_dqo()
The rx->dqo.buf_states[] array is allocated in gve_rx_alloc_ring_dqo()
and it has rx->dqo.num_buf_states so this > needs to >= to prevent an
out of bounds access.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:49:44 -07:00
Bailey Forrest 1db1a862a0 gve: Fix swapped vars when fetching max queues
Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25 11:22:04 -07:00
Bailey Forrest e8192476de gve: Fix warnings reported for DQO patchset
https://patchwork.kernel.org/project/netdevbpf/list/?series=506637&state=*

- Remove unused variable
- Use correct integer type for string formatting.
- Remove `inline` in C files

Fixes: 9c1a59a2f4 ("gve: DQO: Add ring allocation and initialization")
Fixes: a57e5de476 ("gve: DQO: Add TX path")
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 15:38:29 -07:00
Bailey Forrest 9b8dd5e5ea gve: DQO: Add RX path
The RX queue has an array of `gve_rx_buf_state_dqo` objects. All
allocated pages have an associated buf_state object. When a buffer is
posted on the RX buffer queue, the buffer ID will be the buf_state's
index into the RX queue's array.

On packet reception, the RX queue will have one descriptor for each
buffer associated with a received packet. Each RX descriptor will have
a buffer_id that was posted on the buffer queue.

Notable mentions:

- We use a default buffer size of 2048 bytes. Based on page size, we
  may post separate sections of a single page as separate buffers.

- The driver holds an extra reference on pages passed up the receive
  path with an skb and keeps these pages on a list. When posting new
  buffers to the NIC, we check if any of these pages has only our
  reference, or another buffer sized segment of the page has no
  references. If so, it is free to reuse. This page recycling approach
  is a common netdev optimization that reduces page alloc/free calls.

- Pages in the free list have a page_count bias in order to avoid an
  atomic increment of pagecount every time we attempt to reuse a page.
  # references = page_count() - bias

- In order to track when a page is safe to reuse, we keep track of the
  last offset which had a single SKB reference. When this occurs, it
  implies that every single other offset is reusable. Otherwise, we
  don't know if offsets can be safely reused.

- We maintain two free lists of pages. List #1 (recycled_buf_states)
  contains pages we know can be reused right away. List #2
  (used_buf_states) contains pages which cannot be used right away. We
  only attempt to get pages from list #2 when list #1 is empty. We only
  attempt to use a small fixed number pages from list #2 before giving
  up and allocating a new page. Both lists are FIFOs in hope that by the
  time we attempt to reuse a page, the references were dropped.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest a57e5de476 gve: DQO: Add TX path
TX SKBs will have their buffers DMA mapped with the device. Each buffer
will have at least one TX descriptor associated. Each SKB will also have
a metadata descriptor.

Each TX queue maintains an array of `gve_tx_pending_packet_dqo` objects.
Every TX SKB will have an associated pending_packet object. A TX SKB's
descriptors will use its pending_packet's index as the completion tag,
which will be returned on the TX completion queue.

The device implements a "flow-miss model". Most packets will simply
receive a packet completion. The flow-miss system may choose to process
a packet based on its contents. A TX packet which experiences a flow
miss would receive a miss completion followed by a later reinjection
completion. The miss-completion is received when the packet starts to be
processed by the flow-miss system and the reinjection completion is
received when the flow-miss system completes processing the packet and
sends it on the wire.

Notable mentions:

- Buffers may be freed after receiving the miss-completion, but in order
  to avoid packet reordering, we do not complete the SKB until receiving
  the reinjection completion.

- The driver must robustly handle the unlikely scenario where a miss
  completion does not have an associated reinjection completion. This is
  accomplished by maintaining a list of packets which have a pending
  reinjection completion. After a short timeout (5 seconds), the
  SKB and buffers are released and the pending_packet is moved to a
  second list which has a longer timeout (60 seconds), where the
  pending_packet will not be reused. When the longer timeout elapses,
  the driver may assume the reinjection completion would never be
  received and the pending_packet may be reused.

- Completion handling is triggered by an interrupt and is done in the
  NAPI poll function. Because the TX path and completion exist in
  different threading contexts they maintain their own lists for free
  pending_packet objects. The TX path uses a lock-free approach to steal
  the list from the completion path.

- Both the TSO context and general context descriptors have metadata
  bytes. The device requires that if multiple descriptors contain the
  same field, each descriptor must have the same value set for that
  field.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest 0dcc144a79 gve: DQO: Configure interrupts on device up
When interrupts are first enabled, we also set the ratelimits, which
will be static for the entire usage of the device.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest 9c1a59a2f4 gve: DQO: Add ring allocation and initialization
Allocate the buffer and completion ring structures. Do not populate the
rings yet. That will happen in the respective rx and tx datapath
follow-on patches

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest 5e8c5adf95 gve: DQO: Add core netdev features
Add napi netdev device registration, interrupt handling and initial tx
and rx polling stubs. The stubs will be filled in follow-on patches.

Also:
- LRO feature advertisement and handling
- Also update ethtool logic

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest 1f6228e459 gve: Update adminq commands to support DQO queues
DQO queue creation requires additional parameters:
- TX completion/RX buffer queue size
- TX completion/RX buffer queue address
- TX/RX queue size
- RX buffer size

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest a4aa1f1e69 gve: Add DQO fields for core data structures
- Add new DQO datapath structures:
  - `gve_rx_buf_queue_dqo`
  - `gve_rx_compl_queue_dqo`
  - `gve_rx_buf_state_dqo`
  - `gve_tx_desc_dqo`
  - `gve_tx_pending_packet_dqo`

- Incorporate these into the existing ring data structures:
  - `gve_rx_ring`
  - `gve_tx_ring`

Noteworthy mentions:

- `gve_rx_buf_state` represents an RX buffer which was posted to HW.
  Each RX queue has an array of these objects and the index into the
  array is used as the buffer_id when posted to HW.

- `gve_tx_pending_packet_dqo` is treated similarly for TX queues. The
  completion_tag is the index into the array.

- These two structures have links for linked lists which are represented
  by 16b indexes into a contiguous array of these structures.
  This reduces memory footprint compared to 64b pointers.

- We use unions for the writeable datapath structures to reduce cache
  footprint. GQI specific members will renamed like DQO members in a
  future patch.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:38 -07:00
Bailey Forrest 223198183f gve: Add dqo descriptors
General description of rings and descriptors:

TX ring is used for sending TX packet buffers to the NIC. It has the
following descriptors:
- `gve_tx_pkt_desc_dqo` - Data buffer descriptor
- `gve_tx_tso_context_desc_dqo` - TSO context descriptor
- `gve_tx_general_context_desc_dqo` - Generic metadata descriptor

Metadata is a collection of 12 bytes. We define `gve_tx_metadata_dqo`
which represents the logical interpetation of the metadata bytes. It's
helpful to define this structure because the metadata bytes exist in
multiple descriptor types (including `gve_tx_tso_context_desc_dqo`),
and the device requires same field has the same value in all
descriptors.

The TX completion ring is used to receive completions from the NIC.
Having a separate ring allows for completions to be out of order. The
completion descriptor `gve_tx_compl_desc` has several different types,
most important are packet and descriptor completions. Descriptor
completions are used to notify the driver when descriptors sent on the
TX ring are done being consumed. The descriptor completion is only used
to signal that space is cleared in the TX ring. A packet completion will
be received when a packet transmitted on the TX queue is done being
transmitted.

In addition there are "miss" and "reinjection" completions. The device
implements a "flow-miss model". Most packets will simply receive a
packet completion. The flow-miss system may choose to process a packet
based on its contents. A TX packet which experiences a flow miss would
receive a miss completion followed by a later reinjection completion.
The miss-completion is received when the packet starts to be processed
by the flow-miss system and the reinjection completion is received when
the flow-miss system completes processing the packet and sends it on the
wire.

The RX buffer ring is used to send buffers to HW via the
`gve_rx_desc_dqo` descriptor.

Received packets are put into the RX queue by the device, which
populates the `gve_rx_compl_desc_dqo` descriptor. The RX descriptors
refer to buffers posted by the buffer queue. Received buffers may be
returned out of order, such as when HW LRO is enabled.

Important concepts:
- "TX" and "RX buffer" queues, which send descriptors to the device, use
  MMIO doorbells to notify the device of new descriptors.

- "RX" and "TX completion" queues, which receive descriptors from the
  device, use a "generation bit" to know when a descriptor was populated
  by the device. The driver initializes all bits with the "current
  generation". The device will populate received descriptors with the
  "next generation" which is inverted from the current generation. When
  the ring wraps, the current/next generation are swapped.

- It's the driver's responsibility to ensure that the RX and TX
  completion queues are not overrun. This can be accomplished by
  limiting the number of descriptors posted to HW.

- TX packets have a 16 bit completion_tag and RX buffers have a 16 bit
  buffer_id. These will be returned on the TX completion and RX queues
  respectively to let the driver know which packet/buffer was completed.

Bitfields are used to describe descriptor fields. This notation is more
concise and readable than shift-and-mask. It is possible because the
driver is restricted to little endian platforms.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest c4b87ac876 gve: Add support for DQO RX PTYPE map
Unlike GQI, DQO RX descriptors do not contain the L3 and L4 type of the
packet. L3 and L4 types are necessary in order to set the hash and csum
on RX SKBs correctly.

DQO RX descriptors instead contain a 10 bit PTYPE index. The PTYPE map
enables the device to tell the driver how to map from PTYPE index to
L3/L4 type.

The device doesn't provide any guarantees about the range of possible
PTYPEs, so we just use a 1024 entry array to implement a fast mapping
structure.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest 5ca2265eef gve: adminq: DQO specific device descriptor logic
- In addition to TX and RX queues, DQO has TX completion and RX buffer
  queues.
  - TX completions are received when the device has completed sending a
    packet on the wire.
  - RX buffers are posted on a separate queue form the RX completions.
- DQO descriptor rings are allowed to be smaller than PAGE_SIZE.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest a5886ef4f4 gve: Introduce per netdev `enum gve_queue_format`
The currently supported queue formats are:
- GQI_RDA - GQI with raw DMA addressing
- GQI_QPL - GQI with queue page list
- DQO_RDA - DQO with raw DMA addressing

The old `gve_priv.raw_addressing` value is only used for GQI_RDA, so we
remove it in favor of just checking against GQI_RDA

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest 8a39d3e0da gve: Introduce a new model for device options
The current model uses an integer ID and a fixed size struct for the
parameters of each device option.

The new model allows the device option structs to grow in size over
time. A driver may assume that changes to device option structs will
always be appended.

New device options will also generally have a
`supported_features_mask` so that the driver knows which fields within a
particular device option are enabled.

`gve_device_option.feat_mask` is changed to `required_features_mask`,
and it is a bitmask which must match the value expected by the driver.
This gives the device the ability to break backwards compatibility with
old drivers for certain features by blocking the old drivers from trying
to use the feature.

We maintain ABI compatibility with the old model for
GVE_DEV_OPT_ID_RAW_ADDRESSING in case a driver is using a device which
does not support the new model.

This patch introduces some new terminology:

RDA - Raw DMA Addressing - Buffers associated with SKBs are directly DMA
      mapped and read/updated by the device.

QPL - Queue Page Lists - Driver uses bounce buffers which are DMA mapped
      with the device for read/write and data is copied from/to SKBs.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest 920fb45193 gve: Make gve_rx_slot_page_info.page_offset an absolute offset
Using `page_offset` like a boolean means a page may only be split into
two sections. With page sizes larger than 4k, this can be very wasteful.
Future commits in this patchset use `struct gve_rx_slot_page_info` in a
way which supports a fixed buffer size and a variable page size.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest 35f9b2f43f gve: gve_rx_copy: Move padding to an argument
Future use cases will have a different padding value.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
Bailey Forrest dbdaa67540 gve: Move some static functions to a common file
These functions will be shared by the GQI and DQO variants of the GVNIC
driver as of follow-up patches in this series.

Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 12:47:37 -07:00
David Awogbemila fbd4a28b4f gve: Correct SKB queue index validation.
SKBs with skb_get_queue_mapping(skb) == tx_cfg.num_queues should also be
considered invalid.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: David Awogbemila <awogbemila@google.com>
Acked-by: Willem de Brujin <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:38:40 -07:00
Catherine Sullivan f81781835f gve: Upgrade memory barrier in poll routine
As currently written, if the driver checks for more work (via
gve_tx_poll or gve_rx_poll) before the device posts work and the
irq doorbell is not unmasked
(via iowrite32be(GVE_IRQ_ACK | GVE_IRQ_EVENT, ...)) before the device
attempts to raise an interrupt, an interrupt is lost and this could
potentially lead to the traffic being completely halted. For
example, if a tx queue has already been stopped, the driver won't get
the chance to complete work and egress will be halted.

We need a full memory barrier in the poll
routine to ensure that the irq doorbell is unmasked before the driver
checks for more work.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Acked-by: Willem de Brujin <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:38:40 -07:00
David Awogbemila 5218e919c8 gve: Add NULL pointer checks when freeing irqs.
When freeing notification blocks, we index priv->msix_vectors.
If we failed to allocate priv->msix_vectors (see abort_with_msix_vectors)
this could lead to a NULL pointer dereference if the driver is unloaded.

Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: David Awogbemila <awogbemila@google.com>
Acked-by: Willem de Brujin <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:38:40 -07:00
David Awogbemila e96b491a0f gve: Update mgmt_msix_idx if num_ntfy changes
If we do not get the expected number of vectors from
pci_enable_msix_range, we update priv->num_ntfy_blks but not
priv->mgmt_msix_idx. This patch fixes this so that priv->mgmt_msix_idx
is updated accordingly.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: David Awogbemila <awogbemila@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:38:40 -07:00
Catherine Sullivan 5aec55b46c gve: Check TX QPL was actually assigned
Correctly check the TX QPL was assigned and unassigned if
other steps in the allocation fail.

Fixes: f5cedc84a3 (gve: Add transmit and receive support)
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:38:40 -07:00
Daode Huang f67435b555 net: gve: remove duplicated allowed
fix the WARNING of Possible repeated word: 'allowed'

Signed-off-by: Daode Huang <huangdaode@huawei.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 17:07:58 -07:00
Daode Huang c32773c961 net: gve: convert strlcpy to strscpy
Usage of strlcpy in linux kernel has been recently deprecated[1], so
convert gve driver to strscpy

[1] https://lore.kernel.org/lkml/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL
=V6A6G1oUZcprmknw@mail.gmail.com/

Signed-off-by: Daode Huang <huangdaode@huawei.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 17:07:58 -07:00
Catherine Sullivan 6f007c6486 gve: Add support for raw addressing in the tx path
During TX, skbs' data addresses are dma_map'ed and passed to the NIC.
This means that the device can perform DMA directly from these addresses
and the driver does not have to copy the buffer content into
pre-allocated buffers/qpls (as in qpl mode).

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:06:28 -08:00
David Awogbemila 02b0e0c18b gve: Rx Buffer Recycling
This patch lets the driver reuse buffers that have been freed by the
networking stack.

In the raw addressing case, this allows the driver avoid allocating new
buffers.
In the qpl case, the driver can avoid copies.

This patch separates the page refcount tracking mechanism
into a function gve_rx_can_recycle_buffer which uses get_page - this will
be changed in a future patch to entirely eliminate the use of get_page in
tracking page refcounts.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:06:28 -08:00
Catherine Sullivan ede3fcf5ec gve: Add support for raw addressing to the rx path
Add support to use raw dma addresses in the rx path. Due to this new
support we can alloc a new buffer instead of making a copy.

RX buffers are handed to the networking stack and are
re-allocated as needed, avoiding the need to use
skb_copy_to_linear_data() as in "qpl" mode.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:06:28 -08:00
Catherine Sullivan 4944db80ac gve: Add support for raw addressing device option
Add support to describe device for parsing device options. As
the first device option, add raw addressing.

"Raw Addressing" mode (as opposed to the current "qpl" mode) is an
operational mode which allows the driver avoid bounce buffer copies
which it currently performs using pre-allocated qpls (queue_page_lists)
when sending and receiving packets.
For egress packets, the provided skb data addresses will be dma_map'ed and
passed to the device, allowing the NIC can perform DMA directly - the
driver will not have to copy the buffer content into pre-allocated
buffers/qpls (as in qpl mode).
For ingress packets, copies are also eliminated as buffers are handed to
the networking stack and then recycled or re-allocated as
necessary, avoiding the use of skb_copy_to_linear_data().

This patch only introduces the option to the driver.
Subsequent patches will add the ingress and egress functionality.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-08 16:06:28 -08:00
Jakub Kicinski cc69837fca net: don't include ethtool.h from netdevice.h
linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:27:04 -08:00
Gustavo A. R. Silva 691f4077d5 gve: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure.  Kernel code
should always use “flexible array members”[1] for these cases.  The
older style of one-element or zero-length arrays should no longer be
used[2].

Refactor the code according to the use of a flexible-array member in
struct gve_stats_report, instead of a zero-length array, and use the
struct_size() helper to calculate the size for the resource allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-10-30 16:57:41 -05:00
David Awogbemila 7e074d5a76 gve: Enable Link Speed Reporting in the driver.
This change allows the driver to report the device link speed
when the ethtool command:
	ethtool <nic name>
is run.
Getting the link speed is done via a new admin queue command:
ReportLinkSpeed.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
Patricio Noyola 3b7cc73628 gve: Use link status register to report link status
This makes the driver better aware of the connectivity status of the
device. Based on the device's status register, the driver can call
netif_carrier_{on,off}.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Patricio Noyola <patricion@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
Sagi Shahar 5cdad90de6 gve: Batch AQ commands for creating and destroying queues.
Adds support for batching AQ commands and uses it for creating and
destroying queues.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Sagi Shahar <sagis@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
David Awogbemila 2f523dc34a gve: NIC stats for report-stats and for ethtool
This adds per queue NIC stats to ethtool stats and to report-stats.
These stats are always exposed to guest whether or not the
report-stats flag is turned on.

Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
Kuo Zhao 24aeb56f2d gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.
This adds functionality to report driver stats to Hypervisor.
(Users may want to turn this feature off as a matter of privacy
so a "report-stats" flag is added as an ethtool priv option.
It is also disabled by default.)
The hypervisor would trigger a stats report in case "too many"
packets dropped; the stats would be useful in debugging stuck
queues.
A "stats_report_trigger_cnt" stat is added to count the number of times
the hypervisor attempts to trigger stats report.

A timer is also added so that when report-stats is enabled, stat are
updated once every 20 seconds.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Kuo Zhao <kuozhao@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
Catherine Sullivan 0d5775d34d gve: Use dev_info/err instead of netif_info/err.
Update the driver to use dev_info/err instead of netif_info/err.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:54 -07:00
Kuo Zhao 433e274b8f gve: Add stats for gve.
Sample output of "ethtool -S <interface-name>" with 1 RX queue and 1 TX
queue:
NIC statistics:
     rx_packets: 1039
     tx_packets: 37
     rx_bytes: 822071
     tx_bytes: 4100
     rx_dropped: 0
     tx_dropped: 0
     tx_timeouts: 0
     rx_skb_alloc_fail: 0
     rx_buf_alloc_fail: 0
     rx_desc_err_dropped_pkt: 0
     interface_up_cnt: 1
     interface_down_cnt: 0
     reset_cnt: 0
     page_alloc_fail: 0
     dma_mapping_error: 0
     rx_posted_desc[0]: 1365
     rx_completed_desc[0]: 341
     rx_bytes[0]: 215094
     rx_dropped_pkt[0]: 0
     rx_copybreak_pkt[0]: 3
     rx_copied_pkt[0]: 3
     tx_posted_desc[0]: 6
     tx_completed_desc[0]: 6
     tx_bytes[0]: 420
     tx_wake[0]: 0
     tx_stop[0]: 0
     tx_event_counter[0]: 6
     adminq_prod_cnt: 34
     adminq_cmd_fail: 0
     adminq_timeouts: 0
     adminq_describe_device_cnt: 1
     adminq_cfg_device_resources_cnt: 1
     adminq_register_page_list_cnt: 16
     adminq_unregister_page_list_cnt: 0
     adminq_create_tx_queue_cnt: 8
     adminq_create_rx_queue_cnt: 8
     adminq_destroy_tx_queue_cnt: 0
     adminq_destroy_rx_queue_cnt: 0
     adminq_dcfg_device_resources_cnt: 0
     adminq_set_driver_parameter_cnt: 0

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Kuo Zhao <kuozhao@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:53 -07:00
Kuo Zhao d5f7543c86 gve: Get and set Rx copybreak via ethtool
This adds support for getting and setting the RX copybreak
value via ethtool.

Reviewed-by: Yangchun Fu <yangchun@google.com>
Signed-off-by: Kuo Zhao <kuozhao@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 14:31:53 -07:00
David S. Miller a2d6d7ae59 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The ungrafting from PRIO bug fixes in net, when merged into net-next,
merge cleanly but create a build failure.  The resolution used here is
from Petr Machata.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09 12:13:43 -08:00
Liran Alon b54ef37b1c net: Google gve: Remove dma_wmb() before ringing doorbell
Current code use dma_wmb() to ensure Rx/Tx descriptors are visible
to device before writing to doorbell.

However, these dma_wmb() are wrong and unnecessary. Therefore,
they should be removed.

iowrite32be() called from gve_rx_write_doorbell()/gve_tx_put_doorbell()
should guaratee that all previous writes to WB/UC memory is visible to
device before the write done by iowrite32be().

E.g. On ARM64, iowrite32be() calls __iowmb() which expands to dma_wmb()
and only then calls __raw_writel().

Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-03 12:40:53 -08:00
Michael S. Tsirkin 0290bd291c netdev: pass the stuck queue to the timeout handler
This allows incrementing the correct timeout statistic without any mess.
Down the road, devices can learn to reset just the specific queue.

The patch was generated with the following script:

use strict;
use warnings;

our $^I = '.bak';

my @work = (
["arch/m68k/emu/nfeth.c", "nfeth_tx_timeout"],
["arch/um/drivers/net_kern.c", "uml_net_tx_timeout"],
["arch/um/drivers/vector_kern.c", "vector_net_tx_timeout"],
["arch/xtensa/platforms/iss/network.c", "iss_net_tx_timeout"],
["drivers/char/pcmcia/synclink_cs.c", "hdlcdev_tx_timeout"],
["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"],
["drivers/infiniband/ulp/ipoib/ipoib_main.c", "ipoib_timeout"],
["drivers/message/fusion/mptlan.c", "mpt_lan_tx_timeout"],
["drivers/misc/sgi-xp/xpnet.c", "xpnet_dev_tx_timeout"],
["drivers/net/appletalk/cops.c", "cops_timeout"],
["drivers/net/arcnet/arcdevice.h", "arcnet_timeout"],
["drivers/net/arcnet/arcnet.c", "arcnet_timeout"],
["drivers/net/arcnet/com20020.c", "arcnet_timeout"],
["drivers/net/ethernet/3com/3c509.c", "el3_tx_timeout"],
["drivers/net/ethernet/3com/3c515.c", "corkscrew_timeout"],
["drivers/net/ethernet/3com/3c574_cs.c", "el3_tx_timeout"],
["drivers/net/ethernet/3com/3c589_cs.c", "el3_tx_timeout"],
["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"],
["drivers/net/ethernet/3com/3c59x.c", "vortex_tx_timeout"],
["drivers/net/ethernet/3com/typhoon.c", "typhoon_tx_timeout"],
["drivers/net/ethernet/8390/8390.h", "ei_tx_timeout"],
["drivers/net/ethernet/8390/8390.h", "eip_tx_timeout"],
["drivers/net/ethernet/8390/8390.c", "ei_tx_timeout"],
["drivers/net/ethernet/8390/8390p.c", "eip_tx_timeout"],
["drivers/net/ethernet/8390/ax88796.c", "ax_ei_tx_timeout"],
["drivers/net/ethernet/8390/axnet_cs.c", "axnet_tx_timeout"],
["drivers/net/ethernet/8390/etherh.c", "__ei_tx_timeout"],
["drivers/net/ethernet/8390/hydra.c", "__ei_tx_timeout"],
["drivers/net/ethernet/8390/mac8390.c", "__ei_tx_timeout"],
["drivers/net/ethernet/8390/mcf8390.c", "__ei_tx_timeout"],
["drivers/net/ethernet/8390/lib8390.c", "__ei_tx_timeout"],
["drivers/net/ethernet/8390/ne2k-pci.c", "ei_tx_timeout"],
["drivers/net/ethernet/8390/pcnet_cs.c", "ei_tx_timeout"],
["drivers/net/ethernet/8390/smc-ultra.c", "ei_tx_timeout"],
["drivers/net/ethernet/8390/wd.c", "ei_tx_timeout"],
["drivers/net/ethernet/8390/zorro8390.c", "__ei_tx_timeout"],
["drivers/net/ethernet/adaptec/starfire.c", "tx_timeout"],
["drivers/net/ethernet/agere/et131x.c", "et131x_tx_timeout"],
["drivers/net/ethernet/allwinner/sun4i-emac.c", "emac_timeout"],
["drivers/net/ethernet/alteon/acenic.c", "ace_watchdog"],
["drivers/net/ethernet/amazon/ena/ena_netdev.c", "ena_tx_timeout"],
["drivers/net/ethernet/amd/7990.h", "lance_tx_timeout"],
["drivers/net/ethernet/amd/7990.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/a2065.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/am79c961a.c", "am79c961_timeout"],
["drivers/net/ethernet/amd/amd8111e.c", "amd8111e_tx_timeout"],
["drivers/net/ethernet/amd/ariadne.c", "ariadne_tx_timeout"],
["drivers/net/ethernet/amd/atarilance.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/au1000_eth.c", "au1000_tx_timeout"],
["drivers/net/ethernet/amd/declance.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/lance.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/mvme147.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/ni65.c", "ni65_timeout"],
["drivers/net/ethernet/amd/nmclan_cs.c", "mace_tx_timeout"],
["drivers/net/ethernet/amd/pcnet32.c", "pcnet32_tx_timeout"],
["drivers/net/ethernet/amd/sunlance.c", "lance_tx_timeout"],
["drivers/net/ethernet/amd/xgbe/xgbe-drv.c", "xgbe_tx_timeout"],
["drivers/net/ethernet/apm/xgene-v2/main.c", "xge_timeout"],
["drivers/net/ethernet/apm/xgene/xgene_enet_main.c", "xgene_enet_timeout"],
["drivers/net/ethernet/apple/macmace.c", "mace_tx_timeout"],
["drivers/net/ethernet/atheros/ag71xx.c", "ag71xx_tx_timeout"],
["drivers/net/ethernet/atheros/alx/main.c", "alx_tx_timeout"],
["drivers/net/ethernet/atheros/atl1c/atl1c_main.c", "atl1c_tx_timeout"],
["drivers/net/ethernet/atheros/atl1e/atl1e_main.c", "atl1e_tx_timeout"],
["drivers/net/ethernet/atheros/atlx/atl.c", "atlx_tx_timeout"],
["drivers/net/ethernet/atheros/atlx/atl1.c", "atlx_tx_timeout"],
["drivers/net/ethernet/atheros/atlx/atl2.c", "atl2_tx_timeout"],
["drivers/net/ethernet/broadcom/b44.c", "b44_tx_timeout"],
["drivers/net/ethernet/broadcom/bcmsysport.c", "bcm_sysport_tx_timeout"],
["drivers/net/ethernet/broadcom/bnx2.c", "bnx2_tx_timeout"],
["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h", "bnx2x_tx_timeout"],
["drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c", "bnx2x_tx_timeout"],
["drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c", "bnx2x_tx_timeout"],
["drivers/net/ethernet/broadcom/bnxt/bnxt.c", "bnxt_tx_timeout"],
["drivers/net/ethernet/broadcom/genet/bcmgenet.c", "bcmgenet_timeout"],
["drivers/net/ethernet/broadcom/sb1250-mac.c", "sbmac_tx_timeout"],
["drivers/net/ethernet/broadcom/tg3.c", "tg3_tx_timeout"],
["drivers/net/ethernet/calxeda/xgmac.c", "xgmac_tx_timeout"],
["drivers/net/ethernet/cavium/liquidio/lio_main.c", "liquidio_tx_timeout"],
["drivers/net/ethernet/cavium/liquidio/lio_vf_main.c", "liquidio_tx_timeout"],
["drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c", "lio_vf_rep_tx_timeout"],
["drivers/net/ethernet/cavium/thunder/nicvf_main.c", "nicvf_tx_timeout"],
["drivers/net/ethernet/cirrus/cs89x0.c", "net_timeout"],
["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"],
["drivers/net/ethernet/cisco/enic/enic_main.c", "enic_tx_timeout"],
["drivers/net/ethernet/cortina/gemini.c", "gmac_tx_timeout"],
["drivers/net/ethernet/davicom/dm9000.c", "dm9000_timeout"],
["drivers/net/ethernet/dec/tulip/de2104x.c", "de_tx_timeout"],
["drivers/net/ethernet/dec/tulip/tulip_core.c", "tulip_tx_timeout"],
["drivers/net/ethernet/dec/tulip/winbond-840.c", "tx_timeout"],
["drivers/net/ethernet/dlink/dl2k.c", "rio_tx_timeout"],
["drivers/net/ethernet/dlink/sundance.c", "tx_timeout"],
["drivers/net/ethernet/emulex/benet/be_main.c", "be_tx_timeout"],
["drivers/net/ethernet/ethoc.c", "ethoc_tx_timeout"],
["drivers/net/ethernet/faraday/ftgmac100.c", "ftgmac100_tx_timeout"],
["drivers/net/ethernet/fealnx.c", "fealnx_tx_timeout"],
["drivers/net/ethernet/freescale/dpaa/dpaa_eth.c", "dpaa_tx_timeout"],
["drivers/net/ethernet/freescale/fec_main.c", "fec_timeout"],
["drivers/net/ethernet/freescale/fec_mpc52xx.c", "mpc52xx_fec_tx_timeout"],
["drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c", "fs_timeout"],
["drivers/net/ethernet/freescale/gianfar.c", "gfar_timeout"],
["drivers/net/ethernet/freescale/ucc_geth.c", "ucc_geth_timeout"],
["drivers/net/ethernet/fujitsu/fmvj18x_cs.c", "fjn_tx_timeout"],
["drivers/net/ethernet/google/gve/gve_main.c", "gve_tx_timeout"],
["drivers/net/ethernet/hisilicon/hip04_eth.c", "hip04_timeout"],
["drivers/net/ethernet/hisilicon/hix5hd2_gmac.c", "hix5hd2_net_timeout"],
["drivers/net/ethernet/hisilicon/hns/hns_enet.c", "hns_nic_net_timeout"],
["drivers/net/ethernet/hisilicon/hns3/hns3_enet.c", "hns3_nic_net_timeout"],
["drivers/net/ethernet/huawei/hinic/hinic_main.c", "hinic_tx_timeout"],
["drivers/net/ethernet/i825xx/82596.c", "i596_tx_timeout"],
["drivers/net/ethernet/i825xx/ether1.c", "ether1_timeout"],
["drivers/net/ethernet/i825xx/lib82596.c", "i596_tx_timeout"],
["drivers/net/ethernet/i825xx/sun3_82586.c", "sun3_82586_timeout"],
["drivers/net/ethernet/ibm/ehea/ehea_main.c", "ehea_tx_watchdog"],
["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"],
["drivers/net/ethernet/ibm/emac/core.c", "emac_tx_timeout"],
["drivers/net/ethernet/ibm/ibmvnic.c", "ibmvnic_tx_timeout"],
["drivers/net/ethernet/intel/e100.c", "e100_tx_timeout"],
["drivers/net/ethernet/intel/e1000/e1000_main.c", "e1000_tx_timeout"],
["drivers/net/ethernet/intel/e1000e/netdev.c", "e1000_tx_timeout"],
["drivers/net/ethernet/intel/fm10k/fm10k_netdev.c", "fm10k_tx_timeout"],
["drivers/net/ethernet/intel/i40e/i40e_main.c", "i40e_tx_timeout"],
["drivers/net/ethernet/intel/iavf/iavf_main.c", "iavf_tx_timeout"],
["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"],
["drivers/net/ethernet/intel/ice/ice_main.c", "ice_tx_timeout"],
["drivers/net/ethernet/intel/igb/igb_main.c", "igb_tx_timeout"],
["drivers/net/ethernet/intel/igbvf/netdev.c", "igbvf_tx_timeout"],
["drivers/net/ethernet/intel/ixgb/ixgb_main.c", "ixgb_tx_timeout"],
["drivers/net/ethernet/intel/ixgbe/ixgbe_debugfs.c", "adapter->netdev->netdev_ops->ndo_tx_timeout(adapter->netdev);"],
["drivers/net/ethernet/intel/ixgbe/ixgbe_main.c", "ixgbe_tx_timeout"],
["drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c", "ixgbevf_tx_timeout"],
["drivers/net/ethernet/jme.c", "jme_tx_timeout"],
["drivers/net/ethernet/korina.c", "korina_tx_timeout"],
["drivers/net/ethernet/lantiq_etop.c", "ltq_etop_tx_timeout"],
["drivers/net/ethernet/marvell/mv643xx_eth.c", "mv643xx_eth_tx_timeout"],
["drivers/net/ethernet/marvell/pxa168_eth.c", "pxa168_eth_tx_timeout"],
["drivers/net/ethernet/marvell/skge.c", "skge_tx_timeout"],
["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"],
["drivers/net/ethernet/marvell/sky2.c", "sky2_tx_timeout"],
["drivers/net/ethernet/mediatek/mtk_eth_soc.c", "mtk_tx_timeout"],
["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"],
["drivers/net/ethernet/mellanox/mlx4/en_netdev.c", "mlx4_en_tx_timeout"],
["drivers/net/ethernet/mellanox/mlx5/core/en_main.c", "mlx5e_tx_timeout"],
["drivers/net/ethernet/micrel/ks8842.c", "ks8842_tx_timeout"],
["drivers/net/ethernet/micrel/ksz884x.c", "netdev_tx_timeout"],
["drivers/net/ethernet/microchip/enc28j60.c", "enc28j60_tx_timeout"],
["drivers/net/ethernet/microchip/encx24j600.c", "encx24j600_tx_timeout"],
["drivers/net/ethernet/natsemi/sonic.h", "sonic_tx_timeout"],
["drivers/net/ethernet/natsemi/sonic.c", "sonic_tx_timeout"],
["drivers/net/ethernet/natsemi/jazzsonic.c", "sonic_tx_timeout"],
["drivers/net/ethernet/natsemi/macsonic.c", "sonic_tx_timeout"],
["drivers/net/ethernet/natsemi/natsemi.c", "ns_tx_timeout"],
["drivers/net/ethernet/natsemi/ns83820.c", "ns83820_tx_timeout"],
["drivers/net/ethernet/natsemi/xtsonic.c", "sonic_tx_timeout"],
["drivers/net/ethernet/neterion/s2io.h", "s2io_tx_watchdog"],
["drivers/net/ethernet/neterion/s2io.c", "s2io_tx_watchdog"],
["drivers/net/ethernet/neterion/vxge/vxge-main.c", "vxge_tx_watchdog"],
["drivers/net/ethernet/netronome/nfp/nfp_net_common.c", "nfp_net_tx_timeout"],
["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"],
["drivers/net/ethernet/nvidia/forcedeth.c", "nv_tx_timeout"],
["drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c", "pch_gbe_tx_timeout"],
["drivers/net/ethernet/packetengines/hamachi.c", "hamachi_tx_timeout"],
["drivers/net/ethernet/packetengines/yellowfin.c", "yellowfin_tx_timeout"],
["drivers/net/ethernet/pensando/ionic/ionic_lif.c", "ionic_tx_timeout"],
["drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c", "netxen_tx_timeout"],
["drivers/net/ethernet/qlogic/qla3xxx.c", "ql3xxx_tx_timeout"],
["drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c", "qlcnic_tx_timeout"],
["drivers/net/ethernet/qualcomm/emac/emac.c", "emac_tx_timeout"],
["drivers/net/ethernet/qualcomm/qca_spi.c", "qcaspi_netdev_tx_timeout"],
["drivers/net/ethernet/qualcomm/qca_uart.c", "qcauart_netdev_tx_timeout"],
["drivers/net/ethernet/rdc/r6040.c", "r6040_tx_timeout"],
["drivers/net/ethernet/realtek/8139cp.c", "cp_tx_timeout"],
["drivers/net/ethernet/realtek/8139too.c", "rtl8139_tx_timeout"],
["drivers/net/ethernet/realtek/atp.c", "tx_timeout"],
["drivers/net/ethernet/realtek/r8169_main.c", "rtl8169_tx_timeout"],
["drivers/net/ethernet/renesas/ravb_main.c", "ravb_tx_timeout"],
["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"],
["drivers/net/ethernet/renesas/sh_eth.c", "sh_eth_tx_timeout"],
["drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c", "sxgbe_tx_timeout"],
["drivers/net/ethernet/seeq/ether3.c", "ether3_timeout"],
["drivers/net/ethernet/seeq/sgiseeq.c", "timeout"],
["drivers/net/ethernet/sfc/efx.c", "efx_watchdog"],
["drivers/net/ethernet/sfc/falcon/efx.c", "ef4_watchdog"],
["drivers/net/ethernet/sgi/ioc3-eth.c", "ioc3_timeout"],
["drivers/net/ethernet/sgi/meth.c", "meth_tx_timeout"],
["drivers/net/ethernet/silan/sc92031.c", "sc92031_tx_timeout"],
["drivers/net/ethernet/sis/sis190.c", "sis190_tx_timeout"],
["drivers/net/ethernet/sis/sis900.c", "sis900_tx_timeout"],
["drivers/net/ethernet/smsc/epic100.c", "epic_tx_timeout"],
["drivers/net/ethernet/smsc/smc911x.c", "smc911x_timeout"],
["drivers/net/ethernet/smsc/smc9194.c", "smc_timeout"],
["drivers/net/ethernet/smsc/smc91c92_cs.c", "smc_tx_timeout"],
["drivers/net/ethernet/smsc/smc91x.c", "smc_timeout"],
["drivers/net/ethernet/stmicro/stmmac/stmmac_main.c", "stmmac_tx_timeout"],
["drivers/net/ethernet/sun/cassini.c", "cas_tx_timeout"],
["drivers/net/ethernet/sun/ldmvsw.c", "sunvnet_tx_timeout_common"],
["drivers/net/ethernet/sun/niu.c", "niu_tx_timeout"],
["drivers/net/ethernet/sun/sunbmac.c", "bigmac_tx_timeout"],
["drivers/net/ethernet/sun/sungem.c", "gem_tx_timeout"],
["drivers/net/ethernet/sun/sunhme.c", "happy_meal_tx_timeout"],
["drivers/net/ethernet/sun/sunqe.c", "qe_tx_timeout"],
["drivers/net/ethernet/sun/sunvnet.c", "sunvnet_tx_timeout_common"],
["drivers/net/ethernet/sun/sunvnet_common.c", "sunvnet_tx_timeout_common"],
["drivers/net/ethernet/sun/sunvnet_common.h", "sunvnet_tx_timeout_common"],
["drivers/net/ethernet/synopsys/dwc-xlgmac-net.c", "xlgmac_tx_timeout"],
["drivers/net/ethernet/ti/cpmac.c", "cpmac_tx_timeout"],
["drivers/net/ethernet/ti/cpsw.c", "cpsw_ndo_tx_timeout"],
["drivers/net/ethernet/ti/cpsw_priv.c", "cpsw_ndo_tx_timeout"],
["drivers/net/ethernet/ti/cpsw_priv.h", "cpsw_ndo_tx_timeout"],
["drivers/net/ethernet/ti/davinci_emac.c", "emac_dev_tx_timeout"],
["drivers/net/ethernet/ti/netcp_core.c", "netcp_ndo_tx_timeout"],
["drivers/net/ethernet/ti/tlan.c", "tlan_tx_timeout"],
["drivers/net/ethernet/toshiba/ps3_gelic_net.h", "gelic_net_tx_timeout"],
["drivers/net/ethernet/toshiba/ps3_gelic_net.c", "gelic_net_tx_timeout"],
["drivers/net/ethernet/toshiba/ps3_gelic_wireless.c", "gelic_net_tx_timeout"],
["drivers/net/ethernet/toshiba/spider_net.c", "spider_net_tx_timeout"],
["drivers/net/ethernet/toshiba/tc35815.c", "tc35815_tx_timeout"],
["drivers/net/ethernet/via/via-rhine.c", "rhine_tx_timeout"],
["drivers/net/ethernet/wiznet/w5100.c", "w5100_tx_timeout"],
["drivers/net/ethernet/wiznet/w5300.c", "w5300_tx_timeout"],
["drivers/net/ethernet/xilinx/xilinx_emaclite.c", "xemaclite_tx_timeout"],
["drivers/net/ethernet/xircom/xirc2ps_cs.c", "xirc_tx_timeout"],
["drivers/net/fjes/fjes_main.c", "fjes_tx_retry"],
["drivers/net/slip/slip.c", "sl_tx_timeout"],
["include/linux/usb/usbnet.h", "usbnet_tx_timeout"],
["drivers/net/usb/aqc111.c", "usbnet_tx_timeout"],
["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
["drivers/net/usb/asix_devices.c", "usbnet_tx_timeout"],
["drivers/net/usb/ax88172a.c", "usbnet_tx_timeout"],
["drivers/net/usb/ax88179_178a.c", "usbnet_tx_timeout"],
["drivers/net/usb/catc.c", "catc_tx_timeout"],
["drivers/net/usb/cdc_mbim.c", "usbnet_tx_timeout"],
["drivers/net/usb/cdc_ncm.c", "usbnet_tx_timeout"],
["drivers/net/usb/dm9601.c", "usbnet_tx_timeout"],
["drivers/net/usb/hso.c", "hso_net_tx_timeout"],
["drivers/net/usb/int51x1.c", "usbnet_tx_timeout"],
["drivers/net/usb/ipheth.c", "ipheth_tx_timeout"],
["drivers/net/usb/kaweth.c", "kaweth_tx_timeout"],
["drivers/net/usb/lan78xx.c", "lan78xx_tx_timeout"],
["drivers/net/usb/mcs7830.c", "usbnet_tx_timeout"],
["drivers/net/usb/pegasus.c", "pegasus_tx_timeout"],
["drivers/net/usb/qmi_wwan.c", "usbnet_tx_timeout"],
["drivers/net/usb/r8152.c", "rtl8152_tx_timeout"],
["drivers/net/usb/rndis_host.c", "usbnet_tx_timeout"],
["drivers/net/usb/rtl8150.c", "rtl8150_tx_timeout"],
["drivers/net/usb/sierra_net.c", "usbnet_tx_timeout"],
["drivers/net/usb/smsc75xx.c", "usbnet_tx_timeout"],
["drivers/net/usb/smsc95xx.c", "usbnet_tx_timeout"],
["drivers/net/usb/sr9700.c", "usbnet_tx_timeout"],
["drivers/net/usb/sr9800.c", "usbnet_tx_timeout"],
["drivers/net/usb/usbnet.c", "usbnet_tx_timeout"],
["drivers/net/vmxnet3/vmxnet3_drv.c", "vmxnet3_tx_timeout"],
["drivers/net/wan/cosa.c", "cosa_net_timeout"],
["drivers/net/wan/farsync.c", "fst_tx_timeout"],
["drivers/net/wan/fsl_ucc_hdlc.c", "uhdlc_tx_timeout"],
["drivers/net/wan/lmc/lmc_main.c", "lmc_driver_timeout"],
["drivers/net/wan/x25_asy.c", "x25_asy_timeout"],
["drivers/net/wimax/i2400m/netdev.c", "i2400m_tx_timeout"],
["drivers/net/wireless/intel/ipw2x00/ipw2100.c", "ipw2100_tx_timeout"],
["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
["drivers/net/wireless/intersil/hostap/hostap_main.c", "prism2_tx_timeout"],
["drivers/net/wireless/intersil/orinoco/main.c", "orinoco_tx_timeout"],
["drivers/net/wireless/intersil/orinoco/orinoco_usb.c", "orinoco_tx_timeout"],
["drivers/net/wireless/intersil/orinoco/orinoco.h", "orinoco_tx_timeout"],
["drivers/net/wireless/intersil/prism54/islpci_dev.c", "islpci_eth_tx_timeout"],
["drivers/net/wireless/intersil/prism54/islpci_eth.c", "islpci_eth_tx_timeout"],
["drivers/net/wireless/intersil/prism54/islpci_eth.h", "islpci_eth_tx_timeout"],
["drivers/net/wireless/marvell/mwifiex/main.c", "mwifiex_tx_timeout"],
["drivers/net/wireless/quantenna/qtnfmac/core.c", "qtnf_netdev_tx_timeout"],
["drivers/net/wireless/quantenna/qtnfmac/core.h", "qtnf_netdev_tx_timeout"],
["drivers/net/wireless/rndis_wlan.c", "usbnet_tx_timeout"],
["drivers/net/wireless/wl3501_cs.c", "wl3501_tx_timeout"],
["drivers/net/wireless/zydas/zd1201.c", "zd1201_tx_timeout"],
["drivers/s390/net/qeth_core.h", "qeth_tx_timeout"],
["drivers/s390/net/qeth_core_main.c", "qeth_tx_timeout"],
["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"],
["drivers/s390/net/qeth_l2_main.c", "qeth_tx_timeout"],
["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"],
["drivers/s390/net/qeth_l3_main.c", "qeth_tx_timeout"],
["drivers/staging/ks7010/ks_wlan_net.c", "ks_wlan_tx_timeout"],
["drivers/staging/qlge/qlge_main.c", "qlge_tx_timeout"],
["drivers/staging/rtl8192e/rtl8192e/rtl_core.c", "_rtl92e_tx_timeout"],
["drivers/staging/rtl8192u/r8192U_core.c", "tx_timeout"],
["drivers/staging/unisys/visornic/visornic_main.c", "visornic_xmit_timeout"],
["drivers/staging/wlan-ng/p80211netdev.c", "p80211knetdev_tx_timeout"],
["drivers/tty/n_gsm.c", "gsm_mux_net_tx_timeout"],
["drivers/tty/synclink.c", "hdlcdev_tx_timeout"],
["drivers/tty/synclink_gt.c", "hdlcdev_tx_timeout"],
["drivers/tty/synclinkmp.c", "hdlcdev_tx_timeout"],
["net/atm/lec.c", "lec_tx_timeout"],
["net/bluetooth/bnep/netdev.c", "bnep_net_timeout"]
);

for my $p (@work) {
	my @pair = @$p;
	my $file = $pair[0];
	my $func = $pair[1];
	print STDERR $file , ": ", $func,"\n";
	our @ARGV = ($file);
	while (<ARGV>) {
		if (m/($func\s*\(struct\s+net_device\s+\*[A-Za-z_]?[A-Za-z-0-9_]*)(\))/) {
			print STDERR "found $1+$2 in $file\n";
		}
		if (s/($func\s*\(struct\s+net_device\s+\*[A-Za-z_]?[A-Za-z-0-9_]*)(\))/$1, unsigned int txqueue$2/) {
			print STDERR "$func found in $file\n";
		}
		print;
	}
}

where the list of files and functions is simply from:

git grep ndo_tx_timeout, with manual addition of headers
in the rare cases where the function is from a header,
then manually changing the few places which actually
call ndo_tx_timeout.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Martin Habets <mhabets@solarflare.com>

changes from v9:
	fixup a forward declaration
changes from v9:
	more leftovers from v3 change
changes from v8:
        fix up a missing direct call to timeout
        rebased on net-next
changes from v7:
	fixup leftovers from v3 change
changes from v6:
	fix typo in rtl driver
changes from v5:
	add missing files (allow any net device argument name)
changes from v4:
	add a missing driver header
changes from v3:
        change queue # to unsigned
Changes from v2:
        added headers
Changes from v1:
        Fix errors found by kbuild:
        generalize the pattern a bit, to pick up
        a couple of instances missed by the previous
        version.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-12 21:38:57 -08:00
Jeroen de Borst a95069ecb7 gve: Fix the queue page list allocated pages count
In gve_alloc_queue_page_list(), when a page allocation fails,
qpl->num_entries will be wrong.  In this case priv->num_registered_pages
can underflow in gve_free_queue_page_list(), causing subsequent calls
to gve_alloc_queue_page_list() to fail.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-26 15:52:34 -08:00
Adi Suresh db96c2cb48 gve: fix dma sync bug where not all pages synced
The previous commit had a bug where the last page in the memory range
could not be synced. This change fixes the behavior so that all the
required pages are synced.

Fixes: 9cfeeb576d ("gve: Fixes DMA synchronization")
Signed-off-by: Adi Suresh <adisuresh@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 12:58:18 -08:00
Yangchun Fu 9cfeeb576d gve: Fixes DMA synchronization.
Synces the DMA buffer properly in order for CPU and device to see
the most up-to-data data.

Signed-off-by: Yangchun Fu <yangchun@google.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 15:00:05 -07:00
Dan Carpenter cc07db5a5b gve: Copy and paste bug in gve_get_stats()
There is a copy and paste error so we have "rx" where "tx" was intended
in the priv->tx[] array.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21 20:50:26 -07:00
Catherine Sullivan 438b43bdb9 gve: Fix case where desc_cnt and data_cnt can get out of sync
desc_cnt and data_cnt should always be equal. In the case of a dropped
packet desc_cnt was still getting updated (correctly), data_cnt
was not. To eliminate this bug and prevent it from recurring this
patch combines them into one ring level cnt.

Signed-off-by: Catherine Sullivan <csully@google.com>
Reviewed-by: Sagi Shahar <sagis@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-05 13:18:53 -07:00
Chuhong Yuan 8ec1e90069 gve: replace kfree with kvfree
Variables allocated by kvzalloc should not be freed by kfree.
Because they may be allocated by vmalloc.
So we replace kfree with kvfree here.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-18 16:31:27 -07:00
Denis Efremov 14b4c48bb1 gve: Remove the exporting of gve_probe
The function gve_probe is declared static and marked EXPORT_SYMBOL, which
is at best an odd combination. Because the function is not used outside of
the drivers/net/ethernet/google/gve/gve_main.c file it is defined in, this
commit removes the EXPORT_SYMBOL() marking.

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-14 12:12:59 -07:00
Arnd Bergmann 0287f9ed16 gve: fix unused variable/label warnings
On unusual page sizes, we get harmless warnings:

drivers/net/ethernet/google/gve/gve_rx.c:283:6: error: unused variable 'pagecount' [-Werror,-Wunused-variable]
drivers/net/ethernet/google/gve/gve_rx.c:336:1: error: unused label 'have_skb' [-Werror,-Wunused-label]

Change the preprocessor #if to regular if() to avoid this.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 12:40:58 -07:00
Wei Yongjun 877cb240f6 gve: Fix error return code in gve_alloc_qpls()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-07 19:25:36 -07:00
Colin Ian King a51df9f8da gve: fix -ENOMEM null check on a page allocation
Currently the check to see if a page is allocated is incorrect
and is checking if the pointer page is null, not *page as
intended.  Fix this.

Addresses-Coverity: ("Dereference before null check")
Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03 13:53:07 -07:00
Catherine Sullivan 3c13ce74b6 gve: Fix u64_stats_sync to initialize start
u64_stats_fetch_begin needs to initialize start.

Signed-off-by: Catherine Sullivan <csully@google.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03 11:27:46 -07:00
Catherine Sullivan e5b845dc79 gve: Add ethtool support
Add support for the following ethtool commands:

ethtool -s|--change devname [msglvl N] [msglevel type on|off]
ethtool -S|--statistics devname
ethtool -i|--driver devname
ethtool -l|--show-channels devname
ethtool -L|--set-channels devname
ethtool -g|--show-ring devname
ethtool --reset devname

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Sagi Shahar <sagis@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:36:35 -07:00
Catherine Sullivan 9e5f7d26a4 gve: Add workqueue and reset support
Add support for the workqueue to handle management interrupts and
support for resets.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Sagi Shahar <sagis@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:36:35 -07:00
Catherine Sullivan f5cedc84a3 gve: Add transmit and receive support
Add support for passing traffic.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Sagi Shahar <sagis@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:36:35 -07:00
Catherine Sullivan 893ce44df5 gve: Add basic driver framework for Compute Engine Virtual NIC
Add a driver framework for the Compute Engine Virtual NIC that will be
available in the future.

At this point the only functionality is loading the driver.

Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Sagi Shahar <sagis@google.com>
Signed-off-by: Jon Olson <jonolson@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:36:35 -07:00