Commit 323ebb61e3 ("net: use listified RX for handling GRO_NORMAL
skbs") introduces batching of GRO_NORMAL packets in napi_frags_finish,
and commit 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()") adds the same to napi_skb_finish. However,
dev_gro_receive (that is called just before napi_{frags,skb}_finish) can
also pass skbs to the networking stack: e.g., when the GRO session is
flushed, napi_gro_complete is called, which passes pp directly to
netif_receive_skb_internal, skipping napi->rx_list. It means that the
packet stored in pp will be handled by the stack earlier than the
packets that arrived before, but are still waiting in napi->rx_list. It
leads to TCP reorderings that can be observed in the TCPOFOQueue counter
in netstat.
This commit fixes the reordering issue by making napi_gro_complete also
use napi->rx_list, so that all packets going through GRO will keep their
order. In order to keep napi_gro_flush working properly, gro_normal_list
calls are moved after the flush to clear napi->rx_list.
iwlwifi calls napi_gro_flush directly and does the same thing that is
done by gro_normal_list, so the same change is applied there:
napi_gro_flush is moved to be before the flush of napi->rx_list.
A few other drivers also use napi_gro_flush (brocade/bna/bnad.c,
cortina/gemini.c, hisilicon/hns3/hns3_enet.c). The first two also use
napi_complete_done afterwards, which performs the gro_normal_list flush,
so they are fine. The latter calls napi_gro_receive right after
napi_gro_flush, so it can end up with non-empty napi->rx_list anyway.
Fixes: 323ebb61e3 ("net: use listified RX for handling GRO_NORMAL skbs")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Cc: Alexander Lobakin <alobakin@dlink.ru>
Cc: Edward Cree <ecree@solarflare.com>
Acked-by: Alexander Lobakin <alobakin@dlink.ru>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We needed this abstraction for some CSR registers for
IWL_DEVICE_22560, but that has been removed, so we don't need the
abstraction anymore. Remove it.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
If we have only 2k RBs like on the latest (AX210) hardware, then
even on x86 where PAGE_SIZE is 4k we currently waste half of the
memory.
If this is the case, return partial pages from the allocator and
track the offset in each RBD (to be able to find the data in them
and remap them later.)
This might also address other platforms with larger PAGE_SIZE by
putting more RBs into a single large page.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We don't need to map *everything* of the RX buffers, we won't use
that much, map only the part we're going to use. This save some
IOMMU space (if applicable and it can deal with that) and also
prepares a bit for mapping partial pages for 2K buffers later.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
For HE-capable devices, we need to allocate more receive buffers as
there could be 256 frames aggregated into a single A-MPDU, and then
they might contain A-MSDUs as well. Until 22000 family, the devices
are able to put multiple frames into a single RB and the default RB
size is 4k, but starting from AX210 family this is no longer true.
On the other hand, those newer devices only use 2k receive buffers
(by default).
Modify the code and configuration to allocate an appropriate number
of RBs depending on the device capabilities:
* 4096 for AX210 HE devices, which use 2k buffers by default,
* 2048 for 22000 family devices which use 4k buffers by default,
* 512 for existing 9000 family devices, which doesn't really
change anything since that's the default before this patch,
* 512 also for AX210/22000 family devices that don't do HE.
Theoretically, for devices lower than AX210, we wouldn't have to
allocate that many RBs if the RB size was manually increased, but
to support that the code got more complex, and it didn't really
seem necessary as that's a use case for monitor mode only, where
hopefully the wasted memory isn't really much of a concern.
Note that AX210 devices actually support bigger than 12-bit VID,
which is required here as we want to allocate 4096 buffers plus
some for quick recycling, so adjust the code for that as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Commit 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()") has applied batched GRO_NORMAL packets processing
to all napi_gro_receive() users, including mac80211-based drivers.
However, this change has led to a regression in iwlwifi driver [1][2] as
it is required for NAPI users to call napi_complete_done() or
napi_complete() and the end of every polling iteration, whilst iwlwifi
doesn't use NAPI scheduling at all and just calls napi_gro_flush().
In that particular case, packets which have not been already flushed
from napi->rx_list stall in it until at least next Rx cycle.
Fix this by adding a manual flushing of the list to iwlwifi driver right
before napi_gro_flush() call to mimic napi_complete() logics.
I prefer to open-code gro_normal_list() rather than exporting it for 2
reasons:
* to prevent from using it and napi_gro_flush() in any new drivers,
as it is the *really* bad way to use NAPI that should be avoided;
* to keep gro_normal_list() static and don't lose any CC optimizations.
I also don't add the "Fixes:" tag as the mentioned commit was only a
trigger that only exposed an improper usage of NAPI in this particular
driver.
[1] https://lore.kernel.org/netdev/PSXP216MB04388962C411CD0B17A86F47804A0@PSXP216MB0438.KORP216.PROD.OUTLOOK.COM
[2] https://bugzilla.kernel.org/show_bug.cgi?id=205647
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Acked-by: Luca Coelho <luciano.coelho@intel.com>
Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Tested-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Reviewed-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a little less efficient now as it's known to be a
multiqueue device in this function, but a future patch will
have to use a variable here anyway, so use rxq->queue_size
now instead to make it clearer.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
These aren't used outside the rx.c file, so make them static.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This is dead code, nothing uses the IWL_DEVICE_22560 macro and
thus nothing every uses IWL_DEVICE_FAMILY_22560. Remove it all.
While at it, remove some code and definitions used only in this
case, and clean up some comments/names that still refer to it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The new device generation has a slightly different suspend resume flow
Currently, the way the driver instruct the device to move to D3 is by
sending D3_CONFIG_CMD.
Instead of using the host command the indication is by writing to the
doorbell interrupt.
The FW will respond with interrupt to indicate transition completion.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Add a pointer to the iwl_trans structure and point it to the trans
part of the cfg. This is the first step in disassociating the trans
configuration from the rest of the configuration.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In order to be able to select the cfg depending on the HW revision or
on the RF ID, we need to set up the trans before selecting the cfg.
To do so, move the elements from cfg that are needed by
iwl_trans_alloc() to a separate struct at the top of the cfg, so it
can be used by other cfg types as well, before selecting the rest of
the configuration.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Sometimes the register status can include interrupts that
were masked. We can, for example, get the RF-Kill bit set
in the interrupt status register although this interrupt
was masked. Then if we get the ALIVE interrupt (for example)
that was not masked, we need to *not* service the RF-Kill
interrupt.
Fix this in the MSI-X interrupt handler.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Newest devices have a new firmware load mechanism. This
mechanism is called the context info. It means that the
driver doesn't need to load the sections of the firmware.
The driver rather prepares a place in DRAM, with pointers
to the relevant sections of the firmware, and the firmware
loads itself.
At the end of the process, the firmware sends the ALIVE
interrupt. This is different from the previous scheme in
which the driver expected the FH_TX interrupt after each
section being transferred over the DMA.
In order to support this new flow, we enabled all the
interrupts. This broke the assumption that we have in the
code that the RF-Kill interrupt can't interrupt the firmware
load flow.
Change the context info flow to enable only the ALIVE
interrupt, and re-enable all the other interrupts only
after the firmware is alive. Then, we won't see the RF-Kill
interrupt until then. Getting the RF-Kill interrupt while
loading the firmware made us kill the firmware while it is
loading and we ended up dumping garbage instead of the firmware
state.
Re-enable the ALIVE | RX interrupts from the ISR when we
get the ALIVE interrupt to be able to get the RX interrupt
that comes immediately afterwards for the ALIVE
notification. This is needed for non MSI-X only.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We added code to restock the buffer upon ALIVE interrupt
when MSI-X is disabled. This was added as part of the context
info code. This code was added only if the ISR debug level
is set which is very unlikely to be related.
Move this code to run even when the ISR debug level is not
set.
Note that gen2 devices work with MSI-X in most cases so that
this path is seldom used.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Unite iwl_trans debug related fields under iwl_trans_debug struct to
increase readability and keep iwl_trans clean.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We have a solitary and inconspicous ` in the middle of a comment in
this function, which should not be there. Remove it.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
If for some reason the device gives us an RX interrupt before we're
ready for it, perhaps during device power-on with misconfigured IRQ
causes mapping or so, we can crash trying to access the queues.
Prevent that by checking that we actually have RXQs and that they
were properly allocated.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Differentiate between SW and HW error interrupts and support ini HW
error trigger.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The layout of the RBD (receive buffer descriptor) isn't quite right,
the hardware ended up being implemented differently. Switch to the
correct RBD layout. While at it, remove the now useless extra defines.
Also, switch the CD (completion descriptor) to the right format, which
is basically just a code cleanup because the only field we really used
(rbid) is still in the same place. We may need fragmentation later if
we ever want to use it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
AX210 devices assume that the (DRAM) addresses of the rb_stts's for
the different queues are continuous.
So allocate the rb_stts's for all the Rx queues in one place.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In AX210 family, UMAC periphery address space moved from
0xA00000 to 0xD00000.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Add new device family AX210.
Make the needed changes for this family.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently there is no way to debug RX/TX paths using prints
without harming tpt. Add prints to debug RX allocation path.
We can still get 1.9 gbps with those on.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Allocator swaps the pending requests with 0 when it starts
working. This means that relying on it n RX path to decide if
to move to emergency is not always a good idea, since it may
be zero, but there are still a lot of unallocated RBs in the
system. Change allocator to decrement the pending requests on
real time. It is more expensive since it accesses the atomic
variable more times, but it gives the RX path a better idea
of the system's status.
Reported-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Fixes: 868a1e863f ("iwlwifi: pcie: avoid empty free RB queue")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
First set of patches for 5.1. Lots of new features in various drivers
but nothing really special standing out.
Major changes:
brcmfmac
* DMI nvram filename quirk for PoV TAB-P1006W-232 tablet
rsi
* support for hardware scan offload
iwlwifi
* support for Target Wakeup Time (TWT) -- a feature that allows the AP
to specify when individual stations can access the medium
* support for mac80211 AMSDU handling
* some new PCI IDs
* relicense the pcie submodule to dual GPL/BSD
* reworked the TOF/CSI (channel estimation matrix) implementation
* Some product name updates in the human-readable strings
mt76
* energy detect regulatory compliance fixes
* preparation for MT7603 support
* channel switch announcement support
mwifiex
* support for sd8977 chipset
qtnfmac
* support for 4addr mode
* convert to SPDX license identifiers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJcWoRZAAoJEG4XJFUm622bOm4H/j2Tt6qS5yz3ioH2I7R+f7e/
8C2JJia+uQs8iChdyCjCFcDymmjB2l5u72JvupwCdERzp3okv/xmJiLW5wFW2z4x
B3Nrd4FV2EMIdsRXg1RWbwZHC4wIY6lFhL1OcUcuNwpb5ab1ppvQFHmH5logd7DC
euFSe02g8xmXdUMciRDKGUdiSiDIApmx9dUfguaqYtepqeW3hmDEEkeDicaf2fjq
cM5qAvssIAgUTqwIImEQQEU7j7A2TgMYkRBNzwv2QG47jy6OlgXkZVByvtq59irg
v4BFNtG6uRWTugxQ/FwYWeqsrnjegFNKm5MFbS6BDxbe05M7QzeK8zm55fE9ZGk=
=cHbd
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2019-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 5.1
First set of patches for 5.1. Lots of new features in various drivers
but nothing really special standing out.
Major changes:
brcmfmac
* DMI nvram filename quirk for PoV TAB-P1006W-232 tablet
rsi
* support for hardware scan offload
iwlwifi
* support for Target Wakeup Time (TWT) -- a feature that allows the AP
to specify when individual stations can access the medium
* support for mac80211 AMSDU handling
* some new PCI IDs
* relicense the pcie submodule to dual GPL/BSD
* reworked the TOF/CSI (channel estimation matrix) implementation
* Some product name updates in the human-readable strings
mt76
* energy detect regulatory compliance fixes
* preparation for MT7603 support
* channel switch announcement support
mwifiex
* support for sd8977 chipset
qtnfmac
* support for 4addr mode
* convert to SPDX license identifiers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These files have a long history of code changes, but analysing
the remaining code leads to having only a few changes that are
not already owned by Intel, notably from
- Andy Lutomirski <luto@amacapital.net>
- Joonwoo Park <joonwpark81@gmail.com>
- Kirtika Ruchandani <kirtika@chromium.org>
- Rajat Jain <rajatja@google.com>
- Stanislaw Gruszka <sgruszka@redhat.com>
remaining in the code today.
Note that
- I myself was working for Intel and for any possibly code
that might be before my employment there give permission
- Wizery employees were working for Intel
More specifically, we identified the following commits that
(partially may) remain today:
25c03d8e8c Joonwoo Park <joonwpark81@gmail.com> ("iwlwifi: do not schedule tasklet when rcv unused irq")
f36d04abe6 Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: use dma_alloc_coherent")
387f3381f7 Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: fix dma mappings and skbs leak")
2624e96ce1 Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: fix possible data overwrite in hcmd callback")
bfe4b80e9f Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: always check if got h/w access before write")
d536c32b45 Andy Lutomirski <luto@amacapital.net> ("iwlwifi: pcie: log when waking the NIC for hcmd submission fails")
a6d24fad00 Rajat Jain <rajatja@google.com> ("iwlwifi: pcie: dump registers when HW becomes inaccessible")
fb12777ab5 Kirtika Ruchandani <kirtika@chromium.org> ("iwlwifi: Add more call-sites for pcie reg dumper")
3a73a30049 Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: cleanup/fix memory barriers")
aa5affbacb Stanislaw Gruszka <sgruszka@redhat.com> ("iwlwifi: dump stack when fail to gain access to the device")
Align the licenses with their permission to clean up and to
make it all identical.
CC: Joonwoo Park <joonwpark81@gmail.com>
CC: Stanislaw Gruszka <sgruszka@redhat.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Rajat Jain <rajatja@google.com>
CC: Kirtika Ruchandani <kirtika@chromium.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Kirtika Ruchandani <kirtika@chromium.org>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Rajat Jain <rajatja@google.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In case there are bugs in this area, this data can
help with debugging.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This function was only used by 9000 A-step devices, which we don't
support anymore, so it can be removed.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.
This change was generated with the following Coccinelle SmPL patch:
@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@
-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
If all free RB queues are empty, the driver will never restock the
free RB queue. That's because the restocking happens in the Rx flow,
and if the free queue is empty there will be no Rx.
Although there's a background worker (a.k.a. allocator) allocating
memory for RBs so that the Rx handler can restock them, the worker may
run only after the free queue has become empty (and then it is too
late for restocking as explained above).
There is a solution for that called 'emergency': If the number of used
RB's reaches half the amount of all RB's, the Rx handler will not wait
for the allocator but immediately allocate memory for the used RB's
and restock the free queue.
But, since the used RB's is per queue, it may happen that the used
RB's are spread between the queues such that the emergency check will
fail for each of the queues
(and still run out of RBs, causing the above symptom).
To fix it, move to emergency mode if the sum of *all* used RBs (for
all Rx queues) reaches half the amount of all RB's
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The Free Software Foundation address is superfluous and causes
checkpatch to issue a warning when present. Remove all paragraphs
with FSF's address to prevent that.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We offloaded all the RX configuration of init to firmware. However,
the configuration of interrupt coalescing was left hanging - it wasn't
offloaded nor was it written by host.
This write to the CSR is allowed in gen2, so the host can do it.
Without it we have various issues with RX fullness.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Store the default rxq number in a variable, so we won't need
to use the actual number in the code.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Allow other device generations to use the utilities that
are used to send and reclaim host commands and to allocate
rx, by making it non-static.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We need to drop packets with errors (such as replay,
MIC, ICV, conversion, duplicate and so on).
Drop invalid packets, put the status bits in the metadata and
move the enum definition to the correct place (FW API header).
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We would like to allow other utlities to init msix and rx.
Put their declarations in a place accessible to other utilities.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This makes code less indented and more readable.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
This allows less "dummy" declarations and casting.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The rfh for 22560 devices has changed so it supports now
the same arch of using used and free lists, but different
structures to support the last.
Use the new structures, hw dependent, to manage the lists.
bd, the free list, uses the iwl_rx_transfer_desc,
in which the vid is stored in the structs' rbid
field, and the page address in the addr field.
used_bd, the used list, uses the iwl_rx_completion_desc
struct, in which the vid is stored in the structs' rbid
field.
rb_stts, the hw "write" pointer of rx is stored in a
__le16 array, in which each entry represents the write
pointer per queue.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The smallest rb size supported today is 4k rx buffers.
22560 devices use 2k rxb's, so allow using 2k buffers.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In 22560 devices the firmware will do all the hw configurations,
but that's not ready yet.
Update the correct registers in the driver until the FW is ready
and does it by itself.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In 22560 devices the ROM sendis an interrupt to the host
once the IML reading is done.
Handle this interrupt, and indicate sw error in case the
value is fail.
Additionally, the cause for sw error in 22560 devices
have been changed, so update the cause list.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The hw now refers to two new blocks:
* rx tr tail - The Tail index on the free buffers queue TR,
which is update by the device after reading the free buffer
from the tr.
* rx cr tail - Updated by the driver when completing
processing a new completion descriptor in the cr.
Add these two new struct to the rxq, allocate and free them
when needed.
In addition, the register for rx write pointer had been changed
to HBUS_TARG_WRPTR. The way to differentiate tx from rx is the
queue number. TX range is 0-511, and RX's is 512-527.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Make sure the rx_allocator worker is canceled before running the
rx_init routine. rx_init frees and re-allocates all rxb's pages. The
rx_allocator worker also allocates pages for the used rxb's. Running
rx_init and rx_allocator simultaniously causes a kernel panic. Fix
that by canceling the work in rx_init.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Different device families may have different flag values
for passing a message to the fw (i.e. SW_RESET).
In order to keep the code readable, and avoid conditioning
upon the family, store a value for each flag, which indicates
the bit that needs to be enabled.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.
However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:
----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()
// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@
- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)
@ depends on patch @
expression E;
@@
- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Work queues cannot be allocated when a mutex is held because the mutex
may be in use and that would make it sleep. Doing so generates the
following splat with 4.13+:
[ 19.513298] ======================================================
[ 19.513429] WARNING: possible circular locking dependency detected
[ 19.513557] 4.13.0-rc5+ #6 Not tainted
[ 19.513638] ------------------------------------------------------
[ 19.513767] cpuhp/0/12 is trying to acquire lock:
[ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
[ 19.514047]
[ 19.514047] but task is already holding lock:
[ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
[ 19.514338]
[ 19.514338] which lock already depends on the new lock.
This lock dependency already existed with previous kernel versions,
but it was not detected until commit 49dfe2a677 ("cpuhotplug: Link
lock stacks for hotplug callbacks") was introduced.
Reported-by: David Weinehall <david.weinehall@intel.com>
Reported-by: Jiri Kosina <jikos@kernel.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This allows to modify TFD_TX_CMD_SLOTS to a power of 2
which is smaller than 256.
Note that we still need to set values to wrap at 256
into the scheduler's write pointer, but all the rest of
the code can use shorter transmit queues.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We have tracing for both pre-ICT and ICT interrupts, including all
the data read there. Extend the tracing to MSI-X interrupts.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Print the queue for the existing debug message and add a new
debug message indicating where the RB ended.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Print out both queue IDs to be able to see what went wrong.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Due to a hardware issue, certain power saving had to be
disabled. However, this issue was fixed in B-step, so the
workaround only needs to apply to A-step.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When the firmware crashes, the transmit queues can't make
any progress. This is why we stop the counter that monitor
the transmit queues' activity.
The call that notifies the error to the op_mode may take
a bit of time, so stop the timer of the transmit queues
earlier.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When we started using threaded irqs, all the opmode calls were changed
to be called with local_bh disabled. The reason for this was it was
that mac80211 needs that. When we are handling FW errors, mac80211 is
not involved, so we don't need it.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When toggling the RF-kill pin quickly in succession, the driver can
get rather confused because it might be in the process of shutting
down, expecting all commands to go through quickly due to rfkill,
but the transport already thinks the device is accessible again,
even though it previously shut it down. This leads to bugs, and I
even observed a kernel panic.
Avoid this by making the PCIe code only report that the radio is
enabled again after the higher layers actually decided to shut it
off.
This also pulls out this common RF-kill checking code into a common
function called by both transport generations and also moves it to
the direct method - in the internal helper we don't really care
about the RF-kill status anymore since we won't report it up until
the stop anyway.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In order to debug "hardware" RF-kill flows, add a low-level hook to
allow changing the "hardware" RF-kill from debugfs.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
There's no point in duplicating exactly the same code here
for legacy and MSI-X interrupts, so pull it out into a new
function to call in both places.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Letting the preprocessor/compiler generate the shift/mask by itself
is a win for readability, so use bitfield.h for some registers.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When applying no-reclaim logic to commands other than the group
zero for legacy commands, commands such as 0x1c (TX_CMD in group
0) can't be used in any other group. Fix that by applying this
logic only for group 0 - it's not and should never be needed for
any other groups.
Reported-by: Sharon Dvir <sharon.dvir@intel.com>
Reported-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Change queue allocation to be dynamic. On transport init only
the command queue is being allocated. Other queues are allocated
on demand.
This is due to the huge amount of queues we will soon enable (512)
and as a preparation for TX Virtual Queue Manager feature (TVQM),
where firmware will assign the actual queue number on demand.
This includes also allocation of the byte count table per queue
and not as a contiguous chunk of memory.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In a000 transport we will allocate queues dynamically.
Right now queue are allocated as one big chunk of memory
and accessed as such.
The dynamic allocation of the queues will require accessing
the queues as pointers.
In order to keep simplicity of pre-a000 tx queues handling,
keep allocating and freeing the memory in the same style,
but move to access the queues in the various functions as
individual pointers.
Dynamic allocation for the a000 devices will be in a separate
patch.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Context information structure is going to be used in a000
devices for firmware self init.
The self init includes firmware self loading from DRAM by
ROM.
This means the TFH relevant firmware loading can be cleaned up.
The firmware loading includes the paging memory as well, so op
mode can stop initializing the paging and sending the DRAM_BLOCK_CMD.
Firmware is doing RFH, TFH and SCD configuration, while driver
only fills the required configurations and addresses in the
context information structure.
The only remaining access to RFH is the write pointer, which
is updated upon alive interrupt after FW configured the RFH.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We don't need to print so much data in the kernel log.
Limit the data to be printed to the queue that actually
got stuck in case of a TFD queue hang, and stop dumping
all the CSR and FH registers. Over the course of time, the
CSR and FH values haven't proven themselves to be really
useful for debugging, and they are now in the firmware dump
anyway.
This comes as a preparation to the addition of more data
required to be printed by the firwmare team.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently, when getting a RFKILL interrupt, the transport enters a flow
in which it stops the device, disables other interrupts, etc. After
stopping the device, the transport resets the hw, and sleeps. During
the sleep, a context switch occurs and host commands are sent by upper
layers (e.g. mvm) to the fw. This is possible since the op_mode layer
and the transport layer hold different mutexes.
Since the STATUS_RFKILL bit isn't set, the transport layer doesn't
recognize that RFKILL was toggled on, and no commands can actually be
sent, so it enqueues the command to the tx queue and sets a timer on
the queue.
After switching context back to stopping the device, STATUS_RFKILL is
set, and then the transport can't send the command to the fw.
This eventually results in a queue hang.
Fix this by setting STATUS_RFKILL immediately when
the interrupt is fired.
Signed-off-by: Golan Ben-Ami <golan.ben.ami@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When resuming, it's possible for the following scenario to occur:
* iwl_pci_resume() enables the RF-kill interrupt
* iwl_pci_resume() reads the RF-kill state (e.g. to 'radio enabled')
* RF_KILL interrupt triggers, and iwl_pcie_irq_handler() reads the
state, now 'radio disabled', and acquires the &trans_pcie->mutex.
* iwl_pcie_irq_handler() further calls iwl_trans_pcie_rf_kill() to
indicate to the higher layers that the radio is now disabled (and
stops the device while at it)
* iwl_pcie_irq_handler() drops the mutex
* iwl_pci_resume() continues, acquires the mutex and calls the higher
layers to indicate that the radio is enabled.
At this point, the device is stopped but the higher layers think it's
available, and can call deeply into the driver to try to enable it.
However, this will fail since the device is actually disabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
There's no need to declare a list and then init it manually,
just use the LIST_HEAD() macro.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Log group as well. Remove 0x prefix to match TX logging.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In case the OS provides fewer interrupts than requested, different
causes will share the same interrupt vector as follow:
1.One interrupt less: non rx causes shared with FBQ.
2.Two interrupts less: non rx causes shared with FBQ and RSS.
3.More than two interrupts: we will use fewer RSS queues.
Also make the request depend on the number of online CPUs
instead of possible CPUs.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The original intent was to have the general iwl_queue shared
between RX and TX queues, but it is not the actual status.
Since it is not shared with any struct but iwl_txq, it adds
unnecessary complexity. Merge those structs.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Move the write_prph_64 of pcie to be transport agnostic.
Add direct write as well, as it is needed for a000 HW.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In MQ environment and new architecture in early stages
we may encounter DMA issues. Track RXB status and bail
out in case we receive index to an RXB that was not
mapped and handed over to HW.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Upon firmware load interrupt (FH_TX), the ISR re-enables the
firmware load interrupt only to avoid races with other
flows as described in the commit below. When the firmware
is completely loaded, the thread that is loading the
firmware will enable all the interrupts to make sure that
the driver gets the ALIVE interrupt.
The problem with that is that the thread that is loading
the firmware is actually racing against the ISR and we can
get to the following situation:
CPU0 CPU1
iwl_pcie_load_given_ucode
...
iwl_pcie_load_firmware_chunk
wait_for_interrupt
<interrupt>
ISR handles CSR_INT_BIT_FH_TX
ISR wakes up the thread on CPU0
/* enable all the interrupts
* to get the ALIVE interrupt
*/
iwl_enable_interrupts
ISR re-enables CSR_INT_BIT_FH_TX only
/* start the firmware */
iwl_write32(trans, CSR_RESET, 0);
BUG! ALIVE interrupt will never arrive since it has been
masked by CPU1.
In order to fix that, change the ISR to first check if
STATUS_INT_ENABLED is set. If so, re-enable all the
interrupts. If STATUS_INT_ENABLED is clear, then we can
check what specific interrupt happened and re-enable only
that specific interrupt (RFKILL or FH_TX).
All the credit for the analysis goes to Kirtika who did the
actual debugging work.
Cc: <stable@vger.kernel.org> [4.5+]
Fixes: a6bd005fe9 ("iwlwifi: pcie: fix RF-Kill vs. firmware load race")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In cases of hardware or DMA error, the vid read from
a zeroed location will be 0, and we will access the rxb
at index 0 in the global table, while it may be NULL or
owned by hardware.
Invalidate vid 0 in order to detect the situation and
bail out.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Somehow we ended up stopping RX using legacy RX registers
even for devices that support RFH. Fix it.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently code calls restock for mq devices during the init
function, unlike sq where restock is called after init.
This causes an harmless but alarming deadlock warning from
lockdep, to fix this - unify the init code.
Rename the restock functions while at it.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Add a warning in case packet didn't end up in the HW
destined queue.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We now have 9000 devices that support multiple frames in
a single RB. Enable it.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
For 9000 devices we can have PCIe bus for discrete
devices and IOSF bus for integrated devices.
PCIe supports maximum transfer size of 128B while IOSF
bus supports maximum transfer size of 64B.
Configure RB size accordingly.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Integrated 9000 devices have a bug with shadow registers
value retention.
If driver writes RBD registers while MAC is asleep the
values are stored in shadow registers to be copied whenever
MAC wakes up.
However, in 9000 devices a MAC wakeup is not triggered
and when the bus powers down due to inactivity the shadow
values and dirty bits are lost.
Turn on the chicken-bits that cause MAC wakeup for RX-related
values as well when the device is in D0.
When the device is in low power mode turn the RX wakeup chicken
bits off since driver is idle and this W/A is not needed.
Remove previous W/A which was ineffective.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When initializing RX we grab NIC access for every read and
write. This is redundant - we can just grab access once.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The RX queues have a shadow register for the write pointer
that enables updates without grabbing NIC access. Use them
instead of the periphery registers because accessing those
is much more expensive.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
CSR registers are always available even when the NIC is not awake, no
need to wake up the NIC before accessing them. This has a huge impact
when we re-enable an interrupt at the end of the ISR since waking up the
NIC can take some time.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
isr_stats is written twice with the same value, remove one of the
redundant assignments to isr_stats.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Due to hardware bug, upon any shadow free-queue register write
access, a legacy RBD shadow register must be written as well.
This is required in order to trigger a copy of the shadow registers
values after MAC exits sleep state.
Specifically, the driver has to write (any value) to the legacy RBD
register each time FRBDCB is accessed.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
My patch resized the pool size, but neglected to resize
the global table, which is obviously wrong since the global
table maps the pool's rxb to vid one to one. This results
in a panic in 9000 devices.
Add a build bug to avoid such a case in the future.
Fixes: 7b5424361e ("iwlwifi: pcie: fine tune number of rxbs")
Reported-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
When trying to reach high Rx throughput of more than 500Mbps on
a device with a relatively weak CPU (Atom x5-Z8500), CPU utilization
may become a bottleneck. Analysis showed that we are looping in
iwl_pcie_rx_handle for very long periods which led to starvation
of other threads (iwl_pcie_rx_handle runs with _bh disabled).
We were handling Rx and allocating new buffers and the new buffers
were ready quickly enough to be available before we had finished
handling all the buffers available in the hardware. As a
consequence, we called iwl_pcie_rxq_restock to refill the hardware
with the new buffers, and start again handling new buffers without
exiting the function. Since we read the hardware pointer again when
we goto restart, new buffers were handled immediately instead of
exiting the function.
This patch avoids refilling RBs inside rx handling loop, unless an
emergency situation is reached. It also doesn't read the hardware
pointer again unless we are in an emergency (unlikely) case.
This significantly reduce the maximal time we spend in
iwl_pcie_rx_handle with _bh disabled.
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
We kick the allocator when we have 2 RBDs that don't have
attached RBs, and the allocator allocates 8 RBs meaning
that it needs another 6 RBDs to attach the RBs to.
The design is that allocator should always have enough RBDs
to fulfill requests, so we give in advance 6 RBDs to the
allocator so that when it is kicked, it gets additional 2 RBDs
and has enough RBDs.
These RBDs were taken from the Rx queue itself, meaning
that each Rx queue didn't have the maximal number of
RBDs, but MAX - 6.
Change initial number of RBDs in the system to include both
queue size and allocator reserves.
Note the multi-queue is always 511 instead of 512 to avoid a
full queue since we cannot detect this state easily enough in
the 9000 arch.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
128 byte chunk size is supported only on PCIe and not
on IOSF. For now, change it back to 64 byte.
Reported-by: Oren Givon <oren.givon@intel.com>
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Change the code to move rxbs directly from the allocator's
list to the queue's free list. This makes the code more
readable, saves the interim array and the double loop over
the free RBs.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
In 9000 series A0 step the closed_rb_num is not wrapping around
properly. The queue is wrapping around as it should, so we can
W/A it by wrapping the closed_rb_num in the driver.
While at it, extend RX logging and add error handling of other
cases HW values may cause us to access invalid memory locations.
Add also a proper masking of vid value read from HW - this should
not have actual affect, but better to be on the safe side.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Fine tune RFH registers further:
* Set default queue explicitly
* Set RFH to drop frames exceeding RB size
* Set the maximum rx transfer size to DRAM to 128 instead of 64
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Working with MSIX requires prior configuration.
This includes requesting interrupt vectors from the OS,
registering the vectors and mapping the optional causes to the
relevant interrupt. In addition add new interrupt handler
to handle MSIX interrupt.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
These are a few fixes for the current cycle.
3 out of the 5 patches fix a bugzilla.
* fix a race that users reported when we try to load the firmware
and the hardware rfkill interrupt triggers at the same time.
* Luca fixes a very visible bug in scheduled scan: our firmware
doesn't support scheduled scan with no profile configured and
the supplicant sometimes requests such scheduled scans.
* build system fix
* firmware name update for 8265
* typo fix in return value
When we load the firmware, we hold trans_pcie->mutex to
avoid nested flows. We also rely on the ISR to wake up the
thread when the DMA has finished copying a chunk. During
this flow, we enable the RF-Kill interrupt.
The problem is that the RF-Kill interrupt handler can take
the mutex and bring the device down. This means that if
we load the firmware while the RF-Kill switch is enabled
(which will happen when we load the INIT firmware to read
the device's capabilities and register to mac80211), we
may get an RF-Kill interrupt immediately and the ISR will
be waiting for the mutex held by the thread that is
currently loading the firmware. At this stage, the ISR
won't be able to service the DMA's interrupt needed to
wake up the thread that load the firmware. We are in a
deadlock situation which ends when the thread that loads
the firmware fails on timeout and releases the mutex.
To fix this, take the mutex later in the flow, disable
the interrupts and synchronize_irq() to give a chance to
the RF-Kill interrupt to run and complete.
After that, mask all the interrupts besides the DMA
interrupt and proceed with firmware load. Make sure to
check that there was no RF-Kill interrupt when the
interrupts were disabled.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=111361
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Previous patches enabled new 9000 hardware DMA for one queue
only.
Enable the actual multi-queue path and configuration now.
This requires also per-queue NAPI struct.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
The 9000 series introduces several changes in the device
DMA operation.
As the device now supports multi-queue rx, several DMA channels
should be configured.
The flows of providing the device with the allocated RBDs now
changes as well - the device maintains a separate table of used
and free table.
The hardware may use the free table to feed RBDs to any queue.
This requires maintaing a shared table to map returned RBDs to
the original RXB - for that purpose the VID is introduced - an
internal identifier of the RB placed in the lower 12 bits and
returned by HW in the used data.
Another change is the support of 64 bit DMA address.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
The 9000 series devices will support multi rx queues.
Current code has one static rx queue - change it to allocate
a number of queues per the device capability (pre-9000 devices
have the number of rx queues set to one).
Subsequent generalizations are:
Change the code to access an explicit numbered rx queue only
when the queue number is known - when handling interrupt, when
accessing the default queue and when iterating the queues.
The rest of the functions will receive the rx queue as a pointer.
Generalize the warning in allocation failure to consider the
allocator status instead of a single rx queue status.
Move the rx initial pool of memory buffers to be shared among
all the queues and allocated to the default queue on init.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Host commands now have a group id, express this in printed messages.
Signed-off-by: Sharon Dvir <sharon.dvir@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>