Commit Graph

95 Commits

Author SHA1 Message Date
Linus Torvalds 7203ca412f Significant patch series in this merge are as follows:
- The 10 patch series "__vmalloc()/kvmalloc() and no-block support" from
   Uladzislau Rezki reworks the vmalloc() code to support non-blocking
   allocations (GFP_ATOIC, GFP_NOWAIT).
 
 - The 2 patch series "ksm: fix exec/fork inheritance" from xu xin fixes
   a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited
   across fork/exec.
 
 - The 4 patch series "mm/zswap: misc cleanup of code and documentations"
   from SeongJae Park does some light maintenance work on the zswap code.
 
 - The 5 patch series "mm/page_owner: add debugfs files 'show_handles'
   and 'show_stacks_handles'" from Mauricio Faria de Oliveira enhances the
   /sys/kernel/debug/page_owner debug feature.  It adds unique identifiers
   to differentiate the various stack traces so that userspace monitoring
   tools can better match stack traces over time.
 
 - The 2 patch series "mm/page_alloc: pcp->batch cleanups" from Joshua
   Hahn makes some minor alterations to the page allocator's per-cpu-pages
   feature.
 
 - The 2 patch series "Improve UFFDIO_MOVE scalability by removing
   anon_vma lock" from Lokesh Gidra addresses a scalability issue in
   userfaultfd's UFFDIO_MOVE operation.
 
 - The 2 patch series "kasan: cleanups for kasan_enabled() checks" from
   Sabyrzhan Tasbolatov performs some cleanup in the KASAN code.
 
 - The 2 patch series "drivers/base/node: fold node register and
   unregister functions" from Donet Tom cleans up the NUMA node handling
   code a little.
 
 - The 4 patch series "mm: some optimizations for prot numa" from Kefeng
   Wang provides some cleanups and small optimizations to the NUMA
   allocation hinting code.
 
 - The 5 patch series "mm/page_alloc: Batch callers of
   free_pcppages_bulk" from Joshua Hahn addresses long lock hold times at
   boot on large machines.  These were causing (harmless) softlockup
   warnings.
 
 - The 2 patch series "optimize the logic for handling dirty file folios
   during reclaim" from Baolin Wang removes some now-unnecessary work from
   page reclaim.
 
 - The 10 patch series "mm/damon: allow DAMOS auto-tuned for per-memcg
   per-node memory usage" from SeongJae Park enhances the DAMOS auto-tuning
   feature.
 
 - The 2 patch series "mm/damon: fixes for address alignment issues in
   DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan fixes DAMON_LRU_SORT
   and DAMON_RECLAIM with certain userspace configuration.
 
 - The 15 patch series "expand mmap_prepare functionality, port more
   users" from Lorenzo Stoakes enhances the new(ish)
   file_operations.mmap_prepare() method and ports additional callsites
   from the old ->mmap() over to ->mmap_prepare().
 
 - The 8 patch series "Fix stale IOTLB entries for kernel address space"
   from Lu Baolu fixes a bug (and possible security issue on non-x86) in
   the IOMMU code.  In some situations the IOMMU could be left hanging onto
   a stale kernel pagetable entry.
 
 - The 4 patch series "mm/huge_memory: cleanup __split_unmapped_folio()"
   from Wei Yang cleans up and optimizes the folio splitting code.
 
 - The 5 patch series "mm, swap: misc cleanup and bugfix" from Kairui
   Song implements some cleanups and a minor fix in the swap discard code.
 
 - The 8 patch series "mm/damon: misc documentation fixups" from SeongJae
   Park does as advertised.
 
 - The 9 patch series "mm/damon: support pin-point targets removal" from
   SeongJae Park permits userspace to remove a specific monitoring target
   in the middle of the current targets list.
 
 - The 2 patch series "mm: MISC follow-up patches for linux/pgalloc.h"
   from Harry Yoo implements a couple of cleanups related to mm header file
   inclusion.
 
 - The 2 patch series "mm/swapfile.c: select swap devices of default
   priority round robin" from Baoquan He improves the selection of swap
   devices for NUMA machines.
 
 - The 3 patch series "mm: Convert memory block states (MEM_*) macros to
   enums" from Israel Batista changes the memory block labels from macros
   to enums so they will appear in kernel debug info.
 
 - The 3 patch series "ksm: perform a range-walk to jump over holes in
   break_ksm" from Pedro Demarchi Gomes addresses an inefficiency when KSM
   unmerges an address range.
 
 - The 22 patch series "mm/damon/tests: fix memory bugs in kunit tests"
   from SeongJae Park fixes leaks and unhandled malloc() failures in DAMON
   userspace unit tests.
 
 - The 2 patch series "some cleanups for pageout()" from Baolin Wang
   cleans up a couple of minor things in the page scanner's
   writeback-for-eviction code.
 
 - The 2 patch series "mm/hugetlb: refactor sysfs/sysctl interfaces" from
   Hui Zhu moves hugetlb's sysfs/sysctl handling code into a new file.
 
 - The 9 patch series "introduce VM_MAYBE_GUARD and make it sticky" from
   Lorenzo Stoakes makes the VMA guard regions available in /proc/pid/smaps
   and improves the mergeability of guarded VMAs.
 
 - The 2 patch series "mm: perform guard region install/remove under VMA
   lock" from Lorenzo Stoakes reduces mmap lock contention for callers
   performing VMA guard region operations.
 
 - The 2 patch series "vma_start_write_killable" from Matthew Wilcox
   starts work in permitting applications to be killed when they are
   waiting on a read_lock on the VMA lock.
 
 - The 11 patch series "mm/damon/tests: add more tests for online
   parameters commit" from SeongJae Park adds additional userspace testing
   of DAMON's "commit" feature.
 
 - The 9 patch series "mm/damon: misc cleanups" from SeongJae Park does
   that.
 
 - The 2 patch series "make VM_SOFTDIRTY a sticky VMA flag" from Lorenzo
   Stoakes addresses the possible loss of a VMA's VM_SOFTDIRTY flag when
   that VMA is merged with another.
 
 - The 16 patch series "mm: support device-private THP" from Balbir Singh
   introduces support for Transparent Huge Page (THP) migration in zone
   device-private memory.
 
 - The 3 patch series "Optimize folio split in memory failure" from Zi
   Yan optimizes folio split operations in the memory failure code.
 
 - The 2 patch series "mm/huge_memory: Define split_type and consolidate
   split support checks" from Wei Yang provides some more cleanups in the
   folio splitting code.
 
 - The 16 patch series "mm: remove is_swap_[pte, pmd]() + non-swap
   entries, introduce leaf entries" from Lorenzo Stoakes cleans up our
   handling of pagetable leaf entries by introducing the concept of
   'software leaf entries', of type softleaf_t.
 
 - The 4 patch series "reparent the THP split queue" from Muchun Song
   reparents the THP split queue to its parent memcg.  This is in
   preparation for addressing the long-standing "dying memcg" problem,
   wherein dead memcg's linger for too long, consuming memory resources.
 
 - The 3 patch series "unify PMD scan results and remove redundant
   cleanup" from Wei Yang does a little cleanup in the hugepage collapse
   code.
 
 - The 6 patch series "zram: introduce writeback bio batching" from
   Sergey Senozhatsky improves zram writeback efficiency by introducing
   batched bio writeback support.
 
 - The 4 patch series "memcg: cleanup the memcg stats interfaces" from
   Shakeel Butt cleans up our handling of the interrupt safety of some
   memcg stats.
 
 - The 4 patch series "make vmalloc gfp flags usage more apparent" from
   Vishal Moola cleans up vmalloc's handling of incoming GFP flags.
 
 - The 6 patch series "mm: Add soft-dirty and uffd-wp support for RISC-V"
   from Chunyan Zhang teches soft dirty and userfaultfd write protect
   tracking to use RISC-V's Svrsw60t59b extension.
 
 - The 5 patch series "mm: swap: small fixes and comment cleanups" from
   Youngjun Park fixes a small bug and cleans up some of the swap code.
 
 - The 4 patch series "initial work on making VMA flags a bitmap" from
   Lorenzo Stoakes starts work on converting the vma struct's flags to a
   bitmap, so we stop running out of them, especially on 32-bit.
 
 - The 2 patch series "mm/swapfile: fix and cleanup swap list iterations"
   from Youngjun Park addresses a possible bug in the swap discard code and
   cleans things up a little.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTEb0wAKCRDdBJ7gKXxA
 jjfIAP94W4EkCCwNOupnChoG+YWw/JW21anXt5NN+i5svn1yugEAwzvv6A+cAFng
 o+ug/fyrfPZG7PLp2R8WFyGIP0YoBA4=
 =IUzS
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

  "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
     Rework the vmalloc() code to support non-blocking allocations
     (GFP_ATOIC, GFP_NOWAIT)

  "ksm: fix exec/fork inheritance" (xu xin)
     Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
     inherited across fork/exec

  "mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
     Some light maintenance work on the zswap code

  "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
     Enhance the /sys/kernel/debug/page_owner debug feature by adding
     unique identifiers to differentiate the various stack traces so
     that userspace monitoring tools can better match stack traces over
     time

  "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
     Minor alterations to the page allocator's per-cpu-pages feature

  "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
     Address a scalability issue in userfaultfd's UFFDIO_MOVE operation

  "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)

  "drivers/base/node: fold node register and unregister functions" (Donet Tom)
     Clean up the NUMA node handling code a little

  "mm: some optimizations for prot numa" (Kefeng Wang)
     Cleanups and small optimizations to the NUMA allocation hinting
     code

  "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
     Address long lock hold times at boot on large machines. These were
     causing (harmless) softlockup warnings

  "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
     Remove some now-unnecessary work from page reclaim

  "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
     Enhance the DAMOS auto-tuning feature

  "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
     Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
     configuration

  "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
     Enhance the new(ish) file_operations.mmap_prepare() method and port
     additional callsites from the old ->mmap() over to ->mmap_prepare()

  "Fix stale IOTLB entries for kernel address space" (Lu Baolu)
     Fix a bug (and possible security issue on non-x86) in the IOMMU
     code. In some situations the IOMMU could be left hanging onto a
     stale kernel pagetable entry

  "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
     Clean up and optimize the folio splitting code

  "mm, swap: misc cleanup and bugfix" (Kairui Song)
     Some cleanups and a minor fix in the swap discard code

  "mm/damon: misc documentation fixups" (SeongJae Park)

  "mm/damon: support pin-point targets removal" (SeongJae Park)
     Permit userspace to remove a specific monitoring target in the
     middle of the current targets list

  "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
     A couple of cleanups related to mm header file inclusion

  "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
     improve the selection of swap devices for NUMA machines

  "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
     Change the memory block labels from macros to enums so they will
     appear in kernel debug info

  "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
     Address an inefficiency when KSM unmerges an address range

  "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
     Fix leaks and unhandled malloc() failures in DAMON userspace unit
     tests

  "some cleanups for pageout()" (Baolin Wang)
     Clean up a couple of minor things in the page scanner's
     writeback-for-eviction code

  "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
     Move hugetlb's sysfs/sysctl handling code into a new file

  "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
     Make the VMA guard regions available in /proc/pid/smaps and
     improves the mergeability of guarded VMAs

  "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
     Reduce mmap lock contention for callers performing VMA guard region
     operations

  "vma_start_write_killable" (Matthew Wilcox)
     Start work on permitting applications to be killed when they are
     waiting on a read_lock on the VMA lock

  "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
     Add additional userspace testing of DAMON's "commit" feature

  "mm/damon: misc cleanups" (SeongJae Park)

  "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
     Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
     VMA is merged with another

  "mm: support device-private THP" (Balbir Singh)
     Introduce support for Transparent Huge Page (THP) migration in zone
     device-private memory

  "Optimize folio split in memory failure" (Zi Yan)

  "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
     Some more cleanups in the folio splitting code

  "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
     Clean up our handling of pagetable leaf entries by introducing the
     concept of 'software leaf entries', of type softleaf_t

  "reparent the THP split queue" (Muchun Song)
     Reparent the THP split queue to its parent memcg. This is in
     preparation for addressing the long-standing "dying memcg" problem,
     wherein dead memcg's linger for too long, consuming memory
     resources

  "unify PMD scan results and remove redundant cleanup" (Wei Yang)
     A little cleanup in the hugepage collapse code

  "zram: introduce writeback bio batching" (Sergey Senozhatsky)
     Improve zram writeback efficiency by introducing batched bio
     writeback support

  "memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
     Clean up our handling of the interrupt safety of some memcg stats

  "make vmalloc gfp flags usage more apparent" (Vishal Moola)
     Clean up vmalloc's handling of incoming GFP flags

  "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
     Teach soft dirty and userfaultfd write protect tracking to use
     RISC-V's Svrsw60t59b extension

  "mm: swap: small fixes and comment cleanups" (Youngjun Park)
     Fix a small bug and clean up some of the swap code

  "initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
     Start work on converting the vma struct's flags to a bitmap, so we
     stop running out of them, especially on 32-bit

  "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
     Address a possible bug in the swap discard code and clean things
     up a little

[ This merge also reverts commit ebb9aeb980 ("vfio/nvgrace-gpu:
  register device memory for poison handling") because it looks
  broken to me, I've asked for clarification   - Linus ]

* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm: fix vma_start_write_killable() signal handling
  mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
  mm/swapfile: fix list iteration when next node is removed during discard
  fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
  mm/kfence: add reboot notifier to disable KFENCE on shutdown
  memcg: remove inc/dec_lruvec_kmem_state helpers
  selftests/mm/uffd: initialize char variable to Null
  mm: fix DEBUG_RODATA_TEST indentation in Kconfig
  mm: introduce VMA flags bitmap type
  tools/testing/vma: eliminate dependency on vma->__vm_flags
  mm: simplify and rename mm flags function for clarity
  mm: declare VMA flags by bit
  zram: fix a spelling mistake
  mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
  mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
  pagemap: update BUDDY flag documentation
  mm: swap: remove scan_swap_map_slots() references from comments
  mm: swap: change swap_alloc_slow() to void
  mm, swap: remove redundant comment for read_swap_cache_async
  mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
  ...
2025-12-05 13:52:43 -08:00
Balbir Singh 3a5a065545 mm/zone_device: rename page_free callback to folio_free
Change page_free to folio_free to make the folio support for
zone device-private more consistent. The PCI P2PDMA callback
has also been updated and changed to folio_free() as a result.

For drivers that do not support folios (yet), the folio is
converted back into page via &folio->page and the page is used
as is, in the current callback implementation.

Link: https://lkml.kernel.org/r/20251001065707.920170-3-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:47 -08:00
Leon Romanovsky 395698bd2c PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function
Provide an access to pci_p2pdma_map_type() function to allow subsystems
to determine the appropriate mapping type for P2PDMA transfers between
a provider and target device.

The pci_p2pdma_map_type() function is the core P2P layer version of
the existing public, but struct page focused, pci_p2pdma_state()
function. It returns the same result. It is required to use the p2p
subsystem from drivers that don't use the struct page layer.

Like __pci_p2pdma_update_state() it is not an exported function. The
idea is that only subsystem code will implement mapping helpers for
taking in phys_addr_t lists, this is deliberately not made accessible
to every driver to prevent abuse.

Following patches will use this function to implement a shared DMA
mapping helper for DMABUF.

Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-4-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20 12:02:00 -07:00
Leon Romanovsky 372d6d1b8a PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation
Refactor the PCI P2PDMA subsystem to separate the core peer-to-peer DMA
functionality from the optional memory allocation layer. This creates a
two-tier architecture:

The core layer provides P2P mapping functionality for physical addresses
based on PCI device MMIO BARs and integrates with the DMA API for
mapping operations. This layer is required for all P2PDMA users.

The optional upper layer provides memory allocation capabilities
including gen_pool allocator, struct page support, and sysfs interface
for user space access.

This separation allows subsystems like DMABUF to use only the core P2P
mapping functionality without the overhead of memory allocation features
they don't need. The core functionality is now available through the
new pcim_p2pdma_provider() function that returns a p2pdma_provider
structure.

Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-3-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20 12:01:52 -07:00
Leon Romanovsky f58ef9d1d1 PCI/P2PDMA: Separate the mmap() support from the core logic
Currently the P2PDMA code requires a pgmap and a struct page to
function. The was serving three important purposes:

 - DMA API compatibility, where scatterlist required a struct page as
   input

 - Life cycle management, the percpu_ref is used to prevent UAF during
   device hot unplug

 - A way to get the P2P provider data through the pci_p2pdma_pagemap

The DMA API now has a new flow, and has gained phys_addr_t support, so
it no longer needs struct pages to perform P2P mapping.

Lifecycle management can be delegated to the user, DMABUF for instance
has a suitable invalidation protocol that does not require struct page.

Finding the P2P provider data can also be managed by the caller
without need to look it up from the phys_addr.

Split the P2PDMA code into two layers. The optional upper layer,
effectively, provides a way to mmap() P2P memory into a VMA by
providing struct page, pgmap, a genalloc and sysfs.

The lower layer provides the actual P2P infrastructure and is wrapped
up in a new struct p2pdma_provider. Rework the mmap layer to use new
p2pdma_provider based APIs.

Drivers that do not want to put P2P memory into VMA's can allocate a
struct p2pdma_provider after probe() starts and free it before
remove() completes. When DMA mapping the driver must convey the struct
p2pdma_provider to the DMA mapping code along with a phys_addr of the
MMIO BAR slice to map. The driver must ensure that no DMA mapping
outlives the lifetime of the struct p2pdma_provider.

The intended target of this new API layer is DMABUF. There is usually
only a single p2pdma_provider for a DMABUF exporter. Most drivers can
establish the p2pdma_provider during probe, access the single instance
during DMABUF attach and use that to drive the DMA mapping.

DMABUF provides an invalidation mechanism that can guarantee all DMA
is halted and the DMA mappings are undone prior to destroying the
struct p2pdma_provider. This ensures there is no UAF through DMABUFs
that are lingering past driver removal.

The new p2pdma_provider layer cannot be used to create P2P memory that
can be mapped into VMA's, be used with pin_user_pages(), O_DIRECT, and
so on. These use cases must still use the mmap() layer. The
p2pdma_provider layer is principally for DMABUF-like use cases where
DMABUF natively manages the life cycle and access instead of
vmas/pin_user_pages()/struct page.

In addition, remove the bus_off field from pci_p2pdma_map_state since
it duplicates information already available in the pgmap structure.
The bus_offset is only used in one location (pci_p2pdma_bus_addr_map)
and is always identical to pgmap->bus_offset.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Alex Mastro <amastro@fb.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Ankit Agrawal <ankita@nvidia.com>
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-1-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-20 12:00:47 -07:00
Leon Romanovsky 54dbd2a8e9 PCI/P2PDMA: Reduce scope of pci_has_p2pmem()
pci_has_p2pmem() is not used outside of p2pdma.c, and there is no need to
export it for use by modules.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/d40f3f1decf54c9236bc38b48a6aae612a5c182f.1756900291.git.leon@kernel.org
2025-09-03 17:01:39 -05:00
Sungho Kim 6238784e50 PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call
The error handling path in pci_p2pdma_add_resource() contains a bug in its
`pgmap_free` label.

Memory is allocated for the `p2p_pgmap` struct, and the pointer is stored
in `p2p_pgmap`. However, the error path calls devm_kfree() with `pgmap`,
which is a pointer to a member field within the `p2p_pgmap` struct, not the
base pointer of the allocation.

Correct the bug by passing the correct base pointer, `p2p_pgmap`, to
devm_kfree().

Signed-off-by: Sungho Kim <sungho.kim@furiosa.ai>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20250820105714.2939896-1-sungho.kim@furiosa.ai
2025-08-20 15:25:54 -05:00
Thomas Weißschuh fb506e31b3 sysfs: treewide: switch back to attribute_group::bin_attrs
The normal bin_attrs field can now handle const pointers.
This makes the _new variant unnecessary.
Switch all users back.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250530-sysfs-const-bin_attr-final-v3-4-724bfcf05b99@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17 10:44:15 +02:00
Christoph Hellwig a25e7962db PCI/P2PDMA: Refactor the p2pdma mapping helpers
The current scheme with a single helper to determine the P2P status
and map a scatterlist segment force users to always use the map_sg
helper to DMA map, which we're trying to get away from because they
are very cache inefficient.

Refactor the code so that there is a single helper that checks the P2P
state for a page, including the result that it is not a P2P page to
simplify the callers, and a second one to perform the address translation
for a bus mapped P2P transfer that does not depend on the scatterlist
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2025-05-06 08:36:53 +02:00
Alistair Popple 82ba975e4c mm: allow compound zone device pages
Zone device pages are used to represent various type of device memory
managed by device drivers.  Currently compound zone device pages are not
supported.  This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.

A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.

Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.

A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page. 
For zone device pages page->compound_head is shared with page->pgmap.

The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections.  Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.

Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:39 -07:00
Alistair Popple b7e2823787 mm/mm_init: move p2pdma page refcount initialisation to p2pdma
Currently ZONE_DEVICE page reference counts are initialised by core memory
management code in __init_zone_device_page() as part of the memremap()
call which driver modules make to obtain ZONE_DEVICE pages.  This
initialises page refcounts to 1 before returning them to the driver.

This was presumably done because it drivers had a reference of sorts on
the page.  It also ensured the page could always be mapped with
vm_insert_page() for example and would never get freed (ie.  have a zero
refcount), freeing drivers of manipulating page reference counts.

However it complicates figuring out whether or not a page is free from the
mm perspective because it is no longer possible to just look at the
refcount.  Instead the page type must be known and if GUP is used a
secondary pgmap reference is also sometimes needed.

To simplify this it is desirable to remove the page reference count for
the driver, so core mm can just use the refcount without having to account
for page type or do other types of tracking.  This is possible because
drivers can always assume the page is valid as core kernel will never
offline or remove the struct page.

This means it is now up to drivers to initialise the page refcount as
required.  P2PDMA uses vm_insert_page() to map the page, and that requires
a non-zero reference count when initialising the page so set that when the
page is first mapped.

Link: https://lkml.kernel.org/r/6aedb0ac2886dcc4503cb705273db5b3863a0b66.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:38 -07:00
Thomas Weißschuh 35fdb302ca PCI/P2PDMA: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Link: https://lore.kernel.org/r/20241202-sysfs-const-bin_attr-pci-v1-3-c32360f495a7@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2024-12-03 15:29:59 -06:00
Thomas Weißschuh 94a20fb9af sysfs: treewide: constify attribute callback of bin_attribute::mmap()
The mmap() callbacks should not modify the struct
bin_attribute passed as argument.
Enforce this by marking the argument as const.

As there are not many callback implementers perform this change
throughout the tree at once.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-6-71110628844c@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05 14:00:28 +01:00
Christophe JAILLET 1e5c66afd4 PCI/P2PDMA: Fix a sleeping issue in a RCU read section
It is not allowed to sleep within a RCU read section, so use GFP_ATOMIC
instead of GFP_KERNEL here.

Link: https://lore.kernel.org/r/02d9ec4a10235def0e764ff1f5be881ba12e16e8.1704397858.git.christophe.jaillet@wanadoo.fr
Fixes: ae21f835a5 ("PCI/P2PDMA: Finish RCU conversion of pdev->p2pdma")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2024-02-08 15:31:43 -06:00
Tadeusz Struk 805b196fb3 PCI/P2PDMA: Remove redundant goto
Remove redundant goto in pci_alloc_p2pmem().

Link: https://lore.kernel.org/r/20231023084050.55230-1-tstruk@gmail.com
Signed-off-by: Tadeusz Struk <tstruk@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-10-23 12:17:52 -05:00
Gustavo A. R. Silva 4a7ce83349 PCI/P2PDMA: Fix undefined behavior bug in struct pci_p2pdma_pagemap
Struct dev_pagemap is a flexible structure, which means that it contains a
flexible-array member.  If dev_pagemap.nr_range > 1, the memory following
the dev_pagemap could be overwritten.

This is currently not an issue because pci_p2pdma_pagemap is not exposed
outside p2pdma.c, and p2pdma.c only sets dev_pagemap.nr_range to 1.

To prevent problems if p2pdma.c ever uses nr_range > 1, move the flexible
struct dev_pagemap to the end of struct pci_p2pdma_pagemap.

-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

Link: https://lore.kernel.org/r/ZRsUL/hATNruwtla@work
Signed-off-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-10-03 17:13:38 -05:00
Bjorn Helgaas 86b4ad7d67 PCI: Fix typos in docs and comments
Fix typos in docs and comments.

Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25 08:15:38 -05:00
Zheng Zengkai 0e8207f54c PCI/P2PDMA: Use pci_dev_id() to simplify the code
When we have a struct pci_dev *, use pci_dev_id() instead of manually
composing the ID from dev->bus->number and dev->devfn.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230811111057.31900-1-zhengzengkai@huawei.com
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-08-14 12:05:05 -05:00
Cai Huoqing 5376ced54c PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-doc
Remove reference to pci_p2pmem_dma(), which has never existed.

Link: https://lore.kernel.org/r/20230329024731.5604-1-cai.huoqing@linux.dev
Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2023-04-06 16:37:51 -05:00
Logan Gunthorpe 6606f4c3c4 PCI/P2PDMA: Annotate RCU dereference
A dereference of the __rcu pointer was noticed by sparse:

  drivers/pci/p2pdma.c:199:44: sparse: sparse: dereference of noderef expression

Dereference the __rcu pointer using rcu_dereference_protected() instead of
accessing it directly. It's safe to use rcu_dereference_protected() because
a reference is held on the pgmap's percpu reference counter and thus it
cannot disappear.

Link: https://lore.kernel.org/r/20230209172953.4597-1-logang@deltatee.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2023-02-16 16:31:12 -06:00
Linus Torvalds ce8a79d560 for-6.2/block-2022-12-08
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk
 WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9
 2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH
 YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK
 aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII
 LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI
 0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd
 Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7
 Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI
 XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n
 D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr
 wxzFGfvHfw==
 =k/vv
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull requests via Christoph:
      - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
        Joshi)
      - Refactor PCIe probing and reset (Christoph Hellwig)
      - Various fabrics authentication fixes and improvements (Sagi
        Grimberg)
      - Avoid fallback to sequential scan due to transient issues (Uday
        Shankar)
      - Implement support for the DEAC bit in Write Zeroes (Christoph
        Hellwig)
      - Allow overriding the IEEE OUI and firmware revision in configfs
        for nvmet (Aleksandr Miloserdov)
      - Force reconnect when number of queue changes in nvmet (Daniel
        Wagner)
      - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
        Grimberg, Christoph Hellwig, Christophe JAILLET)
      - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
      - Use the common tagset helpers in nvme-pci driver (Christoph
        Hellwig)
      - Cleanup the nvme-pci removal path (Christoph Hellwig)
      - Use kstrtobool() instead of strtobool (Christophe JAILLET)
      - Allow unprivileged passthrough of Identify Controller (Joel
        Granados)
      - Support io stats on the mpath device (Sagi Grimberg)
      - Minor nvmet cleanup (Sagi Grimberg)

 - MD pull requests via Song:
      - Code cleanups (Christoph)
      - Various fixes

 - Floppy pull request from Denis:
      - Fix a memory leak in the init error path (Yuan)

 - Series fixing some batch wakeup issues with sbitmap (Gabriel)

 - Removal of the pktcdvd driver that was deprecated more than 5 years
   ago, and subsequent removal of the devnode callback in struct
   block_device_operations as no users are now left (Greg)

 - Fix for partition read on an exclusively opened bdev (Jan)

 - Series of elevator API cleanups (Jinlong, Christoph)

 - Series of fixes and cleanups for blk-iocost (Kemeng)

 - Series of fixes and cleanups for blk-throttle (Kemeng)

 - Series adding concurrent support for sync queues in BFQ (Yu)

 - Series bringing drbd a bit closer to the out-of-tree maintained
   version (Christian, Joel, Lars, Philipp)

 - Misc drbd fixes (Wang)

 - blk-wbt fixes and tweaks for enable/disable (Yu)

 - Fixes for mq-deadline for zoned devices (Damien)

 - Add support for read-only and offline zones for null_blk
   (Shin'ichiro)

 - Series fixing the delayed holder tracking, as used by DM (Yu,
   Christoph)

 - Series enabling bio alloc caching for IRQ based IO (Pavel)

 - Series enabling userspace peer-to-peer DMA (Logan)

 - BFQ waker fixes (Khazhismel)

 - Series fixing elevator refcount issues (Christoph, Jinlong)

 - Series cleaning up references around queue destruction (Christoph)

 - Series doing quiesce by tagset, enabling cleanups in drivers
   (Christoph, Chao)

 - Series untangling the queue kobject and queue references (Christoph)

 - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
   Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)

* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
  blktrace: Fix output non-blktrace event when blk_classic option enabled
  block: sed-opal: Don't include <linux/kernel.h>
  sed-opal: allow using IOC_OPAL_SAVE for locking too
  blk-cgroup: Fix typo in comment
  block: remove bio_set_op_attrs
  nvmet: don't open-code NVME_NS_ATTR_RO enumeration
  nvme-pci: use the tagset alloc/free helpers
  nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
  nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
  nvme: consolidate setting the tagset flags
  nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
  block: bio_copy_data_iter
  nvme-pci: split out a nvme_pci_ctrl_is_dead helper
  nvme-pci: return early on ctrl state mismatch in nvme_reset_work
  nvme-pci: rename nvme_disable_io_queues
  nvme-pci: cleanup nvme_suspend_queue
  nvme-pci: remove nvme_pci_disable
  nvme-pci: remove nvme_disable_admin_queue
  nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
  nvme: use nvme_wait_ready in nvme_shutdown_ctrl
  ...
2022-12-13 10:43:59 -08:00
Jason A. Donenfeld 8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Logan Gunthorpe 7e9c7ef83d PCI/P2PDMA: Allow userspace VMA allocations through sysfs
Create a sysfs bin attribute called "allocate" under the existing
"p2pmem" group. The only allowable operation on this file is the mmap()
call.

When mmap() is called on this attribute, the kernel allocates a chunk of
memory from the genalloc and inserts the pages into the VMA. The
dev_pagemap .page_free callback will indicate when these pages are no
longer used and they will be put back into the genalloc.

On device unbind, remove the sysfs file before the memremap_pages are
cleaned up. This ensures unmap_mapping_range() is called on the files
inode and no new mappings can be created.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20221021174116.7200-9-logang@deltatee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-09 11:29:21 -07:00
Yang Yingliang e01bae16a7 PCI/P2PDMA: Use for_each_pci_dev() helper
Use for_each_pci_dev() instead of open-coding it.  No functional change.

Link: https://lore.kernel.org/r/20220916140329.679633-1-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2022-09-19 13:44:38 -05:00
Logan Gunthorpe 0d06132fc8 PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg()
This interface is superseded by support in dma_map_sg() which now supports
heterogeneous scatterlists. There are no longer any users, so remove it.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-07-26 07:28:08 -04:00
Logan Gunthorpe 5e180ff326 PCI/P2PDMA: Introduce helpers for dma_map_sg implementations
Add pci_p2pdma_map_segment() as a helper for dma_map_sg()
implementations. It takes an scatterlist segment that must point to a
pci_p2pdma struct page and will map it if the mapping requires a bus
address.

The return value indicates whether the mapping required a bus address
or whether the caller still needs to map the segment normally. If the
segment should not be mapped, -EREMOTEIO is returned.

This helper uses a state structure to track the changes to the
pgmap across calls and avoid needing to lookup into the xarray for
every page.

The prototype for the helper is added to dma-map-ops.h as it is only
useful to dma map implementations and don't need to pollute the public
pci-p2pdma header.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-07-26 07:27:47 -04:00
Logan Gunthorpe 719c986580 PCI/P2PDMA: Attempt to set map_type if it has not been set
Attempt to find the mapping type for P2PDMA pages on the first
DMA map attempt if it has not been done ahead of time.

Previously, the mapping type was expected to be calculated ahead of
time, but if pages are to come from userspace then there's no
way to ensure the path was checked ahead of time.

This change will calculate the mapping type if it hasn't pre-calculated
so it is no longer invalid to call pci_p2pdma_map_sg() before the mapping
type is calculated, so drop the WARN_ON when that is the case.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-07-26 07:27:47 -04:00
Shlomo Pongratz 1af7c26c59 PCI/P2PDMA: Whitelist Intel Skylake-E Root Ports at any devfn
In 7b94b53db3 ("PCI/P2PDMA: Add Intel Sky Lake-E Root Ports B, C, D to
the whitelist"), Andrew Maier added Skylake-E 2031, 2032, and 2033 Root
Ports to the pci_p2pdma_whitelist[], so we assume P2PDMA between devices
below these ports works.

Previously we only checked the whitelist for a device at devfn 00.0 on the
root bus, which is often a "host bridge".  But these Skylake Root Ports may
be at any devfn and there may be no "host bridge" device.

Generalize pci_host_bridge_dev() so we check the first device on the root
bus, whether it is devfn 00.0 or a PCIe Root Port, against the whitelist.

[bhelgaas: commit log, comment]
Link: https://lore.kernel.org/r/20220410105213.690-2-shlomop@pliops.com
Tested-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Shlomo Pongratz <shlomop@pliops.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Andrew Maier <andrew.maier@eideticom.com>
2022-04-11 13:21:41 -05:00
Michael J. Ruhl feaea1fe8b PCI/P2PDMA: Add Intel 3rd Gen Intel Xeon Scalable Processors to whitelist
In order to do P2P communication the bridge ID of the platform must be in
the P2P device table.

Update the P2P device table with a device ID for the 3rd Gen Intel Xeon
Scalable Processors.

Link: https://lore.kernel.org/r/20220209162801.7647-1-michael.j.ruhl@intel.com
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2022-02-25 11:03:30 -06:00
Linus Torvalds d0a231f01e pci-v5.17-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmHgpugUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vz59g//eWRLb0j2Vgv84ZH4x1iv6MaBboQr
 2wScnfoN+MIoh+tuM4kRak15X4nB8rJhNZZCzesMUN6PeZvrkoPo4sz/xdzIrA/N
 qY3h8NZ3nC4yCvs/tGem0zZUcSCJsxUAD0eegyMSa142xGIOQTHBSJRflR9osKSo
 bnQlKTkugV8t4kD7NlQ5M3HzN3R+mjsII5JNzCqv2XlzAZG3D8DhPyIpZnRNAOmW
 KiHOVXvQOocfUlvSs5kBlhgR1HgJkGnruCrJ1iDCWQH1Zk0iuVgoZWgVda6Cs3Xv
 gcTJLB7VoSdNZKnct9aMNYPKziHkYc7clilPeDsJs5TlSv3kKERzLj6c/5ZAxFWN
 +RsH+zYHDXJSsL/w0twPnaF5WCuVYUyrs3UiSjUvShKl1T9k9J+Jo8zwUUZx8Xb0
 qXX8jRGMHolBGwPXm2fHEb4bwTUI8emPj29qK4L96KsQ3zKXWB8eGSosxUP52Tti
 RR2WZjkvwlREZCJp6jSEJYkhzoEaVAm8CjKpKUuneX9WcUOsMBSs9k7EXbUy7JeM
 hq5Keuqa8PZo/IK2DYYAchNnBJUDMsWJeduBW12qSmx3J+9victP2qOFu+9skP0a
 85xlO6Cx8beiQh+XnY7jyROvIFuxTnGKHgkq/89Ham/whEzdJ+GRIiYB218kLLCW
 ILdas3C2iiGz99I=
 =Vgg4
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:
   - Use pci_find_vsec_capability() instead of open-coding it (Andy
     Shevchenko)
   - Convert pci_dev_present() stub from macro to static inline to avoid
     'unused variable' errors (Hans de Goede)
   - Convert sysfs slot attributes from default_attrs to default_groups
     (Greg Kroah-Hartman)
   - Use DWORD accesses for LTR, L1 SS to avoid BayHub OZ711LV2 erratum
     (Rajat Jain)
   - Remove unnecessary initialization of static variables (Longji Guo)

  Resource management:
   - Always write Intel I210 ROM BAR on update to work around device
     defect (Bjorn Helgaas)

  PCIe native device hotplug:
   - Fix pciehp lockdep errors on Thunderbolt undock (Hans de Goede)
   - Fix infinite loop in pciehp IRQ handler on power fault (Lukas
     Wunner)

  Power management:
   - Convert amd64-agp, sis-agp, via-agp from legacy PCI power
     management to generic power management (Vaibhav Gupta)

  IOMMU:
   - Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller
     so it can work with an IOMMU (Yifeng Li)

  Error handling:
   - Add PCI_ERROR_RESPONSE and related definitions for signaling and
     checking for transaction errors on PCI (Naveen Naidu)
   - Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers,
     instead of in host controller drivers, when transactions fail on
     PCI (Naveen Naidu)
   - Use PCI_POSSIBLE_ERROR() to check for possible failure of config
     reads (Naveen Naidu)

  Peer-to-peer DMA:
   - Add Logan Gunthorpe as P2PDMA maintainer (Bjorn Helgaas)

  ASPM:
   - Calculate link L0s and L1 exit latencies when needed instead of
     caching them (Saheed O. Bolarinwa)
   - Calculate device L0s and L1 acceptable exit latencies when needed
     instead of caching them (Saheed O. Bolarinwa)
   - Remove struct aspm_latency since it's no longer needed (Saheed O.
     Bolarinwa)

  APM X-Gene PCIe controller driver:
   - Fix IB window setup, which was broken by the fact that IB resources
     are now sorted in address order instead of DT dma-ranges order (Rob
     Herring)

  Apple PCIe controller driver:
   - Enable clock gating to save power (Hector Martin)
   - Fix REFCLK1 enable/poll logic (Hector Martin)

  Broadcom STB PCIe controller driver:
   - Declare bitmap correctly for use by bitmap interfaces (Christophe
     JAILLET)
   - Clean up computation of legacy and non-legacy MSI bitmasks (Florian
     Fainelli)
   - Update suspend/resume/remove error handling to warn about errors
     and not fail the operation (Jim Quinlan)
   - Correct the "pcie" and "msi" interrupt descriptions in DT binding
     (Jim Quinlan)
   - Add DT bindings for endpoint voltage regulators (Jim Quinlan)
   - Split brcm_pcie_setup() into two functions (Jim Quinlan)
   - Add mechanism for turning on voltage regulators for connected
     devices (Jim Quinlan)
   - Turn voltage regulators for connected devices on/off when bus is
     added or removed (Jim Quinlan)
   - When suspending, don't turn off voltage regulators for wakeup
     devices (Jim Quinlan)

  Freescale i.MX6 PCIe controller driver:
   - Add i.MX8MM support (Richard Zhu)

  Freescale Layerscape PCIe controller driver:
   - Use DWC common ops instead of layerscape-specific link-up functions
     (Hou Zhiqiang)

  Intel VMD host bridge driver:
   - Honor platform ACPI _OSC feature negotiation for Root Ports below
     VMD (Kai-Heng Feng)
   - Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan)
   - Reset everything below VMD before enumerating to work around
     failure to enumerate NVMe devices when guest OS reboots (Nirmal
     Patel)

  Bridge emulation (used by Marvell Aardvark and MVEBU):
   - Make emulated ROM BAR read-only by default (Pali Rohár)
   - Make some emulated legacy PCI bits read-only for PCIe devices (Pali
     Rohár)
   - Update reserved bits in emulated PCIe Capability (Pali Rohár)
   - Allow drivers to emulate different PCIe Capability versions (Pali
     Rohár)
   - Set emulated Capabilities List bit for all PCIe devices, since they
     must have at least a PCIe Capability (Pali Rohár)

  Marvell Aardvark PCIe controller driver:
   - Add bridge emulation definitions for PCIe DEVCAP2, DEVCTL2,
     DEVSTA2, LNKCAP2, LNKCTL2, LNKSTA2, SLTCAP2, SLTCTL2, SLTSTA2 (Pali
     Rohár)
   - Add aardvark support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
     registers (Pali Rohár)
   - Clear all MSIs at setup to avoid spurious interrupts (Pali Rohár)
   - Disable bus mastering when unbinding host controller driver (Pali
     Rohár)
   - Mask all interrupts when unbinding host controller driver (Pali
     Rohár)
   - Fix memory leak in host controller unbind (Pali Rohár)
   - Assert PERST# when unbinding host controller driver (Pali Rohár)
   - Disable link training when unbinding host controller driver (Pali
     Rohár)
   - Disable common PHY when unbinding host controller driver (Pali
     Rohár)
   - Fix resource type checking to check only IORESOURCE_MEM, not
     IORESOURCE_MEM_64, which is a flavor of IORESOURCE_MEM (Pali Rohár)

  Marvell MVEBU PCIe controller driver:
   - Implement pci_remap_iospace() for ARM so mvebu can use
     devm_pci_remap_iospace() instead of the previous ARM-specific
     pci_ioremap_io() interface (Pali Rohár)
   - Use the standard pci_host_probe() instead of the device-specific
     mvebu_pci_host_probe() (Pali Rohár)
   - Replace all uses of ARM-specific pci_ioremap_io() with the ARM
     implementation of the standard pci_remap_iospace() interface and
     remove pci_ioremap_io() (Pali Rohár)
   - Skip initializing invalid Root Ports (Pali Rohár)
   - Check for errors from pci_bridge_emul_init() (Pali Rohár)
   - Ignore any bridges at non-zero function numbers (Pali Rohár)
   - Return ~0 data for invalid config read size (Pali Rohár)
   - Disallow mapping interrupts on emulated bridges (Pali Rohár)
   - Clear Root Port Memory & I/O Space Enable and Bus Master Enable at
     initialization (Pali Rohár)
   - Make type bits in Root Port I/O Base register read-only (Pali
     Rohár)
   - Disable Root Port windows when base/limit set to invalid values
     (Pali Rohár)
   - Set controller to Root Complex mode (Pali Rohár)
   - Set Root Port Class Code to PCI Bridge (Pali Rohár)
   - Update emulated Root Port secondary bus numbers to better reflect
     the actual topology (Pali Rohár)
   - Add PCI_BRIDGE_CTL_BUS_RESET support to emulated Root Ports so
     pci_reset_secondary_bus() can reset connected devices (Pali Rohár)
   - Add PCI_EXP_DEVCTL Error Reporting Enable support to emulated Root
     Ports (Pali Rohár)
   - Add PCI_EXP_RTSTA PME Status bit support to emulated Root Ports
     (Pali Rohár)
   - Add DEVCAP2, DEVCTL2 and LNKCTL2 support to emulated Root Ports on
     Armada XP and newer devices (Pali Rohár)
   - Export mvebu-mbus.c symbols to allow pci-mvebu.c to be a module
     (Pali Rohár)
   - Add support for compiling as a module (Pali Rohár)

  MediaTek PCIe controller driver:
   - Assert PERST# for 100ms to allow power and clock to stabilize
     (qizhong cheng)

  MediaTek PCIe Gen3 controller driver:
   - Disable Mediatek DVFSRC voltage request since lack of DVFSRC to
     respond to the request causes failure to exit L1 PM Substate
     (Jianjun Wang)

  MediaTek MT7621 PCIe controller driver:
   - Declare mt7621_pci_ops static (Sergio Paracuellos)
   - Give pcibios_root_bridge_prepare() access to host bridge windows
     (Sergio Paracuellos)
   - Move MIPS I/O coherency unit setup from driver to
     pcibios_root_bridge_prepare() (Sergio Paracuellos)
   - Add missing MODULE_LICENSE() (Sergio Paracuellos)
   - Allow COMPILE_TEST for all arches (Sergio Paracuellos)

  Microsoft Hyper-V host bridge driver:
   - Add hv-internal interfaces to encapsulate arch IRQ dependencies
     (Sunil Muthuswamy)
   - Add arm64 Hyper-V vPCI support (Sunil Muthuswamy)

  Qualcomm PCIe controller driver:
   - Undo PM setup in qcom_pcie_probe() error handling path (Christophe
     JAILLET)
   - Use __be16 type to store return value from cpu_to_be16()
     (Manivannan Sadhasivam)
   - Constify static dw_pcie_ep_ops (Rikard Falkeborn)

  Renesas R-Car PCIe controller driver:
   - Fix aarch32 abort handler so it doesn't check the wrong bus clock
     before accessing the host controller (Marek Vasut)

  TI Keystone PCIe controller driver:
   - Add register offset for ti,syscon-pcie-id and ti,syscon-pcie-mode
     DT properties (Kishon Vijay Abraham I)

  MicroSemi Switchtec management driver:
   - Add Gen4 automotive device IDs (Kelvin Cao)
   - Declare state_names[] as static so it's not allocated and
     initialized for every call (Kelvin Cao)

  Host controller driver cleanups:
   - Use of_device_get_match_data(), not of_match_device(), when we only
     need the device data in altera, artpec6, cadence, designware-plat,
     dra7xx, keystone, kirin (Fan Fei)
   - Drop pointless of_device_get_match_data() cast in j721e (Bjorn
     Helgaas)
   - Drop redundant struct device * from j721e since struct cdns_pcie
     already has one (Bjorn Helgaas)
   - Rename driver structs to *_pcie in intel-gw, iproc, ls-gen4,
     mediatek-gen3, microchip, mt7621, rcar-gen2, tegra194, uniphier,
     xgene, xilinx, xilinx-cpm for consistency across drivers (Fan Fei)
   - Fix invalid address space conversions in hisi, spear13xx (Bjorn
     Helgaas)

  Miscellaneous:
   - Sort Intel Device IDs by value (Andy Shevchenko)
   - Change Capability offsets to hex to match spec (Baruch Siach)
   - Correct misspellings (Krzysztof Wilczyński)
   - Terminate statement with semicolon in pci_endpoint_test.c (Ming
     Wang)"

* tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (151 commits)
  PCI: mt7621: Allow COMPILE_TEST for all arches
  PCI: mt7621: Add missing MODULE_LICENSE()
  PCI: mt7621: Move MIPS setup to pcibios_root_bridge_prepare()
  PCI: Let pcibios_root_bridge_prepare() access bridge->windows
  PCI: mt7621: Declare mt7621_pci_ops static
  PCI: brcmstb: Do not turn off WOL regulators on suspend
  PCI: brcmstb: Add control of subdevice voltage regulators
  PCI: brcmstb: Add mechanism to turn on subdev regulators
  PCI: brcmstb: Split brcm_pcie_setup() into two funcs
  dt-bindings: PCI: Add bindings for Brcmstb EP voltage regulators
  dt-bindings: PCI: Correct brcmstb interrupts, interrupt-map.
  PCI: brcmstb: Fix function return value handling
  PCI: brcmstb: Do not use __GENMASK
  PCI: brcmstb: Declare 'used' as bitmap, not unsigned long
  PCI: hv: Add arm64 Hyper-V vPCI support
  PCI: hv: Make the code arch neutral by adding arch specific interfaces
  PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  x86/PCI: Remove initialization of static variables to false
  PCI: Use DWORD accesses for LTR, L1 SS to avoid erratum
  misc: pci_endpoint_test: Terminate statement with semicolon
  ...
2022-01-16 08:08:11 +02:00
Christophe JAILLET 69f457b18f PCI/P2PDMA: Use percpu_ref_tryget_live_rcu() inside RCU critical section
Since pci_alloc_p2pmem() has already called rcu_read_lock(), we're in an
RCU read-side critical section and don't need to take the lock again.  Use
percpu_ref_tryget_live_rcu() instead of percpu_ref_tryget_live() to save a
few cycles.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/ab80164f4d5b32f9e6240aa4863c3a147ff9c89f.1635974126.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2021-12-15 16:13:13 -06:00
Christoph Hellwig b80892ca02 memremap: remove support for external pgmap refcounts
No driver is left using the external pgmap refcount, so remove the
code to support it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211028151017.50234-1-hch@lst.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-12-04 12:46:09 -08:00
Bjorn Helgaas ebf275b856 Merge branch 'pci/sysfs'
- Check for CAP_SYS_ADMIN before validating sysfs user input, not after
  (Krzysztof Wilczyński)

- Always return -EINVAL from sysfs "store" functions for invalid user input
  instead of -EINVAL sometimes and -ERANGE others (Krzysztof Wilczyński)

- Use kstrtobool() directly instead of the strtobool() wrapper (Krzysztof
  Wilczyński)

* pci/sysfs:
  PCI: Use kstrtobool() directly, sans strtobool() wrapper
  PCI/sysfs: Return -EINVAL consistently from "store" functions
  PCI/sysfs: Check CAP_SYS_ADMIN before parsing user input

# Conflicts:
#	drivers/pci/iov.c
2021-11-05 11:28:46 -05:00
Krzysztof Wilczyński e0f7b19223 PCI: Use kstrtobool() directly, sans strtobool() wrapper
strtobool() is a wrapper around kstrtobool() that has been added for
backward compatibility.

There is no reason to use the old API, so use kstrtobool() directly.

Related: ef95159907 ("lib: move strtobool() to kstrtobool()")

Link: https://lore.kernel.org/r/20210915230127.2495723-3-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-09-28 18:06:29 -05:00
Wang Lu 3a19407913 PCI/P2PDMA: Apply bus offset correctly in DMA address calculation
The bus offset is bus address - physical address, so the calculation in
__pci_p2pdma_map_sg should be: bus address = physical address + bus offset.

Correct the dma_address computation in __pci_p2pdma_map_sg().

Link: https://lore.kernel.org/r/20210909032528.24517-1-wanglu@dapustor.com
Signed-off-by: Wang Lu <wanglu@dapustor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2021-09-20 14:46:02 -05:00
Bjorn Helgaas 7132700067 Merge branch 'pci/sysfs'
- Fix dsm_label_utf16s_to_utf8s() buffer overrun (Krzysztof Wilczyński)

- Use sysfs_emit() and sysfs_emit_at() in "show" functions (Krzysztof
  Wilczyński)

- Fix 'resource_alignment' newline issues (Krzysztof Wilczyński)

- Add newline to 'devspec' sysfs file (Krzysztof Wilczyński)

* pci/sysfs:
  PCI/sysfs: Add 'devspec' newline
  PCI/sysfs: Fix 'resource_alignment' newline issues
  PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in "show" functions
  PCI/sysfs: Rely on lengths from scnprintf(), dsm_label_utf16s_to_utf8s()
  PCI/sysfs: Fix dsm_label_utf16s_to_utf8s() buffer overrun

# Conflicts:
#	drivers/pci/p2pdma.c
2021-07-06 10:56:25 -05:00
Eric Dumazet ae21f835a5 PCI/P2PDMA: Finish RCU conversion of pdev->p2pdma
While looking at pci_alloc_p2pmem() I found RCU protection was not properly
applied there, as pdev->p2pdma was potentially read multiple times.

Fix pci_alloc_p2pmem(), add __rcu qualifier to p2pdma field of struct
pci_dev, and fix all other accesses to this field with proper RCU verbs.

Link: https://lore.kernel.org/r/20210701095621.3129283-1-eric.dumazet@gmail.com
Fixes: 1570175abd ("PCI/P2PDMA: track pgmap references per resource, not globally")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
2021-07-06 10:56:02 -05:00
Christoph Hellwig d1b8dc09dd PCI/P2PDMA: Simplify distance calculation
Merge __calc_map_type_and_dist() and calc_map_type_and_dist_warn() into
calc_map_type_and_dist() to simplify the code a bit.  This now means we add
the devfn strings to the acs_buf unconditionally even if the buffer is not
printed, but that is not a lot of overhead and keeps the code much simpler.

Link: https://lore.kernel.org/r/20210614055310.3960791-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2021-06-16 16:09:41 -05:00
Logan Gunthorpe 3ec0c3ec2d PCI/P2PDMA: Avoid pci_get_slot(), which may sleep
In order to use upstream_bridge_distance_warn() from a dma_map function, it
must not sleep. However, pci_get_slot() takes the pci_bus_sem so it might
sleep.

In order to avoid this, try to get the host bridge's device from the first
element in the device list. It should be impossible for the host bridge's
device to go away while references are held on child devices, so the first
element should not be able to change and, thus, this should be safe.

Introduce a static function called pci_host_bridge_dev() to obtain the host
bridge's root device.

Link: https://lore.kernel.org/r/20210610160609.28447-7-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:55 -05:00
Logan Gunthorpe 7e2faa1710 PCI/P2PDMA: Refactor pci_p2pdma_map_type()
All callers of pci_p2pdma_map_type() have a struct dev_pgmap and a struct
device (of the client doing the DMA transfer). Thus move the conversion to
struct pci_devs for the provider and client into this function.

Link: https://lore.kernel.org/r/20210610160609.28447-6-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:41 -05:00
Logan Gunthorpe cf201bfe8c PCI/P2PDMA: Warn if host bridge not in whitelist
If the host bridge is not in the whitelist print a warning in the
calc_map_type_and_dist_warn() path detailing the vendor and device IDs that
would need to be added to the whitelist.

Suggested-by: Don Dutile <ddutile@redhat.com>
Link: https://lore.kernel.org/r/20210610160609.28447-5-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:41 -05:00
Logan Gunthorpe f9c125b9eb PCI/P2PDMA: Use correct calc_map_type_and_dist() return type
Instead of using an int for the return value of this function, use the
correct enum pci_p2pdma_map_type.

Link: https://lore.kernel.org/r/20210610160609.28447-4-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:41 -05:00
Logan Gunthorpe e4ece59abd PCI/P2PDMA: Collect acs list in stack buffer to avoid sleeping
In order to call the calc_map_type_and_dist_warn() function from a dma_map
operation, the function must not sleep. The only reason it sleeps is to
allocate memory for the seq_buf to print a verbose warning telling the user
how to disable ACS for that path.

Instead of allocating the memory with kmalloc(), allocate a smaller buffer
on the stack. A 128 byte buffer is enough to print 10 PCI device names. A
system with 10 bridge ports between two devices that have ACS enabled would
be unusually large, so this should still be a reasonable limit.

This also cleans up the awkward (and broken) return with -ENOMEM which
contradicts the return type and the caller was not prepared for.

Link: https://lore.kernel.org/r/20210610160609.28447-3-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:41 -05:00
Logan Gunthorpe 6389d43745 PCI/P2PDMA: Rename upstream_bridge_distance() and rework doc
The function upstream_bridge_distance() has evolved such that its name is
no longer entirely reflective of what the function does.  It not only
calculates the distance between two peers but also calculates how the DMA
addresses for those two peers should be mapped.

Rename it to calc_map_type_and_dist() and rework the documentation to
better describe the two pieces of information the function returns.

[bhelgaas: tweak comment wording]
Link: https://lore.kernel.org/r/20210610160609.28447-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-10 18:01:24 -05:00
Krzysztof Wilczyński f8cf6e513e PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in "show" functions
The sysfs_emit() and sysfs_emit_at() functions were introduced to make
it less ambiguous which function is preferred when writing to the output
buffer in a device attribute's "show" callback [1].

Convert the PCI sysfs object "show" functions from sprintf(), snprintf()
and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the
latter is aware of the PAGE_SIZE buffer and correctly returns the number
of bytes written into the buffer.

No functional change intended.

[1] Documentation/filesystems/sysfs.rst

Related commit: ad025f8e46 ("PCI/sysfs: Use sysfs_emit() and
sysfs_emit_at() in "show" functions").

Link: https://lore.kernel.org/r/20210603000112.703037-2-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2021-06-03 22:14:47 -05:00
Linus Torvalds 009bd55dfc RDMA 5.11 pull request
A smaller set of patches, nothing stands out as being particularly major
 this cycle:
 
 - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4,
   mlx4 and mlx5
 
 - Bug fixes and polishing for the new rts ULP
 
 - Cleanup of uverbs checking for allowed driver operations
 
 - Use sysfs_emit all over the place
 
 - Lots of bug fixes and clarity improvements for hns
 
 - hip09 support for hns
 
 - NDR and 50/100Gb signaling rates
 
 - Remove dma_virt_ops and go back to using the IB DMA wrappers
 
 - mlx5 optimizations for contiguous DMA regions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl/aNXUACgkQOG33FX4g
 mxqlMQ/+O6UhxKnDAnMB+HzDGvOm+KXNHOQBuzxz4ZWXqtUrW8WU5ca3PhXovc4z
 /QX0HhMhQmVsva5mjp1OGVATxQ2E+yasqFLg4QXAFWFR3N7s0u/sikE9i1DoPvOC
 lsmLTeRauCFaE4mJD5nvYwm+riECX0GmyVVW7v6V05xwAp0hwdhyU7Kb6Yh3lxsE
 umTz+onPNJcD6Tc4snziyC5QEp5ebEjAaj4dVI1YPR5X0c2RwC5E1CIDI6u4OQ2k
 j7/+Kvo8LNdYNERGiR169x6c1L7WS6dYnGMMeXRgyy0BVbVdRGDnvCV9VRmF66w5
 99fHfDjNMNmqbGNt/4/gwNdVrR9aI4jMZWCh7SmsguX6XwNOlhYldy3x3WnlkfkQ
 e4O0huJceJqcB2Uya70GqufnAetRXsbjzcvWxpR5YAwRmcRkm1f6aGK3BxPjWEbr
 BbYRpiKMxxT4yTe65BuuThzx6g4pNQHe0z3BM/dzMJQAX+PZcs1CPQR8F8PbCrZR
 Ad7qw4HJ587PoSxPi3toVMpYZRP6cISh1zx9q/JCj8cxH9Ri4MovUCS3cF63Ny3B
 1LJ2q0x8FuLLjgZJogKUyEkS8OO6q7NL8WumjvrYWWx19+jcYsV81jTRGSkH3bfY
 F7Esv5K2T1F2gVsCe1ZFFplQg6ja1afIcc+LEl8cMJSyTdoSub4=
 =9t8b
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A smaller set of patches, nothing stands out as being particularly
  major this cycle. The biggest item would be the new HIP09 HW support
  from HNS, otherwise it was pretty quiet for new work here:

   - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw,
     cxgb4, mlx4 and mlx5

   - Bug fixes and polishing for the new rts ULP

   - Cleanup of uverbs checking for allowed driver operations

   - Use sysfs_emit all over the place

   - Lots of bug fixes and clarity improvements for hns

   - hip09 support for hns

   - NDR and 50/100Gb signaling rates

   - Remove dma_virt_ops and go back to using the IB DMA wrappers

   - mlx5 optimizations for contiguous DMA regions"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits)
  RDMA/cma: Don't overwrite sgid_attr after device is released
  RDMA/mlx5: Fix MR cache memory leak
  RDMA/rxe: Use acquire/release for memory ordering
  RDMA/hns: Simplify AEQE process for different types of queue
  RDMA/hns: Fix inaccurate prints
  RDMA/hns: Fix incorrect symbol types
  RDMA/hns: Clear redundant variable initialization
  RDMA/hns: Fix coding style issues
  RDMA/hns: Remove unnecessary access right set during INIT2INIT
  RDMA/hns: WARN_ON if get a reserved sl from users
  RDMA/hns: Avoid filling sl in high 3 bits of vlan_id
  RDMA/hns: Do shift on traffic class when using RoCEv2
  RDMA/hns: Normalization the judgment of some features
  RDMA/hns: Limit the length of data copied between kernel and userspace
  RDMA/mlx4: Remove bogus dev_base_lock usage
  RDMA/uverbs: Fix incorrect variable type
  RDMA/core: Do not indicate device ready when device enablement fails
  RDMA/core: Clean up cq pool mechanism
  RDMA/core: Update kernel documentation for ib_create_named_qp()
  MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address
  ...
2020-12-16 13:42:26 -08:00
Mauro Carvalho Chehab 2f0cd59c6f PCI: Fix kernel-doc markup
Update kernel-doc so the names in the doc match the prototypes.

Link: https://lore.kernel.org/r/f19caf7a68f8365c8b573a42b4ac89ec21925c73.1603469755.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-11-30 11:39:11 -06:00
Christoph Hellwig 73063ec58c PCI/P2PDMA: Cleanup __pci_p2pdma_map_sg a bit
Remove the pointless paddr variable that was only used once.

Link: https://lore.kernel.org/r/20201106181941.1878556-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-17 15:22:07 -04:00
Christoph Hellwig 4d34d52c25 PCI/P2PDMA: Remove the DMA_VIRT_OPS hacks
Now that all users of dma_virt_ops are gone we can remove the workaround
for it in the PCI peer to peer code.

Link: https://lore.kernel.org/r/20201106181941.1878556-9-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-17 15:22:07 -04:00
Linus Torvalds 00937f36b0 pci-v5.10-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl+QUFkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vw6SQ/9FHiAlHIa48/l5ZweqAuN3XnU8hoO
 sqMoJE8eqTkIYIT0aQdW6b1sDB0YE6b4UVxzg+UL/E0qYeJqgIUakig7QkyyF1qU
 aT5hq2ic+lk88G7AAxK3kgQGPk+JvP1EFIyOu6HBWzzDDzgLme1Iuh/5ulc2/lo+
 E4biy0WOnI8vMfCieXGK4bSpc17Rn0+3N4cuVwZXBlntsvicE90VqeWBzqti1sk5
 R6gkZuW+EIUNHHL7TLlkCeYZq6QNbXWzhfKCiaGW2wW4eJ4Ek1/ncQjyTbCFytKU
 7OIYvrH20XO3L5GEfJ5fdbWErI1dRpoHO4NmhWljyBcVh44VYnM2ixhA7TuJ+TOk
 OtMbtoJAlP+QDlVdAW6rmRYmMPLFK/AQl5Aq7ftY22b2rYXqP20BobPy2MpDT71T
 sGC8z0ABl/ijo23g3I+3/2VzP/RzGhZJ0ZqagrXj8jHtg8SVy2fLcR5nr/dlrgFk
 TG83zML6ui1KViyx5nzElaEtw18aTqP61CNQxijQtNoYwKBTtRKNTrdRr4Qo7Hi6
 6S+No3+4z8Kf8d90y0LkJQqr7JRkG6nI3AhXHO3rxXpXJOD2+QzlpwBZTQnASqq7
 3kC1doUPmN97rFUYPQWWyOs6xSMcGbGIz8Uus3shH6yDtNxgpnIVoctH55hTEh6w
 nSY/4ssIfzJxZCE=
 =RCFo
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Print IRQ number used by PCIe Link Bandwidth Notification (Dongdong
     Liu)
   - Add schedule point in pci_read_config() to reduce max latency
     (Jiang Biao)
   - Add Kconfig options for MPS/MRRS strategy (Jim Quinlan)

  Resource management:
   - Fix pci_iounmap() memory leak when !CONFIG_GENERIC_IOMAP (Lorenzo
     Pieralisi)

  PCIe native device hotplug:
   - Reduce noisiness on hot removal (Lukas Wunner)

  Power management:
   - Revert "PCI/PM: Apply D2 delay as milliseconds, not microseconds"
     that was done on the basis of spec typo (Bjorn Helgaas)
   - Rename pci_dev.d3_delay to d3hot_delay to remove D3hot/D3cold
     ambiguity (Krzysztof Wilczyński)
   - Remove unused pcibios_pm_ops (Vaibhav Gupta)

  IOMMU:
   - Enable Translation Blocking for external devices to harden against
     DMA attacks (Rajat Jain)

  Error handling:
   - Add an ACPI APEI notifier chain for vendor CPER records to enable
     device-specific error handling (Shiju Jose)

  ASPM:
   - Remove struct aspm_register_info to simplify code (Saheed O.
     Bolarinwa)

  Amlogic Meson PCIe controller driver:
   - Build as module by default (Kevin Hilman)

  Ampere Altra PCIe controller driver:
   - Add MCFG quirk to work around non-standard ECAM implementation
     (Tuan Phan)

  Broadcom iProc PCIe controller driver:
   - Set affinity mask on MSI interrupts (Mark Tomlinson)

  Broadcom STB PCIe controller driver:
   - Make PCIE_BRCMSTB depend on ARCH_BRCMSTB (Jim Quinlan)
   - Add DT bindings for more Brcmstb chips (Jim Quinlan)
   - Add bcm7278 register info (Jim Quinlan)
   - Add bcm7278 PERST# support (Jim Quinlan)
   - Add suspend and resume pm_ops (Jim Quinlan)
   - Add control of rescal reset (Jim Quinlan)
   - Set additional internal memory DMA viewport sizes (Jim Quinlan)
   - Accommodate MSI for older chips (Jim Quinlan)
   - Set bus max burst size by chip type (Jim Quinlan)
   - Add support for bcm7211, bcm7216, bcm7445, bcm7278 (Jim Quinlan)

  Freescale i.MX6 PCIe controller driver:
   - Use dev_err_probe() to reduce redundant messages (Anson Huang)

  Freescale Layerscape PCIe controller driver:
   - Enforce 4K DMA buffer alignment in endpoint test (Hou Zhiqiang)
   - Add DT compatible strings for ls1088a, ls2088a (Xiaowei Bao)
   - Add endpoint support for ls1088a, ls2088a (Xiaowei Bao)
   - Add endpoint test support for lS1088a (Xiaowei Bao)
   - Add MSI-X support for ls1088a (Xiaowei Bao)

  HiSilicon HIP PCIe controller driver:
   - Handle HIP-specific errors via ACPI APEI (Yicong Yang)

  HiSilicon Kirin PCIe controller driver:
   - Return -EPROBE_DEFER if the GPIO isn't ready (Bean Huo)

  Intel VMD host bridge driver:
   - Factor out physical offset, bus offset, IRQ domain, IRQ allocation
     (Jon Derrick)
   - Use generic PCI PM correctly (Jon Derrick)

  Marvell Aardvark PCIe controller driver:
   - Fix compilation on s390 (Pali Rohár)
   - Implement driver 'remove' function and allow to build it as module
     (Pali Rohár)
   - Move PCIe reset card code to advk_pcie_train_link() (Pali Rohár)
   - Convert mvebu a3700 internal SMCC firmware return codes to errno
     (Pali Rohár)
   - Fix initialization with old Marvell's Arm Trusted Firmware (Pali
     Rohár)

  Microsoft Hyper-V host bridge driver:
   - Fix hibernation in case interrupts are not re-created (Dexuan Cui)

  NVIDIA Tegra PCIe controller driver:
   - Stop checking return value of debugfs_create() functions (Greg
     Kroah-Hartman)
   - Convert to use DEFINE_SEQ_ATTRIBUTE macro (Liu Shixin)

  Qualcomm PCIe controller driver:
   - Reset PCIe to work around Qsdk U-Boot issue (Ansuel Smith)

  Renesas R-Car PCIe controller driver:
   - Add DT documentation for r8a774a1, r8a774b1, r8a774e1 endpoints
     (Lad Prabhakar)
   - Add RZ/G2M, RZ/G2N, RZ/G2H IDs to endpoint test (Lad Prabhakar)
   - Add DT support for r8a7742 (Lad Prabhakar)

  Socionext UniPhier Pro5 controller driver:
   - Add DT descriptions of iATU register (host and endpoint) (Kunihiko
     Hayashi)

  Synopsys DesignWare PCIe controller driver:
   - Add link up check in dw_child_pcie_ops.map_bus() (racy, but seems
     unavoidable) (Hou Zhiqiang)
   - Fix endpoint Header Type check so multi-function devices work (Hou
     Zhiqiang)
   - Skip PCIE_MSI_INTR0* programming if MSI is disabled (Jisheng Zhang)
   - Stop leaking MSI page in suspend/resume (Jisheng Zhang)
   - Add common iATU register support instead of keystone-specific code
     (Kunihiko Hayashi)
   - Major config space access and other cleanups in dwc core and
     drivers that use it (al, exynos, histb, imx6, intel-gw, keystone,
     kirin, meson, qcom, tegra) (Rob Herring)
   - Add multiple PFs support for endpoint (Xiaowei Bao)
   - Add MSI-X doorbell mode in endpoint mode (Xiaowei Bao)

  Miscellaneous:
   - Use fallthrough pseudo-keyword (Gustavo A. R. Silva)
   - Fix "0 used as NULL pointer" warnings (Gustavo Pimentel)
   - Fix "cast truncates bits from constant value" warnings (Gustavo
     Pimentel)
   - Remove redundant zeroing for sg_init_table() (Julia Lawall)
   - Use scnprintf(), not snprintf(), in sysfs "show" functions
     (Krzysztof Wilczyński)
   - Remove unused assignments (Krzysztof Wilczyński)
   - Fix "0 used as NULL pointer" warning (Krzysztof Wilczyński)
   - Simplify bool comparisons (Krzysztof Wilczyński)
   - Use for_each_child_of_node() and for_each_node_by_name() (Qinglang
     Miao)
   - Simplify return expressions (Qinglang Miao)"

* tag 'pci-v5.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (147 commits)
  PCI: vmd: Update VMD PM to correctly use generic PCI PM
  PCI: vmd: Create IRQ allocation helper
  PCI: vmd: Create IRQ Domain configuration helper
  PCI: vmd: Create bus offset configuration helper
  PCI: vmd: Create physical offset helper
  PCI: v3-semi: Remove unneeded break
  PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()
  PCI/ASPM: Remove struct pcie_link_state.l1ss
  PCI/ASPM: Remove struct aspm_register_info.l1ss_cap
  PCI/ASPM: Pass L1SS Capabilities value, not struct aspm_register_info
  PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl1
  PCI/ASPM: Remove struct aspm_register_info.l1ss_ctl2 (unused)
  PCI/ASPM: Remove struct aspm_register_info.l1ss_cap_ptr
  PCI/ASPM: Remove struct aspm_register_info.latency_encoding
  PCI/ASPM: Remove struct aspm_register_info.enabled
  PCI/ASPM: Remove struct aspm_register_info.support
  PCI/ASPM: Use 'parent' and 'child' for readability
  PCI/ASPM: Move LTR path check to where it's used
  PCI/ASPM: Move pci_clear_and_set_dword() earlier
  PCI: dwc: Fix MSI page leakage in suspend/resume
  ...
2020-10-22 12:41:00 -07:00