Commit Graph

1647 Commits

Author SHA1 Message Date
Linus Torvalds 7203ca412f Significant patch series in this merge are as follows:
- The 10 patch series "__vmalloc()/kvmalloc() and no-block support" from
   Uladzislau Rezki reworks the vmalloc() code to support non-blocking
   allocations (GFP_ATOIC, GFP_NOWAIT).
 
 - The 2 patch series "ksm: fix exec/fork inheritance" from xu xin fixes
   a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited
   across fork/exec.
 
 - The 4 patch series "mm/zswap: misc cleanup of code and documentations"
   from SeongJae Park does some light maintenance work on the zswap code.
 
 - The 5 patch series "mm/page_owner: add debugfs files 'show_handles'
   and 'show_stacks_handles'" from Mauricio Faria de Oliveira enhances the
   /sys/kernel/debug/page_owner debug feature.  It adds unique identifiers
   to differentiate the various stack traces so that userspace monitoring
   tools can better match stack traces over time.
 
 - The 2 patch series "mm/page_alloc: pcp->batch cleanups" from Joshua
   Hahn makes some minor alterations to the page allocator's per-cpu-pages
   feature.
 
 - The 2 patch series "Improve UFFDIO_MOVE scalability by removing
   anon_vma lock" from Lokesh Gidra addresses a scalability issue in
   userfaultfd's UFFDIO_MOVE operation.
 
 - The 2 patch series "kasan: cleanups for kasan_enabled() checks" from
   Sabyrzhan Tasbolatov performs some cleanup in the KASAN code.
 
 - The 2 patch series "drivers/base/node: fold node register and
   unregister functions" from Donet Tom cleans up the NUMA node handling
   code a little.
 
 - The 4 patch series "mm: some optimizations for prot numa" from Kefeng
   Wang provides some cleanups and small optimizations to the NUMA
   allocation hinting code.
 
 - The 5 patch series "mm/page_alloc: Batch callers of
   free_pcppages_bulk" from Joshua Hahn addresses long lock hold times at
   boot on large machines.  These were causing (harmless) softlockup
   warnings.
 
 - The 2 patch series "optimize the logic for handling dirty file folios
   during reclaim" from Baolin Wang removes some now-unnecessary work from
   page reclaim.
 
 - The 10 patch series "mm/damon: allow DAMOS auto-tuned for per-memcg
   per-node memory usage" from SeongJae Park enhances the DAMOS auto-tuning
   feature.
 
 - The 2 patch series "mm/damon: fixes for address alignment issues in
   DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan fixes DAMON_LRU_SORT
   and DAMON_RECLAIM with certain userspace configuration.
 
 - The 15 patch series "expand mmap_prepare functionality, port more
   users" from Lorenzo Stoakes enhances the new(ish)
   file_operations.mmap_prepare() method and ports additional callsites
   from the old ->mmap() over to ->mmap_prepare().
 
 - The 8 patch series "Fix stale IOTLB entries for kernel address space"
   from Lu Baolu fixes a bug (and possible security issue on non-x86) in
   the IOMMU code.  In some situations the IOMMU could be left hanging onto
   a stale kernel pagetable entry.
 
 - The 4 patch series "mm/huge_memory: cleanup __split_unmapped_folio()"
   from Wei Yang cleans up and optimizes the folio splitting code.
 
 - The 5 patch series "mm, swap: misc cleanup and bugfix" from Kairui
   Song implements some cleanups and a minor fix in the swap discard code.
 
 - The 8 patch series "mm/damon: misc documentation fixups" from SeongJae
   Park does as advertised.
 
 - The 9 patch series "mm/damon: support pin-point targets removal" from
   SeongJae Park permits userspace to remove a specific monitoring target
   in the middle of the current targets list.
 
 - The 2 patch series "mm: MISC follow-up patches for linux/pgalloc.h"
   from Harry Yoo implements a couple of cleanups related to mm header file
   inclusion.
 
 - The 2 patch series "mm/swapfile.c: select swap devices of default
   priority round robin" from Baoquan He improves the selection of swap
   devices for NUMA machines.
 
 - The 3 patch series "mm: Convert memory block states (MEM_*) macros to
   enums" from Israel Batista changes the memory block labels from macros
   to enums so they will appear in kernel debug info.
 
 - The 3 patch series "ksm: perform a range-walk to jump over holes in
   break_ksm" from Pedro Demarchi Gomes addresses an inefficiency when KSM
   unmerges an address range.
 
 - The 22 patch series "mm/damon/tests: fix memory bugs in kunit tests"
   from SeongJae Park fixes leaks and unhandled malloc() failures in DAMON
   userspace unit tests.
 
 - The 2 patch series "some cleanups for pageout()" from Baolin Wang
   cleans up a couple of minor things in the page scanner's
   writeback-for-eviction code.
 
 - The 2 patch series "mm/hugetlb: refactor sysfs/sysctl interfaces" from
   Hui Zhu moves hugetlb's sysfs/sysctl handling code into a new file.
 
 - The 9 patch series "introduce VM_MAYBE_GUARD and make it sticky" from
   Lorenzo Stoakes makes the VMA guard regions available in /proc/pid/smaps
   and improves the mergeability of guarded VMAs.
 
 - The 2 patch series "mm: perform guard region install/remove under VMA
   lock" from Lorenzo Stoakes reduces mmap lock contention for callers
   performing VMA guard region operations.
 
 - The 2 patch series "vma_start_write_killable" from Matthew Wilcox
   starts work in permitting applications to be killed when they are
   waiting on a read_lock on the VMA lock.
 
 - The 11 patch series "mm/damon/tests: add more tests for online
   parameters commit" from SeongJae Park adds additional userspace testing
   of DAMON's "commit" feature.
 
 - The 9 patch series "mm/damon: misc cleanups" from SeongJae Park does
   that.
 
 - The 2 patch series "make VM_SOFTDIRTY a sticky VMA flag" from Lorenzo
   Stoakes addresses the possible loss of a VMA's VM_SOFTDIRTY flag when
   that VMA is merged with another.
 
 - The 16 patch series "mm: support device-private THP" from Balbir Singh
   introduces support for Transparent Huge Page (THP) migration in zone
   device-private memory.
 
 - The 3 patch series "Optimize folio split in memory failure" from Zi
   Yan optimizes folio split operations in the memory failure code.
 
 - The 2 patch series "mm/huge_memory: Define split_type and consolidate
   split support checks" from Wei Yang provides some more cleanups in the
   folio splitting code.
 
 - The 16 patch series "mm: remove is_swap_[pte, pmd]() + non-swap
   entries, introduce leaf entries" from Lorenzo Stoakes cleans up our
   handling of pagetable leaf entries by introducing the concept of
   'software leaf entries', of type softleaf_t.
 
 - The 4 patch series "reparent the THP split queue" from Muchun Song
   reparents the THP split queue to its parent memcg.  This is in
   preparation for addressing the long-standing "dying memcg" problem,
   wherein dead memcg's linger for too long, consuming memory resources.
 
 - The 3 patch series "unify PMD scan results and remove redundant
   cleanup" from Wei Yang does a little cleanup in the hugepage collapse
   code.
 
 - The 6 patch series "zram: introduce writeback bio batching" from
   Sergey Senozhatsky improves zram writeback efficiency by introducing
   batched bio writeback support.
 
 - The 4 patch series "memcg: cleanup the memcg stats interfaces" from
   Shakeel Butt cleans up our handling of the interrupt safety of some
   memcg stats.
 
 - The 4 patch series "make vmalloc gfp flags usage more apparent" from
   Vishal Moola cleans up vmalloc's handling of incoming GFP flags.
 
 - The 6 patch series "mm: Add soft-dirty and uffd-wp support for RISC-V"
   from Chunyan Zhang teches soft dirty and userfaultfd write protect
   tracking to use RISC-V's Svrsw60t59b extension.
 
 - The 5 patch series "mm: swap: small fixes and comment cleanups" from
   Youngjun Park fixes a small bug and cleans up some of the swap code.
 
 - The 4 patch series "initial work on making VMA flags a bitmap" from
   Lorenzo Stoakes starts work on converting the vma struct's flags to a
   bitmap, so we stop running out of them, especially on 32-bit.
 
 - The 2 patch series "mm/swapfile: fix and cleanup swap list iterations"
   from Youngjun Park addresses a possible bug in the swap discard code and
   cleans things up a little.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTEb0wAKCRDdBJ7gKXxA
 jjfIAP94W4EkCCwNOupnChoG+YWw/JW21anXt5NN+i5svn1yugEAwzvv6A+cAFng
 o+ug/fyrfPZG7PLp2R8WFyGIP0YoBA4=
 =IUzS
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

  "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
     Rework the vmalloc() code to support non-blocking allocations
     (GFP_ATOIC, GFP_NOWAIT)

  "ksm: fix exec/fork inheritance" (xu xin)
     Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
     inherited across fork/exec

  "mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
     Some light maintenance work on the zswap code

  "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
     Enhance the /sys/kernel/debug/page_owner debug feature by adding
     unique identifiers to differentiate the various stack traces so
     that userspace monitoring tools can better match stack traces over
     time

  "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
     Minor alterations to the page allocator's per-cpu-pages feature

  "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
     Address a scalability issue in userfaultfd's UFFDIO_MOVE operation

  "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)

  "drivers/base/node: fold node register and unregister functions" (Donet Tom)
     Clean up the NUMA node handling code a little

  "mm: some optimizations for prot numa" (Kefeng Wang)
     Cleanups and small optimizations to the NUMA allocation hinting
     code

  "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
     Address long lock hold times at boot on large machines. These were
     causing (harmless) softlockup warnings

  "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
     Remove some now-unnecessary work from page reclaim

  "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
     Enhance the DAMOS auto-tuning feature

  "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
     Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
     configuration

  "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
     Enhance the new(ish) file_operations.mmap_prepare() method and port
     additional callsites from the old ->mmap() over to ->mmap_prepare()

  "Fix stale IOTLB entries for kernel address space" (Lu Baolu)
     Fix a bug (and possible security issue on non-x86) in the IOMMU
     code. In some situations the IOMMU could be left hanging onto a
     stale kernel pagetable entry

  "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
     Clean up and optimize the folio splitting code

  "mm, swap: misc cleanup and bugfix" (Kairui Song)
     Some cleanups and a minor fix in the swap discard code

  "mm/damon: misc documentation fixups" (SeongJae Park)

  "mm/damon: support pin-point targets removal" (SeongJae Park)
     Permit userspace to remove a specific monitoring target in the
     middle of the current targets list

  "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
     A couple of cleanups related to mm header file inclusion

  "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
     improve the selection of swap devices for NUMA machines

  "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
     Change the memory block labels from macros to enums so they will
     appear in kernel debug info

  "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
     Address an inefficiency when KSM unmerges an address range

  "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
     Fix leaks and unhandled malloc() failures in DAMON userspace unit
     tests

  "some cleanups for pageout()" (Baolin Wang)
     Clean up a couple of minor things in the page scanner's
     writeback-for-eviction code

  "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
     Move hugetlb's sysfs/sysctl handling code into a new file

  "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
     Make the VMA guard regions available in /proc/pid/smaps and
     improves the mergeability of guarded VMAs

  "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
     Reduce mmap lock contention for callers performing VMA guard region
     operations

  "vma_start_write_killable" (Matthew Wilcox)
     Start work on permitting applications to be killed when they are
     waiting on a read_lock on the VMA lock

  "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
     Add additional userspace testing of DAMON's "commit" feature

  "mm/damon: misc cleanups" (SeongJae Park)

  "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
     Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
     VMA is merged with another

  "mm: support device-private THP" (Balbir Singh)
     Introduce support for Transparent Huge Page (THP) migration in zone
     device-private memory

  "Optimize folio split in memory failure" (Zi Yan)

  "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
     Some more cleanups in the folio splitting code

  "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
     Clean up our handling of pagetable leaf entries by introducing the
     concept of 'software leaf entries', of type softleaf_t

  "reparent the THP split queue" (Muchun Song)
     Reparent the THP split queue to its parent memcg. This is in
     preparation for addressing the long-standing "dying memcg" problem,
     wherein dead memcg's linger for too long, consuming memory
     resources

  "unify PMD scan results and remove redundant cleanup" (Wei Yang)
     A little cleanup in the hugepage collapse code

  "zram: introduce writeback bio batching" (Sergey Senozhatsky)
     Improve zram writeback efficiency by introducing batched bio
     writeback support

  "memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
     Clean up our handling of the interrupt safety of some memcg stats

  "make vmalloc gfp flags usage more apparent" (Vishal Moola)
     Clean up vmalloc's handling of incoming GFP flags

  "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
     Teach soft dirty and userfaultfd write protect tracking to use
     RISC-V's Svrsw60t59b extension

  "mm: swap: small fixes and comment cleanups" (Youngjun Park)
     Fix a small bug and clean up some of the swap code

  "initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
     Start work on converting the vma struct's flags to a bitmap, so we
     stop running out of them, especially on 32-bit

  "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
     Address a possible bug in the swap discard code and clean things
     up a little

[ This merge also reverts commit ebb9aeb980 ("vfio/nvgrace-gpu:
  register device memory for poison handling") because it looks
  broken to me, I've asked for clarification   - Linus ]

* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm: fix vma_start_write_killable() signal handling
  mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
  mm/swapfile: fix list iteration when next node is removed during discard
  fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
  mm/kfence: add reboot notifier to disable KFENCE on shutdown
  memcg: remove inc/dec_lruvec_kmem_state helpers
  selftests/mm/uffd: initialize char variable to Null
  mm: fix DEBUG_RODATA_TEST indentation in Kconfig
  mm: introduce VMA flags bitmap type
  tools/testing/vma: eliminate dependency on vma->__vm_flags
  mm: simplify and rename mm flags function for clarity
  mm: declare VMA flags by bit
  zram: fix a spelling mistake
  mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
  mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
  pagemap: update BUDDY flag documentation
  mm: swap: remove scan_swap_map_slots() references from comments
  mm: swap: change swap_alloc_slow() to void
  mm, swap: remove redundant comment for read_swap_cache_async
  mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
  ...
2025-12-05 13:52:43 -08:00
Linus Torvalds 1dce50698a Scoped user mode access and related changes:
- Implement the missing u64 user access function on ARM when
    CONFIG_CPU_SPECTRE=n. This makes it possible to access a 64bit value in
    generic code with [unsafe_]get_user(). All other architectures and ARM
    variants provide the relevant accessors already.
 
  - Ensure that ASM GOTO jump label usage in the user mode access helpers
    always goes through a local C scope label indirection inside the
    helpers. This is required because compilers are not supporting that a
    ASM GOTO target leaves a auto cleanup scope. GCC silently fails to emit
    the cleanup invocation and CLANG fails the build.
 
    This provides generic wrapper macros and the conversion of affected
    architecture code to use them.
 
  - Scoped user mode access with auto cleanup
 
    Access to user mode memory can be required in hot code paths, but if it
    has to be done with user controlled pointers, the access is shielded
    with a speculation barrier, so that the CPU cannot speculate around the
    address range check. Those speculation barriers impact performance quite
    significantly. This can be avoided by "masking" the provided pointer so
    it is guaranteed to be in the valid user memory access range and
    otherwise to point to a guaranteed unpopulated address space. This has
    to be done without branches so it creates an address dependency for the
    access, which the CPU cannot speculate ahead.
 
    This results in repeating and error prone programming patterns:
 
      	    if (can_do_masked_user_access())
                     from = masked_user_read_access_begin((from));
             else if (!user_read_access_begin(from, sizeof(*from)))
                     return -EFAULT;
             unsafe_get_user(val, from, Efault);
             user_read_access_end();
             return 0;
       Efault:
             user_read_access_end();
             return -EFAULT;
 
     which can be replaced with scopes and automatic cleanup:
 
             scoped_user_read_access(from, Efault)
                     unsafe_get_user(val, from, Efault);
             return 0;
        Efault:
             return -EFAULT;
 
  - Convert code which implements the above pattern over to
    scope_user.*.access(). This also corrects a couple of imbalanced
    masked_*_begin() instances which are harmless on most architectures, but
    prevent PowerPC from implementing the masking optimization.
 
  - Add a missing speculation barrier in copy_from_user_iter()
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksRfITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVhBEACEySjWcyCrD1e0ZFMFAOJZFI2BShav
 reotzCzmHYQdpVukDRxc64BgM2vN4yB04xnyMhi2o4hSTiIJhz1NzbKggsQJhVoA
 psYz+xEI161HuLZnUBUBuF9RRko/HVsbGqO2JFCuOKor4GCycvjVgupR3EIN9h5T
 HZEWGIgaTmN7MBj0QRrJgJkaaSTnPKOwWaNMV/F9pfk27zuB7vuV8WM9P3FaJYG+
 JGa9td7VGaBpWavxgMJqfdvXWBCVDDfZ1dunWx8tPTnLxKZZZD6HlfQXhZTr2n1e
 rtJpGgfVBx5Uqxn4RrhS0I7QeK1b9rrt3IU7EkFoaa3Z8LU5B7cHlm7KyicyoHhy
 SzFFUszssznT/0OhA5fmgPRlqI295HynW2p1L4Xy9hC0EZ2vXJPG5rO6X3x6QwSR
 asjRB7x/6JzWQUzE7/nhXd9KcB66wvQxhnjp7GqulF74aPBCtIdXXDD68YEDYkbi
 dPC3NRBr0ePbsGVGWbYvYIPWcvo1u814C2io1zKwmVbiN6lCYURgQK861vfAZUP8
 oP5D2a6ENgezDKoJo6eJ82inuDu64qZy7OOkU/aO3cbOuWGVyY9CjYD11x85Nr0k
 UNabSOfvcmhmobtYUiAgLLrjX1grQUG3F74ZQTw513mwgMObuDAAoS11GPjY6HL6
 b99WUJRv8jP66A==
 =6no0
 -----END PGP SIGNATURE-----

Merge tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scoped user access updates from Thomas Gleixner:
 "Scoped user mode access and related changes:

   - Implement the missing u64 user access function on ARM when
     CONFIG_CPU_SPECTRE=n.

     This makes it possible to access a 64bit value in generic code with
     [unsafe_]get_user(). All other architectures and ARM variants
     provide the relevant accessors already.

   - Ensure that ASM GOTO jump label usage in the user mode access
     helpers always goes through a local C scope label indirection
     inside the helpers.

     This is required because compilers are not supporting that a ASM
     GOTO target leaves a auto cleanup scope. GCC silently fails to emit
     the cleanup invocation and CLANG fails the build.

     [ Editor's note: gcc-16 will have fixed the code generation issue
       in commit f68fe3ddda4 ("eh: Invoke cleanups/destructors in asm
       goto jumps [PR122835]"). But we obviously have to deal with clang
       and older versions of gcc, so.. - Linus ]

     This provides generic wrapper macros and the conversion of affected
     architecture code to use them.

   - Scoped user mode access with auto cleanup

     Access to user mode memory can be required in hot code paths, but
     if it has to be done with user controlled pointers, the access is
     shielded with a speculation barrier, so that the CPU cannot
     speculate around the address range check. Those speculation
     barriers impact performance quite significantly.

     This cost can be avoided by "masking" the provided pointer so it is
     guaranteed to be in the valid user memory access range and
     otherwise to point to a guaranteed unpopulated address space. This
     has to be done without branches so it creates an address dependency
     for the access, which the CPU cannot speculate ahead.

     This results in repeating and error prone programming patterns:

       	    if (can_do_masked_user_access())
                      from = masked_user_read_access_begin((from));
              else if (!user_read_access_begin(from, sizeof(*from)))
                      return -EFAULT;
              unsafe_get_user(val, from, Efault);
              user_read_access_end();
              return 0;
        Efault:
              user_read_access_end();
              return -EFAULT;

      which can be replaced with scopes and automatic cleanup:

              scoped_user_read_access(from, Efault)
                      unsafe_get_user(val, from, Efault);
              return 0;
         Efault:
              return -EFAULT;

   - Convert code which implements the above pattern over to
     scope_user.*.access(). This also corrects a couple of imbalanced
     masked_*_begin() instances which are harmless on most
     architectures, but prevent PowerPC from implementing the masking
     optimization.

   - Add a missing speculation barrier in copy_from_user_iter()"

* tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required
  scm: Convert put_cmsg() to scoped user access
  iov_iter: Add missing speculation barrier to copy_from_user_iter()
  iov_iter: Convert copy_from_user_iter() to masked user access
  select: Convert to scoped user access
  x86/futex: Convert to scoped user access
  futex: Convert to get/put_user_inline()
  uaccess: Provide put/get_user_inline()
  uaccess: Provide scoped user access regions
  arm64: uaccess: Use unsafe wrappers for ASM GOTO
  s390/uaccess: Use unsafe wrappers for ASM GOTO
  riscv/uaccess: Use unsafe wrappers for ASM GOTO
  powerpc/uaccess: Use unsafe wrappers for ASM GOTO
  x86/uaccess: Use unsafe wrappers for ASM GOTO
  uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user()
  ARM: uaccess: Implement missing __get_user_asm_dword()
2025-12-02 08:01:39 -08:00
Linus Torvalds 4a26e7032d Core kernel bug handling infrastructure changes for v6.19:
- Improve WARN(), which has vararg printf like arguments,
     to work with the x86 #UD based WARN-optimizing infrastructure
     by hiding the format in the bug_table and replacing this
     first argument with the address of the bug-table entry,
     while making the actual function that's called a UD1 instruction.
     (Peter Zijlstra)
 
   - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch
     (Ingo Molnar, s390 support by Heiko Carstens)
 
 Fixes and cleanups:
 
   - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)
 
   - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
     (Peter Zijlstra)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktffcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gLCQ//ZMHTpv6UF+xY4Kq3rdVFJWRR7Prs2hDn
 4V/cFHu7Xb2lRwpJDBNWnbiEkLbGR5qlWD+0CpAMK4mGuL9VjnoeflEzXxOPP9W1
 7SWw1HDc6NkjHM3BNnvkPQbzWF4RL4ZGs4Lbb1nv7pjoTSdBMXNrD5RL+iNmfHHd
 QfOG9ZiRyD5A/b07bfyjIXNaeC/Hot+FeVXTMfD7/vCfc2ywhL2Sm5G/igY/shTY
 Una7q8sbFDD/bFFtWSR2eFQeHQQfT6c/Pu39ZcAIdTlLk1FtZ+A2wcRtwYv/FdPV
 6KDOAxZK7fLMHoCdNTswsg+LbazhABOb/V1J7TaHq2tF/PeyN+B+ucxVY2KFcxQJ
 V5eG5crMrUIL22QO/UaT3dPWxGbJYkNlAWl416tAKdgXA52W4OsPd+k4DGjeP569
 KogAy3SY0D/80v1QN2HcFEJMvr7W2SukxtErqdtA5OKt/ZevB49lGqZoBecqASDO
 QjI1K0yLKnb+erbMIpCpNj+o/Fr1JQgWbYVpwipL2GON4an6vrowimGTsUG7qqxN
 Hwb7IaTNnWQ/4iyCkVV44q6Ln1gyP2hz5Lnzo7QJINUdzp98UjJLAEK4NRG44T4L
 p0t5NMxnvREpAv35rwy03xhmITYeQMOqzN9JEBBvegQyld5ghbThxI7g9BDzcXyL
 Bd0mF7WOV9M=
 =bMUL
 -----END PGP SIGNATURE-----

Merge tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull bug handling infrastructure updates from Ingo Molnar:
 "Core updates:

   - Improve WARN(), which has vararg printf like arguments, to work
     with the x86 #UD based WARN-optimizing infrastructure by hiding the
     format in the bug_table and replacing this first argument with the
     address of the bug-table entry, while making the actual function
     that's called a UD1 instruction (Peter Zijlstra)

   - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch (Ingo
     Molnar, s390 support by Heiko Carstens)

  Fixes and cleanups:

   - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)

   - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS (Peter
     Zijlstra)"

* tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
  x86/bug: Fix BUG_FORMAT vs KASLR
  x86_64/bug: Inline the UD1
  x86/bug: Implement WARN_ONCE()
  x86_64/bug: Implement __WARN_printf()
  x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED
  x86/bug: Add BUG_FORMAT basics
  bug: Allow architectures to provide __WARN_printf()
  bug: Implement WARN_ON() using __WARN_FLAGS()
  bug: Add report_bug_entry()
  bug: Add BUG_FORMAT_ARGS infrastructure
  bug: Clean up CONFIG_GENERIC_BUG_RELATIVE_POINTERS
  bug: Add BUG_FORMAT infrastructure
  x86: Rework __bug_table helpers
  bugs/s390: Remove private WARN_ON() implementation
  bugs/core: Reorganize fields in the first line of WARNING output, add ->comm[] output
  bugs/sh: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/parisc: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Concatenate 'cond_str' with '__FILE__' in __BUG_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Pass in 'cond_str' to __BUG_FLAGS()
  ...
2025-12-01 21:33:01 -08:00
Chunyan Zhang c64da3950c riscv: mm: add userfaultfd write-protect support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 60 for uffd-wp tracking

Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow
bit 4 which is not involved into swap entry computation.

Link: https://lkml.kernel.org/r/20251113072806.795029-6-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Chunyan Zhang 2a3ebad4db riscv: mm: add soft-dirty page tracking support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 59 for soft-dirty.

To add swap PTE soft-dirty tracking, we borrow bit 3 which is available
for swap PTEs on RISC-V systems.

Link: https://lkml.kernel.org/r/20251113072806.795029-5-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Chunyan Zhang 59f6acb4be riscv: add RISC-V Svrsw60t59b extension support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software to use.

Link: https://lkml.kernel.org/r/20251113072806.795029-4-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Peter Zijlstra 2ace527183 Merge branch 'objtool/core'
Bring in the UDB and objtool data annotations to avoid conflicts while further extending the bug exceptions.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2025-11-21 11:21:20 +01:00
Chao-ying Fu 91f815b707 riscv: Update MIPS vendor id to 0x127
[1] defines MIPS vendor id as 0x127. All previous MIPS RISC-V patches
were tested on QEMU, also modified to use 0x722 as MIPS_VENDOR_ID. This
new value should reflect real hardware.

[1] https://mips.com/wp-content/uploads/2025/06/P8700_Programmers_Reference_Manual_Rev1.84_5-31-2025.pdf

Fixes: a8fed1bc03 ("riscv: Add xmipsexectl as a vendor extension")
Signed-off-by: Chao-ying Fu <cfu@wavecomp.com>
Signed-off-by: Aleksa Paunovic <aleksa.paunovic@htecgroup.com>
Link: https://patch.msgid.link/20251113-mips-vendorid-v2-1-3279489b7f84@htecgroup.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Paul WAlmsley <pjw@kernel.org>
2025-11-15 15:27:02 -07:00
Thomas Gleixner 0988ea18c6 riscv/uaccess: Use unsafe wrappers for ASM GOTO
ASM GOTO is miscompiled by GCC when it is used inside a auto cleanup scope:

bool foo(u32 __user *p, u32 val)
{
	scoped_guard(pagefault)
		unsafe_put_user(val, p, efault);
	return true;
efault:
	return false;
}

It ends up leaking the pagefault disable counter in the fault path. clang
at least fails the build.

Rename unsafe_*_user() to arch_unsafe_*_user() which makes the generic
uaccess header wrap it with a local label that makes both compilers emit
correct code. Same for the kernel_nofault() variants.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251027083745.419351819@linutronix.de
2025-11-03 15:26:10 +01:00
Ben Dooks 44aa25c000 riscv: asm: use .insn for making custom instructions
The assembler has .insn for building custom instructions
now, so change the .4byte to .insn. This ensures the output
is marked as an instruction and not as data which may
confuse both debuggers and anything else that relies on
this sort of marking.

Add an ASM_INSN_I() wrapper in asm.h to allow the selecting
of how this is output so older assemblers are still good.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Link: https://lore.kernel.org/r/20251024171640.65232-1-ben.dooks@codethink.co.uk
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-27 18:58:37 -06:00
Jingwei Wang 5d15d2ad36 riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot
The hwprobe vDSO data for some keys, like MISALIGNED_VECTOR_PERF,
is determined by an asynchronous kthread. This can create a race
condition where the kthread finishes after the vDSO data has
already been populated, causing userspace to read stale values.

To fix this race, a new 'ready' flag is added to the vDSO data,
initialized to 'false' during arch_initcall_sync. This flag is
checked by both the vDSO's user-space code and the riscv_hwprobe
syscall. The syscall serves as a one-time gate, using a completion
to wait for any pending probes before populating the data and
setting the flag to 'true', thus ensuring userspace reads fresh
values on its first request.

Reported-by: Tsukasa OI <research_trasio@irq.a4lg.com>
Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@irq.a4lg.com/
Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Co-developed-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Jingwei Wang <wangjingwei@iscas.ac.cn>
Link: https://lore.kernel.org/r/20250811142035.105820-1-wangjingwei@iscas.ac.cn
[pjw@kernel.org: fix checkpatch issues]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-17 22:23:11 -06:00
Paul Walmsley 492c513ec6 riscv: add a forward declaration for cpuinfo_op
Add a forward declaration for cpuinfo_op to resolve a sparse warning.

Link: https://lore.kernel.org/r/b831f349-5d0c-f7ac-8362-acb20bc6221a@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-17 22:10:01 -06:00
Samuel Holland 768e054de0 riscv: Remove the PER_CPU_OFFSET_SHIFT macro
__per_cpu_offset is an array of unsigned long, so we can reuse the
existing RISCV_LGPTR macro.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20251015225604.3860409-1-samuel.holland@sifive.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-17 22:00:29 -06:00
Samuel Holland 5898fc01ff riscv: mm: Define MAX_POSSIBLE_PHYSMEM_BITS for zsmalloc
This definition is used by zsmalloc to optimize memory allocation. On
riscv64, it is the same as MAX_PHYSMEM_BITS from asm/sparsemem.h, but
that definition depends on CONFIG_SPARSEMEM. The correct definition is
already provided for riscv32.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20251015233327.3885003-1-samuel.holland@sifive.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-17 22:00:29 -06:00
Anup Patel ca525d53f9 RISC-V: Define pgprot_dmacoherent() for non-coherent devices
The pgprot_dmacoherent() is used when allocating memory for
non-coherent devices and by default pgprot_dmacoherent() is
same as pgprot_noncached() unless architecture overrides it.

Currently, there is no pgprot_dmacoherent() definition for
RISC-V hence non-coherent device memory is being mapped as
IO thereby making CPU access to such memory slow.

Define pgprot_dmacoherent() to be same as pgprot_writecombine()
for RISC-V so that CPU access non-coherent device memory as
NOCACHE which is better than accessing it as IO.

Fixes: ff689fd21c ("riscv: add RISC-V Svpbmt extension support")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Han Gao <rabenda.cn@gmail.com>
Tested-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Link: https://lore.kernel.org/r/20250820152316.1012757-1-apatel@ventanamicro.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-17 21:30:05 -06:00
Miquel Sabaté Solà 781380d2cd riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES
The current value of BUFMAX is similar as in other architectures, but as
per documentation on KGDB (see
'Documentation/process/debugging/kgdb.rst'), BUFMAX has to be larger
than NUMREGBYTES.

Some NUMREGBYTES architectures (e.g. powerpc or hexagon) actually define
BUFMAX in relation to NUMREGBYTES, and thus this condition is always
guaranteed. Since 2048 is a value that is generally accepted on all
architectures, and that is larger than the current value of NUMREGBYTES,
we can keep this value in arch/riscv, but we can at least add an
'static_assert' as an extra measure just in case NUMREGBYTES changes in
the future for some unforseen reason.

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Link: https://lore.kernel.org/r/20250915143252.154955-1-mikisabate@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-10-09 19:36:45 -06:00
Linus Torvalds 86bcf7be1e RISC-V updates for the v6.18 merge window (part two)
Second set of RISC-V updates for the v6.18 merge window, consisting
 of:
 
 - Support for the RISC-V-standardized RPMI interface.
 
   RPMI is a platform management communication mechanism between OSes
   running on application processors, and a remote platform management
   processor.  Similar to ARM SCMI, TI SCI, etc.  This includes irqchip,
   mailbox, and clk changes.
 
 - Support for the RISC-V-standardized MPXY SBI extension.
 
   MPXY is a RISC-V-specific standard implementing a shared memory
   mailbox between S-mode operating systems (e.g., Linux) and M-mode
   firmware (e.g., OpenSBI).  It is part of this PR since one of its
   use cases is to enable M-mode firmware to act as a single RPMI client
   for all RPMI activity on a core (including S-mode RPMI activity).
   Includes a mailbox driver.
 
 - Some ACPI-related updates to enable the use of RPMI and MPXY.
 
 - The addition of Linux-wide memcpy_{from,to}_le32() static inline
   functions, for RPMI use.
 
 - An ACPI Kconfig change to enable boot logos on any ACPI-using
   architecture (including RISC-V)
 
 - A RISC-V defconfig change to add GPIO keyboard and event device
   support, for front panel shutdown or reboot buttons
 
 This PR also includes a recent, one-line Kconfig patch from Geert to
 keep non-RISC-V users from being asked about building the RPMI virtual
 clock driver when !COMPILE_TEST.  THere's nothing preventing
 non-RISC-V SoCs from implementing RPMI, but until some users show up,
 let's not annoy others with it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmjfNccACgkQx4+xDQu9
 KkugVBAAsa9Oc6iEV1rwj1clWDrvz/bFJRxr/oH3PX0yJ4JJcZaBTydeW3fyJTJ8
 CzmRClGJ18YWjapJXYhMp1AS/VfsULu2AVfUzZUqHTjVZyxgt8xFgygpI+BHIyUN
 vD26iKJz/JvnytRmUi7mMtS0O48nTzdMiiOmu4Ved68YMJCRJKFZw8+rWVcAzrwb
 FHZlIE5Fcb1PRaUDg/45Baj0nEr+NRGKDLsR1rbocbmCmRMnz3ufPTcXk128+3gC
 VB1rQplcMBf2RpCl7p4LW2N746hcbg/RogfpjFy7KLlnEH+Xoh2nCxcWHaiEgR9q
 6JPsYBeekA54ZZsdoNBg1i5rGk3j/G1XGaV1bo7HDLTvShSByhaYrhAedQZEbw//
 xC3Eb7EQ6rNYUUjXiX0y5nhvl+nVlu/FmcsZmcP30ppOV4MQasTZ0zqfso23xhjL
 2e06PwTqsmXDeDNDQ4ruBKrpu8tkA7ZZvjCMq1rvSWjTPObzuGBe/ENrdBUOBb2E
 6UUeAGCZpQm1IxTcKHHxaIDT5ami745kqaBrXanIMKPX1JdCs7ahUqqWzC0LEgSy
 qB/T12bYg5O/yKXdXJuAuTHFb3TOPn6l8aNxRJve+uFwv4r1XXptdal9Yg2xoBWo
 EoGktm8KAp5Ndn5BntXI4xG4Ia3HOsj9YA7y4Iep4EO94JZk3Fk=
 =Ys1m
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.18-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Paul Walmsley:

 - Support for the RISC-V-standardized RPMI interface.

   RPMI is a platform management communication mechanism between OSes
   running on application processors, and a remote platform management
   processor. Similar to ARM SCMI, TI SCI, etc. This includes irqchip,
   mailbox, and clk changes.

 - Support for the RISC-V-standardized MPXY SBI extension.

   MPXY is a RISC-V-specific standard implementing a shared memory
   mailbox between S-mode operating systems (e.g., Linux) and M-mode
   firmware (e.g., OpenSBI). It is part of this PR since one of its use
   cases is to enable M-mode firmware to act as a single RPMI client for
   all RPMI activity on a core (including S-mode RPMI activity).
   Includes a mailbox driver.

 - Some ACPI-related updates to enable the use of RPMI and MPXY.

 - The addition of Linux-wide memcpy_{from,to}_le32() static inline
   functions, for RPMI use.

 - An ACPI Kconfig change to enable boot logos on any ACPI-using
   architecture (including RISC-V)

 - A RISC-V defconfig change to add GPIO keyboard and event device
   support, for front panel shutdown or reboot buttons

* tag 'riscv-for-linus-6.18-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (26 commits)
  clk: COMMON_CLK_RPMI should depend on RISCV
  ACPI: support BGRT table on RISC-V
  MAINTAINERS: Add entry for RISC-V RPMI and MPXY drivers
  RISC-V: Enable GPIO keyboard and event device in RV64 defconfig
  irqchip/riscv-rpmi-sysmsi: Add ACPI support
  mailbox/riscv-sbi-mpxy: Add ACPI support
  irqchip/irq-riscv-imsic-early: Export imsic_acpi_get_fwnode()
  ACPI: RISC-V: Add RPMI System MSI to GSI mapping
  ACPI: RISC-V: Add support to update gsi range
  ACPI: RISC-V: Create interrupt controller list in sorted order
  ACPI: scan: Update honor list for RPMI System MSI
  ACPI: Add support for nargs_prop in acpi_fwnode_get_reference_args()
  ACPI: property: Refactor acpi_fwnode_get_reference_args() to support nargs_prop
  irqchip: Add driver for the RPMI system MSI service group
  dt-bindings: Add RPMI system MSI interrupt controller bindings
  dt-bindings: Add RPMI system MSI message proxy bindings
  clk: Add clock driver for the RISC-V RPMI clock service group
  dt-bindings: clock: Add RPMI clock service controller bindings
  dt-bindings: clock: Add RPMI clock service message proxy bindings
  mailbox: Add RISC-V SBI message proxy (MPXY) based mailbox driver
  ...
2025-10-04 10:36:22 -07:00
Linus Torvalds f3826aa996 guest_memfd:
* Add support for host userspace mapping of guest_memfd-backed memory for VM
   types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE (which isn't
   precisely the same thing as CoCo VMs, since x86's SEV-MEM and SEV-ES have
   no way to detect private vs. shared).
 
   This lays the groundwork for removal of guest memory from the kernel direct
   map, as well as for limited mmap() for guest_memfd-backed memory.
 
   For more information see:
   * a6ad54137a ("Merge branch 'guest-memfd-mmap' into HEAD", 2025-08-27)
   * https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
     (guest_memfd in Firecracker)
   * https://lore.kernel.org/all/20250221160728.1584559-1-roypat@amazon.co.uk/
     (direct map removal)
   * https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/
     (mmap support)
 
 ARM:
 
 * Add support for FF-A 1.2 as the secure memory conduit for pKVM,
   allowing more registers to be used as part of the message payload.
 
 * Change the way pKVM allocates its VM handles, making sure that the
   privileged hypervisor is never tricked into using uninitialised
   data.
 
 * Speed up MMIO range registration by avoiding unnecessary RCU
   synchronisation, which results in VMs starting much quicker.
 
 * Add the dump of the instruction stream when panic-ing in the EL2
   payload, just like the rest of the kernel has always done. This will
   hopefully help debugging non-VHE setups.
 
 * Add 52bit PA support to the stage-1 page-table walker, and make use
   of it to populate the fault level reported to the guest on failing
   to translate a stage-1 walk.
 
 * Add NV support to the GICv3-on-GICv5 emulation code, ensuring
   feature parity for guests, irrespective of the host platform.
 
 * Fix some really ugly architecture problems when dealing with debug
   in a nested VM. This has some bad performance impacts, but is at
   least correct.
 
 * Add enough infrastructure to be able to disable EL2 features and
   give effective values to the EL2 control registers. This then allows
   a bunch of features to be turned off, which helps cross-host
   migration.
 
 * Large rework of the selftest infrastructure to allow most tests to
   transparently run at EL2. This is the first step towards enabling
   NV testing.
 
 * Various fixes and improvements all over the map, including one BE
   fix, just in time for the removal of the feature.
 
 LoongArch:
 
 * Detect page table walk feature on new hardware
 
 * Add sign extension with kernel MMIO/IOCSR emulation
 
 * Improve in-kernel IPI emulation
 
 * Improve in-kernel PCH-PIC emulation
 
 * Move kvm_iocsr tracepoint out of generic code
 
 RISC-V:
 
 * Added SBI FWFT extension for Guest/VM with misaligned delegation and
   pointer masking PMLEN features
 
 * Added ONE_REG interface for SBI FWFT extension
 
 * Added Zicbop and bfloat16 extensions for Guest/VM
 
 * Enabled more common KVM selftests for RISC-V
 
 * Added SBI v3.0 PMU enhancements in KVM and perf driver
 
 s390:
 
 * Improve interrupt cpu for wakeup, in particular the heuristic to decide
   which vCPU to deliver a floating interrupt to.
 
 * Clear the PTE when discarding a swapped page because of CMMA; this
   bug was introduced in 6.16 when refactoring gmap code.
 
 x86 selftests:
 
 * Add #DE coverage in the fastops test (the only exception that's guest-
   triggerable in fastop-emulated instructions).
 
 * Fix PMU selftests errors encountered on Granite Rapids (GNR), Sierra
   Forest (SRF) and Clearwater Forest (CWF).
 
 * Minor cleanups and improvements
 
 x86 (guest side):
 
 * For the legacy PCI hole (memory between TOLUD and 4GiB) to UC when
   overriding guest MTRR for TDX/SNP to fix an issue where ACPI auto-mapping
   could map devices as WB and prevent the device drivers from mapping their
   devices with UC/UC-.
 
 * Make kvm_async_pf_task_wake() a local static helper and remove its
   export.
 
 * Use native qspinlocks when running in a VM with dedicated vCPU=>pCPU
   bindings even when PV_UNHALT is unsupported.
 
 Generic:
 
 * Remove a redundant __GFP_NOWARN from kvm_setup_async_pf() as __GFP_NOWARN is
   now included in GFP_NOWAIT.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjcGSkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPSPAgAnJDswU4fZ5YdJr6jGzsbSQ6utlIV
 FeEltLKQIM7Aq/uvL6PLN5Kx1Pb/d9r9ag39mDT6lq9fOfJdOLjJr2SBXPTCsrPS
 6hyNL1mlgo5qzs54T8dkMbQThlSgA4zaehsc0zl8vnwil6ygoAdrtTHqZm6V0hu/
 F/sVlikCsLix1hC0KtzwscyWYcjWtXfVoi9eU5WY6ALpQaVXfRUtwyOhGDkldr+m
 i3iDiGiLAZ5Iu3igUCIOEzSSQY0FgLJpzbwJAeUxIvomDkHGJLaR14ijvM+NkRZi
 FBo2CLbjrwXb56Rbh2ABcq0CGJ3EiU3L+CC34UaRLzbtl/2BtpetkC3irA==
 =fyov
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "This excludes the bulk of the x86 changes, which I will send
  separately. They have two not complex but relatively unusual conflicts
  so I will wait for other dust to settle.

  guest_memfd:

   - Add support for host userspace mapping of guest_memfd-backed memory
     for VM types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE
     (which isn't precisely the same thing as CoCo VMs, since x86's
     SEV-MEM and SEV-ES have no way to detect private vs. shared).

     This lays the groundwork for removal of guest memory from the
     kernel direct map, as well as for limited mmap() for
     guest_memfd-backed memory.

     For more information see:
       - commit a6ad54137a ("Merge branch 'guest-memfd-mmap' into HEAD")
       - guest_memfd in Firecracker:
           https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
       - direct map removal:
           https://lore.kernel.org/all/20250221160728.1584559-1-roypat@amazon.co.uk/
       - mmap support:
           https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/

  ARM:

   - Add support for FF-A 1.2 as the secure memory conduit for pKVM,
     allowing more registers to be used as part of the message payload.

   - Change the way pKVM allocates its VM handles, making sure that the
     privileged hypervisor is never tricked into using uninitialised
     data.

   - Speed up MMIO range registration by avoiding unnecessary RCU
     synchronisation, which results in VMs starting much quicker.

   - Add the dump of the instruction stream when panic-ing in the EL2
     payload, just like the rest of the kernel has always done. This
     will hopefully help debugging non-VHE setups.

   - Add 52bit PA support to the stage-1 page-table walker, and make use
     of it to populate the fault level reported to the guest on failing
     to translate a stage-1 walk.

   - Add NV support to the GICv3-on-GICv5 emulation code, ensuring
     feature parity for guests, irrespective of the host platform.

   - Fix some really ugly architecture problems when dealing with debug
     in a nested VM. This has some bad performance impacts, but is at
     least correct.

   - Add enough infrastructure to be able to disable EL2 features and
     give effective values to the EL2 control registers. This then
     allows a bunch of features to be turned off, which helps cross-host
     migration.

   - Large rework of the selftest infrastructure to allow most tests to
     transparently run at EL2. This is the first step towards enabling
     NV testing.

   - Various fixes and improvements all over the map, including one BE
     fix, just in time for the removal of the feature.

  LoongArch:

   - Detect page table walk feature on new hardware

   - Add sign extension with kernel MMIO/IOCSR emulation

   - Improve in-kernel IPI emulation

   - Improve in-kernel PCH-PIC emulation

   - Move kvm_iocsr tracepoint out of generic code

  RISC-V:

   - Added SBI FWFT extension for Guest/VM with misaligned delegation
     and pointer masking PMLEN features

   - Added ONE_REG interface for SBI FWFT extension

   - Added Zicbop and bfloat16 extensions for Guest/VM

   - Enabled more common KVM selftests for RISC-V

   - Added SBI v3.0 PMU enhancements in KVM and perf driver

  s390:

   - Improve interrupt cpu for wakeup, in particular the heuristic to
     decide which vCPU to deliver a floating interrupt to.

   - Clear the PTE when discarding a swapped page because of CMMA; this
     bug was introduced in 6.16 when refactoring gmap code.

  x86 selftests:

   - Add #DE coverage in the fastops test (the only exception that's
     guest- triggerable in fastop-emulated instructions).

   - Fix PMU selftests errors encountered on Granite Rapids (GNR),
     Sierra Forest (SRF) and Clearwater Forest (CWF).

   - Minor cleanups and improvements

  x86 (guest side):

   - For the legacy PCI hole (memory between TOLUD and 4GiB) to UC when
     overriding guest MTRR for TDX/SNP to fix an issue where ACPI
     auto-mapping could map devices as WB and prevent the device drivers
     from mapping their devices with UC/UC-.

   - Make kvm_async_pf_task_wake() a local static helper and remove its
     export.

   - Use native qspinlocks when running in a VM with dedicated
     vCPU=>pCPU bindings even when PV_UNHALT is unsupported.

  Generic:

   - Remove a redundant __GFP_NOWARN from kvm_setup_async_pf() as
     __GFP_NOWARN is now included in GFP_NOWAIT.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (178 commits)
  KVM: s390: Fix to clear PTE when discarding a swapped page
  KVM: arm64: selftests: Cover ID_AA64ISAR3_EL1 in set_id_regs
  KVM: arm64: selftests: Remove a duplicate register listing in set_id_regs
  KVM: arm64: selftests: Cope with arch silliness in EL2 selftest
  KVM: arm64: selftests: Add basic test for running in VHE EL2
  KVM: arm64: selftests: Enable EL2 by default
  KVM: arm64: selftests: Initialize HCR_EL2
  KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU counters
  KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2
  KVM: arm64: selftests: Select SMCCC conduit based on current EL
  KVM: arm64: selftests: Provide helper for getting default vCPU target
  KVM: arm64: selftests: Alias EL1 registers to EL2 counterparts
  KVM: arm64: selftests: Create a VGICv3 for 'default' VMs
  KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creation
  KVM: arm64: selftests: Add helper to check for VGICv3 support
  KVM: arm64: selftests: Initialize VGICv3 only once
  KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library code
  KVM: selftests: Add ex_str() to print human friendly name of exception vectors
  selftests/kvm: remove stale TODO in xapic_state_test
  KVM: selftests: Handle Intel Atom errata that leads to PMU event overcount
  ...
2025-10-04 08:52:16 -07:00
Linus Torvalds 8804d970fa Summary of significant series in this pull request:
- The 3 patch series "mm, swap: improve cluster scan strategy" from
   Kairui Song improves performance and reduces the failure rate of swap
   cluster allocation.
 
 - The 4 patch series "support large align and nid in Rust allocators"
   from Vitaly Wool permits Rust allocators to set NUMA node and large
   alignment when perforning slub and vmalloc reallocs.
 
 - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
   Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
   for virtual address spaces for ops-level DAMOS filters.
 
 - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
   from Suren Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps.
 
 - The 2 patch series "mm/mincore: minor clean up for swap cache
   checking" from Kairui Song performs some cleanup in the swap code.
 
 - The 11 patch series "mm: vm_normal_page*() improvements" from David
   Hildenbrand provides code cleanup in the pagemap code.
 
 - The 5 patch series "add persistent huge zero folio support" from
   Pankaj Raghav provides a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero.
 
 - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
   few touchups to the recently added Kexec Handover feature.
 
 - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
   arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap.  To
   end the constant struggle with space shortage on 32-bit conflicting with
   64-bit's needs.
 
 - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
   cleans up some swap code.
 
 - The 7 patch series "selftests/mm: Fix false positives and skip
   unsupported tests" from Donet Tom fixes a few things in our selftests
   code.
 
 - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
   THPs when advised" from David Hildenbrand "allows individual processes
   to opt-out of THP=always into THP=madvise, without affecting other
   workloads on the system".
 
   It's a long story - the [1/N] changelog spells out the considerations.
 
 - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
   gets us started on the memdesc project.  Please see
   https://kernelnewbies.org/MatthewWilcox/Memdescs and
   https://blogs.oracle.com/linux/post/introducing-memdesc.
 
 - The 3 patch series "Tiny optimization for large read operations" from
   Chi Zhiling improves the efficiency of the pagecache read path.
 
 - The 5 patch series "Better split_huge_page_test result check" from Zi
   Yan improves our folio splitting selftest code.
 
 - The 2 patch series "test that rmap behaves as expected" from Wei Yang
   adds some rmap selftests.
 
 - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
   removes that function and converts its two remaining callers.
 
 - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
   fixes some UFFD selftests issues.
 
 - The 3 patch series "introduce kernel file mapped folios" from Boris
   Burkov introduces the concept of "kernel file pages".  Using these
   permits btrfs to account its metadata pages to the root cgroup, rather
   than to the cgroups of random inappropriate tasks.
 
 - The 2 patch series "mm/pageblock: improve readability of some
   pageblock handling" from Wei Yang provides some readability improvements
   to the page allocator code.
 
 - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
   Park teaches DAMON to understand arm32 highmem.
 
 - The 4 patch series "tools: testing: Use existing atomic.h for
   vma/maple tests" from Brendan Jackman performs some code cleanups and
   deduplication under tools/testing/.
 
 - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
   Liam Howlett fixes a couple of 32-bit issues in
   tools/testing/radix-tree.c.
 
 - The 2 patch series "kasan: unify kasan_enabled() and remove
   arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
   arch-specific initialization code into a common arch-neutral
   implementation.
 
 - The 3 patch series "mm: remove zpool" from Johannes Weiner removes
   zspool - an indirection layer which now only redirects to a single thing
   (zsmalloc).
 
 - The 2 patch series "mm: task_stack: Stack handling cleanups" from
   Pasha Tatashin makes a couple of cleanups in the fork code.
 
 - The 37 patch series "mm: remove nth_page()" from David Hildenbrand
   makes rather a lot of adjustments at various nth_page() callsites,
   eventually permitting the removal of that undesirable helper function.
 
 - The 2 patch series "introduce kasan.write_only option in hw-tags" from
   Yeoreum Yun creates a KASAN read-only mode for ARM, using that
   architecture's memory tagging feature.  It is felt that a read-only mode
   KASAN is suitable for use in production systems rather than debug-only.
 
 - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
   from Kefeng Wang does some tidying in the hugetlb folio allocation code.
 
 - The 12 patch series "mm: establish const-correctness for pointer
   parameters" from Max Kellermann makes quite a number of the MM API
   functions more accurate about the constness of their arguments.  This
   was getting in the way of subsystems (in this case CEPH) when they
   attempt to improving their own const/non-const accuracy.
 
 - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
   fixes a number of code sites which were confused over when to use
   free_pages() vs __free_pages().
 
 - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
   Ryhl makes the mapletree code accessible to Rust.  Required by nouveau
   and by its forthcoming successor: the new Rust Nova driver.
 
 - The 2 patch series "selftests/mm: split_huge_page_test:
   split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
   some cleanups to the thp selftesting code.
 
 - The 14 patch series "mm, swap: introduce swap table as swap cache
   (phase I)" from Chris Li and Kairui Song is the first step along the
   path to implementing "swap tables" - a new approach to swap allocation
   and state tracking which is expected to yield speed and space
   improvements.  This patchset itself yields a 5-20% performance benefit
   in some situations.
 
 - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
   the new memdesc layer to clean up the ptdesc code a little.
 
 - The 3 patch series "Fix va_high_addr_switch.sh test failure" from
   Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
 
 - The 2 patch series "Minor fixes for memory allocation profiling" from
   Suren Baghdasaryan addresses a couple of minor issues in relatively new
   memory allocation profiling feature.
 
 - The 3 patch series "Small cleanups" from Matthew Wilcox has a few
   cleanups in preparation for more memdesc work.
 
 - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
   DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
   furtherance of supporting arm highmem.
 
 - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
   warnings" from Muhammad Anjum adds that compiler check to selftests code
   and fixes the fallout, by removing dead code.
 
 - The 10 patch series "Improvements to Victim Process Thawing and OOM
   Reaper Traversal Order" from zhongjinji makes a number of improvements
   in the OOM killer: mainly thawing a more appropriate group of victim
   threads so they can release resources.
 
 - The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
   from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
 
 - The 7 patch series "mm/damon: define and use DAMON initialization
   check function" from SeongJae Park implement reliability and
   maintainability improvements to a recently-added bug fix.
 
 - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
   non-idle ages" from SeongJae Park provides additional transparency to
   userspace clients of the DAMON_STAT information.
 
 - The 2 patch series "Expand scope of khugepaged anonymous collapse"
   from Dev Jain removes some constraints on khubepaged's collapsing of
   anon VMAs.  It also increases the success rate of MADV_COLLAPSE against
   an anon vma.
 
 - The 2 patch series "mm: do not assume file == vma->vm_file in
   compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
   removal of file_operations.mmap().  This patchset concentrates upon
   clearing up the treatment of stacked filesystems.
 
 - The 6 patch series "mm: Improve mlock tracking for large folios" from
   Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
   of large folios.  /proc/meminfo's "Mlocked" field became more accurate.
 
 - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
   during fork" from Donet Tom fixes several user-visible KSM stats
   inaccuracies across forks and adds selftest code to verify these
   counters.
 
 - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
   Yang addresses some potential but presently benign issues in KSM's
   mm_slot handling.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
 jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
 =4mhY
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "mm, swap: improve cluster scan strategy" from Kairui Song improves
   performance and reduces the failure rate of swap cluster allocation

 - "support large align and nid in Rust allocators" from Vitaly Wool
   permits Rust allocators to set NUMA node and large alignment when
   perforning slub and vmalloc reallocs

 - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
   DAMOS_STAT's handling of the DAMON operations sets for virtual
   address spaces for ops-level DAMOS filters

 - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
   Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps

 - "mm/mincore: minor clean up for swap cache checking" from Kairui Song
   performs some cleanup in the swap code

 - "mm: vm_normal_page*() improvements" from David Hildenbrand provides
   code cleanup in the pagemap code

 - "add persistent huge zero folio support" from Pankaj Raghav provides
   a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero

 - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
   the recently added Kexec Handover feature

 - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
   Stoakes turns mm_struct.flags into a bitmap. To end the constant
   struggle with space shortage on 32-bit conflicting with 64-bit's
   needs

 - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
   code

 - "selftests/mm: Fix false positives and skip unsupported tests" from
   Donet Tom fixes a few things in our selftests code

 - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
   from David Hildenbrand "allows individual processes to opt-out of
   THP=always into THP=madvise, without affecting other workloads on the
   system".

   It's a long story - the [1/N] changelog spells out the considerations

 - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
   the memdesc project. Please see

      https://kernelnewbies.org/MatthewWilcox/Memdescs and
      https://blogs.oracle.com/linux/post/introducing-memdesc

 - "Tiny optimization for large read operations" from Chi Zhiling
   improves the efficiency of the pagecache read path

 - "Better split_huge_page_test result check" from Zi Yan improves our
   folio splitting selftest code

 - "test that rmap behaves as expected" from Wei Yang adds some rmap
   selftests

 - "remove write_cache_pages()" from Christoph Hellwig removes that
   function and converts its two remaining callers

 - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
   selftests issues

 - "introduce kernel file mapped folios" from Boris Burkov introduces
   the concept of "kernel file pages". Using these permits btrfs to
   account its metadata pages to the root cgroup, rather than to the
   cgroups of random inappropriate tasks

 - "mm/pageblock: improve readability of some pageblock handling" from
   Wei Yang provides some readability improvements to the page allocator
   code

 - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
   to understand arm32 highmem

 - "tools: testing: Use existing atomic.h for vma/maple tests" from
   Brendan Jackman performs some code cleanups and deduplication under
   tools/testing/

 - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
   a couple of 32-bit issues in tools/testing/radix-tree.c

 - "kasan: unify kasan_enabled() and remove arch-specific
   implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
   initialization code into a common arch-neutral implementation

 - "mm: remove zpool" from Johannes Weiner removes zspool - an
   indirection layer which now only redirects to a single thing
   (zsmalloc)

 - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
   couple of cleanups in the fork code

 - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
   adjustments at various nth_page() callsites, eventually permitting
   the removal of that undesirable helper function

 - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
   creates a KASAN read-only mode for ARM, using that architecture's
   memory tagging feature. It is felt that a read-only mode KASAN is
   suitable for use in production systems rather than debug-only

 - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
   some tidying in the hugetlb folio allocation code

 - "mm: establish const-correctness for pointer parameters" from Max
   Kellermann makes quite a number of the MM API functions more accurate
   about the constness of their arguments. This was getting in the way
   of subsystems (in this case CEPH) when they attempt to improving
   their own const/non-const accuracy

 - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
   code sites which were confused over when to use free_pages() vs
   __free_pages()

 - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
   mapletree code accessible to Rust. Required by nouveau and by its
   forthcoming successor: the new Rust Nova driver

 - "selftests/mm: split_huge_page_test: split_pte_mapped_thp
   improvements" from David Hildenbrand adds a fix and some cleanups to
   the thp selftesting code

 - "mm, swap: introduce swap table as swap cache (phase I)" from Chris
   Li and Kairui Song is the first step along the path to implementing
   "swap tables" - a new approach to swap allocation and state tracking
   which is expected to yield speed and space improvements. This
   patchset itself yields a 5-20% performance benefit in some situations

 - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
   layer to clean up the ptdesc code a little

 - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
   issues in our 5-level pagetable selftesting code

 - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
   addresses a couple of minor issues in relatively new memory
   allocation profiling feature

 - "Small cleanups" from Matthew Wilcox has a few cleanups in
   preparation for more memdesc work

 - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
   Quanmin Yan makes some changes to DAMON in furtherance of supporting
   arm highmem

 - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
   Anjum adds that compiler check to selftests code and fixes the
   fallout, by removing dead code

 - "Improvements to Victim Process Thawing and OOM Reaper Traversal
   Order" from zhongjinji makes a number of improvements in the OOM
   killer: mainly thawing a more appropriate group of victim threads so
   they can release resources

 - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
   is a bunch of small and unrelated fixups for DAMON

 - "mm/damon: define and use DAMON initialization check function" from
   SeongJae Park implement reliability and maintainability improvements
   to a recently-added bug fix

 - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
   SeongJae Park provides additional transparency to userspace clients
   of the DAMON_STAT information

 - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
   some constraints on khubepaged's collapsing of anon VMAs. It also
   increases the success rate of MADV_COLLAPSE against an anon vma

 - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
   from Lorenzo Stoakes moves us further towards removal of
   file_operations.mmap(). This patchset concentrates upon clearing up
   the treatment of stacked filesystems

 - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
   provides some fixes and improvements to mlock's tracking of large
   folios. /proc/meminfo's "Mlocked" field became more accurate

 - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
   Donet Tom fixes several user-visible KSM stats inaccuracies across
   forks and adds selftest code to verify these counters

 - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
   some potential but presently benign issues in KSM's mm_slot handling

* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
  mm: swap: check for stable address space before operating on the VMA
  mm: convert folio_page() back to a macro
  mm/khugepaged: use start_addr/addr for improved readability
  hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
  alloc_tag: fix boot failure due to NULL pointer dereference
  mm: silence data-race in update_hiwater_rss
  mm/memory-failure: don't select MEMORY_ISOLATION
  mm/khugepaged: remove definition of struct khugepaged_mm_slot
  mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
  hugetlb: increase number of reserving hugepages via cmdline
  selftests/mm: add fork inheritance test for ksm_merging_pages counter
  mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
  drivers/base/node: fix double free in register_one_node()
  mm: remove PMD alignment constraint in execmem_vmalloc()
  mm/memory_hotplug: fix typo 'esecially' -> 'especially'
  mm/rmap: improve mlock tracking for large folios
  mm/filemap: map entire large folio faultaround
  mm/fault: try to map the entire file folio in finish_fault()
  mm/rmap: mlock large folios in try_to_unmap_one()
  mm/rmap: fix a mlock race condition in folio_referenced_one()
  ...
2025-10-02 18:18:33 -07:00
Linus Torvalds ae28ed4578 bpf-next-6.18
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjZH40ACgkQ6rmadz2v
 bTrG7w//X/5CyDoKIYJCqynYRdMtfqYuCe8Jhud4p5++iBVqkDyS6Y8EFLqZVyg/
 UHTqaSE4Nz8/pma0WSjhUYn6Chs1AeH+Rw/g109SovE/YGkek2KNwY3o2hDrtPMX
 +oD0my8qF2HLKgEyteXXyZ5Ju+AaF92JFiGko4/wNTX8O99F9nyz2pTkrctS9Vl9
 VwuTxrEXpmhqrhP3WCxkfNfcbs9HP+AALpgOXZKdMI6T4KI0N1gnJ0ZWJbiXZ8oT
 tug0MTPkNRidYMl0wHY2LZ6ZG8Q3a7Sgc+M0xFzaHGvGlJbBg1HjsDMtT6j34CrG
 TIVJ/O8F6EJzAnQ5Hio0FJk8IIgMRgvng5Kd5GXidU+mE6zokTyHIHOXitYkBQNH
 Hk+lGA7+E2cYqUqKvB5PFoyo+jlucuIH7YwrQlyGfqz+98n65xCgZKcmdVXr0hdB
 9v3WmwJFtVIoPErUvBC3KRANQYhFk4eVk1eiGV/20+eIVyUuNbX6wqSWSA9uEXLy
 n5fm/vlk4RjZmrPZHxcJ0dsl9LTF1VvQQHkgoC1Sz/Cc+jA6k4I+ECVHAqEbk36p
 1TUF52yPOD2ViaJKkj+962JaaaXlUn6+Dq7f1GMP6VuyHjz4gsI3mOo4XarqNdWd
 c7TnYmlGO/cGwqd4DdbmWiF1DDsrBcBzdbC8+FgffxQHLPXGzUg=
 =LeQi
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc
   (Amery Hung)

   Applied as a stable branch in bpf-next and net-next trees.

 - Support reading skb metadata via bpf_dynptr (Jakub Sitnicki)

   Also a stable branch in bpf-next and net-next trees.

 - Enforce expected_attach_type for tailcall compatibility (Daniel
   Borkmann)

 - Replace path-sensitive with path-insensitive live stack analysis in
   the verifier (Eduard Zingerman)

   This is a significant change in the verification logic. More details,
   motivation, long term plans are in the cover letter/merge commit.

 - Support signed BPF programs (KP Singh)

   This is another major feature that took years to materialize.

   Algorithm details are in the cover letter/marge commit

 - Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich)

 - Add support for may_goto instruction to arm64 JIT (Puranjay Mohan)

 - Fix USDT SIB argument handling in libbpf (Jiawei Zhao)

 - Allow uprobe-bpf program to change context registers (Jiri Olsa)

 - Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and
   Puranjay Mohan)

 - Allow access to union arguments in tracing programs (Leon Hwang)

 - Optimize rcu_read_lock() + migrate_disable() combination where it's
   used in BPF subsystem (Menglong Dong)

 - Introduce bpf_task_work_schedule*() kfuncs to schedule deferred
   execution of BPF callback in the context of a specific task using the
   kernel’s task_work infrastructure (Mykyta Yatsenko)

 - Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya
   Dwivedi)

 - Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi)

 - Improve the precision of tnum multiplier verifier operation
   (Nandakumar Edamana)

 - Use tnums to improve is_branch_taken() logic (Paul Chaignon)

 - Add support for atomic operations in arena in riscv JIT (Pu Lehui)

 - Report arena faults to BPF error stream (Puranjay Mohan)

 - Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin
   Monnet)

 - Add bpf_strcasecmp() kfunc (Rong Tao)

 - Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao
   Chen)

* tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits)
  libbpf: Replace AF_ALG with open coded SHA-256
  selftests/bpf: Add stress test for rqspinlock in NMI
  selftests/bpf: Add test case for different expected_attach_type
  bpf: Enforce expected_attach_type for tailcall compatibility
  bpftool: Remove duplicate string.h header
  bpf: Remove duplicate crypto/sha2.h header
  libbpf: Fix error when st-prefix_ops and ops from differ btf
  selftests/bpf: Test changing packet data from kfunc
  selftests/bpf: Add stacktrace map lookup_and_delete_elem test case
  selftests/bpf: Refactor stacktrace_map case with skeleton
  bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE
  selftests/bpf: Fix flaky bpf_cookie selftest
  selftests/bpf: Test changing packet data from global functions with a kfunc
  bpf: Emit struct bpf_xdp_sock type in vmlinux BTF
  selftests/bpf: Task_work selftest cleanup fixes
  MAINTAINERS: Delete inactive maintainers from AF_XDP
  bpf: Mark kfuncs as __noclone
  selftests/bpf: Add kprobe multi write ctx attach test
  selftests/bpf: Add kprobe write ctx attach test
  selftests/bpf: Add uprobe context ip register change test
  ...
2025-09-30 17:58:11 -07:00
Linus Torvalds 7601d18be0 A set of changes to consolidate the generic TIF bits accross architectures
All architectures define the same set of generic TIF bits. This makes it
 pointlessly hard to add a new generic TIF bit or to change an existing one.
 
 Provide a generic variant and convert the architectures which utilize the
 generic entry code over to use it. The TIF space is divided into 16 generic
 bits and 16 architecture specific bits, which turned out to provide enough
 space on both sides.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaP78THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTKyD/9NEg3DiN6aM959o1MhvhDk7jNdECXm
 QOvZ85wJYhCv3Pb7RSgMxJL/dR9aWV2b24TSzEeki0awbR4nPeUz82vExmgS7rWX
 9rQLPrOZsrYd76IcAVoV5Ua/g48c+9TM8kLzoZFa9JEYMPmyTEiA3gy4bgab/aov
 L2b903ZrCNkWRKM1Wz5V8xdyPEzhE+cLhbWoPbeCqfzxqbv4+WWKQlPmqamQw+yq
 /61Xhq0tmx3+4hn1IB/Rc8yMTbAK0EwN7SHM+l7yaJ3ijlkGSele4S3mKAHv2I3c
 vODIFwdQ8pRbC1C5eMBnUKRm7Cmf+8CB3m+OIA5ghj10TPFiTUzQ6iG6nUSVniJm
 QB21LHYSWroeQBRibnT5k7RiW5QjtTQmcsjvO7S2rZ/7CkMr7LMu6kN1P0TSiOvc
 SKxs3MO72KaRU/JrEjLqvT2tvdpg2hpffg69U0jA1xCeFULE1jrqo3GwL8dPDk7z
 zKbC73JNg4QJDdi+hIn5nl0fRGVszLzkDum5eyCpCLY/W7BSiQ7q/ayzt9upsSOm
 uc7sqeIgelQMRDMoMNcQUsMnApT0JHOS74WQ03SfahZESj8eFoOb8pr7vaqu4lfi
 6LV4fpwZPBTMDcQ36r2JLuUTqHNHNtWn4xXjQ72ngovIVUL9A2H0DqK+1JhLJ6yX
 tknZ9WVmWVf+JQ==
 =6L98
 -----END PGP SIGNATURE-----

Merge tag 'core-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull TIF bit unification updates from Thomas Gleixner:
 "A set of changes to consolidate the generic TIF (thread info flag)
  bits accross architectures.

  All architectures define the same set of generic TIF bits. This makes
  it pointlessly hard to add a new generic TIF bit or to change an
  existing one.

  Provide a generic variant and convert the architectures which utilize
  the generic entry code over to use it. The TIF space is divided into
  16 generic bits and 16 architecture specific bits, which turned out to
  provide enough space on both sides"

* tag 'core-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  LoongArch: Fix bitflag conflict for TIF_FIXADE
  riscv: Use generic TIF bits
  loongarch: Use generic TIF bits
  s390/entry: Remove unused TIF flags
  s390: Use generic TIF bits
  x86: Use generic TIF bits
  asm-generic: Provide generic TIF infrastructure
2025-09-30 14:36:20 -07:00
Linus Torvalds cb7e3669c6 RISC-V updates for the v6.18 merge window (part one)
First set of RISC-V updates for the v6.18 merge window, including:
 
 - Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
   architectures have already merged this type of cleanup)
 
 - The introduction of ioremap_wc() for RISC-V
 
 - Cleanup of the RISC-V kprobes code to use mostly-extant macros rather than
   open code
 
 - A RISC-V kprobes unit test
 
 - An architecture-specific endianness swap macro set implementation,
   leveraging some dedicated RISC-V instructions for this purpose if they
   are available
 
 - The ability to identity and communicate to userspace the presence of a
   MIPS P8700-specific ISA extension, and to leverage its MIPS-specific PAUSE
   implementation in cpu_relax()
 
 - Several other miscellaneous cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmjaMVIACgkQx4+xDQu9
 Kkva4g/9GE4gzwM+KHlc4e1sfsaTo4oXe9eV+Hj3gUJdM+g8dCtNFchPRCKFjHYb
 X9lm2YVL9Q3cZ7yWWy/DJtZ66Yz9foQ4laX7uYbjsdbdjbsAXLXhjNp6uk4nBrqb
 uW7Uq+Qel8qq66J/B4Z/U0UzF3e5MaptQ6sLuNRONY9OJUxG76zTSJVKwiVNsGaX
 8W59b7ALFlCIwCVyXGm/KO1EELjl8FKDENWXFE1v6T8XvfXfYuhcvUMp84ebbzHV
 D4kQrO3nxKQVgKEdCtW8xxt0/aWkYQ8zbQg8bD0gDDzMYb5uiJDcMFaa8ZuEYiLg
 EcAJX8LmE5GGlTcJf8/jxvaA87hisNsGFvNPXX1OuI26w2TUNC2u80wE6Q6a3aNu
 74oUtZEaPhdFuB31A0rALC8Zb2zalnwwjAL7xRlyAozUuye8Ej7mE7w97WTjIfWz
 7ZL19/+C1uawljzYLn+FJBIcl1wTyvOgx6T4TJlmOsa4OFnCreFx+a0Az6+Rx1GC
 XGsRyDkGPSabIbNVGxUCyTr4w+VpW1WlDjrjwyLC0DQTFW8wyP44W9/K2scp+CqJ
 bSCcAz8QtGAeZ5UlSmXYTOV69xXqZPaom7fVk5RFHoy24en5DSo7kj1NJ3ChupRD
 8ACpALcIgw/VEo0Tyqqy3dyVhPDuYaZMY3WgGGj9Cz18U37e/Ho=
 =nK3D
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Paul Walmsley

 - Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
   architectures have already merged this type of cleanup)

 - The introduction of ioremap_wc() for RISC-V

 - Cleanup of the RISC-V kprobes code to use mostly-extant macros rather
   than open code

 - A RISC-V kprobes unit test

 - An architecture-specific endianness swap macro set implementation,
   leveraging some dedicated RISC-V instructions for this purpose if
   they are available

 - The ability to identity and communicate to userspace the presence
   of a MIPS P8700-specific ISA extension, and to leverage its
   MIPS-specific PAUSE implementation in cpu_relax()

 - Several other miscellaneous cleanups

* tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (39 commits)
  riscv: errata: Fix the PAUSE Opcode for MIPS P8700
  riscv: hwprobe: Document MIPS xmipsexectl vendor extension
  riscv: hwprobe: Add MIPS vendor extension probing
  riscv: Add xmipsexectl instructions
  riscv: Add xmipsexectl as a vendor extension
  dt-bindings: riscv: Add xmipsexectl ISA extension description
  riscv: cpufeature: add validation for zfa, zfh and zfhmin
  perf: riscv: skip empty batches in counter start
  selftests: riscv: Add README for RISC-V KSelfTest
  riscv: sbi: Switch to new sys-off handler API
  riscv: Move vendor errata definitions to new header
  RISC-V: ACPI: enable parsing the BGRT table
  riscv: Enable ARCH_HAVE_NMI_SAFE_CMPXCHG
  riscv: pi: use 'targets' instead of extra-y in Makefile
  riscv: introduce asm/swab.h
  riscv: mmap(): use unsigned offset type in riscv_sys_mmap
  drivers/perf: riscv: Remove redundant ternary operators
  riscv: mm: Use mmu-type from FDT to limit SATP mode
  riscv: mm: Return intended SATP mode for noXlvl options
  riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM
  ...
2025-09-29 19:01:08 -07:00
Linus Torvalds a5ba183bde hardening updates for v6.18-rc1
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
 
 - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
   (Junjie Cao)
 
 - Add str_assert_deassert() helper (Lad Prabhakar)
 
 - gcc-plugins: Remove TODO_verify_il for GCC >= 16
 
 - kconfig: Fix BrokenPipeError warnings in selftests
 
 - kconfig: Add transitional symbol attribute for migration support
 
 - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNraNQAKCRA2KwveOeQk
 u/DkAPwKPP5BSmVR2wkdpQaXIr3PGA+cbBYp34DMJNujZ9piIwD/WZ+HfGTLoERy
 +2Q6HLj9hUdd+Rx3IZ8/w1QmnhUIUAU=
 =AwV9
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One notable addition is the creation of the 'transitional' keyword for
  kconfig so CONFIG renaming can go more smoothly.

  This has been a long-standing deficiency, and with the renaming of
  CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
  support), this came up again.

  The breadth of the diffstat is mainly this renaming.

   - Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)

   - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
     (Junjie Cao)

   - Add str_assert_deassert() helper (Lad Prabhakar)

   - gcc-plugins: Remove TODO_verify_il for GCC >= 16

   - kconfig: Fix BrokenPipeError warnings in selftests

   - kconfig: Add transitional symbol attribute for migration support

   - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"

* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lib/string_choices: Add str_assert_deassert() helper
  kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
  kconfig: Add transitional symbol attribute for migration support
  kconfig: Fix BrokenPipeError warnings in selftests
  gcc-plugins: Remove TODO_verify_il for GCC >= 16
  stddef: Introduce __TRAILING_OVERLAP()
  stddef: Remove token-pasting in TRAILING_OVERLAP()
  lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
2025-09-29 17:48:27 -07:00
Linus Torvalds 8c1ed30218 ffs-const update for v6.18-rc1
- PCI: Fix theoretical underflow in use of ffs().
 
 - Universally apply __attribute_const__ to all architecture's ffs()-family
   of functions.
 
 - Add KUnit tests for ffs() behavior and const-ness.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNrWngAKCRA2KwveOeQk
 u3ZGAPwJTscARU4MspnqpbuAV601dG1TNoJG+8JYH84r+R2jjQEAlmBZB0jaHbC2
 qFWjHivD/0ofvihKfAPFgxlakyV1XAg=
 =diXF
 -----END PGP SIGNATURE-----

Merge tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull ffs const-attribute cleanups from Kees Cook:
 "While working on various hardening refactoring a while back we
  encountered inconsistencies in the application of __attribute_const__
  on the ffs() family of functions.

  This series fixes this across all archs and adds KUnit tests.

  Notably, this found a theoretical underflow in PCI (also fixed here)
  and uncovered an inefficiency in ARC (fixed in the ARC arch PR). I
  kept the series separate from the general hardening PR since it is a
  stand-alone "topic".

   - PCI: Fix theoretical underflow in use of ffs().

   - Universally apply __attribute_const__ to all architecture's
     ffs()-family of functions.

   - Add KUnit tests for ffs() behavior and const-ness"

* tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  KUnit: ffs: Validate all the __attribute_const__ annotations
  sparc: Add __attribute_const__ to ffs()-family implementations
  xtensa: Add __attribute_const__ to ffs()-family implementations
  s390: Add __attribute_const__ to ffs()-family implementations
  parisc: Add __attribute_const__ to ffs()-family implementations
  mips: Add __attribute_const__ to ffs()-family implementations
  m68k: Add __attribute_const__ to ffs()-family implementations
  openrisc: Add __attribute_const__ to ffs()-family implementations
  riscv: Add __attribute_const__ to ffs()-family implementations
  hexagon: Add __attribute_const__ to ffs()-family implementations
  alpha: Add __attribute_const__ to ffs()-family implementations
  sh: Add __attribute_const__ to ffs()-family implementations
  powerpc: Add __attribute_const__ to ffs()-family implementations
  x86: Add __attribute_const__ to ffs()-family implementations
  csky: Add __attribute_const__ to ffs()-family implementations
  bitops: Add __attribute_const__ to generic ffs()-family implementations
  KUnit: Introduce ffs()-family tests
  PCI: Test for bit underflow in pcie_set_readrq()
2025-09-29 16:31:35 -07:00
Sunil V L bb96fb5a79 ACPI: RISC-V: Add RPMI System MSI to GSI mapping
The RPMI System MSI device will provide GSIs to downstream devices
(such as GED) so add it to the RISC-V GSI to fwnode mapping.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Link: https://lore.kernel.org/r/20250818040920.272664-20-apatel@ventanamicro.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-25 19:49:09 -06:00
Sunil V L 4d185fdeef ACPI: RISC-V: Add support to update gsi range
Some RISC-V interrupt controllers like RPMI based system MSI interrupt
controllers do not have MADT entry defined. These interrupt controllers
exist only in the namespace. ACPI spec defines _GSB method to get the
GSI base of the interrupt controller, However, there is no such standard
method to get the GSI range. To support such interrupt controllers, set
the GSI range of such interrupt controllers to non-overlapping range and
provide API for interrupt controller driver to update it with proper
value.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Link: https://lore.kernel.org/r/20250818040920.272664-19-apatel@ventanamicro.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-25 19:48:54 -06:00
Anup Patel 508da38677 RISC-V: Add defines for the SBI message proxy extension
Add defines for the new SBI message proxy extension which is part
of the SBI v3.0 specification.

Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Co-developed-by: Rahul Pathak <rpathak@ventanamicro.com>
Signed-off-by: Rahul Pathak <rpathak@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20250818040920.272664-4-apatel@ventanamicro.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24 16:57:43 -06:00
Kees Cook 23ef9d4397 kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).

Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-24 14:29:14 -07:00
Alexandre Ghiti 546e42c8c6 riscv: Use an atomic xchg in pudp_huge_get_and_clear()
Make sure we return the right pud value and not a value that could
have been overwritten in between by a different core.

Fixes: c3cc2a4a3a ("riscv: Add support for PUD THP")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250814-dev-alex-thp_pud_xchg-v1-1-b4704dfae206@rivosinc.com
[pjw@kernel.org: use xchg rather than atomic_long_xchg; avoid atomic op for !CONFIG_SMP like x86]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-23 18:25:52 -06:00
Djordje Todorovic 0b0ca959d2 riscv: errata: Fix the PAUSE Opcode for MIPS P8700
Add ERRATA_MIPS and ERRATA_MIPS_P8700_PAUSE_OPCODE configs.
Handle errata for the MIPS PAUSE instruction.

Signed-off-by: Djordje Todorovic <djordje.todorovic@htecgroup.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Signed-off-by: Raj Vishwanathan4 <rvishwanathan@mips.com>
Signed-off-by: Aleksa Paunovic <aleksa.paunovic@htecgroup.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250724-p8700-pause-v5-7-a6cbbe1c3412@htecgroup.com
[pjw@kernel.org: updated to apply and compile; fixed a checkpatch issue]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-19 10:33:56 -06:00
Aleksa Paunovic bb4b0f8a1b riscv: hwprobe: Add MIPS vendor extension probing
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_MIPS_0" which allows
userspace to probe for the new xmipsexectl vendor extension.

Signed-off-by: Aleksa Paunovic <aleksa.paunovic@htecgroup.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250724-p8700-pause-v5-4-a6cbbe1c3412@htecgroup.com
[pjw@kernel.org: fixed some checkpatch issues]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-19 10:33:56 -06:00
Aleksa Paunovic 1d4ce63e33 riscv: Add xmipsexectl instructions
Add xmipsexectl instruction opcodes. This includes the MIPS.PAUSE,
MIPS.EHB, and MIPS.IHB instructions.

Signed-off-by: Aleksa Paunovic <aleksa.paunovic@htecgroup.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250724-p8700-pause-v5-3-a6cbbe1c3412@htecgroup.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-19 10:33:56 -06:00
Aleksa Paunovic a8fed1bc03 riscv: Add xmipsexectl as a vendor extension
Add support for MIPS vendor extensions. Add support for the xmipsexectl
vendor extension.

Signed-off-by: Aleksa Paunovic <aleksa.paunovic@htecgroup.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250724-p8700-pause-v5-2-a6cbbe1c3412@htecgroup.com
[pjw@kernel.org: added the MIPS vendor ID from another patch to fix the build]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-18 20:36:00 -06:00
Guo Ren (Alibaba DAMO Academy) 16d18e3eaf riscv: Move vendor errata definitions to new header
Move vendor errata definitions into errata_list_vendors.h.

Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Han Gao <rabenda.cn@gmail.com>
Link: https://lore.kernel.org/r/20250713155321.2064856-2-guoren@kernel.org
[pjw@kernel.org: updated to apply and to make the whitespace consistent]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-18 08:22:00 -06:00
Ignacio Encinas cc2294d3f9 riscv: introduce asm/swab.h
Implement endianness swap macros for RISC-V.

Use the rev8 instruction when Zbb is available. Otherwise, rely on the
default mask-and-shift implementation.

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
Link: https://lore.kernel.org/r/20250723-riscv-swab-v6-1-fc11e9a2efc9@iencinas.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-18 08:20:25 -06:00
Thomas Gleixner 41e871f2b6 riscv: Use generic TIF bits
No point in defining generic items and the upcoming RSEQ optimizations are
only available with this _and_ the generic entry infrastructure, which is
already used by RISCV. So no further action required here.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2025-09-17 08:14:05 +02:00
Nam Cao 518c550eeb riscv: kprobes: Move branch_funct3 to insn.h
Similar to other instruction-processing macros/functions, branch_funct3
should be in insn.h.

Move it into insn.h as RV_EXTRACT_FUNCT3. This new name matches the style
in insn.h.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/linux-riscv/200c29a26338f19d09963fa02562787e8cfa06f2.1747215274.git.namcao@linutronix.de/
[pjw@kernel.org: updated to use RV_X_MASK and to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 18:46:43 -06:00
Nam Cao 5fe5914027 riscv: kprobes: Move branch_rs2_idx to insn.h
Similar to other instruction-processing macros/functions, branch_rs2_idx
should be in insn.h.

Move it into insn.h as RV_EXTRACT_RS2_REG. This new name matches the style
in insn.h.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/linux-riscv/107d4a6c1818bf169be2407b273a0483e6d55bbb.1747215274.git.namcao@linutronix.de/
[pjw@kernel.org: updated to use RV_X_MASK and to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 18:46:43 -06:00
Alexandre Ghiti a601732236 riscv: Move all duplicate insn parsing macros into asm/insn.h
kernel/traps_misaligned.c and kvm/vcpu_insn.c define the same macros to
extract information from the instructions.

Let's move the definitions into asm/insn.h to avoid this duplication.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-3-d865dc9ad180@rivosinc.com
[pjw@kernel.org: updated to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 16:29:07 -06:00
Alexandre Ghiti 833bbb0d91 riscv: Strengthen duplicate and inconsistent definition of RV_X()
RV_X() macro is defined in two different ways which is error prone.

So harmonize its first definition and add another macro RV_X_MASK() for
the second one.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-2-d865dc9ad180@rivosinc.com
[pjw@kernel.org: upcase the macro name to conform with previous practice]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 16:28:30 -06:00
Alexandre Ghiti 932131fd3e riscv: Fix typo EXRACT -> EXTRACT
Simply fix a typo.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-1-d865dc9ad180@rivosinc.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 16:28:02 -06:00
Thomas Huth f811f58597 riscv: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers
While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembly code, __ASSEMBLY__ is a
macro that only gets defined by the Makefiles in the kernel.
This can be very confusing when switching between userspace
and kernelspace coding, or when dealing with uapi headers that
rather should use __ASSEMBLER__ instead. So let's standardize on
the __ASSEMBLER__ macro that is provided by the compilers now.

This originally was a completely mechanical patch (done with a
simple "sed -i" statement), with some manual fixups during
rebasing of the patch later.

Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: linux-riscv@lists.infradead.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20250606070952.498274-3-thuth@redhat.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 16:25:30 -06:00
Yunhui Cui 3a8ee3a9f4 riscv: introduce ioremap_wc()
Compared with IO attributes, NC attributes can improve performance,
specifically in these aspects: Relaxed Order, Gathering, Supports Read
Speculation, Supports Unaligned Access.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250722091504.45974-2-cuiyunhui@bytedance.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16 16:24:27 -06:00
Atish Patra dbdadd943a RISC-V: KVM: Upgrade the supported SBI version to 3.0
Upgrade the SBI version to v3.0 so that corresponding features
can be enabled in the guest.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-8-d8f80cacb884@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 11:49:31 +05:30
Atish Patra e309fd113b RISC-V: KVM: Implement get event info function
The new get_event_info funciton allows the guest to query the presence
of multiple events with single SBI call. Currently, the perf driver
in linux guest invokes it for all the standard SBI PMU events. Support
the SBI function implementation in KVM as well.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-7-d8f80cacb884@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 11:49:31 +05:30
Atish Patra adffbd06d0 drivers/perf: riscv: Implement PMU event info function
With the new SBI PMU event info function, we can query the availability
of the all standard SBI PMU events at boot time with a single ecall.
This improves the bootime by avoiding making an SBI call for each
standard PMU event. Since this function is defined only in SBI v3.0,
invoke this only if the underlying SBI implementation is v3.0 or higher.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-4-d8f80cacb884@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 11:49:31 +05:30
Atish Patra 656ef2ea30 drivers/perf: riscv: Add raw event v2 support
SBI v3.0 introduced a new raw event type that allows wider
mhpmeventX width to be programmed via CFG_MATCH.

Use the raw event v2 if SBI v3.0 is available.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-2-d8f80cacb884@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 11:49:31 +05:30
Anup Patel 48d67106f4 RISC-V: KVM: Implement ONE_REG interface for SBI FWFT state
The KVM user-space needs a way to save/restore the state of
SBI FWFT features so implement SBI extension ONE_REG callbacks.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250823155947.1354229-6-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 10:54:21 +05:30
Anup Patel 85e7850e0d RISC-V: KVM: Move copy_sbi_ext_reg_indices() to SBI implementation
The ONE_REG handling of SBI extension enable/disable registers and
SBI extension state registers is already under SBI implementation.
On similar lines, let's move copy_sbi_ext_reg_indices() under SBI
implementation.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250823155947.1354229-5-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 10:54:18 +05:30
Anup Patel 699a53aedb RISC-V: KVM: Introduce optional ONE_REG callbacks for SBI extensions
SBI extensions can have per-VCPU state which needs to be saved/restored
through ONE_REG interface for Guest/VM migration. Introduce optional
ONE_REG callbacks for SBI extensions so that ONE_REG implementation
for an SBI extenion is part of the extension sources.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250823155947.1354229-4-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16 10:54:14 +05:30