mirror of https://github.com/torvalds/linux.git
25284 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
a3ebb59eee |
VFIO updates for v6.19-rc1
- Move libvfio selftest artifacts in preparation of more tightly
coupled integration with KVM selftests. (David Matlack)
- Fix comment typo in mtty driver. (Chu Guangqing)
- Support for new hardware revision in the hisi_acc vfio-pci variant
driver where the migration registers can now be accessed via the PF.
When enabled for this support, the full BAR can be exposed to the
user. (Longfang Liu)
- Fix vfio cdev support for VF token passing, using the correct size
for the kernel structure, thereby actually allowing userspace to
provide a non-zero UUID token. Also set the match token callback for
the hisi_acc, fixing VF token support for this this vfio-pci variant
driver. (Raghavendra Rao Ananta)
- Introduce internal callbacks on vfio devices to simplify and
consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO
data, removing various ioctl intercepts with a more structured
solution. (Jason Gunthorpe)
- Introduce dma-buf support for vfio-pci devices, allowing MMIO regions
to be exposed through dma-buf objects with lifecycle managed through
move operations. This enables low-level interactions such as a
vfio-pci based SPDK drivers interacting directly with dma-buf capable
RDMA devices to enable peer-to-peer operations. IOMMUFD is also now
able to build upon this support to fill a long standing feature gap
versus the legacy vfio type1 IOMMU backend with an implementation of
P2P support for VM use cases that better manages the lifecycle of the
P2P mapping. (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy)
- Convert eventfd triggering for error and request signals to use RCU
mechanisms in order to avoid a 3-way lockdep reported deadlock issue.
(Alex Williamson)
- Fix a 32-bit overflow introduced via dma-buf support manifesting with
large DMA buffers. (Alex Mastro)
- Convert nvgrace-gpu vfio-pci variant driver to insert mappings on
fault rather than at mmap time. This conversion serves both to make
use of huge PFNMAPs but also to both avoid corrected RAS events
during reset by now being subject to vfio-pci-core's use of
unmap_mapping_range(), and to enable a device readiness test after
reset. (Ankit Agrawal)
- Refactoring of vfio selftests to support multi-device tests and split
code to provide better separation between IOMMU and device objects.
This work also enables a new test suite addition to measure parallel
device initialization latency. (David Matlack)
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmkvV3IRHGFsZXhAc2hh
emJvdC5vcmcACgkQI5ubbjuwiyIpIQ/9GwpjLH5Vdv0v2d9mkHmZIWFpG/tr3zJa
+spQqOjO0etASc67PtIJArT9pWib+s6O8OaG7iFrdNR65HCSsXSZbIGbMThPODfy
DdDj1ipAqMVwcaCZT8un2N8Sktu9YpFQMvc5IoXWWYhw88vili7bBx+OTrEFV2T0
6qQijSBdhw1TXVFHG6BGSmqmisyMepIebA6GmPWdfYu6BfoWBYMdcMjDwd1J61Q5
DDwFRzn/Dz2Tvb1jbXiiRMRuFIuegFQii+wtd30S/cRPFZhZLWzc+drimC6oOFiQ
qL19vQQsBPnLtGvch40HsET/AbY5w0pLCkYX5qacxP3sq27+N+KuotzCvbnVMN+H
e2BqOCujyoce8z1Br6BzV71Lr2yzPDcc5pXTuEuuBT+J/ptOY8hfEikOj85s5Wzj
aKsTrdDRGMrn/o11NkGSzYwFcMs9MxCX9mo98U6OkWDr0+cmPLf4LGZgpJudWg4E
POUlzPpnzJrTlX5d+OqCdKJG0a1hTlTa2udzRa5hCDANHaZWLoAssfgSEKfV9xt1
PzOMf0UIJmPJmFcw/OpMO72/5xp8O4WslJS0ulSm6vrAJDtutLApHZ7bJ44KniNd
4vte+gOjyZY8ibTDKRULhXVlCDxkEnZjRBbApgI9HJD61IElOzjqohRuRx77J09B
7c8OSLI8d1U=
=tpee
-----END PGP SIGNATURE-----
Merge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Move libvfio selftest artifacts in preparation of more tightly
coupled integration with KVM selftests (David Matlack)
- Fix comment typo in mtty driver (Chu Guangqing)
- Support for new hardware revision in the hisi_acc vfio-pci variant
driver where the migration registers can now be accessed via the PF.
When enabled for this support, the full BAR can be exposed to the
user (Longfang Liu)
- Fix vfio cdev support for VF token passing, using the correct size
for the kernel structure, thereby actually allowing userspace to
provide a non-zero UUID token. Also set the match token callback for
the hisi_acc, fixing VF token support for this this vfio-pci variant
driver (Raghavendra Rao Ananta)
- Introduce internal callbacks on vfio devices to simplify and
consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO
data, removing various ioctl intercepts with a more structured
solution (Jason Gunthorpe)
- Introduce dma-buf support for vfio-pci devices, allowing MMIO regions
to be exposed through dma-buf objects with lifecycle managed through
move operations. This enables low-level interactions such as a
vfio-pci based SPDK drivers interacting directly with dma-buf capable
RDMA devices to enable peer-to-peer operations. IOMMUFD is also now
able to build upon this support to fill a long standing feature gap
versus the legacy vfio type1 IOMMU backend with an implementation of
P2P support for VM use cases that better manages the lifecycle of the
P2P mapping (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy)
- Convert eventfd triggering for error and request signals to use RCU
mechanisms in order to avoid a 3-way lockdep reported deadlock issue
(Alex Williamson)
- Fix a 32-bit overflow introduced via dma-buf support manifesting with
large DMA buffers (Alex Mastro)
- Convert nvgrace-gpu vfio-pci variant driver to insert mappings on
fault rather than at mmap time. This conversion serves both to make
use of huge PFNMAPs but also to both avoid corrected RAS events
during reset by now being subject to vfio-pci-core's use of
unmap_mapping_range(), and to enable a device readiness test after
reset (Ankit Agrawal)
- Refactoring of vfio selftests to support multi-device tests and split
code to provide better separation between IOMMU and device objects.
This work also enables a new test suite addition to measure parallel
device initialization latency (David Matlack)
* tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio: (65 commits)
vfio: selftests: Add vfio_pci_device_init_perf_test
vfio: selftests: Eliminate INVALID_IOVA
vfio: selftests: Split libvfio.h into separate header files
vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c
vfio: selftests: Rename vfio_util.h to libvfio.h
vfio: selftests: Stop passing device for IOMMU operations
vfio: selftests: Move IOVA allocator into iova_allocator.c
vfio: selftests: Move IOMMU library code into iommu.c
vfio: selftests: Rename struct vfio_dma_region to dma_region
vfio: selftests: Upgrade driver logging to dev_err()
vfio: selftests: Prefix logs with device BDF where relevant
vfio: selftests: Eliminate overly chatty logging
vfio: selftests: Support multiple devices in the same container/iommufd
vfio: selftests: Introduce struct iommu
vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode
vfio: selftests: Allow passing multiple BDFs on the command line
vfio: selftests: Split run.sh into separate scripts
vfio: selftests: Move run.sh into scripts directory
vfio/nvgrace-gpu: wait for the GPU mem to be ready
vfio/nvgrace-gpu: Inform devmem unmapped after reset
...
|
|
|
|
b687034b1a |
slab updates for 6.19
-----BEGIN PGP SIGNATURE-----
iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmksibgbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiavR8H/jTNKlb8jZtre1Q2xIGJ
PgU8+fc4PGX8C6XuKRgb4KYL+zn3VSnTyxLUc3ObKIRTrGOJOBw3YT8R0LvrMOJx
Ibx/6o0o+vjnDxmq6QGcuYdytDdL/rL6Gh8PR1dyWAqPz6jGtraP0nCJu7Y9jRZ0
JHbyRTfpC8I6fTZv/WHocTsUDUu/+M4jQx3kMAMgSSTc7IAF+El5GqhpwEaWv7u/
6D0px1lXI3rGimzmHeLy+CEjW041MTkxPH3GNzkiZwi2WUwI+ZEteMcs29KHcCOe
/sdqmlzn2CPxzqG3TkJ4LbJE3XThYkqxe56LmBVJnhHFe+vCF8urEX9UUTtMn1dh
3zs=
=iQ4N
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- mempool_alloc_bulk() support for upcoming users in the block layer
that need to allocate multiple objects at once with the mempool's
guaranteed progress semantics, which is not achievable with an
allocation single objects in a loop. Along with refactoring and
various improvements (Christoph Hellwig)
- Preparations for the upcoming separation of struct slab from struct
page, mostly by removing the struct folio layer, as the purpose of
struct folio has shifted since it became used in slab code (Matthew
Wilcox)
- Modernisation of slab's boot param API usage, which removes some
unexpected parsing corner cases (Petr Tesarik)
- Refactoring of freelist_aba_t (now struct freelist_counters) and
associated functions for double cmpxchg, enabled by -fms-extensions
(Vlastimil Babka)
- Cleanups and improvements related to sheaves caching layer, that were
part of the full conversion to sheaves, which is planned for the next
release (Vlastimil Babka)
* tag 'slab-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (42 commits)
slab: Remove unnecessary call to compound_head() in alloc_from_pcs()
mempool: clarify behavior of mempool_alloc_preallocated()
mempool: drop the file name in the top of file comment
mempool: de-typedef
mempool: remove mempool_{init,create}_kvmalloc_pool
mempool: legitimize the io_schedule_timeout in mempool_alloc_from_pool
mempool: add mempool_{alloc,free}_bulk
mempool: factor out a mempool_alloc_from_pool helper
slab: Remove references to folios from virt_to_slab()
kasan: Remove references to folio in __kasan_mempool_poison_object()
memcg: Convert mem_cgroup_from_obj_folio() to mem_cgroup_from_obj_slab()
mempool: factor out a mempool_adjust_gfp helper
mempool: add error injection support
mempool: improve kerneldoc comments
mm: improve kerneldoc comments for __alloc_pages_bulk
fault-inject: make enum fault_flags available unconditionally
usercopy: Remove folio references from check_heap_object()
slab: Remove folio references from kfree_nolock()
slab: Remove folio references from kfree_rcu_sheaf()
slab: Remove folio references from build_detached_freelist()
...
|
|
|
|
51e3b98d73 |
selinux/stable-6.19 PR 20251201
-----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmkuAKEUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPKeA/8DSW+sTkQ9BMGGnyuH1uU/r84qtVh Ft6pnIPzrogE/GKcQeFgFA9D7gQbB8J39PSxZLS3lp0UiuPCuq+D09L+uzDKzDCD Avfe84dwsI5OiplPKyHiG3bF9W2+A1zkwH2j+5uC6yF8v9J9vglo4u5vAYeE2wxA X4b2r9jMm7WJ/KFNiSiiLGEhOSjVVUrJULcmWMRPPruplPDC4dLnqYTWTbkrfF8h /oXv/+ssqbj6FqfL4WaRnjN8GgZcwaWy1qu9LVlZ40iphpbVAyPBJPLJS6Q4hhOl mMHUbYkxALPyW7riQxoXAegQjJyGgKn8Bli9U6bkiKFA2yeIhJFX+OyV1SlOAs/J g6s5XfeCzqY0Tw3eqvT1YRhp10GcA7EtBYvhAe5ARq7PkMoqxmiI587piVX9hbos a0AH9CDNoOw+8QXx27sOoD1YIaiYD9fikXKymrzRRaW/GX6i43XIKiELBMuKoIVZ iwualvQiGBLLczzm5rdqPcLgp09Agn4AHfvFWXKFgS4+IJGKjeeXNOjsp9oFEivq RnXmDpa+nBud5zeTSeSpOY2L0pvuIG5N25N6U9bTsDe+4Y6p0qIAUy8e4sQ0PA8P xyp9/fcNr9jwHeLTjDbxZqZ+MU3GLIIVPdl0zq4z2J8nhkW3wD3pQX6B4qPIuXLx YP3nwhAT9T+hU7w= =IvVa -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Improve the granularity of SELinux labeling for memfd files Currently when creating a memfd file, SELinux treats it the same as any other tmpfs, or hugetlbfs, file. While simple, the drawback is that it is not possible to differentiate between memfd and tmpfs files. This adds a call to the security_inode_init_security_anon() LSM hook and wires up SELinux to provide a set of memfd specific access controls, including the ability to control the execution of memfds. As usual, the commit message has more information. - Improve the SELinux AVC lookup performance Adopt MurmurHash3 for the SELinux AVC hash function instead of the custom hash function currently used. MurmurHash3 is already used for the SELinux access vector table so the impact to the code is minimal, and performance tests have shown improvements in both hash distribution and latency. See the commit message for the performance measurments. - Introduce a Kconfig option for the SELinux AVC bucket/slot size While we have the ability to grow the number of AVC hash buckets today, the size of the buckets (slot size) is fixed at 512. This pull request makes that slot size configurable at build time through a new Kconfig knob, CONFIG_SECURITY_SELINUX_AVC_HASH_BITS. * tag 'selinux-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: improve bucket distribution uniformity of avc_hash() selinux: Move avtab_hash() to a shared location for future reuse selinux: Introduce a new config to make avc cache slot size adjustable memfd,selinux: call security_inode_init_security_anon() |
|
|
|
44fc84337b |
arm64 updates for 6.19:
Core features:
- Basic Arm MPAM (Memory system resource Partitioning And Monitoring)
driver under drivers/resctrl/ which makes use of the fs/rectrl/ API
Perf and PMU:
- Avoid cycle counter on multi-threaded CPUs
- Extend CSPMU device probing and add additional filtering support for
NVIDIA implementations
- Add support for the PMUs on the NoC S3 interconnect
- Add additional compatible strings for new Cortex and C1 CPUs
- Add support for data source filtering to the SPE driver
- Add support for i.MX8QM and "DB" PMU in the imx PMU driver
Memory managemennt:
- Avoid broadcast TLBI if page reused in write fault
- Elide TLB invalidation if the old PTE was not valid
- Drop redundant cpu_set_*_tcr_t0sz() macros
- Propagate pgtable_alloc() errors outside of __create_pgd_mapping()
- Propagate return value from __change_memory_common()
ACPI and EFI:
- Call EFI runtime services without disabling preemption
- Remove unused ACPI function
Miscellaneous:
- ptrace support to disable streaming on SME-only systems
- Improve sysreg generation to include a 'Prefix' descriptor
- Replace __ASSEMBLY__ with __ASSEMBLER__
- Align register dumps in the kselftest zt-test
- Remove some no longer used macros/functions
- Various spelling corrections
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmkvMjkACgkQa9axLQDI
XvGaGg//dtT/ZAqrWa6Yniv1LOlh837C07YdxAYTTuJ+I87DnrxIqjwbW+ye+bF+
61RTkioeCUm3PH+ncO9gPVNi4ASZ1db3/Rc8Fb6rr1TYOI1sMIeBsbbVdRJgsbX6
zu9197jOBHscTAeDceB6jZBDyW8iSLINPZ7LN6lGxXsZM/Vn5zfE0heKEEio6Fsx
+AzO2vos0XcwBR9vFGXtiCDx57T+/cXUtrWfA0Cjz4nvHSgD8+ghS+Jwv+kHMt1L
zrarqbeQfj+Iixm9PVHiazv+8THo9QdNl1yGLxDmJ4LEVPewjW5jBs8+5e8e3/Gj
p5JEvmSyWvKTTbFoM5vhxC72A7yuT1QwAk2iCyFIxMbQ25PndHboKVp/569DzOkT
+6CjI88sVSP6D7bVlN6pFlzc/Fa07YagnDMnMCSfk4LBjUfE3jYb+usaFydyv/rl
jwZbJrnSF/H+uQlyoJFgOEXSoQdDsll3dv6yEsUCwbd8RqXbAe3svbguOUHSdvIj
sCViezGZQ7Rkn6D21AfF9j6e7ceaSDaf5DWMxPI3dAxFKG8TJbCBsToR59NnoSj+
bNEozbZ1mCxmwH8i43wZ6P0RkClvJnoXcvRA+TJj02fSZACO39d3XDNswfXWL41r
KiWGUJZyn2lPKtiAWVX6pSBtDJ+5rFhuoFgADLX6trkxDe9/EMQ=
=4Sb6
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"These are the arm64 updates for 6.19.
The biggest part is the Arm MPAM driver under drivers/resctrl/.
There's a patch touching mm/ to handle spurious faults for huge pmd
(similar to the pte version). The corresponding arm64 part allows us
to avoid the TLB maintenance if a (huge) page is reused after a write
fault. There's EFI refactoring to allow runtime services with
preemption enabled and the rest is the usual perf/PMU updates and
several cleanups/typos.
Summary:
Core features:
- Basic Arm MPAM (Memory system resource Partitioning And Monitoring)
driver under drivers/resctrl/ which makes use of the fs/rectrl/ API
Perf and PMU:
- Avoid cycle counter on multi-threaded CPUs
- Extend CSPMU device probing and add additional filtering support
for NVIDIA implementations
- Add support for the PMUs on the NoC S3 interconnect
- Add additional compatible strings for new Cortex and C1 CPUs
- Add support for data source filtering to the SPE driver
- Add support for i.MX8QM and "DB" PMU in the imx PMU driver
Memory managemennt:
- Avoid broadcast TLBI if page reused in write fault
- Elide TLB invalidation if the old PTE was not valid
- Drop redundant cpu_set_*_tcr_t0sz() macros
- Propagate pgtable_alloc() errors outside of __create_pgd_mapping()
- Propagate return value from __change_memory_common()
ACPI and EFI:
- Call EFI runtime services without disabling preemption
- Remove unused ACPI function
Miscellaneous:
- ptrace support to disable streaming on SME-only systems
- Improve sysreg generation to include a 'Prefix' descriptor
- Replace __ASSEMBLY__ with __ASSEMBLER__
- Align register dumps in the kselftest zt-test
- Remove some no longer used macros/functions
- Various spelling corrections"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
arm64/mm: Document why linear map split failure upon vm_reset_perms is not problematic
arm64/pageattr: Propagate return value from __change_memory_common
arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS
KVM: arm64: selftests: Consider all 7 possible levels of cache
KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user
arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros
Documentation/arm64: Fix the typo of register names
ACPI: GTDT: Get rid of acpi_arch_timer_mem_init()
perf: arm_spe: Add support for filtering on data source
perf: Add perf_event_attr::config4
perf/imx_ddr: Add support for PMU in DB (system interconnects)
perf/imx_ddr: Get and enable optional clks
perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe()
dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL
arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT
arm64: mm: use untagged address to calculate page index
MAINTAINERS: new entry for MPAM Driver
arm_mpam: Add kunit tests for props_mismatch()
arm_mpam: Add kunit test for bitmap reset
arm_mpam: Add helper to reset saved mbwu state
...
|
|
|
|
2547f79b0b |
s390 updates for 6.19 merge window
- Provide a new interface for dynamic configuration and deconfiguration of hotplug memory, allowing with and without memmap_on_memory support. This makes the way memory hotplug is handled on s390 much more similar to other architectures - Remove compat support. There shouldn't be any compat user space around anymore, therefore get rid of a lot of code which also doesn't need to be tested anymore - Add stackprotector support. GCC 16 will get new compiler options, which allow to generate code required for kernel stackprotector support - Merge pai_crypto and pai_ext PMU drivers into a new driver. This removes a lot of duplicated code. The new driver is also extendable and allows to support new PMUs - Add driver override support for AP queues - Rework and extend zcrypt and AP trace events to allow for tracing of crypto requests - Support block sizes larger than 65535 bytes for CCW tape devices - Since the rework of the virtual kernel address space the module area and the kernel image are within the same 4GB area. This eliminates the need of weak per cpu variables. Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU - Various other small improvements and fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmktZioACgkQIg7DeRsp bsK4Rw//VzkvHyzOtGKZ8Hb4S+Sh/PFlaZQXNhj+Xt5gWoOhP1uPmmhBe6LxjYaB J9Ns3hpONQ1dTHV7VVkds8FvM/SBcGe8m5RpefmChC/bjm5UEOV/MppKtA0aLnEH hJmdubIrrRAXKggxlHEfRLzBsFvV/rJ9Xf16FhRxGDc4pgmgkI1NPQ41/dyCHklQ dB3YrFVPIETywVYYVB/G3h11JgF5Z6CKtjYCdSx72Fkbj65+6JPfcPgLKMpcJuPd UxUXtCo1FCXlP70jsz8JQI8cdieG0KDQTtnZP4P/pqjQ3wirOqvMewNa9t9xmQ2e p6Rc1Vx5DESkq9bRWtQEaprTVVzK7DDLH3RuZwB+uLrcLGD8JvVS6/m9n9CgzBMT BnJXG2sLZH+gdQy+DSD/fVDD7OvIk8TGrH+OFwVIKhrT/J3B2E7ZSYyZZCNIS7VG yiuypoDGYg3ZpYjH9+qOXWB3nc0vQWrlFzb1bsQu1omJGmunLv4jtTjAKGN82C33 auBsIYAlQW20X7DV0vZa59PwqwtBqtdQQcTidwtSztzKogRXAdK8KKHtN60JM4S2 7sWFOFCQaTChAeDNw6MF5EtULb551nwH2RtJ9x3CrJj+OGK6clbQNcxIA7Oy0veR Sl9v1lMfeKOgDrPdDy3ArQBJ8WLlF9qX9wLKbiaNyIKmkz2ymkg= =CNrb -----END PGP SIGNATURE----- Merge tag 's390-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Provide a new interface for dynamic configuration and deconfiguration of hotplug memory, allowing with and without memmap_on_memory support. This makes the way memory hotplug is handled on s390 much more similar to other architectures - Remove compat support. There shouldn't be any compat user space around anymore, therefore get rid of a lot of code which also doesn't need to be tested anymore - Add stackprotector support. GCC 16 will get new compiler options, which allow to generate code required for kernel stackprotector support - Merge pai_crypto and pai_ext PMU drivers into a new driver. This removes a lot of duplicated code. The new driver is also extendable and allows to support new PMUs - Add driver override support for AP queues - Rework and extend zcrypt and AP trace events to allow for tracing of crypto requests - Support block sizes larger than 65535 bytes for CCW tape devices - Since the rework of the virtual kernel address space the module area and the kernel image are within the same 4GB area. This eliminates the need of weak per cpu variables. Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU - Various other small improvements and fixes * tag 's390-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (92 commits) watchdog: diag288_wdt: Remove KMSG_COMPONENT macro s390/entry: Use lay instead of aghik s390/vdso: Get rid of -m64 flag handling s390/vdso: Rename vdso64 to vdso s390: Rename head64.S to head.S s390/vdso: Use common STABS_DEBUG and DWARF_DEBUG macros s390: Add stackprotector support s390/modules: Simplify module_finalize() slightly s390: Remove KMSG_COMPONENT macro s390/percpu: Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU s390/ap: Restrict driver_override versus apmask and aqmask use s390/ap: Rename mutex ap_perms_mutex to ap_attr_mutex s390/ap: Support driver_override for AP queue devices s390/ap: Use all-bits-one apmask/aqmask for vfio in_use() checks s390/debug: Update description of resize operation s390/syscalls: Switch to generic system call table generation s390/syscalls: Remove system call table pointer from thread_struct s390/uapi: Remove 31 bit support from uapi header files s390: Remove compat support tools: Remove s390 compat support ... |
|
|
|
1b5dd29869 |
vfs-6.19-rc1.fd_prepare.fs
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZwAKCRCRxhvAZXjc
op0AAP4oNVJkFyvgKoPos5K2EGFB8M7merGhpYtsOoeg8UK6OwD/UySQErHsXQDR
sUDDa5uFOhfrkcfM8REtAN4wF8p5qAc=
=QgFD
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fd prepare updates from Christian Brauner:
"This adds the FD_ADD() and FD_PREPARE() primitive. They simplify the
common pattern of get_unused_fd_flags() + create file + fd_install()
that is used extensively throughout the kernel and currently requires
cumbersome cleanup paths.
FD_ADD() - For simple cases where a file is installed immediately:
fd = FD_ADD(O_CLOEXEC, vfio_device_open_file(device));
if (fd < 0)
vfio_device_put_registration(device);
return fd;
FD_PREPARE() - For cases requiring access to the fd or file, or
additional work before publishing:
FD_PREPARE(fdf, O_CLOEXEC, sync_file->file);
if (fdf.err) {
fput(sync_file->file);
return fdf.err;
}
data.fence = fd_prepare_fd(fdf);
if (copy_to_user((void __user *)arg, &data, sizeof(data)))
return -EFAULT;
return fd_publish(fdf);
The primitives are centered around struct fd_prepare. FD_PREPARE()
encapsulates all allocation and cleanup logic and must be followed by
a call to fd_publish() which associates the fd with the file and
installs it into the caller's fdtable. If fd_publish() isn't called,
both are deallocated automatically. FD_ADD() is a shorthand that does
fd_publish() immediately and never exposes the struct to the caller.
I've implemented this in a way that it's compatible with the cleanup
infrastructure while also being usable separately. IOW, it's centered
around struct fd_prepare which is aliased to class_fd_prepare_t and so
we can make use of all the basica guard infrastructure"
* tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
io_uring: convert io_create_mock_file() to FD_PREPARE()
file: convert replace_fd() to FD_PREPARE()
vfio: convert vfio_group_ioctl_get_device_fd() to FD_ADD()
tty: convert ptm_open_peer() to FD_ADD()
ntsync: convert ntsync_obj_get_fd() to FD_PREPARE()
media: convert media_request_alloc() to FD_PREPARE()
hv: convert mshv_ioctl_create_partition() to FD_ADD()
gpio: convert linehandle_create() to FD_PREPARE()
pseries: port papr_rtas_setup_file_interface() to FD_ADD()
pseries: convert papr_platform_dump_create_handle() to FD_ADD()
spufs: convert spufs_gang_open() to FD_PREPARE()
papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()
spufs: convert spufs_context_open() to FD_PREPARE()
net/socket: convert __sys_accept4_file() to FD_ADD()
net/socket: convert sock_map_fd() to FD_ADD()
net/kcm: convert kcm_ioctl() to FD_PREPARE()
net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE()
secretmem: convert memfd_secret() to FD_ADD()
memfd: convert memfd_create() to FD_ADD()
bpf: convert bpf_token_create() to FD_PREPARE()
...
|
|
|
|
f2e74ecfba |
vfs-6.19-rc1.folio
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc onGBAQDtqeO0jZzS7q9UxlJ84Wj/H9w+9INpO4jMxtWK4svhUAEAghG4qVxRvkE2 Qh+wrpTPIC7OCQ78k8psDRmkj9cn8QA= =FCVN -----END PGP SIGNATURE----- Merge tag 'vfs-6.19-rc1.folio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull folio updates from Christian Brauner: "Add a new folio_next_pos() helper function that returns the file position of the first byte after the current folio. This is a common operation in filesystems when needing to know the end of the current folio. The helper is lifted from btrfs which already had its own version, and is now used across multiple filesystems and subsystems: - btrfs - buffer - ext4 - f2fs - gfs2 - iomap - netfs - xfs - mm This fixes a long-standing bug in ocfs2 on 32-bit systems with files larger than 2GiB. Presumably this is not a common configuration, but the fix is backported anyway. The other filesystems did not have bugs, they were just mildly inefficient. This also introduce uoff_t as the unsigned version of loff_t. A recent commit inadvertently changed a comparison from being unsigned (on 64-bit systems) to being signed (which it had always been on 32-bit systems), leading to sporadic fstests failures. Generally file sizes are restricted to being a signed integer, but in places where -1 is passed to indicate "up to the end of the file", it is convenient to have an unsigned type to ensure comparisons are always unsigned regardless of architecture" * tag 'vfs-6.19-rc1.folio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: Add uoff_t mm: Use folio_next_pos() xfs: Use folio_next_pos() netfs: Use folio_next_pos() iomap: Use folio_next_pos() gfs2: Use folio_next_pos() f2fs: Use folio_next_pos() ext4: Use folio_next_pos() buffer: Use folio_next_pos() btrfs: Use folio_next_pos() filemap: Add folio_next_pos() |
|
|
|
ebaeabfa5a |
vfs-6.19-rc1.writeback
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc
or4UAP9FbpFsZd0DpsYnKuv7kFepl291PuR0x2dKmseJ/wcf8AEAzI8FR5wd/fey
25ZNdExoUojAOj5wVn+jUep3u54jBws=
=/toi
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.19-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull writeback updates from Christian Brauner:
"Features:
- Allow file systems to increase the minimum writeback chunk size.
The relatively low minimal writeback size of 4MiB means that
written back inodes on rotational media are switched a lot. Besides
introducing additional seeks, this also can lead to extreme file
fragmentation on zoned devices when a lot of files are cached
relative to the available writeback bandwidth.
This adds a superblock field that allows the file system to
override the default size, and sets it to the zone size for zoned
XFS.
- Add logging for slow writeback when it exceeds
sysctl_hung_task_timeout_secs. This helps identify tasks waiting
for a long time and pinpoint potential issues. Recording the
starting jiffies is also useful when debugging a crashed vmcore.
- Wake up waiting tasks when finishing the writeback of a chunk
Cleanups:
- filemap_* writeback interface cleanups.
Adding filemap_fdatawrite_wbc ended up being a mistake, as all but
the original btrfs caller should be using better high level
interfaces instead.
This series removes all these low-level interfaces, switches btrfs
to a more specific interface, and cleans up other too low-level
interfaces. With this the writeback_control that is passed to the
writeback code is only initialized in three places.
- Remove __filemap_fdatawrite, __filemap_fdatawrite_range, and
filemap_fdatawrite_wbc
- Add filemap_flush_nr helper for btrfs
- Push struct writeback_control into start_delalloc_inodes in btrfs
- Rename filemap_fdatawrite_range_kick to filemap_flush_range
- Stop opencoding filemap_fdatawrite_range in 9p, ocfs2, and mm
- Make wbc_to_tag() inline and use it in fs"
* tag 'vfs-6.19-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: Make wbc_to_tag() inline and use it in fs.
xfs: set s_min_writeback_pages for zoned file systems
writeback: allow the file system to override MIN_WRITEBACK_PAGES
writeback: cleanup writeback_chunk_size
mm: rename filemap_fdatawrite_range_kick to filemap_flush_range
mm: remove __filemap_fdatawrite_range
mm: remove filemap_fdatawrite_wbc
mm: remove __filemap_fdatawrite
mm,btrfs: add a filemap_flush_nr helper
btrfs: push struct writeback_control into start_delalloc_inodes
btrfs: use the local tmp_inode variable in start_delalloc_inodes
ocfs2: don't opencode filemap_fdatawrite_range in ocfs2_journal_submit_inode_data_buffers
9p: don't opencode filemap_fdatawrite_range in v9fs_mmap_vm_close
mm: don't opencode filemap_fdatawrite_range in filemap_invalidate_inode
writeback: Add logging for slow writeback (exceeds sysctl_hung_task_timeout_secs)
writeback: Wake up waiting tasks when finishing the writeback of a chunk.
|
|
|
|
9368f0f941 |
vfs-6.19-rc1.inode
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc
omMSAP9GLhavxyWQ24Q+49CNWWRQWDY1wTOiUK2BwtIvZ0YEcAD8D1dAiMckL5pC
RwEAVA5p+y+qi+bZP0KXCBxQddoTIQM=
=zo/J
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs inode updates from Christian Brauner:
"Features:
- Hide inode->i_state behind accessors. Open-coded accesses prevent
asserting they are done correctly. One obvious aspect is locking,
but significantly more can be checked. For example it can be
detected when the code is clearing flags which are already missing,
or is setting flags when it is illegal (e.g., I_FREEING when
->i_count > 0)
- Provide accessors for ->i_state, converts all filesystems using
coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2,
overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to
compile
- Rework I_NEW handling to operate without fences, simplifying the
code after the accessor infrastructure is in place
Cleanups:
- Move wait_on_inode() from writeback.h to fs.h
- Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
for clarity
- Cosmetic fixes to LRU handling
- Push list presence check into inode_io_list_del()
- Touch up predicts in __d_lookup_rcu()
- ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
- Assert on ->i_count in iput_final()
- Assert ->i_lock held in __iget()
Fixes:
- Add missing fences to I_NEW handling"
* tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits)
dcache: touch up predicts in __d_lookup_rcu()
fs: push list presence check into inode_io_list_del()
fs: cosmetic fixes to lru handling
fs: rework I_NEW handling to operate without fences
fs: make plain ->i_state access fail to compile
xfs: use the new ->i_state accessors
nilfs2: use the new ->i_state accessors
overlayfs: use the new ->i_state accessors
gfs2: use the new ->i_state accessors
f2fs: use the new ->i_state accessors
smb: use the new ->i_state accessors
ceph: use the new ->i_state accessors
btrfs: use the new ->i_state accessors
Manual conversion to use ->i_state accessors of all places not covered by coccinelle
Coccinelle-based conversion to use ->i_state accessors
fs: provide accessors for ->i_state
fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
fs: move wait_on_inode() from writeback.h to fs.h
fs: add missing fences to I_NEW handling
ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
...
|
|
|
|
1885cdbfbb |
vfs-6.19-rc1.iomap
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc
ooCXAQCwzX2GS/55QHV6JXBBoNxguuSQ5dCj91ZmTfHzij0xNAEAhKEBw7iMGX72
c2/x+xYf+Pc6mAfxdus5RLMggqBFPAk=
=jInB
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.19-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull iomap updates from Christian Brauner:
"FUSE iomap Support for Buffered Reads:
This adds iomap support for FUSE buffered reads and readahead. This
enables granular uptodate tracking with large folios so only
non-uptodate portions need to be read. Also fixes a race condition
with large folios + writeback cache that could cause data corruption
on partial writes followed by reads.
- Refactored iomap read/readahead bio logic into helpers
- Added caller-provided callbacks for read operations
- Moved buffered IO bio logic into new file
- FUSE now uses iomap for read_folio and readahead
Zero Range Folio Batch Support:
Add folio batch support for iomap_zero_range() to handle dirty
folios over unwritten mappings. Fix raciness issues where dirty data
could be lost during zero range operations.
- filemap_get_folios_tag_range() helper for dirty folio lookup
- Optional zero range dirty folio processing
- XFS fills dirty folios on zero range of unwritten mappings
- Removed old partial EOF zeroing optimization
DIO Write Completions from Interrupt Context:
Restore pre-iomap behavior where pure overwrite completions run
inline rather than being deferred to workqueue. Reduces context
switches for high-performance workloads like ScyllaDB.
- Removed unused IOCB_DIO_CALLER_COMP code
- Error completions always run in user context (fixes zonefs)
- Reworked REQ_FUA selection logic
- Inverted IOMAP_DIO_INLINE_COMP to IOMAP_DIO_OFFLOAD_COMP
Buffered IO Cleanups:
Some performance and code clarity improvements:
- Replace manual bitmap scanning with find_next_bit()
- Simplify read skip logic for writes
- Optimize pending async writeback accounting
- Better variable naming
- Documentation for iomap_finish_folio_write() requirements
Misaligned Vectors for Zoned XFS:
Enables sub-block aligned vectors in XFS always-COW mode for zoned
devices via new IOMAP_DIO_FSBLOCK_ALIGNED flag.
Bug Fixes:
- Allocate s_dio_done_wq for async reads (fixes syzbot report after
error completion changes)
- Fix iomap_read_end() for already uptodate folios (regression fix)"
* tag 'vfs-6.19-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (40 commits)
iomap: allocate s_dio_done_wq for async reads as well
iomap: fix iomap_read_end() for already uptodate folios
iomap: invert the polarity of IOMAP_DIO_INLINE_COMP
iomap: support write completions from interrupt context
iomap: rework REQ_FUA selection
iomap: always run error completions in user context
fs, iomap: remove IOCB_DIO_CALLER_COMP
iomap: use find_next_bit() for uptodate bitmap scanning
iomap: use find_next_bit() for dirty bitmap scanning
iomap: simplify when reads can be skipped for writes
iomap: simplify ->read_folio_range() error handling for reads
iomap: optimize pending async writeback accounting
docs: document iomap writeback's iomap_finish_folio_write() requirement
iomap: account for unaligned end offsets when truncating read range
iomap: rename bytes_pending/bytes_accounted to bytes_submitted/bytes_not_submitted
xfs: support sub-block aligned vectors in always COW mode
iomap: add IOMAP_DIO_FSBLOCK_ALIGNED flag
xfs: error tag to force zeroing on debug kernels
iomap: remove old partial eof zeroing optimization
xfs: fill dirty folios on zero range of unwritten mappings
...
|
|
|
|
910c361f9a
|
secretmem: convert memfd_secret() to FD_ADD()
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-26-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> |
|
|
|
1afcbbe5d6
|
memfd: convert memfd_create() to FD_ADD()
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-25-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> |
|
|
|
9eb220eddd |
8 hotfixes. 4 are cc:stable, 7 are against mm/.
All are singletons - please see the respective changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaSdaWwAKCRDdBJ7gKXxA jlRmAP92O8ez4IeVq1VVhVfGVf9Xbjd8zFTykQXuGM/3yDRoTgEA+kLDFTTaJ5Wb MGlgQAeFXqlfMZfBljhyzbV8Ubz1BAc= =4UwS -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-11-26-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "8 hotfixes. 4 are cc:stable, 7 are against mm/. All are singletons - please see the respective changelogs for details" * tag 'mm-hotfixes-stable-2025-11-26-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/filemap: fix logic around SIGBUS in filemap_map_pages() mm/huge_memory: fix NULL pointer deference when splitting folio MAINTAINERS: add test_kho to KHO's entry mailmap: add entry for Sam Protsenko selftests/mm: fix division-by-zero in uffd-unit-tests mm/mmap_lock: reset maple state on lock_vma_under_rcu() retry mm/memfd: fix information leak in hugetlb folios mm: swap: remove duplicate nr_swap_pages decrement in get_swap_page_of_type() |
|
|
|
a8ec08bf32 |
Merge branch 'slab/for-6.19/mempool_alloc_bulk' into slab/for-next
Merges series "mempool_alloc_bulk and various mempool improvements v3" from Christoph Hellwig. From the cover letter [1]: This series adds a bulk version of mempool_alloc that makes allocating multiple objects deadlock safe. The initial users is the blk-crypto-fallback code: https://lore.kernel.org/linux-block/20251031093517.1603379-1-hch@lst.de/ with which v1 was posted, but I also have a few other users in mind. Link: https://lore.kernel.org/all/20251113084022.1255121-1-hch@lst.de/ [1] |
|
|
|
ed80cc758b |
Merge branch 'slab/for-6.19/freelist_aba_t_cleanups' into slab/for-next
Merge series "slab: cmpxchg cleanups enabled by -fms-extensions" From the cover letter [1]: After learning about -fms-extensions being enabled for 6.19, I realized there is some cleanup potential in slub code by extending the definition and usage of freelist_aba_t, as it can now become an unnamed member of struct slab. This series performs the cleanup, with no functional changes intended. Additionally we turn freelist_aba_t to struct freelist_counters as it doesn't meet any criteria for being a typedef, per Documentation/process/coding-style.rst Based on the tag kbuild-ms-extensions-6.19 from git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linuxV Link: https://lore.kernel.org/all/20251107-slab-fms-cleanup-v1-0-650b1491ac9e@suse.cz/#t [1] |
|
|
|
e5d7764e13 |
Merge branch 'slab/for-6.19/memdesc_prep' into slab/for-next
Merge series "Prepare slab for memdescs" by Matthew Wilcox. From the cover letter [1]: When we separate struct folio, struct page and struct slab from each other, converting to folios then to slabs will be nonsense. It made sense under the 'folio is just a head page' interpretation, but with full separation, page_folio() will return NULL for a page which belongs to a slab. This patch series removes almost all mentions of folio from slab. There are a few folio_test_slab() invocations left around the tree that I haven't decided how to handle yet. We're not yet quite at the point of separately allocating struct slab, but that's what I'll be working on next. Link: https://lore.kernel.org/all/20251113000932.1589073-1-willy@infradead.org/ [1] |
|
|
|
3065c20d5d |
Merge branch 'slab/for-6.19/sheaves_cleanups' into slab/for-next
Merge series "slab: preparatory cleanups before adding sheaves to all caches" [1] Cleanups that were written as part of the full sheaves conversion, which is not fully ready yet, but they are useful on their own. Link: https://lore.kernel.org/all/20251105-sheaves-cleanups-v1-0-b8218e1ac7ef@suse.cz/ [1] |
|
|
|
b55590558f |
slab: Remove unnecessary call to compound_head() in alloc_from_pcs()
Each page knows which node it belongs to, so there's no need to convert to a folio. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20251124142329.1691780-1-willy@infradead.org Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
4c6b40877b
|
fs: cosmetic fixes to lru handling
1. inode_bit_waitqueue() was somehow placed between __inode_add_lru() and inode_add_lru(). move it up 2. assert ->i_lock is held in __inode_add_lru instead of just claiming it is needed 3. s/__inode_add_lru/__inode_lru_list_add/ for consistency with itself (inode_lru_list_del()) and similar routines for sb and io list management 4. push list presence check into inode_lru_list_del(), just like sb and io list Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251029131428.654761-2-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> |
|
|
|
37d369fa97
|
fs: Add uoff_t
In a recent commit, I inadvertently changed a comparison from being an unsigned comparison (on 64-bit systems) to being a signed comparison (which it had always been on 32-bit systems). This led to a sporadic fstests failure. To make sure this comparison is always unsigned, introduce a new type, uoff_t which is the unsigned version of loff_t. Generally file sizes are restricted to being a signed integer, but in these two places it is convenient to pass -1 to indicate "up to the end of the file". Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20251123220518.1447261-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> |
|
|
|
7c9580f44f |
mm/filemap: fix logic around SIGBUS in filemap_map_pages()
Chris noticed that filemap_map_pages() calculates can_map_large only once
for the first page in the fault around range. The value is not valid for
the following pages in the range and must be recalculated.
Instead of recalculating can_map_large on each iteration, pass down
file_end to filemap_map_folio_range() and let it make the decision on what
can be mapped.
Link: https://lkml.kernel.org/r/20251120161411.859078-1-kirill@shutemov.name
Fixes:
|
|
|
|
cff47b9e39 |
mm/huge_memory: fix NULL pointer deference when splitting folio
Commit |
|
|
|
270065f514 |
mm/mmap_lock: reset maple state on lock_vma_under_rcu() retry
The retry in lock_vma_under_rcu() drops the rcu read lock before reacquiring the lock and trying again. This may cause a use-after-free if the maple node the maple state was using was freed. The maple state is protected by the rcu read lock. When the lock is dropped, the state cannot be reused as it tracks pointers to objects that may be freed during the time where the lock was not held. Any time the rcu read lock is dropped, the maple state must be invalidated. Resetting the address and state to MA_START is the safest course of action, which will result in the next operation starting from the top of the tree. Prior to commit |
|
|
|
de8798965f |
mm/memfd: fix information leak in hugetlb folios
When allocating hugetlb folios for memfd, three initialization steps are
missing:
1. Folios are not zeroed, leading to kernel memory disclosure to userspace
2. Folios are not marked uptodate before adding to page cache
3. hugetlb_fault_mutex is not taken before hugetlb_add_to_page_cache()
The memfd allocation path bypasses the normal page fault handler
(hugetlb_no_page) which would handle all of these initialization steps.
This is problematic especially for udmabuf use cases where folios are
pinned and directly accessed by userspace via DMA.
Fix by matching the initialization pattern used in hugetlb_no_page():
- Zero the folio using folio_zero_user() which is optimized for huge pages
- Mark it uptodate with folio_mark_uptodate()
- Take hugetlb_fault_mutex before adding to page cache to prevent races
The folio_zero_user() change also fixes a potential security issue where
uninitialized kernel memory could be disclosed to userspace through read()
or mmap() operations on the memfd.
Link: https://lkml.kernel.org/r/20251112145034.2320452-1-kartikey406@gmail.com
Fixes:
|
|
|
|
f5e31a196e |
mm: swap: remove duplicate nr_swap_pages decrement in get_swap_page_of_type()
After commit |
|
|
|
4823329146 |
mempool: clarify behavior of mempool_alloc_preallocated()
The documentation of that function promises to never sleep. However on PREEMPT_RT a spinlock_t might in fact sleep. Reword the documentation so users can predict its behavior better. mempool could also replace spinlock_t with raw_spinlock_t which doesn't sleep even on PREEMPT_RT but that would take away the improved preemptibility of sleeping locks. Link: https://lkml.kernel.org/r/20251014-mempool-doc-v1-1-bc9ebf169700@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@gentwo.org> Cc: David Rientjes <rientjes@google.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
07723a41ee |
mempool: drop the file name in the top of file comment
Mentioning the name of the file is redundant, so drop it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-12-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
0cab6873b7 |
mempool: de-typedef
Switch all uses of the deprecated mempool_t typedef in the core mempool code to use struct mempool instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-11-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
8b41fb80a2 |
mempool: remove mempool_{init,create}_kvmalloc_pool
This was added for bcachefs and is unused now. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-10-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
9c4391767f |
mempool: legitimize the io_schedule_timeout in mempool_alloc_from_pool
The timeout here is and old workaround with a Fixme comment. But thinking about it, it makes sense to keep it, so reword the comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-9-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
ac529d86ad |
mempool: add mempool_{alloc,free}_bulk
Add a version of the mempool allocator that works for batch allocations of multiple objects. Calling mempool_alloc in a loop is not safe because it could deadlock if multiple threads are performing such an allocation at the same time. As an extra benefit the interface is build so that the same array can be used for alloc_pages_bulk / release_pages so that at least for page backed mempools the fast path can use a nice batch optimization. Note that mempool_alloc_bulk does not take a gfp_mask argument as it must always be able to sleep and doesn't support any non-trivial modifiers. NOFO or NOIO constrainst must be set through the scoped API. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-8-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
1742d97df6 |
mempool: factor out a mempool_alloc_from_pool helper
Add a helper for the mempool_alloc slowpath to better separate it from the fast path, and also use it to implement mempool_alloc_preallocated which shares the same logic. [hughd@google.com: fix lack of retrying with __GFP_DIRECT_RECLAIM] [vbabka@suse.cz: really use limited flags for first mempool attempt] Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-7-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
d4504262f7 |
PCI/P2PDMA: Simplify bus address mapping API
Update the pci_p2pdma_bus_addr_map() function to take a direct pointer to the p2pdma_provider structure instead of the pci_p2pdma_map_state. This simplifies the API by removing the need for callers to extract the provider from the state structure. The change updates all callers across the kernel (block layer, IOMMU, DMA direct, and HMM) to pass the provider pointer directly, making the code more explicit and reducing unnecessary indirection. This also removes the runtime warning check since callers now have direct control over which provider they use. Tested-by: Alex Mastro <amastro@fb.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Ankit Agrawal <ankita@nvidia.com> Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-2-d7f71607f371@nvidia.com Signed-off-by: Alex Williamson <alex@shazbot.org> |
|
|
|
c966813ea1 |
slab fix for 6.18-rc7
-----BEGIN PGP SIGNATURE----- iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmkfYFEbFIAAAAAABAAO bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiaLtEIAKDAXWXlwCO3CVDb5USk t1KFTpmQ3UCPDd48eTRbeC6D0B3Jh7+JOa/f96yxQrQOABm2P3xuU30HcqLsWbfV D08MB2u/eyjcghBAqZ95WzdKUcMdzx90qlCnwUE/tDbcFhEc3FutPjqRUQ2iJcyu dk3K6yNl+LiQiw+BVLu+WgQD1fFStuSQ4H3oDHSL2ep0C+vv6jVBEKoiybHFvexQ okrVVBgL7RlPbz+n6t4bsaR64jGa+P9DeiGDGU9gM+kOdP5dYKcmOq0q5dliNOVw 6Bnf9T+zykU6NdQZjwx32eZZoocCg+DT2K+NDvLqeg8PsAGktwQWYDEKnM0yoRMk 7nY= =Lmv8 -----END PGP SIGNATURE----- Merge tag 'slab-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Fix mempool poisoning order>0 pages with CONFIG_HIGHMEM (Vlastimil Babka) * tag 'slab-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/mempool: fix poisoning order>0 pages with HIGHMEM |
|
|
|
2df79e4d72 |
memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
The "soft-reserved" memory regions (EFI_MEMORY_SP) are added to the memblock.reserved, but not to the memblock.memory. It causes memblock_estimated_nr_free_pages() to return a value smaller value than expected, or if it underflows, an extremely large value. Calculate the number of estimated free pages using memblock_reserved_kern_size() instead of memblock_reserved_size() to fix the issue. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmkda44QHHJwcHRAa2Vy bmVsLm9yZwAKCRA5A4Ymyw79kWR1B/0XkJdjP2gH7fxnAZc2h2f3zsRQP/70Hcgr xJy7UE7+2e6KWLzl8vcI4Oyr+7cRbtAa6AYfk2HTcIP+M2Af34kzVgLZceuAW/zr bpyaNV7t23CcQwtY+6etGM2Nlzw6lTi/BF+EAS+rcgx5lrKJ0wpACm/1tplU3nJB DKfumkJgQt02tgwBByXB0SXUjcntiQ/uEWm27EJvD6YTDOprt9316G+7GPRPVaOy y0Se9dFqZ7xWP2sWWwYiSyS57fPgBSB7+XR8/bnsutib8GvA6AmYUaJdo5MavlYZ mz3ZHmvjb0acCDgvrV564RLp23lX29WPSvHFwrlyU4v1g5pDv4dt =tr0l -----END PGP SIGNATURE----- Merge tag 'fixes-2025-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix memblock_estimated_nr_free_pages() for soft-reserved memory The "soft-reserved" memory regions (EFI_MEMORY_SP) are added to the memblock.reserved, but not to the memblock.memory. It causes memblock_estimated_nr_free_pages() to return a value smaller value than expected, or if it underflows, an extremely large value. Calculate the number of estimated free pages using memblock_reserved_kern_size() instead of memblock_reserved_size() to fix the issue" * tag 'fixes-2025-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory |
|
|
|
79301c7d60 |
mm: add spurious fault fixing support for huge pmd
The page faults may be spurious because of the racy access to the page table. For example, a non-populated virtual page is accessed on 2 CPUs simultaneously, thus the page faults are triggered on both CPUs. However, it's possible that one CPU (say CPU A) cannot find the reason for the page fault if the other CPU (say CPU B) has changed the page table before the PTE is checked on CPU A. Most of the time, the spurious page faults can be ignored safely. However, if the page fault is for the write access, it's possible that a stale read-only TLB entry exists in the local CPU and needs to be flushed on some architectures. This is called the spurious page fault fixing. In the current kernel, there is spurious fault fixing support for pte, but not for huge pmd because no architectures need it. But in the next patch in the series, we will change the write protection fault handling logic on arm64, so that some stale huge pmd entries may remain in the TLB. These entries need to be flushed via the huge pmd spurious fault fixing mechanism. Signed-off-by: Huang Ying <ying.huang@linux.alibaba.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Christoph Lameter (Ampere) <cl@gentwo.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Barry Song <baohua@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Yin Fengwei <fengwei_yin@linux.alibaba.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
|
|
|
5bebe8de19 |
mm/huge_memory: Fix initialization of huge zero folio
The recent fix to properly initialize the tags of the huge zero folio
had an unfortunate not-so-subtle side effect: it caused the actual
*contents* of the huge zero folio to not be initialized at all when the
hardware didn't support the memory tagging.
The reason was the unfortunate semantics of tag_clear_highpage(): on
hardware that didn't do the tagging, it would silently just not do
anything at all. And since this is done only on arm64 with MTE support,
that basically meant most hardware.
It wasn't necessarily immediately obvious since the huge zero page isn't
necessarily very heavily used - or because it might already be zero
because all-zeroes is the most common pattern. But it ends up causing
random odd user space failures when you do hit it.
The unfortunate semantics have been around for a while, but became a
real bug only when we started actively using __GFP_ZEROTAGS in the
generic get_huge_zero_folio() function - before that, it had only ever
been used in code that checked that the hardware supported it.
Fix this by simply changing the semantics of tag_clear_highpage() to
return whether it actually successfully did something or not. While at
it, also make it initialize multiple pages in one go, since that's
actually what the only caller wants it to do and it simplifies the whole
logic.
Fixes:
|
|
|
|
e7c375b181 |
vfs-6.18-rc7.fixes
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaRtBJwAKCRCRxhvAZXjc ou5CAQCJb5y2ULKklblICU1wR7Nr15WvTW7VVOcv44RJ22S3NgEAy4DLDBFBw8zC 8e7Hp8gxbjsq8ZJmU088aobFcqbZOwk= =TAnu -----END PGP SIGNATURE----- Merge tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix unitialized variable in statmount_string() - Fix hostfs mounting when passing host root during boot - Fix dynamic lookup to fail on cell lookup failure - Fix missing file type when reading bfs inodes from disk - Enforce checking of sb_min_blocksize() calls and update all callers accordingly - Restore write access before closing files opened by open_exec() in binfmt_misc - Always freeze efivarfs during suspend/hibernate cycles - Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to actually allow mount namespace file descriptor in addition to mount namespace ids - Fix tmpfs remount when noswap is specified - Switch Landlock to iput_not_last() to remove false-positives from might_sleep() annotations in iput() - Remove dead node_to_mnt_ns() code - Ensure that per-queue kobjects are successfully created * tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: landlock: fix splats from iput() after it started calling might_sleep() fs: add iput_not_last() shmem: fix tmpfs reconfiguration (remount) when noswap is set fs/namespace: correctly handle errors returned by grab_requested_mnt_ns power: always freeze efivarfs binfmt_misc: restore write access before closing files opened by open_exec() block: add __must_check attribute to sb_min_blocksize() virtio-fs: fix incorrect check for fsvq->kobj xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super isofs: check the return value of sb_min_blocksize() in isofs_fill_super exfat: check return value of sb_min_blocksize in exfat_read_boot_sector vfat: fix missing sb_min_blocksize() return value checks mnt: Remove dead code which might prevent from building bfs: Reconstruct file type when loading from disk afs: Fix dynamic lookup to fail on cell lookup failure hostfs: Fix only passing host root in boot stage with new mount fs: Fix uninitialized 'offp' in statmount_string() |
|
|
|
7ba45f1504 |
7 hotfixes. 5 are cc:stable, 4 are against mm/.
All are singletons - please see the respective changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaRoauQAKCRDdBJ7gKXxA jtNFAQDEMH0+zRGz/Larkf9cgmdKcDgij1DP2gP/3i8PWAoaGQD8C9evZxu1h9wC rFbaSkPDeSdDafo3RZfpo1gqE0LdEA4= =oew8 -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 hotfixes. 5 are cc:stable, 4 are against mm/ All are singletons - please see the respective changelogs for details" * tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm, swap: fix potential UAF issue for VMA readahead selftests/user_events: fix type cast for write_index packed member in perf_test lib/test_kho: check if KHO is enabled mm/huge_memory: fix folio split check for anon folios in swapcache MAINTAINERS: update David Hildenbrand's email address crash: fix crashkernel resource shrink mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb |
|
|
|
1c2a936edd |
mm, swap: fix potential UAF issue for VMA readahead
Since commit |
|
|
|
f1d47cafe5 |
mm/huge_memory: fix folio split check for anon folios in swapcache
Both uniform and non uniform split check missed the check to prevent
splitting anon folios in swapcache to non-zero order.
Splitting anon folios in swapcache to non-zero order can cause data
corruption since swapcache only support PMD order and order-0 entries.
This can happen when one use split_huge_pages under debugfs to split
anon folios in swapcache.
In-tree callers do not perform such an illegal operation. Only debugfs
interface could trigger it. I will put adding a test case on my TODO
list.
Fix the check.
Link: https://lkml.kernel.org/r/20251105162910.752266-1-ziy@nvidia.com
Fixes:
|
|
|
|
39231e8d6b |
mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support runtime allocation of gigantic hugetlb folios. In the meantime it evolved into a generic way for the architecture to state that it supports gigantic hugetlb folios. In commit |
|
|
|
ec33b59542 |
mm/mempool: fix poisoning order>0 pages with HIGHMEM
The kernel test has reported:
BUG: unable to handle page fault for address: fffba000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
*pde = 03171067 *pte = 00000000
Oops: Oops: 0002 [#1]
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G T 6.18.0-rc2-00031-gec7f31b2a2d3 #1 NONE a1d066dfe789f54bc7645c7989957d2bdee593ca
Tainted: [T]=RANDSTRUCT
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
EIP: memset (arch/x86/include/asm/string_32.h:168 arch/x86/lib/memcpy_32.c:17)
Code: a5 8b 4d f4 83 e1 03 74 02 f3 a4 83 c4 04 5e 5f 5d 2e e9 73 41 01 00 90 90 90 3e 8d 74 26 00 55 89 e5 57 56 89 c6 89 d0 89 f7 <f3> aa 89 f0 5e 5f 5d 2e e9 53 41 01 00 cc cc cc 55 89 e5 53 57 56
EAX: 0000006b EBX: 00000015 ECX: 001fefff EDX: 0000006b
ESI: fffb9000 EDI: fffba000 EBP: c611fbf0 ESP: c611fbe8
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010287
CR0: 80050033 CR2: fffba000 CR3: 0316e000 CR4: 00040690
Call Trace:
poison_element (mm/mempool.c:83 mm/mempool.c:102)
mempool_init_node (mm/mempool.c:142 mm/mempool.c:226)
mempool_init_noprof (mm/mempool.c:250 (discriminator 1))
? mempool_alloc_pages (mm/mempool.c:640)
bio_integrity_initfn (block/bio-integrity.c:483 (discriminator 8))
? mempool_alloc_pages (mm/mempool.c:640)
do_one_initcall (init/main.c:1283)
Christoph found out this is due to the poisoning code not dealing
properly with CONFIG_HIGHMEM because only the first page is mapped but
then the whole potentially high-order page is accessed.
We could give up on HIGHMEM here, but it's straightforward to fix this
with a loop that's mapping, poisoning or checking and unmapping
individual pages.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511111411.9ebfa1ba-lkp@intel.com
Analyzed-by: Christoph Hellwig <hch@lst.de>
Fixes:
|
|
|
|
9b9e43704d |
slab fix for 6.18-rc6
-----BEGIN PGP SIGNATURE----- iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmkWKmwbFIAAAAAABAAO bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYia6V4H/3fH24KLh0jsSK1I0Ifk Eus5+Lv79/78HkpTHEMb/KeSZ8hNEtGAjZq5aBdV/9lXhEfDg9nXok0qqfSVdynx OsRp3xz1lOTJxZnkWTNkl0fBwASCiKG586UrFyCkl1h/mqhy7TpBilBxyLpNI/kO aCRf9mjAGmqliwZzV555LywKg8tcaDDop+6Q6qEL0kWt9W++GVgqLMfP3Jh71Hl/ HU7uuIkFJqfrBDFmtuNEnR3Nta+k5NIENNjcEMAjSQWHzMgCK7l3sapOm70+/FAS 7XLjvxJVonIj805qqxyEXqO32MEun+eMKPN4+VPSTa96O5lwsSQTOhO44i5iwFUz 82M= =wl8O -----END PGP SIGNATURE----- Merge tag 'slab-for-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Fix memory leak of objects from remote NUMA node when bulk freeing to a cache with sheaves (Harry Yoo) * tag 'slab-for-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slub: fix memory leak in free_to_pcs_bulk() |
|
|
|
76ade24433 |
slab: Remove references to folios from virt_to_slab()
Use page_slab() instead of virt_to_folio() which will work perfectly when struct slab is separated from struct folio. This was the last user of folio_slab(), so delete it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20251113000932.1589073-17-willy@infradead.org Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
bbe7117305 |
kasan: Remove references to folio in __kasan_mempool_poison_object()
In preparation for splitting struct slab from struct page and struct folio, remove mentions of struct folio from this function. There is a mild improvement for large kmalloc objects as we will avoid calling compound_head() for them. We can discard the comment as using PageLargeKmalloc() rather than !folio_test_slab() makes it obvious. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: kasan-dev <kasan-dev@googlegroups.com> Link: https://patch.msgid.link/20251113000932.1589073-16-willy@infradead.org Acked-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
b8557d109e |
memcg: Convert mem_cgroup_from_obj_folio() to mem_cgroup_from_obj_slab()
In preparation for splitting struct slab from struct page and struct folio, convert the pointer to a slab rather than a folio. This means we can end up passing a NULL slab pointer to mem_cgroup_from_obj_slab() if the pointer is not to a page allocated to slab, and we handle that appropriately by returning NULL. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Muchun Song <muchun.song@linux.dev> Cc: cgroups@vger.kernel.org Link: https://patch.msgid.link/20251113000932.1589073-15-willy@infradead.org Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
cbcff934fa |
mm/slub: fix memory leak in free_to_pcs_bulk()
The commit |
|
|
|
3d2492401d |
mempool: factor out a mempool_adjust_gfp helper
Add a helper to better isolate and document the gfp flags adjustments. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251113084022.1255121-6-hch@lst.de Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
|
|
|
b77fc08e39 |
mempool: add error injection support
Add a call to should_fail_ex that forces mempool to actually allocate
from the pool to stress the mempool implementation when enabled through
debugfs. By default should_fail{,_ex} prints a very verbose stack trace
that clutters the kernel log, slows down execution and triggers the
kernel bug detection in xfstests. Pass FAULT_NOWARN and print a
single-line message notating the caller instead so that full tests
can be run with fault injection.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://patch.msgid.link/20251113084022.1255121-5-hch@lst.de
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|