linux

Commit Graph

Author	SHA1	Message	Date
Brendan Jackman	571a4b62ed	selftests/mm: skip map_populate on weird filesystems It seems that 9pfs does not allow truncating unlinked files, Mark Brown has noted that NFS may also behave this way. It doesn't seem quite right to call this a "bug" but it's probably a special enough case that it makes sense for the test to just SKIP if it happens. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-7-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:39 -07:00
Brendan Jackman	bf6d575e24	selftests/mm: don't fail uffd-stress if too many CPUs This calculation divides a fixed parameter by an environment-dependent parameter i.e. the number of CPUs. The simple way to avoid machine-specific failures here is to just put a cap on the max value of the latter. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-6-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Suggested-by: Mateusz Guzik <mjguzik@gmail.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	db0f1c138f	selftests/mm: print some details when uffd-stress gets bad params So this can be debugged more easily. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-5-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	f3b5535abc	selftests/mm/uffd: rename nr_cpus -> nr_parallel A later commit will bound this variable so it no longer necessarily matches the number of CPUs. Rename it appropriately. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-4-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	f4b3e6c7f1	selftests/mm: skip uffd-wp-mremap if userfaultfd not available It's obvious that this should fail in that case, but still, save the reader the effort of figuring out that they've run into this by just SKIPping Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-3-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:38 -07:00
Brendan Jackman	0046dbed80	selftests/mm: skip uffd-stress if userfaultfd not available It's pretty obvious that the test wouldn't work if you don't have the feature enabled. But, it's still useful to SKIP instead of failing so the reader can immediately tell that this is the reason why. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-2-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:37 -07:00
Brendan Jackman	800ddf3cd7	selftests/mm: report errno when things fail in gup_longterm Patch series "selftests/mm: Some cleanups from trying to run them", v4. I never had much luck running mm selftests so I spent a few hours digging into why. Looks like most of the reason is missing SKIP checks, so this series is just adding a bunch of those that I found. I did not do anything like all of them, just the ones I spotted in gup_longterm, gup_test, mmap, userfaultfd and memfd_secret. It's a bit unfortunate to have to skip those tests when ftruncate() fails, but I don't have time to dig deep enough into it to actually make them pass. I have observed the issue on 9pfs and heard rumours that NFS has a similar problem. I'm now able to run these test groups successfully: - mmap - gup_test - compaction - migration - page_frag - userfaultfd - mlock I've never gone past "Waiting for hugetlb memory to get depleted", in the hugetlb tests. I don't know if they are stuck or if they would eventually work if I was patient enough (testing on a 1G machine). I have not investigated further. I had some issues with mlock tests failing due to -ENOSRCH from mlock2(), I can no longer reproduce that though, things work OK now. Of the remaining tests there may be others that work fine, but there's no convenient way to survey the whole output of run_vmtests.sh so I'm just going test by test here. In my spare moments I am slowly chipping away at a setup to run these tests continuously in a reasonably hermetic QEMU environment via virtme-ng: `5fad4b9c59/README.md` Hopefully that will eventually offer a way to provide a "canned" environment where the tests are known to work, which can be fairly easily reproduced by any developer. This patch (of 12): Just reporting failure doesn't tell you what went wrong. This can fail in different ways so report errno to help the reader get started debugging. Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-0-dec210a658f5@google.com Link: https://lkml.kernel.org/r/20250311-mm-selftests-v4-1-dec210a658f5@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:37 -07:00
Ujwal Kundur	af3b45aac5	selftests/mm: fix spelling Fix misspelling flagged by codespell. Link: https://lkml.kernel.org/r/20250215081803.1793-1-ujwal.kundur@gmail.com Signed-off-by: Ujwal Kundur <ujwal.kundur@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:23 -07:00
Suren Baghdasaryan	3104138517	mm: make vma cache SLAB_TYPESAFE_BY_RCU To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that object reuse before RCU grace period is over will be detected by lock_vma_under_rcu(). Current checks are sufficient as long as vma is detached before it is freed. The only place this is not currently happening is in exit_mmap(). Add the missing vma_mark_detached() in exit_mmap(). Another issue which might trick lock_vma_under_rcu() during vma reuse is vm_area_dup(), which copies the entire content of the vma into a new one, overriding new vma's vm_refcnt and temporarily making it appear as attached. This might trick a racing lock_vma_under_rcu() to operate on a reused vma if it found the vma before it got reused. To prevent this situation, we should ensure that vm_refcnt stays at detached state (0) when it is copied and advances to attached state only after it is added into the vma tree. Introduce vm_area_init_from() which preserves new vma's vm_refcnt and use it in vm_area_dup(). Since all vmas are in detached state with no current readers when they are freed, lock_vma_under_rcu() will not be able to take vm_refcnt after vma got detached even if vma is reused. vma_mark_attached() in modified to include a release fence to ensure all stores to the vma happen before vm_refcnt gets initialized. Finally, make vm_area_cachep SLAB_TYPESAFE_BY_RCU. This will facilitate vm_area_struct reuse and will minimize the number of call_rcu() calls. [surenb@google.com: remove atomic_set_release() usage in tools/] Link: https://lkml.kernel.org/r/20250217054351.2973666-1-surenb@google.com Link: https://lkml.kernel.org/r/20250213224655.1680278-18-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:21 -07:00
Suren Baghdasaryan	6bef4c2f97	mm: move lesser used vma_area_struct members into the last cacheline Move several vma_area_struct members which are rarely or never used during page fault handling into the last cacheline to better pack vm_area_struct. As a result vm_area_struct will fit into 3 as opposed to 4 cachelines. New typical vm_area_struct layout: struct vm_area_struct { union { struct { long unsigned int vm_start; /* 0 8 / long unsigned int vm_end; / 8 8 / }; / 0 16 / freeptr_t vm_freeptr; / 0 8 / }; / 0 16 / struct mm_struct vm_mm; /* 16 8 / pgprot_t vm_page_prot; / 24 8 / union { const vm_flags_t vm_flags; / 32 8 / vm_flags_t __vm_flags; / 32 8 / }; / 32 8 / unsigned int vm_lock_seq; / 40 4 / / XXX 4 bytes hole, try to pack / struct list_head anon_vma_chain; / 48 16 / / --- cacheline 1 boundary (64 bytes) --- / struct anon_vma anon_vma; /* 64 8 / const struct vm_operations_struct vm_ops; /* 72 8 / long unsigned int vm_pgoff; / 80 8 / struct file vm_file; /* 88 8 / void vm_private_data; /* 96 8 / atomic_long_t swap_readahead_info; / 104 8 / struct mempolicy vm_policy; /* 112 8 / struct vma_numab_state numab_state; /* 120 8 / / --- cacheline 2 boundary (128 bytes) --- / refcount_t vm_refcnt (__aligned__(64)); / 128 4 / / XXX 4 bytes hole, try to pack / struct { struct rb_node rb (__aligned__(8)); / 136 24 / long unsigned int rb_subtree_last; / 160 8 / } __attribute__((__aligned__(8))) shared; / 136 32 / struct anon_vma_name anon_name; /* 168 8 / struct vm_userfaultfd_ctx vm_userfaultfd_ctx; / 176 8 / / size: 192, cachelines: 3, members: 18 / / sum members: 176, holes: 2, sum holes: 8 / / padding: 8 / / forced alignments: 2, forced holes: 1, sum forced holes: 4 */ } __attribute__((__aligned__(64))); Memory consumption per 1000 VMAs becomes 48 pages: slabinfo after vm_area_struct changes: <name> ... <objsize> <objperslab> <pagesperslab> : ... vm_area_struct ... 192 42 2 : ... Link: https://lkml.kernel.org/r/20250213224655.1680278-14-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:20 -07:00
Suren Baghdasaryan	f35ab95ca0	mm: replace vm_lock and detached flag with a reference count rw_semaphore is a sizable structure of 40 bytes and consumes considerable space for each vm_area_struct. However vma_lock has two important specifics which can be used to replace rw_semaphore with a simpler structure: 1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails. 2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore and the vma->detached flag with a refcount (vm_refcnt). When vma is in detached state, vm_refcnt is 0 and only a call to vma_mark_attached() can take it out of this state. Note that unlike before, now we enforce both vma_mark_attached() and vma_mark_detached() to be done only after vma has been write-locked. vma_mark_attached() changes vm_refcnt to 1 to indicate that it has been attached to the vma tree. When a reader takes read lock, it increments vm_refcnt, unless the top usable bit of vm_refcnt (0x40000000) is set, indicating presence of a writer. When writer takes write lock, it sets the top usable bit to indicate its presence. If there are readers, writer will wait using newly introduced mm->vma_writer_wait. Since all writers take mmap_lock in write mode first, there can be only one writer at a time. The last reader to release the lock will signal the writer to wake up. refcount might overflow if there are many competing readers, in which case read-locking will fail. Readers are expected to handle such failures. In summary: 1. all readers increment the vm_refcnt; 2. writer sets top usable (writer) bit of vm_refcnt; 3. readers cannot increment the vm_refcnt if the writer bit is set; 4. in the presence of readers, writer must wait for the vm_refcnt to drop to 1 (plus the VMA_LOCK_OFFSET writer bit), indicating an attached vma with no readers; 5. vm_refcnt overflow is handled by the readers. While this vm_lock replacement does not yet result in a smaller vm_area_struct (it stays at 256 bytes due to cacheline alignment), it allows for further size optimization by structure member regrouping to bring the size of vm_area_struct below 192 bytes. [surenb@google.com: fix a crash due to vma_end_read() that should have been removed] Link: https://lkml.kernel.org/r/20250220200208.323769-1-surenb@google.com Link: https://lkml.kernel.org/r/20250213224655.1680278-13-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:20 -07:00
Suren Baghdasaryan	55e50223bf	mm: introduce vma_iter_store_attached() to use with attached vmas vma_iter_store() functions can be used both when adding a new vma and when updating an existing one. However for existing ones we do not need to mark them attached as they are already marked that way. With vma->detached being a separate flag, double-marking a vmas as attached or detached is not an issue because the flag will simply be overwritten with the same value. However once we fold this flag into the refcount later in this series, re-attaching or re-detaching a vma becomes an issue since these operations will be incrementing/decrementing a refcount. Introduce vma_iter_store_new() and vma_iter_store_overwrite() to replace vma_iter_store() and avoid re-attaching a vma during vma update. Add assertions in vma_mark_attached()/vma_mark_detached() to catch invalid usage. Update vma tests to check for vma detached state correctness. Link: https://lkml.kernel.org/r/20250213224655.1680278-5-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:18 -07:00
Suren Baghdasaryan	8ef95d8f15	mm: mark vma as detached until it's added into vma tree Current implementation does not set detached flag when a VMA is first allocated. This does not represent the real state of the VMA, which is detached until it is added into mm's VMA tree. Fix this by marking new VMAs as detached and resetting detached flag only after VMA is added into a tree. Introduce vma_mark_attached() to make the API more readable and to simplify possible future cleanup when vma->vm_mm might be used to indicate detached vma and vma_mark_attached() will need an additional mm parameter. Link: https://lkml.kernel.org/r/20250213224655.1680278-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:17 -07:00
Suren Baghdasaryan	7b6218ae12	mm: move per-vma lock into vm_area_struct Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. Move vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned as well. With kernel compiled using defconfig, this causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes: slabinfo before: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 160 51 2 : ... slabinfo after moving vm_lock: <name> ... <objsize> <objperslab> <pagesperslab> : ... vm_area_struct ... 256 32 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, which is 5.5MB per 100000 VMAs. Note that the size of this structure is dependent on the kernel configuration and typically the original size is higher than 160 bytes. Therefore these calculations are close to the worst case scenario. A more realistic vm_area_struct usage before this change is: <name> ... <objsize> <objperslab> <pagesperslab> : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 176 46 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 pages, which is 3.9MB per 100000 VMAs. This memory consumption growth can be addressed later by optimizing the vm_lock. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20250213224655.1680278-3-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Tested-by: Shivank Garg <shivankg@amd.com> Link: https://lkml.kernel.org/r/5e19ec93-8307-47c2-bb13-3ddf7150624e@amd.com Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sourav Panda <souravpanda@google.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:17 -07:00
Lorenzo Stoakes	0b6d4853d1	tools/selftests: add file/shmem-backed mapping guard region tests Extend the guard region self tests to explicitly assert that guard regions work correctly for functionality specific to file-backed and shmem mappings. In addition to testing all of the existing guard region functionality that is currently tested against anonymous mappings against file-backed and shmem mappings (except those which are exclusive to anonymous mapping), we now also: * Test that MADV_SEQUENTIAL does not cause unexpected readahead behaviour. * Test that MAP_PRIVATE behaves as expected with guard regions installed in both a shared and private mapping of an fd. * Test that a read-only file can correctly establish guard regions. * Test a probable fault-around case does not interfere with guard regions (or vice-versa). * Test that truncation does not eliminate guard regions. * Test that hole punching functions as expected in the presence of guard regions. * Test that a read-only mapping of a memfd write sealed mapping can have guard regions established within it and function correctly without violation of the seal. * Test that guard regions installed into a mapping of the anonymous zero page function correctly. Link: https://lkml.kernel.org/r/90c16bec5fcaafcd1700dfa3e9988c3e1aa9ac1d.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Lorenzo Stoakes	272f37d3e9	tools/selftests: expand all guard region tests to file-backed Extend the guard region tests to allow for test fixture variants for anon, shmem, and local file files. This allows us to assert that each of the expected behaviours of anonymous memory also applies correctly to file-backed (both shmem and an a file created locally in the current working directory) and thus asserts the same correctness guarantees as all the remaining tests do. The fixture teardown is now performed in the parent process rather than child forked ones, meaning cleanup is always performed, including unlinking any generated temporary files. Additionally the variant fixture data type now contains an enum value indicating the type of backing store and the mmap() invocation is abstracted to allow for the mapping of whichever backing store the variant is testing. We adjust tests as necessary to account for the fact they may now reference files rather than anonymous memory. Link: https://lkml.kernel.org/r/ab42228d2bd9b8aa18e9faebcd5c88732a7e5820.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Lorenzo Stoakes	ce1c0824fc	selftests/mm: rename guard-pages to guard-regions The feature formerly referred to as guard pages is more correctly referred to as 'guard regions', as in fact no pages are ever allocated in the process of installing the regions. To avoid confusion, rename the tests accordingly. [lorenzo.stoakes@oracle.com: fix guard regions invocation] Link: https://lkml.kernel.org/r/13426c71-d069-4407-9340-b227ff8b8736@lucifer.local Link: https://lkml.kernel.org/r/1c3cd04a3f69b5756b94bda701ac88325a9be18b.1739469950.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:15 -07:00
Mark Brown	85968b6a20	selftests/mm: allow tests to run with no huge pages support Currently the mm selftests refuse to run if huge pages are not available in the current system but this is an optional feature and not all the tests actually require them. Change the test during startup to be non-fatal and skip or omit tests which actually rely on having huge pages, allowing the other tests to be run. The gup_test does support using madvise() to configure huge pages but it ignores the error code so we just let it run. Link: https://lkml.kernel.org/r/20250212-kselftest-mm-no-hugepages-v1-2-44702f538522@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nico Pache <npache@redhat.com> Cc: Mariano Pache <npache@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:14 -07:00
Eric Salem	3fae696393	selftests: mm: fix typo Fix misspelling. Link: https://lkml.kernel.org/r/77e0e915-36c3-4c95-84b8-0b73aaa17951@gmail.com Signed-off-by: Eric Salem <ericsalem@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:11 -07:00
Mark Brown	023fff71d8	selftests/mm: fix thuge-gen test name uniqueness The thuge-gen test_mmap() and test_shmget() tests are repeatedly run for a variety of sizes but always report the result of their test with the same name, meaning that automated sysetms running the tests are unable to distinguish between the various tests. Add the supplied sizes to the logged test names to distinguish between runs. My test automation was getting pretty confused about what was going on - the test names are a pretty important external interface. Link: https://lkml.kernel.org/r/20250204-kselftest-mm-fix-dups-v1-1-6afe417ef4bb@kernel.org Fixes: `b38bd9b2c4` ("selftests/mm: thuge-gen: conform to TAP format output") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:03 -07:00
Lorenzo Stoakes	c372473a54	mm: completely abstract unnecessary adj_start calculation The adj_start calculation has been a constant source of confusion in the VMA merge code. There are two cases to consider, one where we adjust the start of the vmg->middle VMA (i.e. the vmg->__adjust_middle_start merge flag is set), in which case adj_start is calculated as: (1) adj_start = vmg->end - vmg->middle->vm_start And the case where we adjust the start of the vmg->next VMA (i.e. the vmg->__adjust_next_start merge flag is set), in which case adj_start is calculated as: (2) adj_start = -(vmg->middle->vm_end - vmg->end) We apply (1) thusly: vmg->middle->vm_start = vmg->middle->vm_start + vmg->end - vmg->middle->vm_start Which simplifies to: vmg->middle->vm_start = vmg->end Similarly, we apply (2) as: vmg->next->vm_start = vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end) Noting that for these VMAs to be mergeable vmg->middle->vm_end == vmg->next->vm_start and so this simplifies to: vmg->next->vm_start = vmg->next->vm_start + -(vmg->next->vm_start - vmg->end) Which simplifies to: vmg->next->vm_start = vmg->end Therefore in each case, we simply need to adjust the start of the VMA to vmg->end (!) and can do away with this adj_start calculation. The only caveat is that we must ensure we update the vm_pgoff field correctly. We therefore abstract this entire calculation to a new function vmg_adjust_set_range() which performs this calculation and sets the adjusted VMA's new range using the general vma_set_range() function. We also must update vma_adjust_trans_huge() which expects the now-abstracted adj_start parameter. It turns out this is wholly unnecessary. In vma_adjust_trans_huge() the relevant code is: if (adjust_next > 0) { struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end); unsigned long nstart = next->vm_start; nstart += adjust_next; split_huge_pmd_if_needed(next, nstart); } The only case where this is relevant is when vmg->__adjust_middle_start is specified (in which case adj_next would have been positive), i.e. the one in which the vma specified is vmg->prev and this the sought 'next' VMA would be vmg->middle. We can therefore eliminate the find_vma() invocation altogether and simply provide the vmg->middle VMA in this instance, or NULL otherwise. Again we have an adj_next offset calculation: next->vm_start + vmg->end - vmg->middle->vm_start Where next == vmg->middle this simplifies to vmg->end as previously demonstrated. Therefore nstart is equal to vmg->end, which is already passed to vma_adjust_trans_huge() via the 'end' parameter and so this code (rather delightfully) simplifies to: if (next) split_huge_pmd_if_needed(next, end); With these changes in place, it becomes silly for commit_merge() to return vmg->target, as it is always the same and threaded through vmg, so we finally change commit_merge() to return an error value once again. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/7bce2cd4b5afb56211822835d145471280c3dccc.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	fe3e9cf0d7	mm: eliminate adj_start parameter from commit_merge() Introduce internal vmg->__adjust_middle_start and vmg->__adjust_next_start merge flags, enabling us to indicate to commit_merge() that we are performing a merge which either spans only part of vmg->middle, or part of vmg->next respectively. In the former instance, we change the start of vmg->middle to match the attributes of vmg->prev, without spanning all of vmg->middle. This implies that vmg->prev->vm_end and vmg->middle->vm_start are both increased to form the new merged VMA (vmg->prev) and the new subsequent VMA (vmg->middle). In the latter case, we change the end of vmg->middle to match the attributes of vmg->next, without spanning all of vmg->next. This implies that vmg->middle->vm_end and vmg->next->vm_start are both decreased to form the new merged VMA (vmg->next) and the new prior VMA (vmg->middle). Since we now have a stable set of prev, middle, next VMAs threaded through vmg and with these flags set know what is happening, we can perform the calculation in commit_merge() instead. This allows us to drop the confusing adj_start parameter and instead pass semantic information to commit_merge(). In the latter case the -(middle->vm_end - start) calculation becomes -(middle->vm-end - vmg->end), however this is correct as vmg->end is set to the start parameter. This is because in this case (rather confusingly), we manipulate vmg->middle, but ultimately return vmg->next, whose range will be correctly specified. At this point vmg->start, end is the new range for the prior VMA rather than the merged one. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/bcec0cd980b373a5eb02236cb033034ce1effe42.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	6ab2d9c7c6	mm: further refactor commit_merge() The current VMA merge mechanism contains a number of confusing mechanisms around removal of VMAs on merge and the shrinking of the VMA adjacent to vma->target in the case of merges which result in a partial merge with that adjacent VMA. Since we now have a STABLE set of VMAs - prev, middle, next - we are now able to have the caller of commit_merge() explicitly tell us which VMAs need deleting, using newly introduced internal VMA merge flags. Doing so allows us to embed this state within the VMG and remove the confusing remove, remove2 parameters from commit_merge(). We additionally are able to eliminate the highly confusing and misleading 'expanded' parameter - a parameter that in reality refers to whether or not the return VMA is the target one or the one immediately adjacent. We can infer which is the case from whether or not the adj_start parameter is negative. This also allows us to simplify further logic around iterator configuration and VMA iterator stores. Doing so means we can also eliminate the adjust parameter, as we are able to infer which VMA ought to be adjusted from adj_start - a positive value implies we adjust the start of 'middle', a negative one implies we adjust the start of 'next'. We are then able to have commit_merge() explicitly return the target VMA, or NULL on inability to pre-allocate memory. Errors were previously filtered so behaviour does not change. We additionally move from the slightly odd use of a bitwise-flag enum vmg->merge_flags field to vmg bitfields. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/7bf2ed24af68aac18672b7acebbd9102f48c5b03.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:02 -07:00
Lorenzo Stoakes	3a75ccba04	mm: simplify vma merge structure and expand comments Patch series "mm: further simplify VMA merge operation", v3. While significant efforts have been made to improve the VMA merge operation, there remains remnants of the bad (or rather confusing) old days, which make the code difficult to understand, more bug prone and thus harder to modify. This series attempts to significantly improve matters in a number of respects - with a focus on simplifying the commit_merge() function which actually actions the merge operation - and importantly, adjusting the two most confusing merge cases - those in which we 'adjust' the VMA immediately adjacent to the one being merged. One source of confusion are the VMAs being threaded through the operation themselves - vmg->prev, vmg->vma and vmg->next. At the start of the operation, vmg->vma is either NULL if a new VMA is propose to be added, or if not then a pointer to an existing VMA being modified, and prev/next are (perhaps not present) VMAs sat immediately before and after the range specified in vmg->start, end, respectively. However, during the VMA merge operation, we change vmg->start, end and pgoff to span the newly merged range and vmg->vma to either be: a. The ultimately returned VMA (in most cases) or b. A VMA which we will manipulate, but ultimately instead return vmg->next. Case b. especially here is confusing for somebody reading this code, but the fact we update this state, along with vmg->start, end, pgoff only makes matters worse. We simplify things by replacing vmg->vma with vmg->middle and never changing it - this is always either NULL (for a new VMA) or the VMA being modified between vmg->prev and vmg->next. We further simplify by placing the merged VMA in a new vmg->target field - whether case b. above is the case or not. The reader of the code can now simply rely on vmg->middle being the middle VMA and vmg->target being the ultimately merged VMA. We additionally tackle the confusing cases where we 'adjust' VMAs other than the one we ultimately return as the merged VMA (this includes case b. above). These are: (1) merge <-----------> \|------\|\|--------\| \|------------\|---\| \| prev \|\| middle \| -> \| target \| m \| \|------\|\|--------\| \|------------\|---\| In which case middle must be adjusted so middle->vm_start is increased as well as performing the merge. (2) (equivalent to case b. above) <-------------> \|---------\|\|------\| \|---\|-------------\| \| middle \|\| next \| -> \| m \| target \| \|---------\|\|------\| \|---\|-------------\| In which case next must be adjusted so next->vm_start is decreased as well as performing the merge. This cases have previously been performed by calculating and passing around a dubious and confusing 'adj_start' parameter along side a pointer to an 'adjust' VMA indicating which VMA requires additional adjustment (middle in case 1 and next in case 2). With the VMG structure in place we are able to avoid this by simply setting a merge flag to describe each case: (1) Sets the vmg->__adjust_middle_start flag (2) Sets the vmg->__adjust_next_start flag By doing so it turns out we can vastly simplify the logic and calculate what is required to perform the operation. Taken together the refactorings make it far easier to understand what is being done even in these more confusing cases, make the code far more maintainable, debuggable, and testable, providing more internal state indicating what is happening in the merge operation. The changes have no functional net impact on the merge operation and everything should still behave as it did before. This patch (of 5): The merge code, while much improved, still has a number of points of confusion. As part of a broader series cleaning this up to make this more maintainable, we start by addressing some confusion around vma_merge_struct fields. So far, the caller either provides no vmg->vma (a new VMA) or supplies the existing VMA which is being altered, setting vmg->start,end,pgoff to the proposed VMA dimensions. vmg->vma is then updated, as are vmg->start,end,pgoff as the merge process proceeds and the appropriate merge strategy is determined. This is rather confusing, as vmg->vma starts off as the 'middle' VMA between vmg->prev,next, but becomes the 'target' VMA, except in one specific edge case (merge next, shrink middle). Int his patch we introduce vmg->middle to describe the VMA that is between vmg->prev and vmg->next, and does NOT change during the merge operation. We replace vmg->vma with vmg->target, and use this only during the merge operation itself. Aside from the merge right, shrink middle case, this becomes the VMA that forms the basis of the VMA that is returned. This edge case can be addressed in a future commit. We also add a number of comments to explain what is going on. Finally, we adjust the ASCII diagrams showing each merge case in vma_merge_existing_range() to be clearer - the arrow range previously showed the vmg->start, end spanned area, but it is clearer to change this to show the final merged VMA. This patch has no change in functional behaviour. Link: https://lkml.kernel.org/r/cover.1738326519.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/4dfe60f1419d55e5d0516f56349695d73a57184c.1738326519.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:06:01 -07:00
Zi Yan	ad4c9bb541	selftests/mm: test splitting file-backed THP to any lower order Now split_huge_page*() supports shmem THP split to any lower order. Test it. The test now reads file content out after split to check if the split corrupts the file data. Link: https://lkml.kernel.org/r/20250122161928.1240637-3-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:05:56 -07:00
Zi Yan	035a112e5f	selftests/mm: make file-backed THP split work by writing PMD size data Commit `acd7ccb284` ("mm: shmem: add large folio support for tmpfs") changes huge=always to allocate THP/mTHP based on write size and split_huge_page_test does not write PMD size data, so file-back THP is not created during the test. Fix it by writing PMD size data. Link: https://lkml.kernel.org/r/20250122161928.1240637-1-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 22:05:56 -07:00
Rafael Aquini	67a2f86846	selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation We noticed that uffd-stress test was always failing to run when invoked for the hugetlb profiles on x86_64 systems with a processor count of 64 or bigger: ... # ------------------------------------ # running ./uffd-stress hugetlb 128 32 # ------------------------------------ # ERROR: invalid MiB (errno=9, @uffd-stress.c:459) ... # [FAIL] not ok 3 uffd-stress hugetlb 128 32 # exit=1 ... The problem boils down to how run_vmtests.sh (mis)calculates the size of the region it feeds to uffd-stress. The latter expects to see an amount of MiB while the former is just giving out the number of free hugepages halved down. This measurement discrepancy ends up violating uffd-stress' assertion on number of hugetlb pages allocated per CPU, causing it to bail out with the error above. This commit fixes that issue by adjusting run_vmtests.sh's half_ufd_size_MB calculation so it properly renders the region size in MiB, as expected, while maintaining all of its original constraints in place. Link: https://lkml.kernel.org/r/20250218192251.53243-1-aquini@redhat.com Fixes: `2e47a445d7` ("selftests/mm: run_vmtests.sh: fix hugetlb mem size calculation") Signed-off-by: Rafael Aquini <raquini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-16 17:40:25 -07:00
Rae Moar	2e0cf2b32f	kunit: tool: add test to check parsing late test plan Add test to check for the infinite loop caused by the inability to parse a late test plan. The test parses the following output: TAP version 13 ok 4 test4 1..4 Link: https://lore.kernel.org/r/20250313192714.1380005-1-rmoar@google.com Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-15 18:13:43 -06:00
Rae Moar	1d4c06d519	kunit: tool: Fix bug in parsing test plan A bug was identified where the KTAP below caused an infinite loop: TAP version 13 ok 4 test_case 1..4 The infinite loop was caused by the parser not parsing a test plan if following a test result line. Fix this bug by parsing test plan line to avoid the infinite loop. Link: https://lore.kernel.org/r/20250313192714.1380005-1-rmoar@google.com Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-15 18:13:38 -06:00
Saket Kumar Bhaskar	8c10109a97	selftests/bpf: Fix sockopt selftest failure on powerpc The SO_RCVLOWAT option is defined as 18 in the selftest header, which matches the generic definition. However, on powerpc, SO_RCVLOWAT is defined as 16. This discrepancy causes sol_socket_sockopt() to fail with the default switch case on powerpc. This commit fixes by defining SO_RCVLOWAT as 16 for powerpc. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311084647.3686544-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:49:24 -07:00
Viktor Malik	de07b18289	selftests/bpf: Fix string read in strncmp benchmark The strncmp benchmark uses the bpf_strncmp helper and a hand-written loop to compare two strings. The values of the strings are filled from userspace. One of the strings is non-const (in .bss) while the other is const (in .rodata) since that is the requirement of bpf_strncmp. The problem is that in the hand-written loop, Clang optimizes the reads from the const string to always return 0 which breaks the benchmark. Use barrier_var to prevent the optimization. The effect can be seen on the strncmp-no-helper variant. Before this change: # ./bench strncmp-no-helper Setting up benchmark 'strncmp-no-helper'... Benchmark 'strncmp-no-helper' started. Iter 0 (112.309us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 1 (-23.238us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 2 ( 58.994us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 3 (-30.466us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 4 ( 29.996us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 5 ( 16.949us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Iter 6 (-60.035us): hits 0.000M/s ( 0.000M/prod), drops 0.000M/s, total operations 0.000M/s Summary: hits 0.000 ± 0.000M/s ( 0.000M/prod), drops 0.000 ± 0.000M/s, total operations 0.000 ± 0.000M/s After this change: # ./bench strncmp-no-helper Setting up benchmark 'strncmp-no-helper'... Benchmark 'strncmp-no-helper' started. Iter 0 ( 77.711us): hits 5.534M/s ( 5.534M/prod), drops 0.000M/s, total operations 5.534M/s Iter 1 ( 11.215us): hits 6.006M/s ( 6.006M/prod), drops 0.000M/s, total operations 6.006M/s Iter 2 (-14.253us): hits 5.931M/s ( 5.931M/prod), drops 0.000M/s, total operations 5.931M/s Iter 3 ( 59.087us): hits 6.005M/s ( 6.005M/prod), drops 0.000M/s, total operations 6.005M/s Iter 4 (-21.379us): hits 6.010M/s ( 6.010M/prod), drops 0.000M/s, total operations 6.010M/s Iter 5 (-20.310us): hits 5.861M/s ( 5.861M/prod), drops 0.000M/s, total operations 5.861M/s Iter 6 ( 53.937us): hits 6.004M/s ( 6.004M/prod), drops 0.000M/s, total operations 6.004M/s Summary: hits 5.969 ± 0.061M/s ( 5.969M/prod), drops 0.000 ± 0.000M/s, total operations 5.969 ± 0.061M/s Fixes: `9c42652f8b` ("selftests/bpf: Add benchmark for bpf_strncmp() helper") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20250313122852.1365202-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:49:24 -07:00
Kumar Kartikeya Dwivedi	1f375aef6c	selftests/bpf: Fix arena_spin_lock compilation on PowerPC Venkat reported a compilation error for BPF selftests on PowerPC [0]. The crux of the error is the following message: In file included from progs/arena_spin_lock.c:7: /root/bpf-next/tools/testing/selftests/bpf/bpf_arena_spin_lock.h:122:8: error: member reference base type '__attribute__((address_space(1))) u32' (aka '__attribute__((address_space(1))) unsigned int') is not a structure or union 122 \| old = atomic_read(&lock->val); This is because PowerPC overrides the qspinlock type changing the lock->val member's type from atomic_t to u32. To remedy this, import the asm-generic version in the arena spin lock header, name it __qspinlock (since it's aliased to arena_spinlock_t, the actual name hardly matters), and adjust the selftest to not depend on the type in vmlinux.h. [0]: https://lore.kernel.org/bpf/7bc80a3b-d708-4735-aa3b-6a8c21720f9d@linux.ibm.com Fixes: `88d706ba7c` ("selftests/bpf: Introduce arena spin lock") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311154244.3775505-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:59 -07:00
Blaise Boscaccy	7987f1627e	selftests/bpf: Add a kernel flag test for LSM bpf hook This test exercises the kernel flag added to security_bpf by effectively blocking light-skeletons from loading while allowing normal skeletons to function as-is. Since this should work with any arbitrary BPF program, an existing program from LSKELS_EXTRA was used as a test payload. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250310221737.821889-3-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Blaise Boscaccy	082f1db02c	security: Propagate caller information in bpf hooks Certain bpf syscall subcommands are available for usage from both userspace and the kernel. LSM modules or eBPF gatekeeper programs may need to take a different course of action depending on whether or not a BPF syscall originated from the kernel or userspace. Additionally, some of the bpf_attr struct fields contain pointers to arbitrary memory. Currently the functionality to determine whether or not a pointer refers to kernel memory or userspace memory is exposed to the bpf verifier, but that information is missing from various LSM hooks. Here we augment the LSM hooks to provide this data, by simply passing a boolean flag indicating whether or not the call originated in the kernel, in any hook that contains a bpf_attr struct that corresponds to a subcommand that may be called from the kernel. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20250310221737.821889-2-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Chen Ni	a03d375330	selftests/bpf: Convert comma to semicolon Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/bpf/20250310032045.651068-1-nichen@iscas.ac.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Anton Protopopov	caa4237a79	selftests/bpf: Fix selection of static vs. dynamic LLVM The Makefile uses the exit code of the `llvm-config --link-static --libs` command to choose between statically-linked and dynamically-linked LLVMs. The stdout and stderr of that command are redirected to /dev/null. To redirect the output the "&>" construction is used, which might not be supported by /bin/sh, which is executed by make for $(shell ...) commands. On such systems the test will fail even if static LLVM is actually supported. Replace "&>" by ">/dev/null 2>&1" to fix this. Fixes: `2a9d30fac8` ("selftests/bpf: Support dynamically linking LLVM if static is not available") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20250310145112.1261241-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Emil Tsalapatis	c06707ff07	selftests: bpf: fix duplicate selftests in cpumask_success. The BPF cpumask selftests are currently run twice in test_progs/cpumask.c, once by traversing cpumask_success_testcases, and once by invoking RUN_TESTS(cpumask_success). Remove the invocation of RUN_TESTS to properly run the selftests only once. Now that the tests are run only through cpumask_success_testscases, add to it the missing test_refcount_null_tracking testcase. Also remove the __success annotation from it, since it is now loaded and invoked by the runner. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:58 -07:00
Emil Tsalapatis	918ba2636d	selftests: bpf: add bpf_cpumask_populate selftests Add selftests for the bpf_cpumask_populate helper that sets a bpf_cpumask to a bit pattern provided by a BPF program. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	1041b8bc9f	selftests/bpf: lwt_seg6local: Move test to test_progs test_lwt_seg6local.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_seg6local.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_seg6local.c. Use the network helpers instead of `nc` to exchange the final packet. Remove test_lwt_seg6local.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-2-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	b1d85ff517	selftests/bpf: lwt_seg6local: Remove unused routes Some routes in fb00:: are initialized during setup, even though they aren't needed by the test as the UDP packets will travel through the lightweight tunnels. Remove these unnecessary routes. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-1-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Feng Yang	339c1f8ea1	selftests/bpf: Fix cap_enable_effective() return code The caller of cap_enable_effective() expects negative error code. Fix it. Before: failed to restore CAP_SYS_ADMIN: -1, Unknown error -1 After: failed to restore CAP_SYS_ADMIN: -3, No such process failed to restore CAP_SYS_ADMIN: -22, Invalid argument Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305022234.44932-1-yangfeng59949@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	8bda5b787d	selftests/bpf: Fix dangling stdout seen by traffic monitor thread Traffic monitor thread may see dangling stdout as the main thread closes and reassigns stdout without protection. This happens when the main thread finishes one subtest and moves to another one in the same netns_new() scope. The issue can be reproduced by running test_progs repeatedly with traffic monitor enabled: for ((i=1;i<=100;i++)); do ./test_progs -a flow_dissector_skb* -m '*' done For restoring stdout in crash_handler(), since it does not really care about closing stdout, simlpy flush stdout and restore it to the original one. Then, Fix the issue by consolidating stdio_restore_cleanup() and stdio_restore(), and protecting the use/close/assignment of stdout with a lock. The locking in the main thread is always performed regradless of whether traffic monitor is running or not for simplicity. It won't have any side-effect. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-3-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	34a25aabcd	selftests/bpf: Allow assigning traffic monitor print function Allow users to change traffic monitor's print function. If not provided, traffic monitor will print to stdout by default. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305182057.2802606-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Amery Hung	aeb5bbb025	selftests/bpf: Clean up call sites of stdio_restore() reset_affinity() and save_ns() are only called in run_one_test(). There is no need to call stdio_restore() in reset_affinity() and save_ns() if stdio_restore() is moved right after a test finishes in run_one_test(). Also remove an unnecessary check of env.stdout_saved in crash_handler() by moving env.stdout_saved assignment to the beginning of main(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250305182057.2802606-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Bastien Curutchet (eBPF Foundation)	f5e288943e	selftests/bpf: Move test_lwt_ip_encap to test_progs test_lwt_ip_encap.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_ip_encap.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_ip_encap.c. Rework the GSO part to avoid using nc and dd. Remove test_lwt_ip_encap.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250304-lwt_ip-v1-1-8fdeb9e79a56@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:57 -07:00
Kumar Kartikeya Dwivedi	2dfc8186d6	selftests/bpf: Add tests for arena spin lock Add some basic selftests for qspinlock built over BPF arena using cond_break_label macro. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Kumar Kartikeya Dwivedi	88d706ba7c	selftests/bpf: Introduce arena spin lock Implement queued spin lock algorithm as BPF program for lock words living in BPF arena. The algorithm is copied from kernel/locking/qspinlock.c and adapted for BPF use. We first implement abstract helpers for portable atomics and acquire/release load instructions, by relying on X86_64 presence to elide expensive barriers and rely on implementation details of the JIT, and fall back to slow but correct implementations elsewhere. When support for acquire/release load/stores lands, we can improve this state. Then, the qspinlock algorithm is adapted to remove dependence on multi-word atomics due to lack of support in BPF ISA. For instance, xchg_tail cannot use 16-bit xchg, and needs to be a implemented as a 32-bit try_cmpxchg loop. Loops which are seemingly infinite from verifier PoV are annotated with cond_break_label macro to return an error. Only 1024 NR_CPUs are supported. Note that the slow path is a global function, hence the verifier doesn't know the return value's precision. The recommended way of usage is to always test against zero for success, and not ret < 0 for error, as the verifier would assume ret > 0 has not been accounted for. Add comments in the function documentation about this quirk. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Kumar Kartikeya Dwivedi	4b7ede0be3	selftests/bpf: Introduce cond_break_label Add a new cond_break_label macro that jumps to the specified label when the cond_break termination check fires, and allows us to better handle the uncontrolled termination of the loop. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:56 -07:00
Eduard Zingerman	871ef8d50e	bpf: correct use/def for may_goto instruction may_goto instruction does not use any registers, but in compute_insn_live_regs() it was treated as a regular conditional jump of kind BPF_K with r0 as source register. Thus unnecessarily marking r0 as used. Fixes: `14c8552db6` ("bpf: simple DFA-based live registers analysis") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305085436.2731464-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:30 -07:00
Eduard Zingerman	2ea8f6a1cd	selftests/bpf: test cases for compute_live_registers() Cover instructions from each kind: - assignment - arithmetic - store/load - endian conversion - atomics - branches, conditional branches, may_goto, calls - LD_ABS/LD_IND - address_space_cast Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-6-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:30 -07:00
Eduard Zingerman	14c8552db6	bpf: simple DFA-based live registers analysis Compute may-live registers before each instruction in the program. The register is live before the instruction I if it is read by I or some instruction S following I during program execution and is not overwritten between I and S. This information would be used in the next patch as a hint in func_states_equal(). Use a simple algorithm described in [1] to compute this information: - define the following: - I.use : a set of all registers read by instruction I; - I.def : a set of all registers written by instruction I; - I.in : a set of all registers that may be alive before I execution; - I.out : a set of all registers that may be alive after I execution; - I.successors : a set of instructions S that might immediately follow I for some program execution; - associate separate empty sets 'I.in' and 'I.out' with each instruction; - visit each instruction in a postorder and update corresponding 'I.in' and 'I.out' sets as follows: I.out = U [S.in for S in I.successors] I.in = (I.out / I.def) U I.use (where U stands for set union, / stands for set difference) - repeat the computation while I.{in,out} changes for any instruction. On implementation side keep things as simple, as possible: - check_cfg() already marks instructions EXPLORED in post-order, modify it to save the index of each EXPLORED instruction in a vector; - represent I.{in,out,use,def} as bitmasks; - don't split the program into basic blocks and don't maintain the work queue, instead: - do fixed-point computation by visiting each instruction; - maintain a simple 'changed' flag if I.{in,out} for any instruction change; Measurements show that even such simplistic implementation does not add measurable verification time overhead (for selftests, at-least). Note on check_cfg() ex_insn_beg/ex_done change: To avoid out of bounds access to env->cfg.insn_postorder array, it should be guaranteed that instruction transitions to EXPLORED state only once. Previously this was not the fact for incorrect programs with direct calls to exception callbacks. The 'align' selftest needs adjustment to skip computed insn/live registers printout. Otherwise it matches lines from the live registers printout. [1] https://en.wikipedia.org/wiki/Live-variable_analysis Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:29 -07:00
Peilin Ye	ff3afe5da9	selftests/bpf: Add selftests for load-acquire and store-release instructions Add several ./test_progs tests: - arena_atomics/load_acquire - arena_atomics/store_release - verifier_load_acquire/* - verifier_store_release/* - verifier_precision/bpf_load_acquire - verifier_precision/bpf_store_release The last two tests are added to check if backtrack_insn() handles the new instructions correctly. Additionally, the last test also makes sure that the verifier "remembers" the value (in src_reg) we store-release into e.g. a stack slot. For example, if we take a look at the test program: #0: r1 = 8; /* store_release((u64 )(r10 - 8), r1); / #1: .8byte %[store_release]; #2: r1 = (u64 )(r10 - 8); #3: r2 = r10; #4: r2 += r1; #5: r0 = 0; #6: exit; At #1, if the verifier doesn't remember that we wrote 8 to the stack, then later at #4 we would be adding an unbounded scalar value to the stack pointer, which would cause the program to be rejected: VERIFIER LOG: ============= ... math between fp pointer and register with unbounded min value is not allowed For easier CI integration, instead of using built-ins like __atomic_{load,store}_n() which depend on the new __BPF_FEATURE_LOAD_ACQ_STORE_REL pre-defined macro, manually craft load-acquire/store-release instructions using __imm_insn(), as suggested by Eduard. All new tests depend on: (1) Clang major version >= 18, and (2) ENABLE_ATOMICS_TESTS is defined (currently implies -mcpu=v3 or v4), and (3) JIT supports load-acquire/store-release (currently arm64 and x86-64) In .../progs/arena_atomics.c: /* 8-byte-aligned / __u8 __arena_global load_acquire8_value = 0x12; / 1-byte hole */ __u16 __arena_global load_acquire16_value = 0x1234; That 1-byte hole in the .addr_space.1 ELF section caused clang-17 to crash: fatal error: error in backend: unable to write nop sequence of 1 bytes To work around such llvm-17 CI job failures, conditionally define __arena_global variables as 64-bit if __clang_major__ < 18, to make sure .addr_space.1 has no holes. Ideally we should avoid compiling this file using clang-17 at all (arena tests depend on __BPF_FEATURE_ADDR_SPACE_CAST, and are skipped for llvm-17 anyway), but that is a separate topic. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/1b46c6feaf0f1b6984d9ec80e500cc7383e9da1a.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:29 -07:00
Kumar Kartikeya Dwivedi	2fb761823e	bpf, x86: Add x86 JIT support for timed may_goto Implement the arch_bpf_timed_may_goto function using inline assembly to have control over which registers are spilled, and use our special protocol of using BPF_REG_AX as an argument into the function, and as the return value when going back. Emit call depth accounting for the call made from this stub, and ensure we don't have naked returns (when rethunk mitigations are enabled) by falling back to the RET macro (instead of retq). After popping all saved registers, the return address into the BPF program should be on top of the stack. Since the JIT support is now enabled, ensure selftests which are checking the produced may_goto sequences do not break by adjusting them. Make sure we still test the old may_goto sequence on other architectures, while testing the new sequence on x86_64. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250304003239.2390751-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:28 -07:00
Mykyta Yatsenko	6419d08b6c	selftests/bpf: Add tests for bpf_object__prepare Add selftests, checking that running bpf_object__prepare successfully creates maps before load step. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-5-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:28 -07:00
Bastien Curutchet (eBPF Foundation)	a54e700696	selftests/bpf: test_tunnel: Remove test_tunnel.sh All tests from test_tunnel.sh have been migrated into test test_progs. The last test remaining in the script is the test_ipip() that is already covered in the test_prog framework by the NONE case of test_ipip_tunnel(). Remove the test_tunnel.sh script and its Makefile entry Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-10-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	05cd60ab57	selftests/bpf: test_tunnel: Move ip6tnl tunnel tests to test_progs ip6tnl tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6tnl tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ipip6() and test_ip6ip6() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-9-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	260f2da62d	selftests/bpf: test_tunnel: Move ip6geneve tunnel test to test_progs ip6geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-8-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	bd477738e6	selftests/bpf: test_tunnel: Move geneve tunnel test to test_progs geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-7-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	ea60b6a524	selftests/bpf: test_tunnel: Move ip6erspan tunnel test to test_progs ip6erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-6-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	cadb08a4d3	selftests/bpf: test_tunnel: Move erspan tunnel tests to test_progs erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-5-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	856818b28f	selftests/bpf: test_tunnel: Move ip6gre tunnel test to test_progs ip6gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6gre tunnels. It uses the same network topology and the same BPF programs than the script. Disable the IPv6 DAD feature because it can take lot of time and cause some tests to fail depending on the environment they're run on. Remove test_ip6gre() and test_ip6gretap() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-4-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	257dfd1c6b	selftests/bpf: test_tunnel: Move gre tunnel test to test_progs gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test gre tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_gre() and test_gre_no_tunnel_key() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-3-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:27 -07:00
Bastien Curutchet (eBPF Foundation)	fcb39996a2	selftests/bpf: test_tunnel: Add ping helpers All tests use more or less the same ping commands as final validation. Also test_ping()'s return value is checked with ASSERT_OK() while this check is already done by the SYS() macro inside test_ping(). Create helpers around test_ping() and use them in the tests to avoid code duplication. Remove the unnecessary ASSERT_OK() from the tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-2-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Bastien Curutchet (eBPF Foundation)	5d6aa606c1	selftests/bpf: test_tunnel: Add generic_attach* helpers A fair amount of code duplication is present among tests to attach BPF programs. Create generic_attach* helpers that attach BPF programs to a given interface. Use ASSERT_OK_FD() instead of ASSERT_GE() to check fd's validity. Use these helpers in all the available tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-1-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	346e4ca462	veristat: Report program type guess results to sdterr In order not to pollute CSV output, e.g.: $ ./veristat -o csv exceptions_ext.bpf.o > test.csv Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/extension... Using guessed program type 'sched_cls' for exceptions_ext.bpf.o/throwing_extension... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	2d95b3f582	veristat: Strerror expects positive number (errno) Before: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': Unknown error -2 ... After: ./veristat -G @foobar iters.bpf.o Failed to open presets in 'foobar': No such file or directory ... Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Eduard Zingerman	c0d078da7a	veristat: @files-list.txt notation for object files list Allow reading object file list from file. E.g. the following command: ./veristat @list.txt Is equivalent to the following invocation: ./veristat line-1 line-2 ... line-N Where line-i corresponds to lines from list.txt. Lines starting with '#' are ignored. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250301000147.1583999-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:26 -07:00
Kumar Kartikeya Dwivedi	72ed076abf	selftests/bpf: Add tests for extending sleepable global subprogs Add tests for freplace behavior with the combination of sleepable and non-sleepable global subprogs. The changes_pkt_data selftest did all the hardwork, so simply rename it and include new support for more summarization tests for might_sleep bit. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Kumar Kartikeya Dwivedi	b2bb703434	selftests/bpf: Test sleepable global subprogs in atomic contexts Add tests for rejecting sleepable and accepting non-sleepable global function calls in atomic contexts. For spin locks, we still reject all global function calls. Once resilient spin locks land, we will carefully lift in cases where we deem it safe. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Alexis Lothoré (eBPF Foundation)	93cf4e537e	bpf/selftests: test_select_reuseport_kern: Remove unused header test_select_reuseport_kern.c is currently including <stdlib.h>, but it does not use any definition from there. Remove stdlib.h inclusion from test_select_reuseport_kern.c Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250227-remove_wrong_header-v1-1-bc94eb4e2f73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Yonghong Song	2222aa1c89	selftests/bpf: Add selftests allowing cgroup prog pre-ordering Add a few selftests with cgroup prog pre-ordering. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250224230121.283601-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Jiayuan Chen	7f260af1f2	selftests/bpf: Fixes for test_maps test BPF CI has failed 3 times in the last 24 hours. Add retry for ENOMEM. It's similar to the optimization plan: commit `2f553b032c` ("selftsets/bpf: Retry map update for non-preallocated per-cpu map") Failed CI: https://github.com/kernel-patches/bpf/actions/runs/13549227497/job/37868926343 https://github.com/kernel-patches/bpf/actions/runs/13548089029/job/37865812030 https://github.com/kernel-patches/bpf/actions/runs/13553536268/job/37883329296 selftests/bpf: Fixes for test_maps test Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' Fork 100 tasks to 'test_update_delete' ...... test_task_storage_map_stress_lookup:PASS test_maps: OK, 0 SKIPPED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-4-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:25 -07:00
Jiayuan Chen	93a279b65a	selftests/bpf: Allow auto port binding for bpf nf Allow auto port binding for bpf nf test to avoid binding conflict. ./test_progs -a bpf_nf 24/1 bpf_nf/xdp-ct:OK 24/2 bpf_nf/tc-bpf-ct:OK 24/3 bpf_nf/alloc_release:OK 24/4 bpf_nf/insert_insert:OK 24/5 bpf_nf/lookup_insert:OK 24/6 bpf_nf/set_timeout_after_insert:OK 24/7 bpf_nf/set_status_after_insert:OK 24/8 bpf_nf/change_timeout_after_alloc:OK 24/9 bpf_nf/change_status_after_alloc:OK 24/10 bpf_nf/write_not_allowlisted_field:OK 24 bpf_nf:OK Summary: 1/10 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-3-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:24 -07:00
Jiayuan Chen	acf0d6f681	selftests/bpf: Allow auto port binding for cgroup connect Allow auto port binding for cgroup connect test to avoid binding conflict. Result: ./test_progs -a cgroup_v1v2 59 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-2-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:24 -07:00
Mykyta Yatsenko	064e9aacfd	selftests/bpf: Add tests for bpf_dynptr_copy Add XDP setup type for dynptr tests, enabling testing for non-contiguous buffer. Add 2 tests: - test_dynptr_copy - verify correctness for the fast (contiguous buffer) code path. - test_dynptr_copy_xdp - verifies code paths that handle non-contiguous buffer. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250226183201.332713-4-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-03-15 11:48:20 -07:00
Alison Schofield	114a89b433	cxl/test: Define a CFMWS capable of a 3 way HB interleave The CXL unit test cxl-xor-region.sh is skipping a 1+1+1 region interleave test case because the window is not defined. Additionally, upcoming expansion of 3 way HB interleave test cases (like 2+2+2) require the same window. Replace an unused CFMWS with a 3-way capable CFMWS in the set of CFMWS's loaded when interleave_arithmetic=1. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://patch.msgid.link/20250226221931.2352061-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 16:27:40 -07:00
Dave Jiang	763e15d047	Merge branch 'for-6.15/extended-linear-cache' into cxl-for-next2 Add support for Extended Linear Cache for CXL. Add enumeration support of the cache. Add MCE notification of the aliased memory address.	2025-03-14 16:22:34 -07:00
Dave Jiang	d781a45270	Merge branch 'for-6.15/dirty-shutdown' into cxl-for-next2 Add support for Global Persistent Flush (GPF) and dirty shutdown accounting.	2025-03-14 16:11:42 -07:00
Davidlohr Bueso	6eb52f63ea	tools/testing/cxl: Set Shutdown State support Add support to emulate the CXL Set Shutdown State operation. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-5-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:27 -07:00
Yuquan Wang	2da9ad027e	cxl/pmem: debug invalid serial number data In a nvdimm interleave-set each device with an invalid or zero serial number may cause pmem region initialization to fail, but in cxl case such device could still set cookies of nd_interleave_set and create a nvdimm pmem region. This adds the validation of serial number in cxl pmem region creation. The event of no serial number would cause to fail to set the cookie and pmem region. For cxl-test to work properly, always +1 on mock device's serial number. Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250219040029.515451-2-wangyuquan1236@phytium.com.cn Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:01:04 -07:00
Dave Jiang	9387c6aec0	Merge branch 'for-6.15/fw-first-error-logging' into cxl-for-next2 Add logging support for CXL CPER endpoint and port protocol errors. Including the 2 patches that was completed later. Link: https://lore.kernel.org/linux-cxl/20250123084421.127697-1-Smita.KoralahalliChannabasappa@amd.com/ Link: https://lore.kernel.org/linux-cxl/20250310223839.31342-1-Smita.KoralahalliChannabasappa@amd.com/	2025-03-14 14:27:17 -07:00
Smita Koralahalli	36f257e3b0	acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol errors. The defined trace events cxl_aer_uncorrectable_error and cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them to trace FW-First Protocol errors. Since the CXL code is required to be called from process context and GHES is in interrupt context, use workqueues for processing. Similar to CXL CPER event handling, use kfifo to handle errors as it simplifies queue processing by providing lock free fifo operations. Add the ability for the CXL sub-system to register a workqueue to process CXL CPER protocol errors. [DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:21:45 -07:00
Tamir Duberstein	97c1f302f2	scanf: convert self-test to KUnit Convert the scanf() self-test to a KUnit test. In the interest of keeping the patch reasonably-sized this doesn't refactor the tests into proper parameterized tests - it's all one big test case. Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-3-b98820fa39ff@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-14 13:55:37 -07:00
Niklas Cassel	24a42582b0	selftests: pci_endpoint: Use IRQ_TYPE_* defines from UAPI header In order to improve readability, use the IRQ_TYPE_* defines from the UAPI header rather than using raw values. No functional change. Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Link: https://lore.kernel.org/r/20250310111016.859445-12-cassel@kernel.org	2025-03-14 16:12:04 +00:00
Matthieu Baerts (NGI0)	a1e36ec363	selftests: drv-net: fix merge conflicts resolution After the recent merge between net-next and net, I got some conflicts on my side because the merge resolution was different from Stephen's one [1] I applied on my side in the MPTCP tree. It looks like the code that is now in net-next is using the old way to retrieve the local and remote addresses. This patch is now using the new way, like what was in Stephen's email [1]. Also, in get_interface_info(), there were no conflicts in this area, because that was new code from 'net', but a small adaptation was needed there as well to get the remote address. Fixes: `941defcea7` ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Link: https://lore.kernel.org/20250311115758.17a1d414@canb.auug.org.au [1] Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250314-net-next-drv-net-ping-fix-merge-v1-1-0d5c19daf707@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-14 10:24:14 +01:00
Paolo Abeni	941defcea7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: tools/testing/selftests/drivers/net/ping.py `75cc19c8ff` ("selftests: drv-net: add xdp cases for ping.py") `de94e86974` ("selftests: drv-net: store addresses in dict indexed by ipver") https://lore.kernel.org/netdev/20250311115758.17a1d414@canb.auug.org.au/ net/core/devmem.c `a70f891e0f` ("net: devmem: do not WARN conditionally after netdev_rx_queue_restart()") `1d22d3060b` ("net: drop rtnl_lock for queue_mgmt operations") https://lore.kernel.org/netdev/20250313114929.43744df1@canb.auug.org.au/ Adjacent changes: tools/testing/selftests/net/Makefile `6f50175cca` ("selftests: Add IPv6 link-local address generation tests for GRE devices.") `2e5584e0f9` ("selftests/net: expand cmsg_ipv6.sh with ipv4") drivers/net/ethernet/broadcom/bnxt/bnxt.c `661958552e` ("eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic") `fe96d717d3` ("bnxt_en: Extend queue stop/start for TX rings") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 23:08:11 +01:00
Jason Xing	a1e0783e10	selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN Add selftests for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN BPF socket cases. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250312153523.9860-5-kerneljasonxing@gmail.com	2025-03-13 14:42:53 -07:00
Linus Torvalds	4003c9e787	Including fixes from netfilter, bluetooth and wireless. No known regressions outstanding. Current release - regressions: - wifi: nl80211: fix assoc link handling - eth: lan78xx: sanitize return values of register read/write functions Current release - new code bugs: - ethtool: tsinfo: fix dump command - bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL - eth: mlx5: DR, use the right action structs for STEv3 Previous releases - regressions: - netfilter: nf_tables: make destruction work queue pernet - gre: fix IPv6 link-local address generation. - wifi: iwlwifi: fix TSO preparation - bluetooth: revert "bluetooth: hci_core: fix sleeping function called from invalid context" - ovs: revert "openvswitch: switch to per-action label counting in conntrack" - eth: ice: fix switchdev slow-path in LAG - eth: bonding: fix incorrect MAC address setting to receive NS messages Previous releases - always broken: - core: prevent TX of unreadable skbs - sched: prevent creation of classes with TC_H_ROOT - netfilter: nft_exthdr: fix offset with ipv4_find_option() - wifi: cfg80211: cancel wiphy_work before freeing wiphy - mctp: copy headers if cloned - phy: nxp-c45-tja11xx: add errata for TJA112XA/B - eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx \| tx} - eth: mlx5: bridge, fix the crash caused by LAG state check Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmfS+yMSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOknd8P/3XdIaDXrwVl9BSi37qToRs40i+6qPep 5MZN+wakJkuHQqooADF/UarqHWEtQwFOTJalgsPbrUpsespA0lhmzSRHsk+2Xy// 8o8RuZ2VngN1/UVBaoXb0QOY28CgtDs0SdW6/jST/dEp3bHvW+ZTDR2aaivKAWtJ gdWIWR+Ctw1lpetKJer3loPXjPd2doDedSlf64CiHw7dRFTwiE2AcrTRkYdd18Ml HJjiBkqKMnUTxqWxOU3GJxdl7Qru7CvAfn3XQ0PjCO9hADPJb/uWZDRENauk3/jE 6lJc+EP5AoG3Ce8MLzHr/aayZR3YtyhJ2TaWmyPbe6BJq7tPH84ydKMZnF7UnOd5 Jp4T6rykkJOYnUdiYGiOPHcTxQxUugKUrL3rNvGJep+x3JAi/DJJkHNWEx+EBB55 3YbDTOUiY1bCGOoGJIZRwNzw52O0+7qINvCtAN21sMu4pKr/nYMzO8L9ldFExllj EuOcxr813V09NyoJs+wWcy9FdlVmcPwCsT4Pylrdb+mA1XohPzMwibH3yHk4XCZ6 Aijz5Bb78i4vg+kZzNZmVY11J5yL0/eC8DefMgUe3+JKwyM2TiSn5hHK6tYbTLtM yAjbTkeu/FD6alSRAnIAL+YBHg4ri3fXXjU4vLZA5n3wWIjfqpL7JAFgnK57Bd8Z 1X5k+52VjQzj =lIrc -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, bluetooth and wireless. No known regressions outstanding. Current release - regressions: - wifi: nl80211: fix assoc link handling - eth: lan78xx: sanitize return values of register read/write functions Current release - new code bugs: - ethtool: tsinfo: fix dump command - bluetooth: btusb: configure altsetting for HCI_USER_CHANNEL - eth: mlx5: DR, use the right action structs for STEv3 Previous releases - regressions: - netfilter: nf_tables: make destruction work queue pernet - gre: fix IPv6 link-local address generation. - wifi: iwlwifi: fix TSO preparation - bluetooth: revert "bluetooth: hci_core: fix sleeping function called from invalid context" - ovs: revert "openvswitch: switch to per-action label counting in conntrack" - eth: - ice: fix switchdev slow-path in LAG - bonding: fix incorrect MAC address setting to receive NS messages Previous releases - always broken: - core: prevent TX of unreadable skbs - sched: prevent creation of classes with TC_H_ROOT - netfilter: nft_exthdr: fix offset with ipv4_find_option() - wifi: cfg80211: cancel wiphy_work before freeing wiphy - mctp: copy headers if cloned - phy: nxp-c45-tja11xx: add errata for TJA112XA/B - eth: - bnxt: fix kernel panic in the bnxt_get_queue_stats{rx \| tx} - mlx5: bridge, fix the crash caused by LAG state check" * tag 'net-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) net: mana: cleanup mana struct after debugfs_remove() net/mlx5e: Prevent bridge link show failure for non-eswitch-allowed devices net/mlx5: Bridge, fix the crash caused by LAG state check net/mlx5: Lag, Check shared fdb before creating MultiPort E-Switch net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs net/mlx5: HWS, Rightsize bwc matcher priority net/mlx5: DR, use the right action structs for STEv3 Revert "openvswitch: switch to per-action label counting in conntrack" net: openvswitch: remove misbehaving actions length check selftests: Add IPv6 link-local address generation tests for GRE devices. gre: Fix IPv6 link-local address generation. netfilter: nft_exthdr: fix offset with ipv4_find_option() selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT net_sched: Prevent creation of classes with TC_H_ROOT ipvs: prevent integer overflow in do_ip_vs_get_ctl() selftests: netfilter: skip br_netfilter queue tests if kernel is tainted netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree() wifi: mac80211: fix MPDU length parsing for EHT 5/6 GHz qlcnic: fix memory leak issues in qlcnic_sriov_common.c rtase: Fix improper release of ring list entries in rtase_sw_reset ...	2025-03-13 07:58:48 -10:00
Tamir Duberstein	7a79e7daa8	printf: convert self-test to KUnit Convert the printf() self-test to a KUnit test. In the interest of keeping the patch reasonably-sized this doesn't refactor the tests into proper parameterized tests - it's all one big test case. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-1-4d85c361c241@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-03-13 10:26:33 -07:00
Paolo Abeni	2409fa66e2	netfilter pull request 25-03-13 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmfSqyUACgkQ1w0aZmrP KyGocBAAu0+rdAcW1wwSYk53tRN663WpbbYg/AchJ9ilLH2IXdCzzbeuekRzvSmS /gS2WNANmckL314300RV5vTd9+PCbANjh1248O4S8ZQnhmomoiB3xD4QM9fu8HKZ RE+koHmnHESl+TiiUzgw31OtZEdSpjn7zUzm+qAznBeDXfyBE1YWQ9q8LALi0+3m RW/waYykwRktzHCANBYNF0oqt6siBrHnI9KHgXIfQCEK0mc3A/s8MfO1IJqC177Y It/+4a6qsaGZZsSwGCrQ/40CO3rItzF7jefpeAjv/di224EQ+p2sWZgEiLPCMtvk 6WSfYhKNiDnKGAe1u9hCG2RdaSQ8Og7BsBWFeB56gZbK7gPoFY2VCM/hATsBTYG8 bn8xQKFoA9y7dYp8AP6a33OlzV+/qOfJa2kXQ90yjS4yWnMY+q59j1D564z/WhZ9 gVZjrcUDSsv3ETtp7vk7s+0SoI3EvSLb/umcjHZR6oeLkNFWpQ4r9NqY0GX6+zDC 89YR2w/RbO9WMRb52OAqY3HdKmOflwEIN5mDWCAB4sIPUgCpoDzvIjC2xLdr6lUW ydoisH2B87luG0p7zKP7iGE1KIp8wm+zO+XciOrcjyie5YNMHyCFRbzGAsd47CLB UjNrrlAS0tU65Nb+JzsrZOYye2cghE+KC8fKHvpwjk54lsaPHl0= =OTtx -----END PGP SIGNATURE----- Merge tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Missing initialization of cpu and jiffies32 fields in conncount, from Kohei Enju. 2) Skip several tests in case kernel is tainted, otherwise tests bogusly report failure too as they also check for tainted kernel, from Florian Westphal. 3) Fix a hyphothetical integer overflow in do_ip_vs_get_ctl() leading to bogus error logs, from Dan Carpenter. 4) Fix incorrect offset in ipv4 option match in nft_exthdr, from Alexey Kashavkin. netfilter pull request 25-03-13 * tag 'nf-25-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_exthdr: fix offset with ipv4_find_option() ipvs: prevent integer overflow in do_ip_vs_get_ctl() selftests: netfilter: skip br_netfilter queue tests if kernel is tainted netfilter: nf_conncount: Fully initialize struct nf_conncount_tuple in insert_tree() ==================== Link: https://patch.msgid.link/20250313095636.2186-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 15:07:39 +01:00
Thomas Gleixner	8e63360d86	selftests/timers/posix-timers: Add a test for exact allocation mode The exact timer ID allocation mode is used by CRIU to restore timers with a given ID. Add a test case for it. It's skipped on older kernels when the prctl() fails. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/8734fl2tkx.ffs@tglx	2025-03-13 12:07:18 +01:00
Guillaume Nault	6f50175cca	selftests: Add IPv6 link-local address generation tests for GRE devices. GRE devices have their special code for IPv6 link-local address generation that has been the source of several regressions in the past. Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an IPv6 link-link local address in accordance with the net.ipv6.conf.<dev>.addr_gen_mode sysctl. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/2d6772af8e1da9016b2180ec3f8d9ee99f470c77.1741375285.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-13 10:17:42 +01:00
Sebastian Ott	edfd826b8b	KVM: arm64: selftests: Test that TGRAN*_2 fields are writable Userspace can write to these fields for non-NV guests; add test that do just that. Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/kvmarm/20250306184013.30008-1-sebott@redhat.com/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-12 13:40:53 -07:00
Jakub Kicinski	188107b2c4	selftests: net: bump GRO timeout for gro/setup_veth Commit `51bef03e1a` ("selftests/net: deflake GRO tests") recently switched to NAPI suspension, and lowered the timeout from 1ms to 100us. This started causing flakes in netdev-run CI. Let's bump it to 200us. In a quick test of a debug kernel I see failures with 100us, with 200us in 5 runs I see 2 completely clean runs and 3 with a single retry (GRO test will retry up to 5 times). Reviewed-by: Kevin Krakauer <krakauer@google.com> Link: https://patch.msgid.link/20250310110821.385621-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-12 13:20:04 -07:00
Cong Wang	bb7737de5f	selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT Integrate the reproduer from Mingi to TDC. All test results: 1..4 ok 1 0385 - Create DRR with default setting ok 2 2375 - Delete DRR with handle ok 3 3092 - Show DRR class ok 4 4009 - Reject creation of DRR class with classid TC_H_ROOT Cc: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306232355.93864-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-12 12:52:14 -07:00
Florian Westphal	c21b02fd9c	selftests: netfilter: skip br_netfilter queue tests if kernel is tainted These scripts fail if the kernel is tainted which leads to wrong test failure reports in CI environments when an unrelated test triggers some splat. Check taint state at start of script and SKIP if its already dodgy. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2025-03-12 15:28:34 +01:00
Michael Jeanson	e6644c967d	rseq/selftests: Ensure the rseq ABI TLS is actually 1024 bytes Adding the aligned(1024) attribute to the definition of __rseq_abi did not increase its size to 1024, for this attribute to impact the size of __rseq_abi it would need to be added to the declaration of 'struct rseq_abi'. We only want to increase the size of the TLS allocation to ensure registration will succeed with future extended ABI. Use a union with a dummy member to ensure we allocate 1024 bytes. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/20250311192222.323453-1-mjeanson@efficios.com	2025-03-12 13:19:47 +01:00
Madhavan Srinivasan	73276cee1a	selftest/powerpc/mm/pkey: fix build-break introduced by commit `00894c3fc9` Build break was reported in the powerpc mailing list for next-20250218 with below errors make[1]: Nothing to be done for 'all'. BUILD_TARGET=/root/venkat/linux-next/tools/testing/selftests/powerpc/mm; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C mm all CC pkey_exec_prot In file included from pkey_exec_prot.c:18: /root/venkat/linux-next/tools/testing/selftests/powerpc/include/pkeys.h: In function ‘pkeys_unsupported’: /root/venkat/linux-next/tools/testing/selftests/powerpc/include/pkeys.h:96:34: error: ‘PKEY_UNRESTRICTED’ undeclared (first use in this function) 96 \| pkey = sys_pkey_alloc(0, PKEY_UNRESTRICTED); \| ^~~~~~~~~~~~~~~~~ https://lore.kernel.org/all/20250113170619.484698-2-yury.khrustalev@arm.com/ patchset has been queued to arm64/for-next/pkey_unrestricted which is causing a build break in the selftest/powerpc builds. Commit `6d61527d93` ("mm/pkey: Add PKEY_UNRESTRICTED macro") added a macro PKEY_UNRESTRICTED to handle implicit literal value of 0x0 (which is "unrestricted"). Add the same to selftest/powerpc/pkeys.h to fix the reported build break. Fixes: `00894c3fc9` ("selftests/powerpc: Use PKEY_UNRESTRICTED macro") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/lkml/3267ea6e-5a1a-4752-96ef-8351c912d386@linux.ibm.com/T/ Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://lore.kernel.org/r/20250311084129.39308-1-maddy@linux.ibm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-11 15:16:10 +00:00
Hangbin Liu	9318dc2357	selftests: bonding: fix incorrect mac address The correct mac address for NS target 2001:db8::254 is 33:33:ff:00:02:54, not 33:33:00:00:02:54. The same with client maddress. Fixes: `86fb6173d1` ("selftests: bonding: add ns multicast group testing") Acked-by: Jay Vosburgh <jv@jvosburgh.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306023923.38777-3-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-11 13:19:27 +01:00
Miklos Szeredi	e1c24b52ad	selftests: add tests for mount notification Provide coverage for all mnt_notify_add() instances. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20250307204046.322691-1-mszeredi@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-11 12:04:02 +01:00
Ming Lei	390174c91d	selftests: ublk: improve test usability Add UBLK_TEST_QUIET, so we can print test result(PASS/SKIP/FAIL) only. Also always run from test script's current directory, then the same test script can be started from other work directory. This way helps a lot to reuse this test source code and scripts for other projects(liburing, blktests, ...) Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-12-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	af83ccc7db	selftests: ublk: add stress test for covering IO vs. killing ublk server Add stress_test_01 for running IO vs. killing ublk server, so io_uring exit & cancel code path can be covered, same with ublk's cancel code path. Especially IO buffer lifetime is one big thing for ublk zero copy, the added test can verify if this area works as expected. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-11-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c60ac48eab	selftests: ublk: add one stress test for covering IO vs. removing device Add stress_test_01 for running IO vs. removing device for verifying that ublk device removal can work as expected when heavy IO workloads are in progress. null, loop and loop/zc are covered in this tests. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-10-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	87a9265213	selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests Load ublk_drv module in _prep_test(), and unload it in _cleanup_test(), so that test can always be done in consistent state. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c2cb669a86	selftests: ublk: move zero copy feature check into _add_ublk_dev() Move zero copy feature check into _add_ublk_dev() since we will have more tests which requires to cover zero copy. Then one check function of _check_add_dev() has to be added for dealing with cleanup since '_add_ublk_dev()' is run in sub-shell, and we can't exit from it to terminal shell. Meantime always return error code from _add_ublk_dev(). Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	c83b089a70	selftests: ublk: don't pass ${dev_id} to _cleanup_test() More devices can be created in single tests, so simply remove all ublk devices in _cleanup_test(), meantime remove the ${dev_id} argument of _cleanup_test(). Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	632051ffbd	selftests: ublk: support shellcheck and fix all warning Add shellcheck, meantime fixes all warnings. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250303124324.3563605-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:42 -06:00
Ming Lei	5b2db7a8c7	selftests: ublk: fix parsing '-a' argument The argument of '-a' doesn't follow any value, so fix it by putting it with '-z' together. Fixes: `bedc9cbc5f` ("selftests: ublk: add ublk zero copy test") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 16:24:18 -06:00
Taehee Yoo	75cc19c8ff	selftests: drv-net: add xdp cases for ping.py ping.py has 3 cases, test_v4, test_v6 and test_tcp. But these cases are not executed on the XDP environment. So, it adds XDP environment, existing tests(test_v4, test_v6, and test_tcp) are executed too on the below XDP environment. So, it adds XDP cases. 1. xdp-generic + single-buffer 2. xdp-generic + multi-buffer 3. xdp-native + single-buffer 4. xdp-native + multi-buffer 5. xdp-offload It also makes test_{v4 \| v6 \| tcp} sending large size packets. this may help to check whether multi-buffer is working or not. Note that the physical interface may be down and then up when xdp is attached or detached. This takes some period to activate traffic. So sleep(10) is added if the test interface is the physical interface. netdevsim and veth type interfaces skip sleep. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250309134219.91670-9-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-10 13:31:12 -07:00
Willem de Bruijn	0922cb68ed	selftests/net: expand cmsg_ip with MSG_MORE UDP send with MSG_MORE takes a slightly different path than the lockless fast path. For completeness, add coverage to this case too. Pass MSG_MORE on the initial sendmsg, then follow up with a zero byte write to unplug the cork. Unrelated: also add two missing endlines in usage(). Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250307033620.411611-4-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-10 13:13:04 -07:00
Greg Kroah-Hartman	993a47bd7b	Linux 6.14-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfOKBUeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1aQH/iC+Oyij4VxAjBek BOXIT/p6CwlIXb8ObiWWcRjDPizlcxb3RaV8J2RO+IqaQ2wltxpFANq2G7Re2FPm SNcEpIURAOVcxHGedcfFA91srO5F4FzNTO8LVp7MIbcgMYy3pdk+dbZmi6A691R+ t9pb74m+MAnF1o/MUx7pUlhAT/4ymuuR0F7WCSg4h0Xwe5m0nlJY89kJBC7PCjyd n3mdhsz3rDSLmt/z/T7HGD89r8sYSvm9cOKtL3ELgGTrm7boQV8ii9Y9w04DI8PQ JmIernugcCxmhH36mVUAHgJf2+/T388xFUh/D5+skeUOUZpaJZG866rnb32WpsHc eWLFUeg= =Wypt -----END PGP SIGNATURE----- Merge 6.14-rc6 into driver-core-next We need the driver core fix in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-03-10 17:37:25 +01:00
Ming Lei	2ecdcdfee5	selftests: ublk: add --foreground command line Add --foreground command for helping to debug. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Ming Lei	9d80f48c5e	selftests: ublk: fix build failure Fixes the following build failure: ublk//file_backed.c: In function ‘backing_file_tgt_init’: ublk//file_backed.c:28:42: error: ‘O_DIRECT’ undeclared (first use in this function); did you mean ‘O_DIRECTORY’? 28 \| fd = open(file, O_RDWR \| O_DIRECT); \| ^~~~~~~~ \| O_DIRECTORY when trying to reuse this same utility for liburing test. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Ming Lei	9894e0eaae	selftests: ublk: make ublk_stop_io_daemon() more reliable Improve ublk_stop_io_daemon() in the following ways: - don't wait if ->ublksrv_pid becomes -1, which means that the disk has been stopped - don't wait if ublk char device doesn't exist any more, so we can avoid to rely on inoitfy for wait until the char device is closed And this way may reduce time of delete command a lot. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-03-10 09:17:13 -06:00
Linus Torvalds	a382b06d29	KVM/arm64 fixes for 6.14, take #4 * Fix a couple of bugs affecting pKVM's PSCI relay implementation when running in the hVHE mode, resulting in the host being entered with the MMU in an unknown state, and EL2 being in the wrong mode. x86: * Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. * Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. * Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. * Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. * Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. * Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. * Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfNSeUUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNKngf/cLgQAT9AF4nFqcwh5b5uucKHVJ8W uTiGlWqLAf2UN53L63eZ/7vKQWGQYkOTFvormR14Jam6IYtytsZw1xLBH4fGtUyB qVjk0EPzaKGqn3LrgyneQNCXdyxJv7EBVBgoOKH0pvOksoW2E5ZizhhtRFtL7nCE Yk8FQKpP0mIBk04RMsvzJVEFKIb4OZgJadWo0gryg1oF2aAv7mxQjyqUWsBDsb3q 99c0ElSBfV39FeT8xeok4k7S5jbBWii2KiaH72ZsNiBu0rYmEuLwIoygCNNWL9Wu FPdQ+r//YrzfCJSXwGPfdUaRaF4p2642S6oiXQuusNNUmhK6/MRo3mZo8A== =XQHm -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "arm64: - Fix a couple of bugs affecting pKVM's PSCI relay implementation when running in the hVHE mode, resulting in the host being entered with the MMU in an unknown state, and EL2 being in the wrong mode x86: - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Explicitly zero EAX and EBX when PERFMON_V2 isn't supported by KVM KVM: selftests: Fix printf() format goof in SEV smoke test KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 KVM: SVM: Save host DR masks on CPUs with DebugSwap KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu() KVM: arm64: Initialize HCR_EL2.E2H early KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs KVM: SVM: Manually context switch DEBUGCTL if LBR virtualization is disabled KVM: x86: Snapshot the host's DEBUGCTL in common x86 KVM: SVM: Suppress DEBUGCTL.BTF on AMD KVM: SVM: Drop DEBUGCTL[5:2] from guest's effective value KVM: selftests: Assert that STI blocking isn't set after event injection KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow	2025-03-09 09:04:08 -10:00
Paolo Bonzini	ea9bd29a9c	KVM x86 fixes for 6.14-rcN #2 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmfLlhUACgkQOlYIJqCj N/0x7w/+MhqJdHbshL7Gzw+rcXwCROiCkqsxFP+YoTXte8uaHS5CEfcMYjE8SuGp KBpgLo4Lj1dVTXiCjemlY5sn6CDiuSs74X8A88ksuu5hVsFByJUgyWU9iw8J/crZ B2vj8huhqa8OCEPe5JujWfnfyAkKE5tUA4GFi73vhHMcftTNj+ftxT33/Pfg7y7M xOvWFWS6ZshKrouRKzI7ZFEYLwp0lr4U3dzO5rCRAd5J4MSBWRx6Dx2um5dyEYKJ xgwl4ylM4S/+78u1+0nQnToM0UWHJ3e7x8nze6UXYTZIrBr/lSeKlbhOPnEWJcJB Eemnur9ORI2BRPUReqBKluCZsSK+E5B/HPCVt5cxtuRIuUOD+kW17LPgnPyE4Sso eVt+XAvQc7EjrpWDSHr3ZQZZM89l9zHhuSAQ0npO6y71s0FzEVZQoDamNmOLAPjH Qg+qhBV2l6pyfqhqiLzADasYLOl57cJsfiMjM331ALLqAn57jzd+B8c4hdB2Xg4s KPuy8w8uBaY9zpd9YDBLLr7JJVs35KexNZMjT2vqBYXcScyLgmAuSQXy3hub6Mzn gI5ZXIKG8eO9v2jejfClI6/OEdtEwgSGEVwuBKB16pMrIxqpguMTMTWLVRn5G+oo qA8anmKaac62GaB66JE/Wjy069OPIGYnHSU2nal0Tej6kG0xv6E= =as6u -----END PGP SIGNATURE----- Merge tag 'kvm-x86-fixes-6.14-rcN.2' of https://github.com/kvm-x86/linux into HEAD KVM x86 fixes for 6.14-rcN #2 - Set RFLAGS.IF in C code on SVM to get VMRUN out of the STI shadow. - Ensure DEBUGCTL is context switched on AMD to avoid running the guest with the host's value, which can lead to unexpected bus lock #DBs. - Suppress DEBUGCTL.BTF on AMD (to match Intel), as KVM doesn't properly emulate BTF. KVM's lack of context switching has meant BTF has always been broken to some extent. - Always save DR masks for SNP vCPUs if DebugSwap is supported, as the guest can enable DebugSwap without KVM's knowledge. - Fix a bug in mmu_stress_tests where a vCPU could finish the "writes to RO memory" phase without actually generating a write-protection fault. - Fix a printf() goof in the SEV smoke test that causes build failures with -Werror. - Explicitly zero EAX and EBX in CPUID.0x8000_0022 output when PERFMON_V2 isn't supported by KVM.	2025-03-09 03:44:06 -04:00
Linus Torvalds	1110ce6a1e	33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ8zgnAAKCRDdBJ7gKXxA jmTzAP9LsTIpkPkRXDpwxPR/Si5KPwOkE6sGj4ETEqbX3vUvcAEA/Lp7oafc7Vqr XxlC1VFush1ZK29Tecxzvnapl2/VSAs= =zBvY -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits) mm/page_alloc: fix uninitialized variable rapidio: add check for rio_add_net() in rio_scan_alloc_net() rapidio: fix an API misues when rio_add_net() fails MAINTAINERS: .mailmap: update Sumit Garg's email address Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone" mm: fix finish_fault() handling for large folios mm: don't skip arch_sync_kernel_mappings() in error paths mm: shmem: remove unnecessary warning in shmem_writepage() userfaultfd: fix PTE unmapping stack-allocated PTE copies userfaultfd: do not block on locking a large folio with raised refcount mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages mm: shmem: fix potential data corruption during shmem swapin mm: fix kernel BUG when userfaultfd_move encounters swapcache selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms selftests/damon/damos_quota: make real expectation of quota exceeds include/linux/log2.h: mark is_power_of_2() with __always_inline NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback mm, swap: avoid BUG_ON in relocate_cluster() mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() ...	2025-03-08 14:34:06 -10:00
Kunihiko Hayashi	a28d2f2398	selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test Add GET_IRQTYPE API checks to each interrupt test. While at it, change pci_ep_ioctl() to get the appropriate return value from ioctl(). Suggested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/20250225110252.28866-2-hayashi.kunihiko@socionext.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>	2025-03-08 14:36:00 +00:00
Niklas Cassel	af1451b673	selftests: pci_endpoint: Skip disabled BARs Currently BARs that have been disabled by the endpoint controller driver will result in a test FAIL. Returning FAIL for a BAR that is disabled seems overly pessimistic. There are EPC that disables one or more BARs intentionally. One reason for this is that there are certain EPCs that are hardwired to expose internal PCIe controller registers over a certain BAR, so the EPC driver disables such a BAR, such that the host will not overwrite random registers during testing. Such a BAR will be disabled by the EPC driver's init function, and the BAR will be marked as BAR_RESERVED, such that it will be unavailable to endpoint function drivers. Let's return FAIL only for BARs that are actually enabled and failed the test, and let's return skip for BARs that are not even enabled. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250123120147.3603409-4-cassel@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>	2025-03-08 14:35:57 +00:00
Jakub Kicinski	93b1e05517	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ8qHFwAKCRAraS+Z+3y5 E6QRAQCI/irYxpT6nBgT3ejyO6CIHjbHjdNP5KRkE8JT61zmJgD8Diet4dwwOxLK c1Pq5Jnl9KLpEPG99TcjtH3c++N3fww= =ltml -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-06 We've added 6 non-merge commits during the last 13 day(s) which contain a total of 6 files changed, 230 insertions(+), 56 deletions(-). The main changes are: 1) Add XDP metadata support for tun driver, from Marcus Wichelmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Fix file descriptor assertion in open_tuntap helper selftests/bpf: Add test for XDP metadata support in tun driver selftests/bpf: Refactor xdp_context_functional test and bpf program selftests/bpf: Move open_tuntap to network helpers net: tun: Enable transfer of XDP metadata to skb net: tun: Enable XDP metadata support ==================== Link: https://patch.msgid.link/20250307055335.441298-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-07 19:08:49 -08:00
Willem de Bruijn	7ae495a537	selftests/net: add proc_net_pktgen to .gitignore Ensure git doesn't pick up this new target. Fixes: `03544faad7` ("selftest: net: add proc_net_pktgen") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250307031356.368350-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-07 19:08:14 -08:00
Linus Torvalds	2a520073e7	s390 updates for 6.14-rc6 - Fix return address recovery of traced function in ftrace to ensure reliable stack unwinding - Fix compiler warnings and runtime crashes of vDSO selftests on s390 by introducing a dedicated GNU hash bucket pointer with correct 32-bit entry size - Fix test_monitor_call() inline asm, which misses CC clobber, by switching to an instruction that doesn't modify CC -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmfLedEACgkQjYWKoQLX FBhXMQf/bIaJEOV/Ww5PWY+AqtHPpQjhP2cvOfSRoh2py38Gp1sy/6ddpn8lcl3R skyeW4XHLpARYGg6NNOAOgFr43c7wQYrr+7uSW0zska+hOlSOi5TtycEOgu7o+VE Nig5XRuBPc01CBokFc+gbvMTjp2fTVlzTQ2/F/B7py62j2c21Y2/n7yZ4hhfQolM oiWlRDXzCRHM/sfa3FlML6sSgfIcHhhMSZLpdmRDhir9vBm1VpeGHa+FM5V8Pzsk KsiuoH7+ylbw+swVn2V/p3fqfS7JoopYdng8h40eLRkeHaTi052+NmMwv8gYAnCk BxfC1re5KlZ6kgWh0CDLUoYQHdwHxw== =KCjH -----END PGP SIGNATURE----- Merge tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix return address recovery of traced function in ftrace to ensure reliable stack unwinding - Fix compiler warnings and runtime crashes of vDSO selftests on s390 by introducing a dedicated GNU hash bucket pointer with correct 32-bit entry size - Fix test_monitor_call() inline asm, which misses CC clobber, by switching to an instruction that doesn't modify CC * tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ftrace: Fix return address recovery of traced function selftests/vDSO: Fix GNU hash table entry size for s390x s390/traps: Fix test_monitor_call() inline assembly	2025-03-07 16:21:02 -10:00
Thomas Weißschuh	a782d45c86	selftests/nolibc: stop testing constructor order The execution order of constructors in undefined and depends on the toolchain. While recent toolchains seems to have a stable order, it doesn't work for older ones and may also change at any time. Stop validating the order and instead only validate that all constructors are executed. Reported-by: Willy Tarreau <w@1wt.eu> Closes: https://lore.kernel.org/lkml/20250301110735.GA18621@1wt.eu/ Link: https://lore.kernel.org/r/20250306-nolibc-constructor-order-v1-1-68fd161cc5ec@weissschuh.net Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-07 07:34:12 +01:00
Jakub Kicinski	1cc3462159	selftests: openvswitch: don't hardcode the drop reason subsys WiFi removed one of their subsys entries from drop reasons, in commit `286e696770` ("wifi: mac80211: Drop cooked monitor support") SKB_DROP_REASON_SUBSYS_OPENVSWITCH is now 2 not 3. The drop reasons are not uAPI, read the correct value from debug info. We need to enable vmlinux BTF, otherwise pahole needs a few GB of memory to decode the enum name. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20250304180615.945945-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 16:45:43 -08:00
Jakub Kicinski	56a586961b	selftests: net: bpf_offload: add 'libbpf_global' to ignored maps After installing pahole on the CI image we have a new map created by libbpf. Ignore it otherwise we see: Exception: Time out waiting for map counts to stabilize want 2, have 3 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250304233204.1139251-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 15:41:37 -08:00
Jakub Kicinski	f9d2f5ddd4	selftests: net: fix error message in bpf_offload We hit a following exception on timeout, nmaps is never set: Test bpftool bound info reporting (own ns)... Traceback (most recent call last): File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 1128, in <module> check_dev_info(False, "") File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 583, in check_dev_info maps = bpftool_map_list_wait(expected=2, ns=ns) File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 215, in bpftool_map_list_wait raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps)) NameError: name 'nmaps' is not defined Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250304233204.1139251-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 15:41:37 -08:00
Louis Taylor	6e406202a4	selftests/nolibc: use O_RDONLY flag instead of 0 This doesn't matter much, but is what the standard says. Signed-off-by: Louis Taylor <louis@kragniz.eu> Link: https://lore.kernel.org/r/20250306184147.208723-5-louis@kragniz.eu Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-06 22:30:21 +01:00
Louis Taylor	b2edaad7f5	tools/nolibc: add support for openat(2) openat is useful to avoid needing to construct relative paths, so expose a wrapper for using it directly. Signed-off-by: Louis Taylor <louis@kragniz.eu> Link: https://lore.kernel.org/r/20250306184147.208723-1-louis@kragniz.eu Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-06 22:30:20 +01:00
Ingo Molnar	82354fce16	Merge branch 'sched/urgent' into sched/core, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-06 22:26:36 +01:00
Jakub Kicinski	2525e16a2b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c `2bcf4772e4` ("net: ethtool: try to protect all callback with netdev instance lock") `637399bf7e` ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-06 13:03:35 -08:00
Marcus Wichelmann	49306d5bfc	selftests/bpf: Fix file descriptor assertion in open_tuntap helper The open_tuntap helper function uses open() to get a file descriptor for /dev/net/tun. The open(2) manpage writes this about its return value: On success, open(), openat(), and creat() return the new file descriptor (a nonnegative integer). On error, -1 is returned and errno is set to indicate the error. This means that the fd > 0 assertion in the open_tuntap helper is incorrect and should rather check for fd >= 0. When running the BPF selftests locally, this incorrect assertion was not an issue, but the BPF kernel-patches CI failed because of this: open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun): actual 0 <= expected 0 Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250305213438.3863922-7-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	73eeecc3cd	selftests/bpf: Add test for XDP metadata support in tun driver Add a selftest that creates a tap device, attaches XDP and TC programs, writes a packet with a test payload into the tap device and checks the test result. This test ensures that the XDP metadata support in the tun driver is enabled and that the metadata size is correctly passed to the skb. See the previous commit ("selftests/bpf: refactor xdp_context_functional test and bpf program") for details about the test design. The test runs in its own network namespace. This provides some extra safety against conflicting interface names. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-6-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	b46aa22b66	selftests/bpf: Refactor xdp_context_functional test and bpf program The existing XDP metadata test works by creating a veth pair and attaching XDP & TC programs that drop the packet when the condition of the test isn't fulfilled. The test then pings through the veth pair and succeeds when the ping comes through. While this test works great for a veth pair, it is hard to replicate for tap devices to test the XDP metadata support of them. A similar test for the tun driver would either involve logic to reply to the ping request, or would have to capture the packet to check if it was dropped or not. To make the testing of other drivers easier while still maximizing code reuse, this commit refactors the existing xdp_context_functional test to use a test_result map. Instead of conditionally passing or dropping the packet, the TC program is changed to copy the received metadata into the value of that single-entry array map. Tests can then verify that the map value matches the expectation. This testing logic is easy to adapt to other network drivers as the only remaining requirement is that there is some way to send a custom Ethernet packet through it that triggers the XDP & TC programs. The Ethernet header of that custom packet is all-zero, because it is not required to be valid for the test to work. The zero ethertype also helps to filter out packets that are not related to the test and would otherwise interfere with it. The payload of the Ethernet packet is used as the test data that is expected to be passed as metadata from the XDP to the TC program and written to the map. It has a fixed size of 32 bytes which is a reasonable size that should be supported by both drivers. Additional packet headers are not necessary for the test and were therefore skipped to keep the testing code short. This new testing methodology no longer requires the veth interfaces to have IP addresses assigned, therefore these were removed. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250305213438.3863922-5-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
Marcus Wichelmann	d5ca409c86	selftests/bpf: Move open_tuntap to network helpers To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-4-marcus.wichelmann@hetzner-cloud.de	2025-03-06 12:31:08 -08:00
SeongJae Park	582ccf78f6	selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries damon_nr_regions.py starts DAMON, periodically collect number of regions in snapshots, and see if it is in the requested range. The check code assumes the numbers are sorted on the collection list, but there is no such guarantee. Hence this can result in false positive test success. Sort the list before doing the check. Link: https://lkml.kernel.org/r/20250225222333.505646-4-sj@kernel.org Fixes: `781497347d` ("selftests/damon: implement test for min/max_nr_regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	695469c07a	selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms damon_nr_regions.py updates max_nr_regions to a number smaller than expected number of real regions and confirms DAMON respect the harsh limit. To give time for DAMON to make changes for the regions, 3 aggregation intervals (300 milliseconds) are given. The internal mechanism works with not only the max_nr_regions, but also sz_limit, though. It avoids merging region if that casn make region of size larger than sz_limit. In the test, sz_limit is set too small to achive the new max_nr_regions, unless it is updated for the new min_nr_regions. But the update is done only once per operations set update interval, which is one second by default. Hence, the test randomly incurs false positive failures. Fix it by setting the ops interval same to aggregation interval, to make sure sz_limit is updated by the time of the check. Link: https://lkml.kernel.org/r/20250225222333.505646-3-sj@kernel.org Fixes: `8bf890c816` ("selftests/damon/damon_nr_regions: test online-tuned max_nr_regions") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	1c684d77df	selftests/damon/damos_quota: make real expectation of quota exceeds Patch series "selftests/damon: three fixes for false results". Fix three DAMON selftest bugs that cause two and one false positive failures and successes. This patch (of 3): damos_quota.py assumes the quota will always exceeded. But whether quota will be exceeded or not depend on the monitoring results. Actually the monitored workload has chaning access pattern and hence sometimes the quota may not really be exceeded. As a result, false positive test failures happen. Expect how much time the quota will be exceeded by checking the monitoring results, and use it instead of the naive assumption. Link: https://lkml.kernel.org/r/20250225222333.505646-1-sj@kernel.org Link: https://lkml.kernel.org/r/20250225222333.505646-2-sj@kernel.org Fixes: `51f58c9da1` ("selftests/damon: add a test for DAMOS quota") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:16 -08:00
SeongJae Park	349db086a6	selftests/damon/damos_quota_goal: handle minimum quota that cannot be further reduced damos_quota_goal.py selftest see if DAMOS quota goals tuning feature increases or reduces the effective size quota for given score as expected. The tuning feature sets the minimum quota size as one byte, so if the effective size quota is already one, we cannot expect it further be reduced. However the test is not aware of the edge case, and fails since it shown no expected change of the effective quota. Handle the case by updating the failure logic for no change to see if it was the case, and simply skips to next test input. Link: https://lkml.kernel.org/r/20250217182304.45215-1-sj@kernel.org Fixes: `f1c07c0a16` ("selftests/damon: add a test for DAMOS quota goal") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202502171423.b28a918d-lkp@intel.com Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> [6.10.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:12 -08:00
John Hubbard	0a7565ee6e	Revert "selftests/mm: remove local __NR_* definitions" This reverts commit `a5c6bc5900`. The general approach described in commit `e076eaca59` ("selftests: break the dependency upon local header files") was taken one step too far here: it should not have been extended to include the syscall numbers. This is because doing so would require per-arch support in tools/include/uapi, and no such support exists. This revert fixes two separate reports of test failures, from Dave Hansen[1], and Li Wang[2]. An excerpt of Dave's report: Before this commit (`a5c6bc5900`) things are fine. But after, I get: running PKEY tests for unsupported CPU/OS An excerpt of Li's report: I just found that mlock2_() return a wrong value in mlock2-test [1] https://lore.kernel.org/dc585017-6740-4cab-a536-b12b37a7582d@intel.com [2] https://lore.kernel.org/CAEemH2eW=UMu9+turT2jRie7+6ewUazXmA6kL+VBo3cGDGU6RA@mail.gmail.com Link: https://lkml.kernel.org/r/20250214033850.235171-1-jhubbard@nvidia.com Fixes: `a5c6bc5900` ("selftests/mm: remove local __NR_* definitions") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Li Wang <liwang@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Andrei Vagin <avagin@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:12 -08:00
Atish Patra	ee4e778c58	KVM: riscv: selftests: Allow number of interrupts to be configurable It is helpful to vary the number of the LCOFI interrupts generated by the overflow test. Allow additional argument for overflow test to accommodate that. It can be easily cross-validated with /proc/interrupts output in the host. Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-4-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Atish Patra	4b506adfea	KVM: riscv: selftests: Change command line option The PMU test commandline option takes an argument to disable a certain test. The initial assumption behind this was a common use case is just to run all the test most of the time. However, running a single test seems more useful instead. Especially, the overflow test has been helpful to validate PMU virtualizaiton interrupt changes. Switching the command line option to run a single test instead of disabling a single test also allows to provide additional test specific arguments to the test. The default without any options remains unchanged which continues to run all the tests. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-3-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Atish Patra	1f6bbe1255	KVM: riscv: selftests: Do not start the counter in the overflow handler There is no need to start the counter in the overflow handler as we intend to trigger precise number of LCOFI interrupts through these tests. The overflow irq handler has already stopped the counter. As a result, the stop call from the test function may return already stopped error which is fine as well. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-2-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
Catalin Marinas	306219d59b	kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings While the kselftest was added at the same time with the kernel support for MTE on hugetlb mappings, the tests may be run on older kernels. Skip the tests if PROT_MTE is not supported on MAP_HUGETLB mappings. Fixes: `27879e8cb6` ("selftests: arm64: add hugetlb mte tests") Cc: Yang Shi <yang@os.amperecomputing.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20250221093331.2184245-3-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-05 18:54:02 +00:00
Catalin Marinas	7ae95109c6	kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c The architecture doesn't define precise/imprecise MTE tag check modes, only synchronous and asynchronous. Use the correct naming and also ensure they match the MTE_{ASYNC,SYNC}_ERR type. Fixes: `27879e8cb6` ("selftests: arm64: add hugetlb mte tests") Cc: Yang Shi <yang@os.amperecomputing.com> Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20250221093331.2184245-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-03-05 18:54:02 +00:00
Wojtek Wasko	76868642e4	testptp: Add option to open PHC in readonly mode PTP Hardware Clocks no longer require WRITE permission to perform readonly operations, such as listing device capabilities or listening to EXTTS events once they have been enabled by a process with WRITE permissions. Add '-r' option to testptp to open the PHC in readonly mode instead of the default read-write mode. Skip enabling EXTTS if readonly mode is requested. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-03-05 12:43:54 +00:00
Christian Brauner	56f235da15	selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-16-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	1c4f2dbe42	selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-15-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	2adf6ca638	selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-14-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	2e94e4c649	selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-13-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	a79975f05e	selftests/pidfd: add third PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-12-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	86c1dfdd52	selftests/pidfd: add second PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-11-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	853ab1ff2c	selftests/pidfd: add first PIDFD_INFO_EXIT selftest Add a selftest for PIDFD_INFO_EXIT behavior. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-10-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:21 +01:00
Christian Brauner	ddf5315526	selftests/pidfd: expand common pidfd header Move more infrastructure to the pidfd header. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-9-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Christian Brauner	18938f7180	pidfs/selftests: ensure correct headers for ioctl handling Ensure that necessary ioctl infrastructure is available. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-8-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Christian Brauner	17457a47d2	selftests/pidfd: fix header inclusion Ensure that necessary defines are present. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-7-c8c3d8361705@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 13:26:20 +01:00
Boqun Feng	467c890f2d	Merge branches 'docs.2025.02.04a', 'lazypreempt.2025.03.04a', 'misc.2025.03.04a', 'srcu.2025.02.05a' and 'torture.2025.02.05a'	2025-03-04 18:47:32 -08:00
Paul E. McKenney	8d608f0801	rcutorture: Make scenario TREE07 build CONFIG_PREEMPT_LAZY=y This commit tests lazy preemption by causing the TREE07 rcutorture scenario to build its kernel with CONFIG_PREEMPT_LAZY=y. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:46:47 -08:00
Paul E. McKenney	118559a994	rcutorture: Make scenario TREE10 build CONFIG_PREEMPT_LAZY=y This commit tests lazy preemption by causing the TREE10 rcutorture scenario to build its kernel with CONFIG_PREEMPT_LAZY=y. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:46:47 -08:00
Uladzislau Rezki (Sony)	a6cea3954e	rcu: Update TREE05.boot to test normal synchronize_rcu() Add extra parameters for rcutorture module. One is the "nfakewriters" which is set -1. There will be created number of test-kthreads which correspond to number of CPUs in a test system. Those threads randomly invoke synchronize_rcu() call. Apart of that "rcu_normal" is set to 1, because it is specifically for a normal synchronize_rcu() testing, also a newly added parameter which is "rcu_normal_wake_from_gp" is set to 1 also. That prevents interaction with other callbacks in a system. Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Link: https://lore.kernel.org/r/20250227131613.52683-2-urezki@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com>	2025-03-04 18:44:29 -08:00
Jakub Kicinski	ea4342739d	selftests: drv-net: use env.rpath in the HDS test Commit `29b036be1b` ("selftests: drv-net: test XDP, HDS auto and the ioctl path") added a new test case in the net tree, now that this code has made its way to net-next convert it to use the env.rpath() helper instead of manually computing the relative path. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250228212956.25399-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 17:14:23 -08:00
Gang Yan	ba24001665	selftests: mptcp: add a test for mptcp_diag_dump_one This patch introduces a new 'chk_diag' test in diag.sh. It retrieves the token for a specified MPTCP socket (msk) using the 'ss' command and then accesses the 'mptcp_diag_dump_one' in kernel via ./mptcp_diag to verify if the correct token is returned. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/524 Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-2-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 16:57:38 -08:00
Gang Yan	00f5e338cf	selftests: mptcp: Add a tool to get specific msk_info This patch enables the retrieval of the mptcp_info structure corresponding to a specified MPTCP socket (msk). When multiple MPTCP connections are present, specific information can be obtained for a given connection through the 'mptcp_diag_dump_one' by using the 'token' associated with the msk. Signed-off-by: Gang Yan <yangang@kylinos.cn> Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-1-f933c4275676@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-04 16:57:37 -08:00
Yi Lai	df6f8c4d72	selftests/pcie_bwctrl: Add 'set_pcie_speed.sh' to TEST_PROGS The test shell script "set_pcie_speed.sh" is not installed in INSTALL_PATH. Attempting to execute set_pcie_cooling_state.sh shows warning: ./set_pcie_cooling_state.sh: line 119: ./set_pcie_speed.sh: No such file or directory Add "set_pcie_speed.sh" to TEST_PROGS. Link: https://lore.kernel.org/r/Z8FfK8rN30lKzvVV@ly-workstation Fixes: `838f12c3d5` ("selftests/pcie_bwctrl: Create selftests") Signed-off-by: Yi Lai <yi1.lai@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2025-03-04 16:49:19 -06:00
Thomas Weißschuh	a22ee38d2e	selftests/vDSO: Fix GNU hash table entry size for s390x Commit `14be4e6f35` ("selftests: vDSO: fix ELF hash table entry size for s390x") changed the type of the ELF hash table entries to 64bit on s390x. However the GNU hash tables entries are always 32bit. The "bucket" pointer is shared between both hash algorithms. On s390, this caused the GNU hash algorithm to access its 32-bit entries as if they were 64-bit, triggering compiler warnings (assignment between "Elf64_Xword " and "Elf64_Word ") and runtime crashes. Introduce a new dedicated "gnu_bucket" pointer which is used by the GNU hash. Fixes: `e0746bde6f` ("selftests/vDSO: support DT_GNU_HASH") Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250217-selftests-vdso-s390-gnu-hash-v2-1-f6c2532ffe2a@linutronix.de Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2025-03-04 17:15:18 +01:00
Bharadwaj Raju	82ef781f24	selftests/ftrace: add 'poll' binary to gitignore When building this test, a binary file 'poll' is generated and should be gitignore'd. Link: https://lore.kernel.org/r/20250210160138.4745-1-bharadwaj.raju777@gmail.com Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <shuah@kernel.org>	2025-03-04 08:51:17 -07:00
Breno Leitao	d7a2522426	netconsole: selftest: add task name append testing Add test coverage for the netconsole task name feature to the existing sysdata selftest script. This extends the test infrastructure to verify that task names are correctly appended when enabled and absent when disabled. The test validates that: - Task names appear in the expected format "taskname=<name>" - Task names are included when the feature is enabled - Task names are excluded when the feature is disabled - The feature works correctly alongside other sysdata fields like CPU Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 15:28:28 +01:00
Yi Liu	1062d81086	iommufd: Disallow allocating nested parent domain with fault ID Allocating a domain with a fault ID indicates that the domain is faultable. However, there is a gap for the nested parent domain to support PRI. Some hardware lacks the capability to distinguish whether PRI occurs at stage 1 or stage 2. This limitation may require software-based page table walking to resolve. Since no in-tree IOMMU driver currently supports this functionality, it is disallowed. For more details, refer to the related discussion at [1]. [1] https://lore.kernel.org/linux-iommu/bd1655c6-8b2f-4cfa-adb1-badc00d01811@intel.com/ Link: https://patch.msgid.link/r/20250226104012.82079-1-yi.l.liu@intel.com Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-04 09:34:54 -04:00
Nicolas Dichtel	0c493da863	net: rename netns_local to netns_immutable The name 'netns_local' is confusing. A following commit will export it via netlink, so let's use a more explicit name. Reported-by: Eric Dumazet <edumazet@google.com> Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 12:44:48 +01:00
Ingo Molnar	cfdaa618de	Merge branch 'x86/cpu' into x86/asm, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-04 11:19:21 +01:00
Ingo Molnar	1b4c36f9b1	Merge branch 'x86/urgent' into x86/cpu, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2025-03-04 11:15:26 +01:00
Peter Seiderer	03544faad7	selftest: net: add proc_net_pktgen Add some test for /proc/net/pktgen/... interface. - enable 'CONFIG_NET_PKTGEN=m' in tools/testing/selftests/net/config Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-03-04 10:57:58 +01:00
Christian Brauner	e5c10b19b3	selftests: test subdirectory mounting This tests mounting a subdirectory without ever having to expose the filesystem to a non-anonymous mount namespace. Link: https://lore.kernel.org/r/20250225-work-mount-propagation-v1-3-e6e3724500eb@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:55 +01:00
Christian Brauner	12e40cbb0e	selftests: add test for detached mount tree propagation Test that detached mount trees receive propagation events. Link: https://lore.kernel.org/r/20250225-work-mount-propagation-v1-2-e6e3724500eb@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:55 +01:00
Christian Brauner	7bece690a9	selftests: seventh test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-16-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:54 +01:00
Christian Brauner	0e6eef95df	selftests: sixth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-15-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:54 +01:00
Christian Brauner	d346ae7a64	selftests: fifth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-14-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	1768f9bd53	selftests: fourth test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-13-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	4fd0aea1ea	selftests: third test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-12-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	89ddfd4af1	selftests: second test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-11-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	d83a55b826	selftests: first test for mounting detached mounts onto detached mounts Add a test to verify that detached mounts behave correctly. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-10-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Christian Brauner	3f1724dd56	selftests: create detached mounts from detached mounts Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-7-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-04 09:29:53 +01:00
Jakub Kicinski	d110dbf149	selftests: net: report output format as TAP 13 in Python tests The Python lib based tests report that they are producing "KTAP version 1", but really we aren't making use of any KTAP features, like subtests. Our output is plain TAP. Report TAP 13 instead of KTAP 1, this is what mptcp tests do, and what NIPA knows how to parse best. For HW testing we need precise subtest result tracking. Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250228180007.83325-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-03-03 15:03:19 -08:00
Thomas Weißschuh	8770a9183f	selftests: vDSO: vdso_standalone_test_x86: Switch to nolibc vdso_standalone_test_x86 provides its own ASM syscall wrappers and _start() implementation. The in-tree nolibc library already provides this functionality for multiple architectures. By making use of nolibc, the standalone testcase can be built from the exact same codebase as the non-standalone version. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-16-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	4f65df6a58	selftests: vDSO: vdso_test_gettimeofday: Make compatible with nolibc nolibc does not provide sys/time.h and sys/auxv.h, instead their definitions are available unconditionally. Guard the includes so they are not attempted on nolibc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-15-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	97a8814124	selftests: vDSO: vdso_test_gettimeofday: Clean up includes Some unnecessary headers are included, remove them. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-14-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	032e871686	selftests: vDSO: parse_vdso: Test __SIZEOF_LONG__ instead of ULONG_MAX According to limits.h(2) ULONG_MAX is only guaranteed to expand to an expression, not a symbolic constant which can be evaluated by the preprocessor. Specifically the definition of ULONG_MAX from nolibc can not be evaluated by the preprocessor. To provide compatibility with nolibc, check with __SIZEOF_LONG__ instead, with is provided directly by the preprocessor and therefore always a symbolic constant. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-13-28e14e031ed8@linutronix.de	2025-03-03 20:00:13 +01:00
Thomas Weißschuh	c9fbaa8795	selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers To allow the usage of parse_vdso.c together with a limited libc like nolibc, use the kernels own elf.h and auxvec.h headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-12-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	09dcec6470	selftests: vDSO: parse_vdso: Drop vdso_init_from_auxv() There are no users left. This also removes the usage of ElfXX_auxv_t, which is not formally standardized. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-11-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	05c204acf5	selftests: vDSO: vdso_standalone_test_x86: Use vdso_init_form_sysinfo_ehdr vdso_standalone_test_x86 is the only user of vdso_init_from_auxv(). Instead of combining the parsing the aux vector with the parsing of the vDSO, split them apart into getauxval() and the regular vdso_init_from_sysinfo_ehdr(). The implementation of getauxval() is taken from tools/include/nolibc/stdlib.h. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-10-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Thomas Weißschuh	1a59f5d315	selftests: Add headers target Some selftests need access to a full UAPI headers tree, for example when building with nolibc which heavily relies on UAPI headers. A reference to such a tree is available in the KHDR_INCLUDES variable, but there is currently no way to populate such a tree automatically. Provide a target that the tests can depend on to get access to usable UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-8-28e14e031ed8@linutronix.de	2025-03-03 20:00:12 +01:00
Tejun Heo	8a9b1585e2	sched_ext: Merge branch 'for-6.14-fixes' into for-6.15 Pull for-6.14-fixes to receive: `9360dfe4cb` ("sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()") which conflicts with: `337d1b354a` ("sched_ext: Move built-in idle CPU selection policy to a separate file") Signed-off-by: Tejun Heo <tj@kernel.org>	2025-03-03 08:02:22 -10:00
Sean Christopherson	3b2d3db368	KVM: selftests: Fix printf() format goof in SEV smoke test Print out the index of mismatching XSAVE bytes using unsigned decimal format. Some versions of clang complain about trying to print an integer as an unsigned char. x86/sev_smoke_test.c:55:51: error: format specifies type 'unsigned char' but the argument has type 'int' [-Werror,-Wformat] Fixes: `8c53183dba` ("selftests: kvm: add test for transferring FPU state into VMSA") Link: https://lore.kernel.org/r/20250228233852.3855676-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-03-03 07:45:34 -08:00
Sean Christopherson	d88ed5fb7c	KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage During the initial mprotect(RO) stage of mmu_stress_test, keep vCPUs spinning until all vCPUs have hit -EFAULT, i.e. until all vCPUs have tried to write to a read-only page. If a vCPU manages to complete an entire iteration of the loop without hitting a read-only page, and the vCPU observes mprotect_ro_done before starting a second iteration, then the vCPU will prematurely fall through to GUEST_SYNC(3) (on x86 and arm64) and get out of sequence. Replace the "do-while (!r)" loop around the associated _vcpu_run() with a single invocation, as barring a KVM bug, the vCPU is guaranteed to hit -EFAULT, and retrying on success is super confusion, hides KVM bugs, and complicates this fix. The do-while loop was semi-unintentionally added specifically to fudge around a KVM x86 bug, and said bug is unhittable without modifying the test to force x86 down the !(x86\|\|arm64) path. On x86, if forced emulation is enabled, vcpu_arch_put_guest() may trigger emulation of the store to memory. Due a (very, very) longstanding bug in KVM x86's emulator, emulate writes to guest memory that fail during __kvm_write_guest_page() unconditionally return KVM_EXIT_MMIO. While that is desirable in the !memslot case, it's wrong in this case as the failure happens due to __copy_to_user() hitting a read-only page, not an emulated MMIO region. But as above, x86 only uses vcpu_arch_put_guest() if the __x86_64__ guards are clobbered to force x86 down the common path, and of course the unexpected MMIO is a KVM bug, i.e. should cause a test failure. Fixes: `b6c304aec6` ("KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)") Reported-by: Yan Zhao <yan.y.zhao@intel.com> Closes: https://lore.kernel.org/all/20250208105318.16861-1-yan.y.zhao@intel.com Debugged-by: Yan Zhao <yan.y.zhao@intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20250228230804.3845860-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-03-03 07:37:28 -08:00
Mirsad Todorovac	40fc756101	selftests/x86/syscall: Fix coccinelle WARNING recommending the use of ARRAY_SIZE() Coccinelle gives WARNING recommending the use of ARRAY_SIZE() macro definition to improve the code readability: ./tools/testing/selftests/x86/syscall_numbering.c:316:35-36: WARNING: Use ARRAY_SIZE Fixes: `15c82d98a0` ("selftests/x86/syscall: Update and extend syscall_numbering_64") Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20241101111523.1293193-2-mtodorovac69@gmail.com	2025-03-03 12:38:49 +01:00
Thomas Weißschuh	cb839e0cc8	selftests/nolibc: add armthumb configuration While nolibc does support ARM Thumb instructions, that support was not tested specifically. Add a new test configuration for it. Tested-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250301-nolibc-armthumb-v1-2-d1f04abb5f6d@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-02 13:25:20 +01:00
Thomas Weißschuh	f8bedb30d6	selftests/nolibc: explicitly enable ARM mode The default could also be -mthumb. Explicitly use -marm to keep everything predictable. Tested-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250301-nolibc-armthumb-v1-1-d1f04abb5f6d@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-03-02 13:25:20 +01:00
Linus Torvalds	5c44ddaf7d	Tracing fixes for v6.14: - Fix crash from bad histogram entry An error path in the histogram creation could leave an entry in a link list that gets freed. Then when a new entry is added it can cause a u-a-f bug. This is fixed by restructuring the code so that the histogram is consistent on failure and everything is cleaned up appropriately. - Fix fprobe self test The fprobe self test relies on no function being attached by ftrace. BPF programs can attach to functions via ftrace and systemd now does so. This causes those functions to appear in the enabled_functions list which holds all functions attached by ftrace. The selftest also uses that file to see if functions are being connected correctly. It counts the functions in the file, but if there's already functions in the file, it fails. Instead, add the number of functions in the file at the start of the test to all the calculations during the test. - Fix potential division by zero of the function profiler stddev The calculated divisor that calculates the standard deviation of the function times can overflow. If the overflow happens to land on zero, that can cause a division by zero. Check for zero from the calculation before doing the division. TODO: Catch when it ever overflows and report it accordingly. For now, just prevent the system from crashing. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ8HqYBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qpoXAP90gvO2LfjItjZVjBYudr4GOzcsjAAK cZ2vL2LJp3hT4QD+Kud2YaZqzrV8tvFFBikO7FvEV3zZpnw48895pIgcoww= =NLe0 -----END PGP SIGNATURE----- Merge tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix crash from bad histogram entry An error path in the histogram creation could leave an entry in a link list that gets freed. Then when a new entry is added it can cause a u-a-f bug. This is fixed by restructuring the code so that the histogram is consistent on failure and everything is cleaned up appropriately. - Fix fprobe self test The fprobe self test relies on no function being attached by ftrace. BPF programs can attach to functions via ftrace and systemd now does so. This causes those functions to appear in the enabled_functions list which holds all functions attached by ftrace. The selftest also uses that file to see if functions are being connected correctly. It counts the functions in the file, but if there's already functions in the file, it fails. Instead, add the number of functions in the file at the start of the test to all the calculations during the test. - Fix potential division by zero of the function profiler stddev The calculated divisor that calculates the standard deviation of the function times can overflow. If the overflow happens to land on zero, that can cause a division by zero. Check for zero from the calculation before doing the division. TODO: Catch when it ever overflows and report it accordingly. For now, just prevent the system from crashing. * tag 'trace-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Avoid potential division by zero in function_stat_show() selftests/ftrace: Let fprobe test consider already enabled functions tracing: Fix bad hist from corrupting named_triggers list	2025-02-28 15:43:32 -08:00
Sean Christopherson	62838fa5ea	KVM: selftests: Relax assertion on HLT exits if CPU supports Idle HLT If the CPU supports Idle HLT, which elides HLT VM-Exits if the vCPU has an unmasked pending IRQ or NMI, relax the xAPIC IPI test's assertion on the number of HLT exits to only require that the number of exits is less than or equal to the number of HLT instructions that were executed. I.e. don't fail the test if Idle HLT does what it's supposed to do. Note, unfortunately there's no way to determine if KVM supports Idle HLT, as this_cpu_has() checks raw CPU support, and kvm_cpu_has() checks what can be exposed to L1, i.e. the latter would check if KVM supports nested Idle HLT. But, since the assert is purely bonus coverage, checking for CPU support is good enough. Cc: Manali Shukla <Manali.Shukla@amd.com> Tested-by: Manali Shukla <Manali.Shukla@amd.com> Link: https://lore.kernel.org/r/20250226231809.3183093-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 15:42:28 -08:00
Sean Christopherson	f3513a335e	KVM: selftests: Assert that STI blocking isn't set after event injection Add an L1 (guest) assert to the nested exceptions test to verify that KVM doesn't put VMRUN in an STI shadow (AMD CPUs bleed the shadow into the guest's int_state if a #VMEXIT occurs before VMRUN fully completes). Add a similar assert to the VMX side as well, because why not. Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 09:15:23 -08:00
Colin Ian King	75418e222e	KVM: selftests: Fix spelling mistake "UFFDIO_CONINUE" -> "UFFDIO_CONTINUE" There is a spelling mistake in a PER_PAGE_DEBUG debug message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20250227220819.656780-1-colin.i.king@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-28 09:12:52 -08:00
Ming Lei	bedc9cbc5f	selftests: ublk: add ublk zero copy test Enable zero copy on file backed target, meantime add one fio test for covering write verify, another test for mkfs/mount/umount. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Ming Lei	5d95bfb535	selftests: ublk: add file backed ublk Add file backed ublk target code, meantime add one fio test for covering write verify, another test for mkfs/mount/umount. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Ming Lei	6aecda00b7	selftests: ublk: add kernel selftests for ublk Both ublk driver and userspace heavily depends on io_uring subsystem, and tools/testing/selftests/ should be the best place for holding this cross-subsystem tests. Add basic read/write IO test over this ublk null disk, and make sure ublk working. More tests will be added. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228161919.2869102-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-28 09:28:08 -07:00
Kevin Krakauer	51bef03e1a	selftests/net: deflake GRO tests GRO tests are timing dependent and can easily flake. This is partially mitigated in gro.sh by giving each subtest 3 chances to pass. However, this still flakes on some machines. Reduce the flakiness by: - Bumping retries to 6. - Setting napi_defer_hard_irqs to 1 to reduce the chance that GRO is flushed prematurely. This also lets us reduce the gro_flush_timeout from 1ms to 100us. Tested: Ran `gro.sh -t large` 1000 times. There were no failures with this change. Ran inside strace to increase flakiness. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-4-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Kevin Krakauer	41cda57284	selftests/net: only print passing message in GRO tests when tests pass gro.c:main no longer erroneously claims a test passes when running as a sender. Tested: Ran `gro.sh -t large` to verify the sender no longer prints a status. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-3-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Kevin Krakauer	784e6abd99	selftests/net: have `gro.sh -t` return a correct exit code Modify gro.sh to return a useful exit code when the -t flag is used. It formerly returned 0 no matter what. Tested: Ran `gro.sh -t large` and verified that test failures return 1. Signed-off-by: Kevin Krakauer <krakauer@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250226192725.621969-2-krakauer@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 18:38:23 -08:00
Heiko Carstens	3908b6baf2	selftests/ftrace: Let fprobe test consider already enabled functions The fprobe test fails on Fedora 41 since the fprobe test assumption that the number of enabled_functions is zero before the test starts is not necessarily true. Some user space tools, like systemd, add BPF programs that attach to functions. Those will show up in the enabled_functions table and must be taken into account by the fprobe test. Therefore count the number of lines of enabled_functions before tests start, and use that as base when comparing expected results. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250226142703.910860-1-hca@linux.ibm.com Fixes: `e85c5e9792` ("selftests/ftrace: Update fprobe test to check enabled_functions file") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-02-27 21:02:09 -05:00
Jakub Kicinski	357660d759	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c `fa52f15c74` ("net: cadence: macb: Synchronize stats calculations") `75696dd0fd` ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c `79990cf5e7` ("ice: Fix deinitializing VF in error path") `a203163274` ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c `18912c5206` ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") `297d389e9e` ("net: prefix devmem specific helpers") net/mptcp/subflow.c `8668860b0a` ("mptcp: reset when MPTCP opts are dropped after join") `c3349a22c2` ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 10:20:58 -08:00
Linus Torvalds	1e15510b71	Including fixes from bluetooth. We didn't get netfilter or wireless PRs this week, so next week's PR is probably going to be bigger. A healthy dose of fixes for bugs introduced in the current release nonetheless. Current release - regressions: - Bluetooth: always allow SCO packets for user channel - af_unix: fix memory leak in unix_dgram_sendmsg() - rxrpc: - remove redundant peer->mtu_lock causing lockdep splats - fix spinlock flavor issues with the peer record hash - eth: iavf: fix circular lock dependency with netdev_lock - net: use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net() RDMA driver register notifier after the device Current release - new code bugs: - ethtool: fix ioctl confusing drivers about desired HDS user config - eth: ixgbe: fix media cage present detection for E610 device Previous releases - regressions: - loopback: avoid sending IP packets without an Ethernet header - mptcp: reset connection when MPTCP opts are dropped after join Previous releases - always broken: - net: better track kernel sockets lifetime - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT - eth: enetc: number of error handling fixes - dsa: rtl8366rb: reshuffle the code to fix config / build issue with LED support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfAj8MACgkQMUZtbf5S IrtoTRAAj0XNWXGWZdOuVub0xhtjsPLoZktux4AzsELqaynextkJW6w9pG5qVrWu UZt3a3bC7u6+JoTgb+GQVhyjuuVjv6NOSuLK3FS+NePW8ijhLP5oTg6eD0MQS60Z wa9yQx3yL1Kvb6b80Go/3WgRX9V6Rx8zlROAl/gOlZ9NKB0rSVqnueZGPjGZJf1a ayyXsmzRykshbr5Ic0e+b74hFP3DGxVgHjIob1C4kk/Q+WOfQKnm3C3fnZ/R2QcS 7B7kSk9WokvNwk3hJc7ZtFxJbrQKSSuRI8nCD93hBjTn76yJjlPicJ9b6HJoGhE/ Pwt7fBnDCCA00x6ejD3OrurR+/80PbPtyvNtgMMTD49wSwxQpQ6YpTMInnodCzAV NvIhkkXBprI0kiTT4dDpNoeFMKD3i07etKpvMfEoDzZR7vgUsj6aClSmuxILeU9a crFC4Vp5SgyU1/lUPDiG4dfbd8s4hfM4bZ+d0zAtth3/rQA7/EA6dLqbRXXWX7h5 Gl6egKWPsSl+WUgFjpBjYfhqrQsc06hxaCh0SQYH6SnS3i+PlMU2uRJYZMLQ66rX QsSQOyqCEHwd1qnrLedg9rCniv+DzOJf+qh+H0eY9WhuOay+8T52OHLxpRjSHxBo SCP+qQxSX0qhH5DtUiOV50Fwg19UhJJyWd0COfv5SIGm/I1dUOY= =+Ci7 -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth. We didn't get netfilter or wireless PRs this week, so next week's PR is probably going to be bigger. A healthy dose of fixes for bugs introduced in the current release nonetheless. Current release - regressions: - Bluetooth: always allow SCO packets for user channel - af_unix: fix memory leak in unix_dgram_sendmsg() - rxrpc: - remove redundant peer->mtu_lock causing lockdep splats - fix spinlock flavor issues with the peer record hash - eth: iavf: fix circular lock dependency with netdev_lock - net: use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net() RDMA driver register notifier after the device Current release - new code bugs: - ethtool: fix ioctl confusing drivers about desired HDS user config - eth: ixgbe: fix media cage present detection for E610 device Previous releases - regressions: - loopback: avoid sending IP packets without an Ethernet header - mptcp: reset connection when MPTCP opts are dropped after join Previous releases - always broken: - net: better track kernel sockets lifetime - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT - eth: enetc: number of error handling fixes - dsa: rtl8366rb: reshuffle the code to fix config / build issue with LED support" * tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits) net: ti: icss-iep: Reject perout generation request idpf: fix checksums set in idpf_rx_rsc() selftests: drv-net: Check if combined-count exists net: ipv6: fix dst ref loop on input in rpl lwt net: ipv6: fix dst ref loop on input in seg6 lwt usbnet: gl620a: fix endpoint checking in genelink_bind() net/mlx5: IRQ, Fix null string in debug print net/mlx5: Restore missing trace event when enabling vport QoS net/mlx5: Fix vport QoS cleanup on error net: mvpp2: cls: Fixed Non IP flow, with vlan tag flow defination. af_unix: Fix memory leak in unix_dgram_sendmsg() net: Handle napi_schedule() calls from non-interrupt net: Clear old fragment checksum value in napi_reuse_skb gve: unlink old napi when stopping a queue using queue API net: Use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net(). tcp: Defer ts_recent changes until req is owned net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs() net: enetc: remove the mm_lock from the ENETC v4 driver net: enetc: add missing enetc4_link_deinit() net: enetc: update UDP checksum when updating originTimestamp field ...	2025-02-27 09:32:42 -08:00
Joe Damato	1cbddbddee	selftests: drv-net: Check if combined-count exists Some drivers, like tg3, do not set combined-count: $ ethtool -l enp4s0f1 Channel parameters for enp4s0f1: Pre-set maximums: RX: 4 TX: 4 Other: n/a Combined: n/a Current hardware settings: RX: 4 TX: 1 Other: n/a Combined: n/a In the case where combined-count is not set, the ethtool netlink code in the kernel elides the value and the code in the test: netnl.channels_get(...) With a tg3 device, the returned dictionary looks like: {'header': {'dev-index': 3, 'dev-name': 'enp4s0f1'}, 'rx-max': 4, 'rx-count': 4, 'tx-max': 4, 'tx-count': 1} Note that the key 'combined-count' is missing. As a result of this missing key the test raises an exception: # Exception\| if channels['combined-count'] == 0: # Exception\| ~~~~~~~~^^^^^^^^^^^^^^^^^^ # Exception\| KeyError: 'combined-count' Change the test to check if 'combined-count' is a key in the dictionary first and if not assume that this means the driver has separate RX and TX queues. With this change, the test now passes successfully on tg3 and mlx5 (which does have a 'combined-count'). Fixes: `1cf2704242` ("net: selftest: add test for netdev netlink queue-get API") Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250226181957.212189-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-27 07:30:32 -08:00
Colin Ian King	bd64e9d6aa	selftests/x86/xstate: Fix spelling mistake "hader" -> "header" There is a spelling mistake in a sig_print() message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250227091533.599213-1-colin.i.king@gmail.com	2025-02-27 10:49:31 +01:00
Bharadwaj Raju	29fa7d7934	selftests/sysctl: fix wording of help messages Change wording of test number recommendation to "the recommended number of times". Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-02-27 10:02:12 +01:00
Jakub Kicinski	185646a8a0	selftests: drv-net: add tests for napi IRQ affinity notifiers Add tests to check that the napi retained the IRQ after down/up, multiple changes in the number of rx queues and after attaching/releasing XDP program. Tested on ice and idpf: # NETIF=<iface> tools/testing/selftests/drivers/net/hw/irq.py KTAP version 1 1..4 ok 1 irq.check_irqs_reported ok 2 irq.check_reconfig_queues ok 3 irq.check_reconfig_xdp ok 4 irq.check_down # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Tested-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://patch.msgid.link/20250224232228.990783-7-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 19:51:38 -08:00
Willem de Bruijn	2e5584e0f9	selftests/net: expand cmsg_ipv6.sh with ipv4 Expand IPV6_TCLASS to also cover IP_TOS. Expand IPV6_HOPLIMIT to also cover IP_TTL. Expand csmg_sender.c to allow setting IPv4 setsockopts. Also rename struct v6 to cmsg to match its expanded scope. Don't bother updating all occurrences of tclass and hoplimit. Rename cmsg_ipv6.sh to cmsg_ip.sh to match the expanded scope. Be careful around the subtle API difference between TCLASS and TOS. IP_TOS includes ECN bits. Add a test to verify that these are masked when making routing decisions. Diff is more concise with --word-diff Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250225022431.2083926-3-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 18:59:00 -08:00
Willem de Bruijn	c1d6d629ab	selftests/net: prepare cmsg_ipv6.sh for ipv4 Move IPV6_TCLASS and IPV6_HOPLIMIT into loops, to be able to use them for IP_TOS and IP_TTL in a follow-on patch. Indentation in this file is a mix of four spaces and tabs for double indents. To minimize code churn, maintain that pattern. Very small diff if viewing with -w. Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250225022431.2083926-2-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-26 18:59:00 -08:00
Shameer Kolothum	f69656656f	KVM: selftests: Add test for KVM_REG_ARM_VENDOR_HYP_BMAP_2 One difference here with other pseudo-firmware bitmap registers is that the default/reset value for the supported hypercall function-ids is 0 at present. Hence, modify the test accordingly. Reviewed-by: Sebastian Ott <sebott@redhat.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20250221140229.12588-7-shameerali.kolothum.thodi@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-02-26 13:30:37 -08:00
Dave Jiang	516e5bd0b6	cxl: Add mce notifier to emit aliased address for extended linear cache Below is a setup with extended linear cache configuration with an example layout of memory region shown below presented as a single memory region consists of 256G memory where there's 128G of DRAM and 128G of CXL memory. The kernel sees a region of total 256G of system memory. 128G DRAM 128G CXL memory \|-----------------------------------\|-------------------------------------\| Data resides in either DRAM or far memory (FM) with no replication. Hot data is swapped into DRAM by the hardware behind the scenes. When error is detected in one location, it is possible that error also resides in the aliased location. Therefore when a memory location that is flagged by MCE is part of the special region, the aliased memory location needs to be offlined as well. Add an mce notify callback to identify if the MCE address location is part of an extended linear cache region and handle accordingly. Added symbol export to set_mce_nospec() in x86 code in order to call set_mce_nospec() from the CXL MCE notify callback. Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 14:13:49 -07:00
Thomas Weißschuh	3bd53b2fa5	Revert "selftests: kselftest: Fix build failure with NOLIBC" This reverts commit `16767502aa`. Nolibc gained support for uname(2) and sscanf(3) which are the dependencies of ksft_min_kernel_version(). So re-enable support for ksft_min_kernel_version() under nolibc. Acked-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250209-nolibc-scanf-v2-2-c29dea32f1cd@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-26 22:13:48 +01:00
Thomas Weißschuh	22edf1f8d4	tools/nolibc: add support for [v]sscanf() These functions are used often, also in selftests. sscanf() itself is also used by kselftest.h itself. The implementation is limited and only supports numeric arguments. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250209-nolibc-scanf-v2-1-c29dea32f1cd@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-26 22:13:48 +01:00
Heiko Carstens	dc4b165855	selftests/ftrace: Use readelf to find entry point in uprobe test The uprobe events test fails on s390, but also on x86 (Fedora 41). The problem appears to be that there is an assumption that adding a uprobe to the beginning of the executable mapping of /bin/sh is sufficient to trigger a uprobe event when /bin/sh is executed. This assumption is not necessarily true. Therefore use "readelf -h" to find the entry point address of /bin/sh and use this address when adding the uprobe event. This adds a dependency to readelf which is not always installed. Therefore add a check and exit with exit_unresolved if it is not installed. Link: https://lore.kernel.org/r/20250220130102.2079179-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-26 13:53:58 -07:00
Dave Jiang	0ec9849b63	acpi/hmat / cxl: Add extended linear cache support for CXL The current cxl region size only indicates the size of the CXL memory region without accounting for the extended linear cache size. Retrieve the cache size from HMAT and append that to the cxl region size for the cxl region range that matches the SRAT range that has extended linear cache enabled. The SRAT defines the whole memory range that includes the extended linear cache and the CXL memory region. The new HMAT ECN/ECR to the Memory Side Cache Information Structure defines the size of the extended linear cache size and matches to the SRAT Memory Affinity Structure by the memory proxmity domain. Add a helper to match the cxl range to the SRAT memory range in order to retrieve the cache size. There are several places that checks the cxl region range against the decoder range. Use new helper to check between the two ranges and address the new cache size. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 13:45:22 -07:00
Linus Torvalds	c0d35086a2	Landlock fix for v6.14-rc5 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZ785FhAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSILQBAMFwpFClzjVeWyLFNd/gaTlPWeedvnag+yZu CK9q39jOAP9tj1unQBFpsI7jsTk6ZxPVxb4DymPIirZMb/5FuxUnDQ== =QrSM -----END PGP SIGNATURE----- Merge tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "Fixes to TCP socket identification, documentation, and tests" * tag 'landlock-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add binaries to .gitignore selftests/landlock: Test that MPTCP actions are not restricted selftests/landlock: Test TCP accesses with protocol=IPPROTO_TCP landlock: Fix non-TCP sockets restriction landlock: Minor typo and grammar fixes in IPC scoping documentation landlock: Fix grammar error selftests/landlock: Enable the new CONFIG_AF_UNIX_OOB	2025-02-26 11:55:44 -08:00
Andrea Righi	5ae5161820	selftests/sched_ext: Add NUMA-aware scheduler test Add a selftest to validate the behavior of the NUMA-aware scheduler functionalities, including idle CPU selection within nodes, per-node DSQs and CPU to node mapping. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-02-26 08:49:02 -10:00
Mykyta Yatsenko	3d1033caf0	selftests/bpf: Introduce veristat test Introducing test for veristat, part of test_progs. Test cases cover functionality of setting global variables in BPF program. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250225163101.121043-3-mykyta.yatsenko5@gmail.com	2025-02-26 10:45:01 -08:00
Mykyta Yatsenko	e3c9abd0d1	selftests/bpf: Implement setting global variables in veristat To better verify some complex BPF programs we'd like to preset global variables. This patch introduces CLI argument `--set-global-vars` or `-G` to veristat, that allows presetting values to global variables defined in BPF program. For example: prog.c: ``` enum Enum { ELEMENT1 = 0, ELEMENT2 = 5 }; const volatile __s64 a = 5; const volatile __u8 b = 5; const volatile enum Enum c = ELEMENT2; const volatile bool d = false; char arr[4] = {0}; SEC("tp_btf/sched_switch") int BPF_PROG(...) { bpf_printk("%c\n", arr[a]); bpf_printk("%c\n", arr[b]); bpf_printk("%c\n", arr[c]); bpf_printk("%c\n", arr[d]); return 0; } ``` By default verification of the program fails: ``` ./veristat prog.bpf.o ``` By presetting global variables, we can make verification pass: ``` ./veristat wq.bpf.o -G "a = 0" -G "b = 1" -G "c = 2" -G "d = 3" ``` Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250225163101.121043-2-mykyta.yatsenko5@gmail.com	2025-02-26 10:45:00 -08:00
Ihor Solodrai	0ba0ef012e	selftests/bpf: Test bpf_usdt_arg_size() function Update usdt tests to also check for correct behavior of bpf_usdt_arg_size(). Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20250224235756.2612606-2-ihor.solodrai@linux.dev	2025-02-26 08:59:44 -08:00
Dave Jiang	44818d387e	cxl/test: Add Get Supported Features mailbox command support Add cxl-test emulation of Get Supported Features mailbox command. Currently only adding a test feature with feature identifier of all f's for testing. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250220194438.2281088-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:31 -07:00
Dave Jiang	f0e6a2329b	cxl: Add Get Supported Features command for kernel usage CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h) The command retrieve the list of supported device-specific features (identified by UUID) and general information about each Feature. The driver will retrieve the Feature entries in order to make checks and provide information for the Get Feature and Set Feature command. One of the main piece of information retrieved are the effects a Set Feature command would have for a particular feature. The retrieved Feature entries are stored in the cxl_mailbox context. The setup of Features is initiated via devm_cxl_setup_features() during the pci probe function before the cxl_memdev is enumerated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:27 -07:00
Mahe Tardy	9138048bb5	selftests/bpf: add cgroup_skb netns cookie tests Add netns cookie test that verifies the helper is now supported and work in the context of cgroup_skb programs. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com> Link: https://lore.kernel.org/r/20250225125031.258740-2-mahe.tardy@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-26 07:35:51 -08:00
Chang S. Bae	bfc98dbcb3	selftests/x86/avx: Add AVX tests Add xstate testing specifically for those vector register states, validating kernel's context switching and ensuring ABI compliance. Use the established xstate testing framework. Alternatively, this invocation could be placed directly in xstate.c::main(). However, the current test file naming convention, which clearly specifies the tested area, seems reasonable. Adding avx.c considerably aligns with that convention. The test output should be like this for ZMM_Hi256 as an example: $ avx_64 ... [RUN] AVX-512 ZMM_Hi256: check context switches, 10 iterations, 5 threads. [OK] No incorrect case was found. [RUN] AVX-512 ZMM_Hi256: inject xstate via ptrace(). [OK] 'xfeatures' in SW reserved area was correctly written [OK] xstate was correctly updated. [RUN] AVX-512 ZMM_Hi256: load xstate and raise SIGUSR1 [OK] 'magic1' is valid [OK] 'xfeatures' in SW reserved area is valid [OK] 'xfeatures' in XSAVE header is valid [OK] xstate delivery was successful [OK] 'magic2' is valid [RUN] AVX-512 ZMM_Hi256: load new xstate from sighandler and check it after sigreturn [OK] xstate was restored correctly But systems without AVX-512 will look like: ... The kernel does not support feature number: 5 The kernel does not support feature number: 6 The kernel does not support feature number: 7 Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-10-chang.seok.bae@intel.com	2025-02-26 13:05:30 +01:00
Chang S. Bae	fa826c1f2c	selftests/x86/xstate: Clarify supported xstates The established xstate test code is designed to be generic, but certain xstates require special handling and cannot be tested without additional adjustments. Clarify which xstates are currently supported, and enforce testing only for them. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-9-chang.seok.bae@intel.com	2025-02-26 13:05:30 +01:00
Chang S. Bae	10d8a204c5	selftests/x86/xstate: Consolidate test invocations into a single entry Currently, each of the three xstate tests runs as a separate invocation, requiring the xstate number to be passed and state information to be reconstructed repeatedly. This approach arose from their individual and isolated development, but now it makes sense to unify them. Introduce a wrapper function that first verifies feature availability from the kernel and constructs the necessary state information once. The wrapper then sequentially invokes all tests to ensure consistent execution. Update the AMX test to use this unified invocation. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-8-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	e075d9fa16	selftests/x86/xstate: Introduce signal ABI test With the refactored test cases, another xstate exposure to userspace is through signal delivery. While amx.c includes signal-related scenarios, its primary focus is on xstate permission management, which is largely specific to dynamic states. The remaining gap is testing xstate preservation and restoration across signal delivery. The kernel defines an ABI for presenting xstate in the signal frame, closely resembling the hardware XSAVE format, where xstate modification is also possible. Introduce a new test case to verify xstate preservation across signal delivery and return, that is ensuring ABI compatibility by: - Loading xstate before raising a signal. - Verifying correct exposure in the signal frame - Modifying xstate in the signal frame before returning. - Checking the state restoration upon signal return. Integrate this test into the AMX test suite as an initial usage site. Expected output: $ amx_64 ... [RUN] AMX Tile data: load xstate and raise SIGUSR1 [OK] 'magic1' is valid [OK] 'xfeatures' in SW reserved area is valid [OK] 'xfeatures' in XSAVE header is valid [OK] xstate delivery was successful [OK] 'magic2' is valid [RUN] AMX Tile data: load new xstate from sighandler and check it after sigreturn [OK] xstate was restored correctly Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-7-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	7cb2fbe419	selftests/x86/xstate: Refactor ptrace ABI test Following the refactoring of the context switching test, the ptrace test is another component reusable for other xstate features. As part of this restructuring, add a missing check to validate the user_xstateregs->xstate_fx_sw field in the ABI. Also, replace err() and fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: inject xstate via ptrace(). [OK] 'xfeatures' in SW reserved area was correctly written [OK] xstate was correctly updated. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-6-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	40f6852ef2	selftests/x86/xstate: Refactor context switching test The existing context switching and ptrace tests in amx.c are not specific to dynamic states, making them reusable for general xstate testing. As a first step, move the context switching test to xstate.c. Refactor the test code to allow specifying which xstate component being tested. To decouple the test from dynamic states, remove the permission request code. In fact, The permission request inside the test wrapper was redundant. Additionally, replace fatal_error() with ksft_exit_fail_msg() for consistency in error handling. Expected output: $ amx_64 ... [RUN] AMX Tile data: check context switches, 10 iterations, 5 threads. [OK] No incorrect case was found. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-5-chang.seok.bae@intel.com	2025-02-26 13:05:29 +01:00
Chang S. Bae	3fcb4d6146	selftests/x86/xstate: Enumerate and name xstate components After moving essential helpers from amx.c, the code remains neutral regarding which xstate components it handles. However, explicitly listing known components helps users identify which features are ready for testing. Enumerate xstate components to facilitate identification. Extend struct xstate_info to include a name field, providing a human-readable identifier. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-4-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Chang S. Bae	0f6d91a327	selftests/x86/xstate: Refactor XSAVE helpers for general use The AMX test introduced several XSAVE-related helper functions, but so far, it has been the only user of them. These helpers can be generalized for broader test of multiple xstate features. Move most XSAVE-related code into xsave.h, making it shareable. The restructuring includes: * Establishing low-level XSAVE helpers for saving and restoring register states, as well as handling XSAVE buffers. * Generalizing state data manipuldations: set_rand_data() * Introducing a generic feature query helper: get_xstate_info() While doing so, remove unused defines in amx.c. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-3-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Chang S. Bae	dbd6b649e7	selftests/x86: Consolidate redundant signal helper functions The x86 selftests frequently register and clean up signal handlers, but the sethandler() and clearhandler() functions have been redundantly copied across multiple .c files. Move these functions to helpers.h to enable reuse across tests, eliminating around 250 lines of duplicate code. Converge the error handling by using ksft_exit_fail_msg(), which is functionally equivalent with err() within the selftest framework. This change is a prerequisite for the upcoming xstate selftest, which requires signal handling for registering and cleaning up handlers. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250226010731.2456-2-chang.seok.bae@intel.com	2025-02-26 13:05:28 +01:00
Sebastian Ott	a88c7c2244	KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR Assert that MIDR_EL1, REVIDR_EL1, AIDR_EL1 are writable from userspace, that the changed values are visible to guests, and that they are preserved across a vCPU reset. Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20250225005401.679536-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-02-26 01:32:23 -08:00
Amery Hung	4e4136c644	selftests/bpf: Test gen_pro/epilogue that generate kfuncs Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program. The main bpf program and return value checks are identical to pro_epilogue.c introduced in commit `47e69431b5` ("selftests/bpf: Test gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops detects a program name with prefix "test_kfunc_", it generates slightly different prologue and epilogue: They still add 1000 to args->a in prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue, but involve kfuncs. At high level, the alternative version of prologue and epilogue look like this: cgrp = bpf_cgroup_from_id(0); if (cgrp) bpf_cgroup_release(cgrp); else /* Perform what original bpf_testmod_st_ops prologue or * epilogue does */ Since 0 is never a valid cgroup id, the original prologue or epilogue logic will be performed. As a result, the __retval check should expect the exact same return value. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250225233545.285481-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-25 19:04:43 -08:00
Gal Pressman	da87cabaf8	selftests: drv-net-hw: Add a test for symmetric RSS hash Add a selftest that verifies symmetric RSS hash is working as intended. The test runs iterations of traffic, swapping the src/dst UDP ports, and verifies that the same RX queue is receiving the traffic in both cases. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-5-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:31:05 -08:00
Gal Pressman	0163250039	selftests: drv-net: Make rand_port() get a port more reliably Instead of guessing a port and checking whether it's available, get an available port from the OS. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:31:05 -08:00
Hangbin Liu	0f58804080	selftests/net: ensure mptcp is enabled in netns Some distributions may not enable MPTCP by default. All other MPTCP tests source mptcp_lib.sh to ensure MPTCP is enabled before testing. However, the ip_local_port_range test is the only one that does not include this step. Let's also ensure MPTCP is enabled in netns for ip_local_port_range so that it passes on all distributions. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224094013.13159-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-25 18:13:28 -08:00
Linus Torvalds	9f5270d758	perf tools fixes for v6.14: 2nd batch - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZ74OJwAKCRCyPKLppCJ+ J30mAPsHCA8A+CNq/5yW2VhFLV1GgCSL5oWqxXRn7QjhSrCQBQEAot2u4O5zXs7M sg+mPlYiS1oT+zmvTLlXrN+bVyWP9A4= =jH1N -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure	2025-02-25 13:32:32 -08:00
liuye	c4f23a9d6e	selftests/x86/lam: Fix minor memory in do_uring() Exception branch returns without freeing 'fi'. Signed-off-by: liuye <liuye@kylinos.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250114082650.113105-1-liuye@kylinos.cn	2025-02-25 15:03:06 +01:00
Linus Torvalds	2a1944bff5	RISC-V Fixes for 6.14-rc5 * A fix for cacheinfo DT probing to avoid reading non-boolean properties as booleans. * A fix for cpufeature to use bitmap_equal() instead of memcmp(), so unused bits are ignored. * Fixes for cmpxchg and futex cmpxchg that properly encode the sign extension requirements on inline asm, which results in spurious successes. This manifests in at least inode_set_ctime_current, but is likely just a disaster waiting to happen. * A fix for the rseq selftests, which was using an invalid constraint. * A pair of fixes for signal frame size handling: * We were reserving space for an extra empty extension context header on systems with extended signal context, thus resulting in unnecessarily large allocations. * We weren't properly checking for available extensions before calculating the signal stack size, which resulted in undersized stack allocations on some systems (at least those with T-Head custom vectors). Also, we've added Alex as a reviewer. He's been helping out a ton lately, thanks! -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAme8rsQTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQVumD/9EZz6+ejftG1u7M/YzFfIa6ZOqYtoi 7aaecvsKNhVu0zvLU2vDAj+lTeokutQKAI9hByoVDVBzCllW/AYnpentQXTbHbl4 3pyIeOPKMXEDyru8heQ/u7h2qhq1AN54btPHx9UdIc5/vyrQIF2Ec4jo1GojJmyj qNHSyeTOB8nhBHYzM2RYAw6r0s27UX5wfL09Cm97b7KJigJ6GGPsI+KjruOqVfST i7wqbH1ufQoXrU8DhICA5wT/Q7f2WqJ3W2IyP186EJOAXhEG9xzOMQzVwCm2KI/I Rf5mhE5MoS9+ZmyhpATRc8Dwr1iZjIh7nKIFJh4/IGcSCuwsnWEFRvihmqCnphl3 /fFLstBflFAEKadmc7LgRJDTZYAjVAe/kSrl3v9ArVBU7d07/dAUOXcgjJy6fygx 7yrMnT3Ew4J160iDvFIalUdjEgYsKbwNygT/D6XlcXuR/KoWA+HYQ7S4eTgbqpiz 366JAhZQ9xyQxTlca13sdkz+2eBjU/SAX8O9/WYpDp6J5uqOFtyQ52a2qEzQFznF MRtFe2YERlM8L0vqJhNFEDefo7FUuZIyOU8evhJA+L7X8b3FpsEx5xUzamXdeGsQ 0sEbsBPhNToVF0DWn8HtdpMYYX95xujj83QneMawaGjpS7cpY93OVUNFAJ/a/uuJ bGLv+1HIdW0LoQ== =+DfW -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for cacheinfo DT probing to avoid reading non-boolean properties as booleans. - A fix for cpufeature to use bitmap_equal() instead of memcmp(), so unused bits are ignored. - Fixes for cmpxchg and futex cmpxchg that properly encode the sign extension requirements on inline asm, which results in spurious successes. This manifests in at least inode_set_ctime_current, but is likely just a disaster waiting to happen. - A fix for the rseq selftests, which was using an invalid constraint. - A pair of fixes for signal frame size handling: - We were reserving space for an extra empty extension context header on systems with extended signal context, thus resulting in unnecessarily large allocations. - We weren't properly checking for available extensions before calculating the signal stack size, which resulted in undersized stack allocations on some systems (at least those with T-Head custom vectors). Also, we've added Alex as a reviewer. He's been helping out a ton lately, thanks! * tag 'riscv-for-linus-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: MAINTAINERS: Add myself as a riscv reviewer riscv: signal: fix signal_minsigstksz riscv: signal: fix signal frame size rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm riscv/futex: sign extend compare value in atomic cmpxchg riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg riscv: cpufeature: use bitmap_equal() instead of memcmp() riscv: cacheinfo: Use of_property_present() for non-boolean properties	2025-02-24 16:40:32 -08:00
Martin K. Petersen	adc4fb9c81	Merge patch series "Initial support for RK3576 UFS controller" Shawn Lin <shawn.lin@rock-chips.com> says: This patchset adds initial UFS controller supprt for RK3576 SoC. Patch 1 is the dt-bindings. Patch 2-4 deal with rpm and spm support in advanced suggested by Ulf. Patch 5 exports two new APIs for host driver. Patch 6 and 7 are the host driver and dtsi support. Link: https://lore.kernel.org/r/1738736156-119203-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-02-24 19:24:14 -05:00
Yiqian Xun	e402c70856	selftests/user_events: Fix failures caused by test code In parse_abi function,the dyn_test fails because the enable_file isn’t closed after successfully registering an event. By adding wait_for_delete(), the dyn_test now passes as expected. Link: https://lore.kernel.org/r/20250221033555.326716-1-realxxyq@163.com Signed-off-by: Yiqian Xun <xunyiqian@kylinos.cn> Acked-by: Beau Belgrave <beaub@linux.microsoft.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-24 16:37:17 -07:00
Jakub Kicinski	29b036be1b	selftests: drv-net: test XDP, HDS auto and the ioctl path Test XDP and HDS interaction. While at it add a test for using the IOCTL, as that turned out to be the real culprit. Testing bnxt: # NETIF=eth0 ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable # SKIP disabling of HDS not supported by the device ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0 and netdevsim: # ./ksft-net-drv/drivers/net/hds.py KTAP version 1 1..12 ok 1 hds.get_hds ok 2 hds.get_hds_thresh ok 3 hds.set_hds_disable ok 4 hds.set_hds_enable ok 5 hds.set_hds_thresh_zero ok 6 hds.set_hds_thresh_max ok 7 hds.set_hds_thresh_gt ok 8 hds.set_xdp ok 9 hds.enabled_set_xdp ok 10 hds.ioctl ok 11 hds.ioctl_set_xdp ok 12 hds.ioctl_enabled_set_xdp # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0 Netdevsim needs a sane default for tx/rx ring size. ethtool 6.11 is needed for the --disable-netlink option. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Taehee Yoo <ap420073@gmail.com> Link: https://patch.msgid.link/20250221025141.1132944-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-24 14:16:37 -08:00
David Wei	89baa22d75	io_uring/zcrx: add selftest case for recvzc with read limit Add a selftest case to iou-zcrx where the sender sends 4x4K = 16K and the receiver does 4x4K recvzc requests. Validate that the requests complete successfully and that the data is not corrupted. Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20250224041319.2389785-3-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-24 12:56:13 -07:00
Sean Christopherson	2428865bf0	KVM: selftests: Add a nested (forced) emulation intercept test for x86 Add a rudimentary test for validating KVM's handling of L1 hypervisor intercepts during instruction emulation on behalf of L2. To minimize complexity and avoid overlap with other tests, only validate KVM's handling of instructions that L1 wants to intercept, i.e. that generate a nested VM-Exit. Full testing of emulation on behalf of L2 is better achieved by running existing (forced) emulation tests in a VM, (although on VMX, getting L0 to emulate on #UD requires modifying either L1 KVM to not intercept #UD, or modifying L0 KVM to prioritize L0's exception intercepts over L1's intercepts, as is done by KVM for SVM). Since emulation should never be successful, i.e. L2 always exits to L1, dynamically generate the L2 code stream instead of adding a helper for each instruction. Doing so requires hand coding instruction opcodes, but makes it significantly easier for the test to compute the expected "next RIP" and instruction length. Link: https://lore.kernel.org/r/20250201015518.689704-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-24 09:01:07 -08:00
Linus Torvalds	b8c8c1414f	tracing fixes for v6.14: Function graph accounting fixes: - Fix the manage ops hashes The function graph registers a "manager ops" and "sub-ops" to ftrace. The manager ops does not have any callback but calls the sub-ops callbacks. The manage ops hashes (what is used to tell ftrace what functions to attach to) is built on the sub-ops it manages. There was an error in the way it built the hash. An empty hash means to attach to all functions. When the manager ops had one sub-ops it properly copied its hash. But when the manager ops had more than one sub-ops, it went into a loop to make a set of all functions it needed to add to the hash. If any of the subops hashes was empty, that would mean to attach to all functions. The error was that the first iteration of the loop passed in an empty hash to start with in order to add the other hashes. That starting hash was mistaken as to attach to all functions. This made the manage ops attach to all functions whenever it had two or more sub-ops, even if each sub-op was attached to only a single function. - Do not add duplicate entries to the manager ops hash If two or more subops hashes trace the same function, an entry for that function will be added to the manager ops for each subops. This causes waste and extra overhead. Fprobe accounting fixes: - Remove last function from fprobe hash Fprobes has a ftrace hash to manage which functions an fprobe is attached to. It also has a counter of how many fprobes are attached. When the last fprobe is removed, it unregisters the fprobe from ftrace but does not remove the functions the last fprobe was attached to from the hash. This leaves the old functions attached. When a new fprobe is added, the fprobe infrastructure attaches to not only the functions of the new fprobe, but also to the functions of the last fprobe. - Fix accounting of the fprobe counter When a fprobe is added, it updates a counter. If the counter goes from zero to one, it attaches its ops to ftrace. When an fprobe is removed, the counter is decremented. If the counter goes from 1 to zero, it removes the fprobes ops from ftrace. There was an issue where if two fprobes trace the same function, the addition of each fprobe would increment the counter. But when removing the first of the fprobes, it would notice that another fprobe is still attached to one of its functions no it does not remove the functions from the ftrace ops. But it also did not decrement the counter. When the last fprobe is removed, the counter is still one. This leaves the fprobes callback still registered with ftrace and it being called by the functions defined by the fprobes ops hash. Worse yet, because all the functions from the fprobe ops hash have been removed, that tells ftrace that it wants to trace all functions. Thus, this puts the state of the system where every function is calling the fprobe callback handler (which does nothing as there are no registered fprobes), but this causes a good 13% slow down of the entire system. Other updates: - Add a selftest to test the above issues to prevent regressions. - Fix preempt count accounting in function tracing Better recursion protection was added to function tracing which added another layer of preempt disable. As the preempt_count gets traced in the event, it needs to subtract the amount of preempt disabling the tracer does to record what the preempt_count was when the trace was triggered. - Fix memory leak in output of set_event A variable is passed by the seq_file functions in the location that is set by the return of the next() function. The start() function allocates it and the stop() function frees it. But when the last item is found, the next() returns NULL which leaks the data that was allocated in start(). The m->private is used for something else, so have next() free the data when it returns NULL, as stop() will then just receive NULL in that case. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ7j6ARQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6quFxAQDrO8tjYbhLqg/LMOQyzwn/EF3Jx9ub 87961mA0rKTkYwEAhPNzTZ6GwKyKc4ny/R338KgNY69wWnOK6k/BTxCRmwk= =TOah -----END PGP SIGNATURE----- Merge tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Function graph accounting fixes: - Fix the manage ops hashes The function graph registers a "manager ops" and "sub-ops" to ftrace. The manager ops does not have any callback but calls the sub-ops callbacks. The manage ops hashes (what is used to tell ftrace what functions to attach to) is built on the sub-ops it manages. There was an error in the way it built the hash. An empty hash means to attach to all functions. When the manager ops had one sub-ops it properly copied its hash. But when the manager ops had more than one sub-ops, it went into a loop to make a set of all functions it needed to add to the hash. If any of the subops hashes was empty, that would mean to attach to all functions. The error was that the first iteration of the loop passed in an empty hash to start with in order to add the other hashes. That starting hash was mistaken as to attach to all functions. This made the manage ops attach to all functions whenever it had two or more sub-ops, even if each sub-op was attached to only a single function. - Do not add duplicate entries to the manager ops hash If two or more subops hashes trace the same function, an entry for that function will be added to the manager ops for each subops. This causes waste and extra overhead. Fprobe accounting fixes: - Remove last function from fprobe hash Fprobes has a ftrace hash to manage which functions an fprobe is attached to. It also has a counter of how many fprobes are attached. When the last fprobe is removed, it unregisters the fprobe from ftrace but does not remove the functions the last fprobe was attached to from the hash. This leaves the old functions attached. When a new fprobe is added, the fprobe infrastructure attaches to not only the functions of the new fprobe, but also to the functions of the last fprobe. - Fix accounting of the fprobe counter When a fprobe is added, it updates a counter. If the counter goes from zero to one, it attaches its ops to ftrace. When an fprobe is removed, the counter is decremented. If the counter goes from 1 to zero, it removes the fprobes ops from ftrace. There was an issue where if two fprobes trace the same function, the addition of each fprobe would increment the counter. But when removing the first of the fprobes, it would notice that another fprobe is still attached to one of its functions no it does not remove the functions from the ftrace ops. But it also did not decrement the counter, so when the last fprobe is removed, the counter is still one. This leaves the fprobes callback still registered with ftrace and it being called by the functions defined by the fprobes ops hash. Worse yet, because all the functions from the fprobe ops hash have been removed, that tells ftrace that it wants to trace all functions. Thus, this puts the state of the system where every function is calling the fprobe callback handler (which does nothing as there are no registered fprobes), but this causes a good 13% slow down of the entire system. Other updates: - Add a selftest to test the above issues to prevent regressions. - Fix preempt count accounting in function tracing Better recursion protection was added to function tracing which added another layer of preempt disable. As the preempt_count gets traced in the event, it needs to subtract the amount of preempt disabling the tracer does to record what the preempt_count was when the trace was triggered. - Fix memory leak in output of set_event A variable is passed by the seq_file functions in the location that is set by the return of the next() function. The start() function allocates it and the stop() function frees it. But when the last item is found, the next() returns NULL which leaks the data that was allocated in start(). The m->private is used for something else, so have next() free the data when it returns NULL, as stop() will then just receive NULL in that case" * tag 'ftrace-v6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak when reading set_event file ftrace: Correct preemption accounting for function tracing. selftests/ftrace: Update fprobe test to check enabled_functions file fprobe: Fix accounting of when to unregister from function graph fprobe: Always unregister fgraph function from ops ftrace: Do not add duplicate entries in subops manager ops ftrace: Fix accounting of adding subops to a manager ops	2025-02-22 09:03:54 -08:00
Tamir Duberstein	ba6be7ba2d	selftests: remove reference to prime_numbers.sh Remove a leftover shell script reference from commit `313b38a6ec` ("lib/prime_numbers: convert self-test to KUnit"). Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202502171110.708d965a-lkp@intel.com Fixes: `313b38a6ec` ("lib/prime_numbers: convert self-test to KUnit") Signed-off-by: Tamir Duberstein <tamird@gmail.com> Link: https://lore.kernel.org/r/20250217-fix-prime-numbers-v1-1-eb0ca7235e60@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>	2025-02-22 06:44:45 -08:00
Michael Jeanson	3c27b40830	selftests/rseq: Add rseq syscall errors test This test adds coverage of expected errors during rseq registration and unregistration, it disables glibc integration and will thus always exercise the rseq syscall explictly. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250121213402.1754762-1-mjeanson@efficios.com	2025-02-22 14:13:45 +01:00
Maciej Wieczor-Retman	782b819827	selftests/lam: Test get_user() LAM pointer handling Recent change in how get_user() handles pointers: https://lore.kernel.org/all/20241024013214.129639-1-torvalds@linux-foundation.org/ has a specific case for LAM. It assigns a different bitmask that's later used to check whether a pointer comes from userland in get_user(). Add test case to LAM that utilizes a ioctl (FIOASYNC) syscall which uses get_user() in its implementation. Execute the syscall with differently tagged pointers to verify that valid user pointers are passing through and invalid kernel/non-canonical pointers are not. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/1624d9d1b9502517053a056652d50dc5d26884ac.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Maciej Wieczor-Retman	51f909dcd1	selftests/lam: Skip test if LAM is disabled Until LASS is merged into the kernel: https://lore.kernel.org/all/20241028160917.1380714-1-alexander.shishkin@linux.intel.com/ LAM is left disabled in the config file. Running the LAM selftest with disabled LAM only results in unhelpful output. Use one of LAM syscalls() to determine whether the kernel was compiled with LAM support (CONFIG_ADDRESS_MASKING) or not. Skip running the tests in the latter case. Merge CPUID checking function with the one mentioned above to achieve a single function that shows LAM's availability from both CPU and the kernel. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/251d0f45f6a768030115e8d04bc85458910cb0dc.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Maciej Wieczor-Retman	ec8f5b4659	selftests/lam: Move cpu_has_la57() to use cpuinfo flag In current form cpu_has_la57() reports platform's support for LA57 through reading the output of cpuid. A much more useful information is whether 5-level paging is actually enabled on the running system. Check whether 5-level paging is enabled by trying to map a page in the high linear address space. Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/8b1ca51b13e6d94b5a42b6930d81b692cbb0bcbb.1737990375.git.maciej.wieczor-retman@intel.com	2025-02-22 13:17:07 +01:00
Hangbin Liu	465b210fdc	selftests: fib_nexthops: do not mark skipped tests as failed The current test marks all unexpected return values as failed and sets ret to 1. If a test is skipped, the entire test also returns 1, incorrectly indicating failure. To fix this, add a skipped variable and set ret to 4 if it was previously 0. Otherwise, keep ret set to 1. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250220085326.1512814-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:23:29 -08:00
Ido Schimmel	e818d1d1a6	selftests: fib_rule_tests: Add DSCP mask match tests Add tests for FIB rules that match on DSCP with a mask. Test both good and bad flows and both the input and output paths. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: dscp redirect to table [ OK ] TEST: rule6 check: dscp no redirect to table [ OK ] TEST: rule6 del by pref: dscp redirect to table [ OK ] TEST: rule6 check: iif dscp redirect to table [ OK ] TEST: rule6 check: iif dscp no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp redirect to table [ OK ] TEST: rule6 check: dscp masked redirect to table [ OK ] TEST: rule6 check: dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked redirect to table [ OK ] TEST: rule6 check: iif dscp masked no redirect to table [ OK ] TEST: rule6 del by pref: iif dscp masked redirect to table [ OK ] [...] Tests passed: 316 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250220080525.831924-7-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 16:08:49 -08:00
Jakub Kicinski	e87700965a	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ7ffOQAKCRAraS+Z+3y5 EzVHAP9h/QkeYoOZW9gul08I8vFiZsFe/lbOSLJWxeVfxb9JhgD/cMqby3qAxQK6 lsdNQ9jYG2232Wym89ag7fvTBK15Wg4= =gkN2 -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:59:47 -08:00
Xiao Liang	85cb3711ac	selftests: net: Add test cases for link and peer netns - Add test for creating link in another netns when a link of the same name and ifindex exists in current netns. - Add test to verify that link is created in target netns directly - no link new/del events should be generated in link netns or current netns. - Add test cases to verify that link-netns is set as expected for various drivers and combination of namespace-related parameters. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-14-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Xiao Liang	0303294162	selftests: net: Add python context manager for netns entering Change netns of current thread and switch back on context exit. For example: with NetNSEnter("ns1"): ip("link add dummy0 type dummy") The command be executed in netns "ns1". Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20250219125039.18024-13-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-21 15:28:03 -08:00
Steven Rostedt	e85c5e9792	selftests/ftrace: Update fprobe test to check enabled_functions file A few bugs were found in the fprobe accounting logic along with it using the function graph infrastructure. Update the fprobe selftest to catch those bugs in case they or something similar shows up in the future. The test now checks the enabled_functions file which shows all the functions attached to ftrace or fgraph. When enabling a fprobe, make sure that its corresponding function is also added to that file. Also add two more fprobes to enable to make sure that the fprobe logic works properly with multiple probes. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250220202055.733001756@goodmis.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-02-21 09:36:12 -05:00
Chandra Pratap	ae9ebda1bc	selftests: fix spelling/grammar errors in sysctl/sysctl.sh Fix the grammatical/spelling errors in sysctl/sysctl.sh. This fixes all errors pointed out by codespell in the file. Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-02-21 09:27:04 +01:00
Amery Hung	63817c7711	selftests/bpf: Test struct_ops program with __ref arg calling bpf_tail_call Test if the verifier rejects struct_ops program with __ref argument calling bpf_tail_call(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250220221532.1079331-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-20 18:44:35 -08:00
Alexei Starovoitov	bd4319b6c2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 Cross-merge bpf fixes after downstream PR (bpf-6.14-rc4). Minor conflict: kernel/bpf/btf.c Adjacent changes: kernel/bpf/arena.c kernel/bpf/btf.c kernel/bpf/syscall.c kernel/bpf/verifier.c mm/memory.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-20 18:13:57 -08:00
Jakub Kicinski	932a9249f7	selftests: drv-net: rename queues check_xdp to check_xsk The test is for AF_XDP, we refer to AF_XDP as XSK. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	4fde839846	selftests: drv-net: improve the use of ksft helpers in XSK queue test Avoid exceptions when xsk attr is not present, and add a proper ksft helper for "not in" condition. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	7147713799	selftests: drv-net: add a way to wait for a local process We use wait_port_listen() extensively to wait for a process we spawned to be ready. Not all processes will open listening sockets. Add a method of explicitly waiting for a child to be ready. Pass a FD to the spawned process and wait for it to write a message to us. FD number is passed via KSFT_READY_FD env variable. Similarly use KSFT_WAIT_FD to let the child process for a sign that we are done and child should exit. Sending a signal to a child with shell=True can get tricky. Make use of this method in the queues test to make it less flaky. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	d3726ab45c	selftests: drv-net: probe for AF_XDP sockets more explicitly Separate the support check from socket binding for easier refactoring. Use: ./helper - - just to probe if we can open the socket. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	bab59dcf71	selftests: drv-net: add missing new line in xdp_helper Kurt and Joe report missing new line at the end of Usage. Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	dabd31baa3	selftests: drv-net: use cfg.rpath() in netlink xsk attr test The cfg.rpath() helper was been recently added to make formatting paths for helper binaries easier. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:58:25 -08:00
Jakub Kicinski	846742f7e3	selftests: drv-net: add a warning for bkg + shell + terminate Joe Damato reports that some shells will fork before running the command when python does "sh -c $cmd", while bash on my machine does an exec of $cmd directly. This will have implications for our ability to terminate the child process on various configurations of bash and other shells. Warn about using bkg(... shell=True, termininate=True) most background commands can hopefully exit cleanly (exit_wait). Link: https://lore.kernel.org/Z7Yld21sv_Ip3gQx@LQ3V64L9R2 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250219234956.520599-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 17:57:29 -08:00
Linus Torvalds	319fc77f8f	BPF fixes: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> -----BEGIN PGP SIGNATURE----- iIsEABYKADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZ7evlRUcZGFuaWVsQGlv Z2VhcmJveC5uZXQACgkQ2yufC7HISIPwHgD/dTvM00Ck4Q73fPivyT7tcqxeXJlD D6ggzWl/SG9LAbwA/2/cSgAM9Jm1g7ddvn/S9QaDYOs5GmFl6urq6krs+tYD =FCs9 -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull BPF fixes from Daniel Borkmann: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests: bpf: test batch lookup on array of maps with holes bpf: skip non exist keys in generic_map_lookup_batch bpf: Handle allocation failure in acquire_lock_state bpf: verifier: Disambiguate get_constant_map_key() errors bpf: selftests: Test constant key extraction on irrelevant maps bpf: verifier: Do not extract constant map keys for irrelevant maps bpf: Fix softlockup in arena_map_free on 64k page kernel net: Add rx_skb of kfree_skb to raw_tp_null_args[]. bpf: Fix deadlock when freeing cgroup storage selftests/bpf: Add strparser test for bpf selftests/bpf: Fix invalid flag of recv() bpf: Disable non stream socket for strparser bpf: Fix wrong copied_seq calculation strparser: Add read_sock callback bpf: avoid holding freeze_mutex during mmap operation bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic selftests/bpf: Adjust data size to have ETH_HLEN bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed	2025-02-20 15:37:17 -08:00
Song Yoong Siang	6164847e54	selftests/bpf: Add launch time request to xdp_hw_metadata Add launch time hardware offload request to xdp_hw_metadata. Users can configure the delta of launch time relative to HW RX-time using the "-l" argument. By default, the delta is set to 0 ns, which means the launch time is disabled. By setting the delta to a non-zero value, the launch time hardware offload feature will be enabled and requested. Additionally, users can configure the Tx Queue to be enabled with the launch time hardware offload using the "-L" argument. By default, Tx Queue 0 will be used. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250216093430.957880-3-yoong.siang.song@intel.com	2025-02-20 15:13:45 -08:00
Jason Xing	f4924aec58	selftests/bpf: Add simple bpf tests in the tx path for timestamping feature BPF program calculates a couple of latency deltas between each tx timestamping callbacks. It can be used in the real world to diagnose the kernel behaviour in the tx path. Check the safety issues by accessing a few bpf calls in bpf_test_access_bpf_calls() which are implemented in the patch 3 and 4. Check if the bpf timestamping can co-exist with socket timestamping. There remains a few realistic things[1][2] to highlight: 1. in general a packet may pass through multiple qdiscs. For instance with bonding or tunnel virtual devices in the egress path. 2. packets may be resent, in which case an ACK might precede a repeat SCHED and SND. 3. erroneous or malicious peers may also just never send an ACK. [1]: https://lore.kernel.org/all/67a389af981b0_14e0832949d@willemb.c.googlers.com.notmuch/ [2]: https://lore.kernel.org/all/c329a0c1-239b-4ca1-91f2-cb30b8dd2f6a@linux.dev/ Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-13-kerneljasonxing@gmail.com	2025-02-20 14:30:07 -08:00
Thomas Weißschuh	9c812b01f1	tools/nolibc: add support for 32-bit s390 32-bit s390 is very close to the existing 64-bit implementation. Some special handling is necessary as there is neither LLVM nor QEMU support. Also the kernel itself can not build natively for 32-bit s390, so instead the test program is executed with a 64-bit kernel. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-2-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:06:32 +01:00
Thomas Weißschuh	3d1e67c615	selftests/nolibc: rename s390 to s390x Support for 32-bit s390 is about to be added. As "s39032" would look horrible, use the another naming scheme. 32-bit s390 is "s390" and 64-bit s390 is "s390x", similar to how it is handled in various toolchain components. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250206-nolibc-s390-v2-1-991ad97e3d58@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:06:18 +01:00
Thomas Weißschuh	00ddf4cc97	selftests/nolibc: only run constructor tests on nolibc The nolibc testsuite can be run against other libcs to test for interoperability. Some aspects of the constructor execution are not standardized and musl does not provide all tested feature, for one it does not provide arguments to the constructors, anymore? Skip the constructor tests on non-nolibc configurations. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20250212-nolibc-test-constructor-v1-1-c963875b3da4@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>	2025-02-20 22:04:12 +01:00
Steven Rostedt	e35896f236	selftests/tracing: Allow some more tests to run in instances The tests: trigger-action-hist-xfail.tc trigger-onchange-action-hist.tc trigger-snapshot-action-hist.tc trigger-hist-expressions.tc can all run in an instance. Test them in an instance as well. Link: https://lore.kernel.org/r/20250220185846.451234966@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:14 -07:00
Steven Rostedt	a58cc70af2	selftests/ftrace: Clean up triggers after setting them The triggers set in trigger-onchange-action-hist.tc and trigger-snapshot-action-hist.tc are not cleaned up at the end. These tests can also be done in instances and without cleaning up the triggers, the instances can not be removed as they are still "busy". Link: https://lore.kernel.org/r/20250220185846.291817731@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:07 -07:00
Steven Rostedt	4a3134b114	selftests/tracing: Test only toplevel README file not the instances For the tests that have both a README attribute as well as the instance flag to run the tests as an instance, the instance version will always exit with UNSUPPORTED. That's because the instance directory does not contain a README file. Currently, the tests check for a README file in the directory that the test runs in and if there's a requirement for something to be present in the README file, it will not find it, as the instance directory doesn't have it. Have the tests check if the current directory is an instance directory, and if it is, check two directories above the current directory for the README file: /sys/kernel/tracing/README /sys/kernel/tracing/instances/foo/../../README Link: https://lore.kernel.org/r/20250220185846.130216270@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-20 13:15:01 -07:00
Jakub Kicinski	5d6ba5ab85	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-20 10:37:30 -08:00
Linus Torvalds	27eddbf344	Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)). Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: geneve: fix use-after-free in geneve_find_dev(). - eth: ibmvnic: don't reference skb after sending to VIOS Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAme3Dc0SHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkQToQAKKTNLnFkYGFevxOukoZtU1l4ZOiaavg FlSsCVQrXfOvq5tIbkT7X86F3jYBNElNHvSDAFNQu5ZayEjqJHljSQE17rVVhepX Gvo9Jfk/ERwwd9vD78+KSvGMzSYAiIFoZmgLjaNyiFH53gMku8K41NRQAI2EeUzG G7lK/XYHvp/N7Ikt6ZMTFFW5NH7Wt2oeKZ8DbNBhK9+9lCet4s6PWzmGYrYkGUyy 0iuYh09tKCGrBH0BpHMIGokqttVawjdg4tCJUMXurVoOWus0dKXuIdpKQuCdm8jO lV89xotE7Icw9CJTFQQZNn2C94J3RL4xtqo4fOuO1ztll+ma/nbPp9lJgy2P1WRI DqWJart1YuzC4Ef1NQiNeqRjxKoLDHbuu2A1zdAHMyHxZsKaQWowhhjylQLYsU3N d0WHlo43VQf95wx9cqdsGmEopk5YPLqBS1Wz/uL/m7y/+dmK4nKCcmqhIQ1QG23L SOhuOmHDs3mDkXusxr7NULCwGzjy3rPSExGrM68kiDEUX7cWFvVGmrRHf2bBiv7A gJ6UZ3J4au477adHjj2lBn3n1iIQDSN04gjPYKjunSEaMI2nqr55dN/aAnKFTjtm fMPXMWOHDfkjtX0iV69/345dXyzL5r4JHNwtTwoLx1QRMm6WoznJveHI6qXQ1RGb HdG8y9A9ZfLn =pEjU -----END PGP SIGNATURE----- Merge tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Smaller than usual with no fixes from any subtree. Current release - regressions: - core: fix race of rtnl_net_lock(dev_net(dev)) Previous releases - regressions: - core: remove the single page frag cache for good - flow_dissector: fix handling of mixed port and port-range keys - sched: cls_api: fix error handling causing NULL dereference - tcp: - adjust rcvq_space after updating scaling ratio - drop secpath at the same time as we currently drop dst - eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl(). Previous releases - always broken: - vsock: - fix variables initialization during resuming - for connectible sockets allow only connected - eth: - geneve: fix use-after-free in geneve_find_dev() - ibmvnic: don't reference skb after sending to VIOS" * tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) Revert "net: skb: introduce and use a single page frag cache" net: allow small head cache usage with large MAX_SKB_FRAGS values nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() tcp: drop secpath at the same time as we currently drop dst net: axienet: Set mac_managed_pm arp: switch to dev_getbyhwaddr() in arp_req_set_public() net: Add non-RCU dev_getbyhwaddr() helper sctp: Fix undefined behavior in left shift operation selftests/bpf: Add a specific dst port matching flow_dissector: Fix port range key handling in BPF conversion selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range flow_dissector: Fix handling of mixed port and port-range keys geneve: Suppress list corruption splat in geneve_destroy_tunnels(). gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). dev: Use rtnl_net_dev_lock() in unregister_netdev(). net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). net: Add net_passive_inc() and net_passive_dec(). net: pse-pd: pd692x0: Fix power limit retrieval MAINTAINERS: trim the GVE entry gve: set xdp redirect target only when it is available ...	2025-02-20 10:19:54 -08:00
Christian Brauner	540dcf0f44	selftests/nsfs: add ioctl validation tests Add simple tests to validate that non-nsfs ioctls are rejected. Link: https://lore.kernel.org/r/20250219-work-nsfs-v1-2-21128d73c5e8@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-20 09:13:52 +01:00
Jakub Kicinski	0d0f4174f6	selftests: drv-net: add a simple TSO test Add a simple test for TSO. Send a few MB of data and check device stats to verify that the device was performing segmentation. Do the same thing over a few tunnel types. Injecting GSO packets directly would give us more ability to test corner cases, but perhaps starting simple is good enough? # ./ksft-net-drv/drivers/net/hw/tso.py # Detected qstat for LSO wire-packets KTAP version 1 1..14 ok 1 tso.ipv4 # SKIP Test requires IPv4 connectivity ok 2 tso.vxlan4_ipv4 # SKIP Test requires IPv4 connectivity ok 3 tso.vxlan6_ipv4 # SKIP Test requires IPv4 connectivity ok 4 tso.vxlan_csum4_ipv4 # SKIP Test requires IPv4 connectivity ok 5 tso.vxlan_csum6_ipv4 # SKIP Test requires IPv4 connectivity ok 6 tso.gre4_ipv4 # SKIP Test requires IPv4 connectivity ok 7 tso.gre6_ipv4 # SKIP Test requires IPv4 connectivity ok 8 tso.ipv6 ok 9 tso.vxlan4_ipv6 ok 10 tso.vxlan6_ipv6 ok 11 tso.vxlan_csum4_ipv6 ok 12 tso.vxlan_csum6_ipv6 # Testing with mangleid enabled ok 13 tso.gre4_ipv6 ok 14 tso.gre6_ipv6 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:7 error:0 Note that the test currently depends on the driver reporting the LSO count via qstat, which appears to be relatively rare (virtio, cisco/enic, sfc/efc; but virtio needs host support). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	de94e86974	selftests: drv-net: store addresses in dict indexed by ipver Looks like more and more tests want to iterate over IP version, run the same test over ipv4 and ipv6. The current naming of members in the env class makes it a bit awkward, we have separate members for ipv4 and ipv6 parameters. Store the parameters inside dicts, so that tests can easily index them with ip version. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250218225426.77726-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	2aefca8e1f	selftests: drv-net: get detailed interface info We already record output of ip link for NETIF in env for easy access. Record the detailed version. TSO test will want to know the max tso size. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:50 -08:00
Jakub Kicinski	2217bcb491	selftests: drv-net: resolve remote interface name Find out and record in env the name of the interface which remote host will use for the IP address provided via config. Interface name is useful for mausezahn and for setting up tunnels. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250218225426.77726-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:08:49 -08:00
Suchit	23dcacff2d	selftests: net: Fix minor typos in MPTCP and psock tests Fixes minor spelling errors: - `simult_flows.sh`: "al testcases" -> "all testcases" - `psock_tpacket.c`: "accross" -> "across" Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250218165923.20740-1-suchitkarunakaran@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 19:02:48 -08:00
Cong Wang	15de6ba95d	selftests/bpf: Add a specific dst port matching After this patch: #102/1 flow_dissector_classification/ipv4:OK #102/2 flow_dissector_classification/ipv4_continue_dissect:OK #102/3 flow_dissector_classification/ipip:OK #102/4 flow_dissector_classification/gre:OK #102/5 flow_dissector_classification/port_range:OK #102/6 flow_dissector_classification/ipv6:OK #102 flow_dissector_classification:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250218043210.732959-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:54:59 -08:00
Cong Wang	dfc1580f96	selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range After this patch: # ./tc_flower_port_range.sh TEST: Port range matching - IPv4 UDP [ OK ] TEST: Port range matching - IPv4 TCP [ OK ] TEST: Port range matching - IPv6 UDP [ OK ] TEST: Port range matching - IPv6 TCP [ OK ] TEST: Port range matching - IPv4 UDP Drop [ OK ] Cc: Qiang Zhang <dtzq01@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250218043210.732959-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:54:59 -08:00
Ido Schimmel	f5d783c088	selftests: fib_rule_tests: Add port mask match tests Add tests for FIB rules that match on source and destination ports with a mask. Test both good and bad flows. # ./fib_rule_tests.sh IPv6 FIB rule tests [...] TEST: rule6 check: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport redirect to table [ OK ] TEST: rule6 check: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport range no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport range redirect to table [ OK ] TEST: rule6 check: sport and dport masked redirect to table [ OK ] TEST: rule6 check: sport and dport masked no redirect to table [ OK ] TEST: rule6 del by pref: sport and dport masked redirect to table [ OK ] [...] Tests passed: 292 Tests failed: 0 Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-9-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:43:38 -08:00
Ido Schimmel	94694aa641	selftests: fib_rule_tests: Add port range match tests Currently, only matching on specific ports is tested. Add port range testing to make sure this use case does not regress. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250217134109.311176-8-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-19 18:43:38 -08:00
Jordan Rome	7042882abc	selftests/bpf: Add tests for bpf_copy_from_user_task_str This adds tests for both the happy path and the error path (with and without the BPF_F_PAD_ZEROS flag). Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250213152125.1837400-3-linux@jordanrome.com	2025-02-19 17:01:36 -08:00
Alexis Lothoré (eBPF Foundation)	ac13c50872	selftests/bpf: Enable kprobe_multi tests for ARM64 The kprobe_multi feature was disabled on ARM64 due to the lack of fprobe support. The fprobe rewrite on function_graph has been recently merged and thus brought support for fprobes on arm64. This then enables kprobe_multi support on arm64, and so the corresponding tests can now be run on this architecture. Remove the tests depending on kprobe_multi from DENYLIST.aarch64 to allow those to run in CI. CONFIG_FPROBE is already correctly set in tools/testing/selftests/bpf/config Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250219-enable_kprobe_multi_tests-v1-1-faeec99240c8@bootlin.com	2025-02-19 14:57:05 -08:00
Jason Xing	7a93ba8048	selftests/bpf: Add rto max for bpf_setsockopt test Test the TCP_RTO_MAX_MS optname in the existing setget_sockopt test. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250219081333.56378-3-kerneljasonxing@gmail.com	2025-02-19 12:30:52 -08:00
Bastien Curutchet (eBPF Foundation)	157feaaf18	selftests/bpf: ns_current_pid_tgid: Use test_progs's ns_ feature Two subtests use the test_in_netns() function to run the test in a dedicated network namespace. This can now be done directly through the test_progs framework with a test name starting with 'ns_'. Replace the use of test_in_netns() by test_ns_* calls. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-4-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	207cd7578a	selftests/bpf: tc_links/tc_opts: Unserialize tests Tests are serialized because they all use the loopback interface. Replace the 'serial_test_' prefixes with 'test_ns_' to benefit from the new test_prog feature which creates a dedicated namespace for each test, allowing them to run in parallel. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-3-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	c047e0e0e4	selftests/bpf: Optionally open a dedicated namespace to run test in it Some tests are serialized to prevent interference with others. Open a dedicated network namespace when a test name starts with 'ns_' to allow more test parallelization. Use the test name as namespace name to avoid conflict between namespaces. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-2-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Bastien Curutchet (eBPF Foundation)	4a06c5251a	selftests/bpf: ns_current_pid_tgid: Rename the test function Next patch will add a new feature to test_prog to run tests in a dedicated namespace if the test name starts with 'ns_'. Here the test name already starts with 'ns_' and creates some namespaces which would conflict with the new feature. Rename the test to avoid this conflict. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-1-14504db136b7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-19 09:46:02 -08:00
Christian Brauner	a1579f6bf6	selftests/ovl: add third selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-4-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	6e5ed6587e	selftests/ovl: add second selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-3-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	c68946ee7e	selftests/filesystems: add utils.{c,h} Add a new set of helpers that will be used in follow-up patches. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Christian Brauner	96f0943259	selftests/ovl: add first selftest for "override_creds" Add a simple test to verify that the new "override_creds" option works. Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-2-46af55e4ceda@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:32:12 +01:00
Eduard Zingerman	6361cd26e4	selftests/bpf: check states pruning for deeply nested iterator A test case with ridiculously deep bpf_for() nesting and a conditional update of a stack location. Consider the innermost loop structure: 1: bpf_for(o, 0, 10) 2: if (unlikely(bpf_get_prandom_u32())) 3: buf[0] = 42; 4: <exit> Assuming that verifier.c:clean_live_states() operates w/o change from the previous patch (e.g. as on current master) verification would proceed as follows: - at (1) state {buf[0]=?,o=drained}: - checkpoint - push visit to (2) for later - at (4) {buf[0]=?,o=drained} - pop (2) {buf[0]=?,o=active}, push visit to (3) for later - at (1) {buf[0]=?,o=active} - checkpoint - push visit to (2) for later - at (4) {buf[0]=?,o=drained} - pop (2) {buf[0]=?,o=active}, push visit to (3) for later - at (1) {buf[0]=?,o=active}: - checkpoint reached, checkpoint's branch count becomes 0 - checkpoint is processed by clean_live_states() and becomes {o=active} - pop (3) {buf[0]=42,o=active} - at (1), {buf[0]=42,o=active} - checkpoint - push visit to (2) for later - at (4) {buf[0]=42,o=drained} - pop (2) {buf[0]=42,o=active}, push visit to (3) for later - at (1) {buf[0]=42,o=active}, checkpoint reached - pop (3) {buf[0]=42,o=active} - at (1) {buf[0]=42,o=active}: - checkpoint reached, checkpoint's branch count becomes 0 - checkpoint is processed by clean_live_states() and becomes {o=active} - ... Note how clean_live_states() converted the checkpoint {buf[0]=42,o=active} to {o=active} and it can no longer be matched against {buf[0]=<any>,o=active}, because iterator based states are compared using stacksafe(... RANGE_WITHIN), that requires stack slots to have same types. At the same time there are still states {buf[0]=42,o=active} pushed to DFS stack. This behaviour becomes exacerbated with multiple nesting levels, here are veristat results: - nesting level 1: 69 insns - nesting level 2: 258 insns - nesting level 3: 900 insns - nesting level 4: 4754 insns - nesting level 5: 35944 insns - nesting level 6: 312558 insns - nesting level 7: 1M limit Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250215110411.3236773-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 19:22:59 -08:00
Eduard Zingerman	6da35da1a3	selftests/bpf: test correct loop_entry update in copy_verifier_state A somewhat cumbersome test case sensitive to correct copying of bpf_verifier_state->loop_entry fields in verifier.c:copy_verifier_state(). W/o the fix from a previous commit the program is accepted as safe. 1: /* poison block / 2: if (random() != 24) { // assume false branch is placed first 3: i = iter_new(); 4: while (iter_next(i)); 5: iter_destroy(i); 6: return; 7: } 8: 9: / dfs_depth block / 10: for (i = 10; i > 0; i--); 11: 12: / main block / 13: i = iter_new(); // fp[-16] 14: b = -24; // r8 15: for (;;) { 16: if (iter_next(i)) 17: break; 18: if (random() == 77) { // assume false branch is placed first 19: (u64 )(r10 + b) = 7; // this is not safe when b == -25 20: iter_destroy(i); 21: return; 22: } 23: if (random() == 42) { // assume false branch is placed first 24: b = -25; 25: } 26: } 27: iter_destroy(i); The goal of this example is to: (a) poison env->cur_state->loop_entry with a state S, such that S->branches == 0; (b) set state S as a loop_entry for all checkpoints in / main block /, thus forcing NOT_EXACT states comparisons; (c) exploit incorrect loop_entry set for checkpoint at line 18 by first creating a checkpoint with b == -24 and then pruning the state with b == -25 using that checkpoint. The / poison block / is responsible for goal (a). It forces verifier to first validate some unrelated iterator based loop, which leads to an update_loop_entry() call in is_state_visited(), which places checkpoint created at line 4 as env->cur_state->loop_entry. Starting from line 8, the branch count for that checkpoint is 0. The / dfs_depth block / is responsible for goal (b). It abuses the fact that update_loop_entry(cur, hdr) only updates cur->loop_entry when hdr->dfs_depth <= cur->dfs_depth. After line 12 every state has dfs_depth bigger then dfs_depth of poisoned env->cur_state->loop_entry. Thus the above condition is never true for lines 12-27. The / main block */ is responsible for goal (c). Verification proceeds as follows: - checkpoint {b=-24,i=active} created at line 16; - jump 18->23 is verified first, jump to 19 pushed to stack; - jump 23->26 is verified first, jump to 24 pushed to stack; - checkpoint {b=-24,i=active} created at line 15; - current state is pruned by checkpoint created at line 16, this sets branches count for checkpoint at line 15 to 0; - jump to 24 is popped from stack; - line 16 is reached in state {b=-25,i=active}; - this is pruned by a previous checkpoint {b=-24,i=active}: - checkpoint's loop_entry is poisoned and has branch count of 0, hence states are compared using NOT_EXACT rules; - b is not marked precise yet. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250215110411.3236773-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 19:22:58 -08:00
Chandra Mohan Sundar	f29e41454b	selftests: net: Fix few spelling mistakes Fix few spelling mistakes in net selftests Signed-off-by: Chandra Mohan Sundar <chandru.dav@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250217141520.81033-1-chandru.dav@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-18 18:10:31 -08:00
Yan Zhai	d66b773917	selftests: bpf: test batch lookup on array of maps with holes Iterating through array of maps may encounter non existing keys. The batch operation should not fail on when this happens. Signed-off-by: Yan Zhai <yan@cloudflare.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/9007237b9606dc2ee44465a4447fe46e13f3bea6.1739171594.git.yan@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-18 17:27:37 -08:00
Bastien Curutchet (eBPF Foundation)	e06f5bfd93	selftests/bpf: Remove test_xdp_redirect_multi.sh The tests done by test_xdp_redirect_multi.sh are now fully covered by the CI through test_xdp_veth.c. Remove test_xdp_redirect_multi.sh Remove xdp_redirect_multi.c that was used by the script to load and attach the BPF programs. Remove their entries in the Makefile Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-6-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	a93bfd824d	selftests/bpf: test_xdp_veth: Add XDP program on egress test XDP programs loaded on egress is tested by test_xdp_redirect_multi.sh but not by the test_progs framework. Add a test case in test_xdp_veth.c to test the XDP program on egress. Use the same BPF program than test_xdp_redirect_multi.sh that replaces the source MAC address by one provided through a BPF map. Use a BPF program that stores the source MAC of received packets in a map to check the test results. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-5-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	1e7e634542	selftests/bpf: test_xdp_veth: Add XDP broadcast redirection tests XDP redirections with BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS flags are tested by test_xdp_redirect_multi.sh but not within the test_progs framework. Add a broadcast test case in test_xdp_veth.c to test them. Use the same BPF programs than the one used by test_xdp_redirect_multi.sh. Use a BPF map to select the broadcast flags. Use a BPF map with an entry per veth to check whether packets are received or not Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-4-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	09c8bb1fae	selftests/bpf: Optionally select broadcasting flags Broadcasting flags are hardcoded for each kind for protocol. Create a redirect_flags map that allows to select the broadcasting flags to use in the bpf_redirect_map(). The protocol ID is used as a key. Set the old hardcoded values as default if the map isn't filled by the BPF caller. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250212-redirect-multi-v5-3-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	19a9484c1b	selftests/bpf: test_xdp_veth: Use a dedicated namespace Tests use the root network namespace, so they aren't fully independent of each other. For instance, the index of the created veth interfaces is incremented every time a new test is launched. Wrap the network topology in a network namespace to ensure full isolation. Use the append_tid() helper to ensure the uniqueness of this namespace's name during parallel runs. Remove the use of the append_tid() on the veth names as they now belong to an already unique namespace. Simplify cleanup_network() by directly deleting the namespaces Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250212-redirect-multi-v5-2-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Bastien Curutchet (eBPF Foundation)	6bdac0e317	selftests/bpf: test_xdp_veth: Create struct net_configuration The network configuration is defined by a table of struct veth_configuration. This isn't convenient if we want to add a network configuration that isn't linked to a veth pair. Create a struct net_configuration that holds the veth_configuration table to ease adding new configuration attributes in upcoming patch. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250212-redirect-multi-v5-1-fd0d39fca6e6@bootlin.com	2025-02-18 13:56:34 -08:00
Brendan Jackman	43ebec94e1	kunit: tool: Build GDB scripts Following a similar rationale as commit `e4835f1da4` ("kunit: tool: Build compile_commands.json"), make a common developer tool available by default for KUnit users. Compared to compile_commands.json, there is a little more work to be done to build the GDB scripts. Is it enough to affect development cycle duration? Unscientific evaluation: rm -rf .kunit; time tools/testing/kunit/kunit.py build --kunitconfig ./lib/kunit/.kunitconfig --jobs 96 Without this patch it took 14.77s, with this patch it took 14.83. So, although `make scripts_gdb` is pretty slow, presumably most of that is just the overhead of running Kbuild at all, actually building the scripts is approximately free. Note also, to actually get the GDB scripts the user needs to enable CONFIG_SCRIPTS_GDB, but building the scripts_gdb target without that is still harmless. Link: https://lore.kernel.org/r/20250121-kunit-gdb-v1-1-faedfd0653ef@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-18 14:28:01 -07:00
Charlie Jenkins	42367eca76	tools: Remove redundant quiet setup Q is exported from Makefile.include so it is not necessary to manually set it. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Benjamin Tissoires <bentiss@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Lukasz Luba <lukasz.luba@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-2-07de4482a581@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-02-18 16:27:43 -03:00
Petr Machata	eae1e92a1d	selftests: test_vxlan_fdb_changelink: Add a test for MC remote change Changes to MC remote need to be reflected in actual group memberships. Add a test to verify that it is the case. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:44 +01:00
Petr Machata	24adf47ea9	selftests: test_vxlan_fdb_changelink: Convert to lib.sh Instead of inlining equivalents, use lib.sh-provided primitives. Use defer to manage vx lifetime. This will make it easier to extend the test in the next patch. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:44 +01:00
Petr Machata	f802f172d7	selftests: forwarding: lib: Move require_command to net, generalize This helper could be useful to more than just forwarding tests. Move it upstairs and port over to log_test_skip(). Split the function into two parts: the bit that actually checks and reports skip, which is in a new function check_command(). And a bit that exits the test script if the check fails. This allows users consistent checking behavior while giving an option to bail out from a single test without bailing out of the whole script. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 13:06:43 +01:00
Michal Luczaj	85928e9c43	selftest/bpf: Add vsock test for sockmap rejecting unconnected Verify that for a connectible AF_VSOCK socket, merely having a transport assigned is insufficient; socket must be connected for the sockmap to accept. This does not test datagram vsocks. Even though it hardly matters. VMCI is the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an unimplemented vsock_transport::readskb() callback, making it unsupported by BPF/sockmap. Signed-off-by: Michal Luczaj <mhal@rbox.co> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 12:00:01 +01:00
Michal Luczaj	8350695bfb	selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected Commit `515745445e` ("selftest/bpf: Add test for vsock removal from sockmap on close()") added test that checked if proto::close() callback was invoked on AF_VSOCK socket release. I.e. it verified that a close()d vsock does indeed get removed from the sockmap. It was done simply by creating a socket pair and attempting to replace a close()d one with its peer. Since, due to a recent change, sockmap does not allow updating index with a non-established connectible vsock, redo it with a freshly established one. Signed-off-by: Michal Luczaj <mhal@rbox.co> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-02-18 12:00:01 +01:00
Mark Brown	5dcf52e2ce	selftests/mm: fix check for running THP tests When testing if we should try to compact memory or drop caches before we run the THP or HugeTLB tests we use \| as an or operator. This doesn't work since run_vmtests.sh is written in shell where this is used to pipe the output of the first argument into the second. Instead use the shell's -o operator. Link: https://lkml.kernel.org/r/20250212-kselftest-mm-no-hugepages-v1-1-44702f538522@kernel.org Fixes: `b433ffa8db` ("selftests: mm: perform some system cleanup before using hugepages") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nico Pache <npache@redhat.com> Cc: Mariano Pache <npache@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-02-17 22:40:04 -08:00
Amery Hung	af17bad9fb	selftests/bpf: Test returning referenced kptr from struct_ops programs Test struct_ops programs returning referenced kptr. When the return type of a struct_ops operator is pointer to struct, the verifier should only allow programs that return a scalar NULL or a non-local kptr with the correct type in its unmodified form. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250217190640.1748177-6-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-17 18:47:27 -08:00
Amery Hung	6991ec6beb	selftests/bpf: Test referenced kptr arguments of struct_ops programs Test referenced kptr acquired through struct_ops argument tagged with "__ref". The success case checks whether 1) a reference to the correct type is acquired, and 2) the referenced kptr argument can be accessed in multiple paths as long as it hasn't been released. In the fail cases, we first confirm that a referenced kptr acquried through a struct_ops argument is not allowed to be leaked. Then, we make sure this new referenced kptr acquiring mechanism does not accidentally allow referenced kptrs to flow into global subprograms through their arguments. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250217190640.1748177-4-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-17 18:47:27 -08:00
Joe Damato	788e52e2b6	selftests: drv-net: Test queue xsk attribute Test that queues which are used for AF_XDP have the xsk nest attribute. The attribute is currently empty, but its existence means the AF_XDP is being used for the queue. Enable CONFIG_XDP_SOCKETS for selftests/drivers/net tests, as well. Signed-off-by: Joe Damato <jdamato@fastly.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250214211255.14194-4-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:46:03 -08:00
Anna Emese Nyiri	c935af429e	selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY Introduce tests to verify the correct functionality of the SO_RCVMARK and SO_RCVPRIORITY socket options. Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250214205828.48503-1-annaemesenyiri@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:45:19 -08:00
Pranav Tyagi	dbcbec81c9	selftests: net: fix grammar in reuseaddr_ports_exhausted.c log message This patch fixes a grammatical error in a test log message in reuseaddr_ports_exhausted.c for better clarity. Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250213152612.4434-1-pranav.tyagi03@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-02-17 16:34:36 -08:00
Yury Khrustalev	00894c3fc9	selftests/powerpc: Use PKEY_UNRESTRICTED macro Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions are used in mm selftests for memory protection keys for ppc target. Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com> Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20250113170619.484698-4-yury.khrustalev@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-02-17 18:16:36 +00:00
Yury Khrustalev	3809cefe93	selftests/mm: Use PKEY_UNRESTRICTED macro Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions are used in mm selftests for memory protection keys. Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com> Suggested-by: Joey Gouly <joey.gouly@arm.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250113170619.484698-3-yury.khrustalev@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2025-02-17 18:16:36 +00:00
David Wei	71082faa2c	io_uring/zcrx: add selftest Add a selftest for io_uring zero copy Rx. This test cannot run locally and requires a remote host to be configured in net.config. The remote host must have hardware support for zero copy Rx as listed in the documentation page. The test will restore the NIC config back to before the test and is idempotent. liburing is required to compile the test and be installed on the remote host running the test. Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-02-17 05:41:09 -07:00
Jens Axboe	5c496ff11d	Merge commit '71f0dd5a3293d75d26d405ffbaedfdda4836af32' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc Merge networking zerocopy receive tree, to get the prep patches for the io_uring rx zc support. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (63 commits) net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers net: page_pool: add a mp hook to unregister_netdevice* net: page_pool: add callback for mp info printing netdev: add io_uring memory provider info net: page_pool: create hooks for custom memory providers net: generalise net_iov chunk owners net: prefix devmem specific helpers net: page_pool: don't cast mp param to devmem tools: ynl: add all headers to makefile deps eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs eth: fbnic: add MAC address TCAM to debugfs tools: ynl-gen: support limits using definitions tools: ynl-gen: don't output external constants net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled net/mlx5e: Remove unused mlx5e_tc_flow_action struct net/mlx5: Remove stray semicolon in LAG port selection table creation net/mlx5e: Support FEC settings for 200G per lane link modes net/mlx5: Add support for 200Gbps per lane link modes ...	2025-02-17 05:38:28 -07:00
Greg Kroah-Hartman	2ce177e9b3	Merge 6.14-rc3 into driver-core-next We need the faux_device changes in here for future work. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-17 07:24:33 +01:00
Linus Torvalds	82ff316456	ARM: - Large set of fixes for vector handling, specially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours. - Fix use of kernel VAs at EL2 when emulating timers with nVHE. - Small set of pKVM improvements and cleanups. x86: - Fix broken SNP support with KVM module built-in, ensuring the PSP module is initialized before KVM even when the module infrastructure cannot be used to order initcalls - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference. - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1. - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmev2ykUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMvxwf/bw2u08moAYWAjJLROFvfiKXnznLS iqJ2+jcw0lJ7wDqm4Zw8M5t74Rd+y5yzkLkZOyjav9yBB09zRkItiTHljCNMOQnt 2QptBa3CUN8N+rNnvVRt6dMkhw7z6n7eoFRSIDY2Y9PgiTapbFXPV1gFkMPO6+0f SyF4LCr0iuDkJdvGAZJAH/Mp8nG6dv/A6a+Q+R1RkbKn9c2OdWw4VMfhIzimFGN6 0RFjbfXXvyO0aU/W/VHwvvuhcjGkAZWfHDdaTXqbvSMhayW562UPVMVBwXdVBmDj Dk1gCKcbm4WyktbXYW6iOYj3MgdK96eI24ozps4R0aDexsrTRY4IfH4KEg== =20Ql -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM: - Large set of fixes for vector handling, especially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours - Fix use of kernel VAs at EL2 when emulating timers with nVHE - Small set of pKVM improvements and cleanups x86: - Fix broken SNP support with KVM module built-in, ensuring the PSP module is initialized before KVM even when the module infrastructure cannot be used to order initcalls - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1 - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) x86/sev: Fix broken SNP support with KVM module built-in KVM: SVM: Ensure PSP module is initialized if KVM module is built-in crypto: ccp: Add external API interface for PSP module initialization KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic() KVM: arm64: timer: Drop warning on failed interrupt signalling KVM: arm64: Fix alignment of kvm_hyp_memcache allocations KVM: arm64: Convert timer offset VA when accessed in HYP code KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp() KVM: arm64: Eagerly switch ZCR_EL{1,2} KVM: arm64: Mark some header functions as inline KVM: arm64: Refactor exit handlers KVM: arm64: Refactor CPTR trap deactivation KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN KVM: arm64: Remove host FPSIMD saving for non-protected KVM KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop KVM: nSVM: Enter guest mode before initializing nested NPT MMU KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper ...	2025-02-16 10:25:12 -08:00
Thomas Weißschuh	e275f44e0a	kunit: qemu_configs: sparc: use Zilog console The driver for the 8250 console is not used, as no port is found. Instead the prom0 bootconsole is used the whole time. The prom driver translates '\n' to '\r\n' before handing of the message off to the firmware. The firmware performs the same translation again. In the final output produced by QEMU each line ends with '\r\r\n'. This breaks the kunit parser, which can only handle '\r\n' and '\n'. Use the Zilog console instead. It works correctly, is the one documented by the QEMU manual and also saves a bit of codesize: Before=4051011, After=4023326, chg -0.68% Observed on QEMU 9.2.0. Link: https://lore.kernel.org/r/20250214-kunit-qemu-sparc-console-v1-1-ba1dfdf8f0b1@linutronix.de Fixes: `87c9c16317` ("kunit: tool: add support for QEMU") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-15 19:06:56 -07:00
Brendan Jackman	08fafac4c9	kunit: tool: Use qboot on QEMU x86_64 As noted in [0], SeaBIOS (QEMU default) makes a mess of the terminal, qboot does not. It turns out this is actually useful with kunit.py, since the user is exposed to this issue if they set --raw_output=all. qboot is also faster than SeaBIOS, but it's is marginal for this usecase. [0] https://lore.kernel.org/all/CA+i-1C0wYb-gZ8Mwh3WSVpbk-LF-Uo+njVbASJPe1WXDURoV7A@mail.gmail.com/ Both SeaBIOS and qboot are x86-specific. Link: https://lore.kernel.org/r/20250124-kunit-qboot-v1-1-815e4d4c6f7c@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-02-15 19:06:41 -07:00
Sebastian Andrzej Siewior	633488947e	kernfs: Use RCU to access kernfs_node::parent. kernfs_rename_lock is used to obtain stable kernfs_node::{name\|parent} pointer. This is a preparation to access kernfs_node::parent under RCU and ensure that the pointer remains stable under the RCU lifetime guarantees. For a complete path, as it is done in kernfs_path_from_node(), the kernfs_rename_lock is still required in order to obtain a stable parent relationship while computing the relevant node depth. This must not change while the nodes are inspected in order to build the path. If the kernfs user never moves the nodes (changes the parent) then the kernfs_rename_lock is not required and the RCU guarantees are sufficient. This "restriction" can be set with KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required. Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU access and use RCU accessor while accessing the node. Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can not change. Acked-by: Tejun Heo <tj@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-02-15 17:46:32 +01:00
Andrii Nakryiko	4eb93fea59	selftests/bpf: add test for LDX/STX/ST relocations over array field Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE relocation size adjustment when the CO-RE relocation target type is an ARRAY. We need to make sure that compiler generates LDX/STX/ST instruction with CO-RE relocation against entire ARRAY type, not ARRAY's element. With the code pattern in selftest, we get this: 59: 61 71 00 00 00 00 00 00 w1 = (u32 )(r7 + 0x0) 00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0) Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory load instruction itself. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250207014809.1573841-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:58:14 -08:00
Jiayuan Chen	72266ee83f	selftests/bpf: Add selftest for may_goto Added test cases to ensure that programs with stack sizes exceeding 512 bytes are restricted in non-JITed mode, and can be executed normally in JITed mode, even with stack sizes exceeding 512 bytes due to the presence of may_goto instructions. Test result: echo "0" > /proc/sys/net/core/bpf_jit_enable ./test_progs -t verifier_stack_ptr ... stack size 512 with may_goto with jit:SKIP stack size 512 with may_goto without jit:OK ... Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED echo "1" > /proc/sys/net/core/bpf_jit_enable ./test_progs -t verifier_stack_ptr ... stack size 512 with may_goto with jit:OK stack size 512 with may_goto without jit:SKIP ... Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20250214091823.46042-4-mrpre@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:55:15 -08:00
Jiayuan Chen	b38c72ab80	selftests/bpf: Introduce __load_if_JITed annotation for tests In some cases, the verification logic under the interpreter and JIT differs, such as may_goto, and the test program behaves differently under different runtime modes, requiring separate verification logic for each result. Introduce __load_if_JITed and __load_if_no_JITed annotation for tests. Signed-off-by: Jiayuan Chen <mrpre@163.com> Link: https://lore.kernel.org/r/20250214091823.46042-3-mrpre@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-02-14 19:55:15 -08:00
Stafford Horne	713e788c0e	rseq/selftests: Fix riscv rseq_offset_deref_addv inline asm When working on OpenRISC support for restartable sequences I noticed and fixed these two issues with the riscv support bits. 1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly passed to the macro. Fix this by adding 'inc' to the list of macro arguments. 2 The inline asm input constraints for 'inc' and 'off' use "er", The riscv gcc port does not have an "e" constraint, this looks to be copied from the x86 port. Fix this by just using an "r" constraint. I have compile tested this only for riscv. However, the same fixes I use in the OpenRISC rseq selftests and everything passes with no issues. Fixes: `171586a6ab` ("selftests/rseq: riscv: Template memory ordering and percpu access mode") Signed-off-by: Stafford Horne <shorne@gmail.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250114170721.3613280-1-shorne@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-02-14 13:06:36 -08:00
Linus Torvalds	04f41cbf03	sched_ext: Fixes for v6.14-rc2 - Fix lock imbalance in a corner case of dispatch_to_local_dsq(). - Migration disabled tasks were confusing some BPF schedulers and its handling had a bug. Fix it and simplify the default behavior by dispatching them automatically. - ops.tick(), ops.disable() and ops.exit_task() were incorrectly disallowing kfuncs that require the task argument to be the rq operation is currently operating on and thus is rq-locked. Allow them. - Fix autogroup migration handling bug which was occasionally triggering a warning in the cgroup migration path. - tools/sched_ext, selftest and other misc updates. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ695uA4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGeCfAQDmUixMNJCIrRphYsWcYUzlGLZyyRpQEEYFtRMO UC266gD+PUV2UvuO5sAVH8HVnGdOqkXaE/IRG+TC7fQH3ruPlgI= =LFd1 -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix lock imbalance in a corner case of dispatch_to_local_dsq() - Migration disabled tasks were confusing some BPF schedulers and its handling had a bug. Fix it and simplify the default behavior by dispatching them automatically - ops.tick(), ops.disable() and ops.exit_task() were incorrectly disallowing kfuncs that require the task argument to be the rq operation is currently operating on and thus is rq-locked. Allow them. - Fix autogroup migration handling bug which was occasionally triggering a warning in the cgroup migration path - tools/sched_ext, selftest and other misc updates * tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Use SCX_CALL_OP_TASK in task_tick_scx sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h. sched_ext: selftests: Fix grammar in tests description sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq() sched_ext: Fix migration disabled handling in targeted dispatches sched_ext: Implement auto local dispatching of migration disabled tasks sched_ext: Fix incorrect time delta calculation in time_delta() sched_ext: Fix lock imbalance in dispatch_to_local_dsq() sched_ext: selftests/dsp_local_on: Fix selftest on UP systems tools/sched_ext: Add helper to check task migration state sched_ext: Fix incorrect autogroup migration detection sched_ext: selftests/dsp_local_on: Fix sporadic failures selftests/sched_ext: Fix enum resolution sched_ext: Include task weight in the error state dump sched_ext: Fixes typos in comments	2025-02-14 11:14:24 -08:00
Linus Torvalds	80868f5d3d	cgroup: Fixes for v6.14-rc2 - Fix a race window where a newly forked task could escape cgroup.kill. - Remove incorrectly included steal time from cpu.stat::usage_usec. - Minor update in selftest. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ6928A4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGYrbAPsEtoH5GFw7VtKIy4fS23QbtxUuwW0fERrPWGyt JtQv3gD5AboBUrGWdgiM5c2bIXT3f+Bn9w3HhLiPaB8ieN/0kA8= =3BBH -----END PGP SIGNATURE----- Merge tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix a race window where a newly forked task could escape cgroup.kill - Remove incorrectly included steal time from cpu.stat::usage_usec - Minor update in selftest * tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Remove steal time from usage_usec selftests/cgroup: use bash in test_cpuset_v1_hp.sh cgroup: fix race between fork and cgroup.kill	2025-02-14 11:00:42 -08:00
Sean Christopherson	16fc7cb406	KVM: selftests: Add infrastructure for getting vCPU binary stats Now that the binary stats cache infrastructure is largely scope agnostic, add support for vCPU-scoped stats. Like VM stats, open and cache the stats FD when the vCPU is created so that it's guaranteed to be valid when vcpu_get_stats() is invoked. Account for the extra per-vCPU file descriptor in kvm_set_files_rlimit(), so that tests that create large VMs don't run afoul of resource limits. To sanity check that the infrastructure actually works, and to get a bit of bonus coverage, add an assert in x86's xapic_ipi_test to verify that the number of HLTs executed by the test matches the number of HLT exits observed by KVM. Tested-by: Manali Shukla <Manali.Shukla@amd.com> Link: https://lore.kernel.org/r/20250111005049.1247555-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:13 -08:00
Sean Christopherson	9b56532b8a	KVM: selftests: Adjust number of files rlimit for all "standard" VMs Move the max vCPUs test's RLIMIT_NOFILE adjustments to common code, and use the new helper to adjust the resource limit for non-barebones VMs by default. x86's recalc_apic_map_test creates 512 vCPUs, and a future change will open the binary stats fd for all vCPUs, which will put the recalc APIC test above some distros' default limit of 1024. Link: https://lore.kernel.org/r/20250111005049.1247555-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:12 -08:00
Sean Christopherson	ea7179f995	KVM: selftests: Get VM's binary stats FD when opening VM Get and cache a VM's binary stats FD when the VM is opened, as opposed to waiting until the stats are first used. Opening the stats FD outside of __vm_get_stat() will allow converting it to a scope-agnostic helper. Note, this doesn't interfere with kvm_binary_stats_test's testcase that verifies a stats FD can be used after its own VM's FD is closed, as the cached FD is also closed during kvm_vm_free(). Link: https://lore.kernel.org/r/20250111005049.1247555-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:11 -08:00
Sean Christopherson	e65faf71bd	KVM: selftests: Add struct and helpers to wrap binary stats cache Add a struct and helpers to manage the binary stats cache, which is currently used only for VM-scoped stats. This will allow expanding the selftests infrastructure to provide support for vCPU-scoped binary stats, which, except for the ioctl to get the stats FD are identical to VM-scoped stats. Defer converting __vm_get_stat() to a scope-agnostic helper to a future patch, as getting the stats FD from KVM needs to be moved elsewhere before it can be made completely scope-agnostic. Link: https://lore.kernel.org/r/20250111005049.1247555-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:02:09 -08:00
Sean Christopherson	b0c3f5df92	KVM: selftests: Macrofy vm_get_stat() to auto-generate stat name string Turn vm_get_stat() into a macro that generates a string for the stat name, as opposed to taking a string. This will allow hardening stat usage in the future to generate errors on unknown stats at compile time. No functional change intended. Link: https://lore.kernel.org/r/20250111005049.1247555-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:01:55 -08:00
Sean Christopherson	eead13d493	KVM: selftests: Assert that __vm_get_stat() actually finds a stat Fail the test if it attempts to read a stat that doesn't exist, e.g. due to a typo (hooray, strings), or because the test tried to get a stat for the wrong scope. As is, there's no indiciation of failure and @data is left untouched, e.g. holds '0' or random stack data in most cases. Fixes: `8448ec5993` ("KVM: selftests: Add NX huge pages test") Link: https://lore.kernel.org/r/20250111005049.1247555-4-seanjc@google.com [sean: fixup spelling mistake, courtesy of Colin Ian King] Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-02-14 07:01:27 -08:00
Bharadwaj Raju	78332fdb95	selftests/landlock: Add binaries to .gitignore Building the test creates binaries 'wait-pipe' and 'sandbox-and-launch' which need to be gitignore'd. Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Link: https://lore.kernel.org/r/20250210161101.6024-1-bharadwaj.raju777@gmail.com [mic: Sort entries] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2025-02-14 09:23:11 +01:00

... 5 6 7 8 9 ...

18741 Commits