ASPM doesn't need to be disabled if pcie dpm is disabled.
So ASPM can be independantly enabled.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add interface for hardware init by vcn instance.
v2: fix code format
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VFs query RAS error counts directly from host with
AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled,
an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create()
Likewise, VFs depend on host support to query CPERs, rather than ACA component.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In reference to memory carved out for APUs,
s/cave out/carve out/
Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
"interrupt" becomes "irq" in:
dce_vX_0_set_hpd_interrupt_state()
dce_vX_0_set_crtc_interrupt_state()
dce_vX_0_set_pageflip_interrupt_state()
It is easier when going through the code to just change the DCE number in
the functions' name to find and compare them across DCE versions.
Also, it standardizes function mapping inside a given structure where .set
and .process are both set to functions with a "_irq" suffix.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and
replace "type" by "hpd" for a uniform parameter naming usage across DCEs.
In link_factory.c, there is a missing "p" to "types"
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes: 785f0f9fe7 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The user can set any speed value.
If speed is greater than UINT_MAX/8, division by zero is possible.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 1e866f1fe5 ("drm/amd/pm: Prevent divide by zero")
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Rework heuristics for resolving the fault IPA (HPFAR_EL2 v. re-walk
stage-1 page tables) to align with the architecture. This avoids
possibly taking an SEA at EL2 on the page table walk or using an
architecturally UNKNOWN fault IPA.
* Use acquire/release semantics in the KVM FF-A proxy to avoid reading
a stale value for the FF-A version.
* Fix KVM guest driver to match PV CPUID hypercall ABI.
* Use Inner Shareable Normal Write-Back mappings at stage-1 in KVM
selftests, which is the only memory type for which atomic
instructions are architecturally guaranteed to work.
s390:
* Don't use %pK for debug printing and tracepoints.
x86:
* Use a separate subclass when acquiring KVM's per-CPU posted interrupts
wakeup lock in the scheduled out path, i.e. when adding a vCPU on
the list of vCPUs to wake, to workaround a false positive deadlock.
The schedule out code runs with a scheduler lock that the wakeup
handler takes in the opposite order; but it does so with IRQs disabled
and cannot run concurrently with a wakeup.
* Explicitly zero-initialize on-stack CPUID unions
* Allow building irqbypass.ko as as module when kvm.ko is a module
* Wrap relatively expensive sanity check with KVM_PROVE_MMU
* Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
selftests:
* Add more scenarios to the MONITOR/MWAIT test.
* Add option to rseq test to override /dev/cpu_dma_latency
* Bring list of exit reasons up to date
* Cleanup Makefile to list once tests that are valid on all architectures
Other:
* Documentation fixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmf083IUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroN1dgf/QwfpZcHoMNQSnrc1jMy2LHrArln2
XfmsOGZTU7kyoLQsLWGAPNocOveGdiemTDsj5ZXoNMnqV8hCBr+tZuv2gWI1rr/o
kiGerdIgSZ9piTjBlJkVAaOzbWhg2DUnr7qVVzEzFY9+rPNyQ81vgAfU7h56KhYB
optecozmBrHHAxvQZwmPeL9UyPWFjOF1BY/8LTMx7X+aVuCX6qx1JqO3a3ylAw4J
tGXv6qFJfuCnu1d1b4X0ILce0iMUTOjQzvTcIm+BKjYycecl+3j1aczC/BOorIgc
mf0+XeauhcTduK73pirnvx2b05eOxntgkOpwJytO2RP6pE0uK+2Th/C3Qg==
=ba/Y
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Rework heuristics for resolving the fault IPA (HPFAR_EL2 v. re-walk
stage-1 page tables) to align with the architecture. This avoids
possibly taking an SEA at EL2 on the page table walk or using an
architecturally UNKNOWN fault IPA
- Use acquire/release semantics in the KVM FF-A proxy to avoid
reading a stale value for the FF-A version
- Fix KVM guest driver to match PV CPUID hypercall ABI
- Use Inner Shareable Normal Write-Back mappings at stage-1 in KVM
selftests, which is the only memory type for which atomic
instructions are architecturally guaranteed to work
s390:
- Don't use %pK for debug printing and tracepoints
x86:
- Use a separate subclass when acquiring KVM's per-CPU posted
interrupts wakeup lock in the scheduled out path, i.e. when adding
a vCPU on the list of vCPUs to wake, to workaround a false positive
deadlock. The schedule out code runs with a scheduler lock that the
wakeup handler takes in the opposite order; but it does so with
IRQs disabled and cannot run concurrently with a wakeup
- Explicitly zero-initialize on-stack CPUID unions
- Allow building irqbypass.ko as as module when kvm.ko is a module
- Wrap relatively expensive sanity check with KVM_PROVE_MMU
- Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
selftests:
- Add more scenarios to the MONITOR/MWAIT test
- Add option to rseq test to override /dev/cpu_dma_latency
- Bring list of exit reasons up to date
- Cleanup Makefile to list once tests that are valid on all
architectures
Other:
- Documentation fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
KVM: arm64: Use acquire/release to communicate FF-A version negotiation
KVM: arm64: selftests: Explicitly set the page attrs to Inner-Shareable
KVM: arm64: selftests: Introduce and use hardware-definition macros
KVM: VMX: Use separate subclasses for PI wakeup lock to squash false positive
KVM: VMX: Assert that IRQs are disabled when putting vCPU on PI wakeup list
KVM: x86: Explicitly zero-initialize on-stack CPUID unions
KVM: Allow building irqbypass.ko as as module when kvm.ko is a module
KVM: x86/mmu: Wrap sanity check on number of TDP MMU pages with KVM_PROVE_MMU
KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency
KVM: x86: Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
Documentation: kvm: remove KVM_CAP_MIPS_TE
Documentation: kvm: organize capabilities in the right section
Documentation: kvm: fix some definition lists
Documentation: kvm: drop "Capability" heading from capabilities
Documentation: kvm: give correct name for KVM_CAP_SPAPR_MULTITCE
Documentation: KVM: KVM_GET_SUPPORTED_CPUID now exposes TSC_DEADLINE
selftests: kvm: list once tests that are valid on all architectures
selftests: kvm: bring list of exit reasons up to date
selftests: kvm: revamp MONITOR/MWAIT tests
KVM: arm64: Don't translate FAR if invalid/unsafe
...
This is normally handled in the gfx IP suspend callbacks, but
for S0ix, those are skipped because we don't want to touch
gfx. So handle it in device suspend.
Fixes: b9467983b7 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Fixes: 963537ca23 ("drm/amdgpu: add dynamic workload profile switching for gfx11")
Fixes: 5f95a15495 ("drm/amdgpu: add dynamic workload profile switching for gfx12")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pause the workload setting in dm when doing idle optimization
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the callback for implementation for swsmu.
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To be used for display idle optimizations when
we want to pause non-default profiles.
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ANNOTATE_IGNORE_ALTERNATIVE adds additional noise to the code generated
by CLAC/STAC alternatives, hurting readability for those whose read
uaccess-related code generation on a regular basis.
Remove the annotation specifically for the "NOP patched with CLAC/STAC"
case in favor of a manual check.
Leave the other uses of that annotation in place as they're less common
and more difficult to detect.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/fc972ba4995d826fcfb8d02733a14be8d670900b.1744098446.git.jpoimboe@kernel.org
- fprobe: Fix to remove fprobe_hlist_node when module unloading
When a fprobe target module is removed, the fprobe_hlist_node
should be removed from the fprobe's hash table to prevent reusing
accidentally if another module is loaded at the same address.
- fprobe: Fix to lock module while registering fprobe
The module containing the function to be probeed is locked using a
reference counter until the fprobe registration is complete, which
prevents use after free.
- fprobe-events: Fix possible UAF on modules
Basically as same as above, but in the fprobe-events layer we also
need to get module reference counter when we find the tracepoint
in the module.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmf0kJ8bHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b+bEIALiFuYBn2y26OJnfaRnW
rgSC2JupswEVg7HwsN5kA/x1ypXl9SPfJGjbiHLUTq9+4KGOBTmY+k5/OpVO+Qkh
3nYKOkZxKRTglA7hRSTH0rxDV1eobps4nv/xkPjprugcjCGU54+4yb9Hq7Kyflpa
o8p+VS/0VOJ9f3Iy9a9JRfu9qE7Qzz9USCj4N64WMgx/qczPe27twqFEaUpTf1VW
Sw9twtKnqGs9hNE2QmhlzUBuq6gOZMXkjH6t1U4pMWBGB51JqZ5ZBhC4kL/5XEIZ
bEau82El5qdieQC2B7c0RxldceKa4t4QUlJDalZGKpxvTXrCw9rFyv0dRe2cXnKm
Yo0=
=I+MO
-----END PGP SIGNATURE-----
Merge tag 'probes-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- fprobe: remove fprobe_hlist_node when module unloading
When a fprobe target module is removed, the fprobe_hlist_node should
be removed from the fprobe's hash table to prevent reusing
accidentally if another module is loaded at the same address.
- fprobe: lock module while registering fprobe
The module containing the function to be probeed is locked using a
reference counter until the fprobe registration is complete, which
prevents use after free.
- fprobe-events: fix possible UAF on modules
Basically as same as above, but in the fprobe-events layer we also
need to get module reference counter when we find the tracepoint in
the module.
* tag 'probes-fixes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: fprobe: Cleanup fprobe hash when module unloading
tracing: fprobe events: Fix possible UAF on modules
tracing: fprobe: Fix to lock module while registering fprobe
With CONFIG_PREFIX_SYMBOLS, objtool adds __pfx prefix symbols
to claim the compiler emitted call padding bytes. When
CONFIG_X86_KERNEL_IBT is not selected, the symbols are added to
individual object files and for Rust objects, they end up being
exported, resulting in warnings with CONFIG_GENDWARFKSYMS as the
symbols have no debugging information:
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_put_task_struct
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_task_euid
warning: gendwarfksyms: symbol_print_versions: no information for symbol __pfx_rust_helper_readq_relaxed
...
Filter out the __pfx prefix from exported symbols similarly to
the existing __cfi and __odr_asan prefixes.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Cc: stable@vger.kernel.org
Fixes: ac61506bf2 ("rust: Use gendwarfksyms + extended modversions for CONFIG_MODVERSIONS")
Link: https://lore.kernel.org/r/20250318231815.917621-2-samitolvanen@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
- A number of cpuset remote partition related fixes and cleanups along with
selftest updates.
- A change from this merge window made cgroup_rstat_updated_list() called
outside cgroup_rstat_lock leading to list corruptions. Fix it by
relocating the call inside the lock.
-----BEGIN PGP SIGNATURE-----
iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZ/QMSQ4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGebUAP0bdg/hIX5OjhREbaDKWoUyAHnHqMdg3Dvngvhp
d9aOqQD/b1jdVfDINFtb2qjOpizPjyI0ycQxrr9K3DrSYmUAKAs=
=hFhq
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- A number of cpuset remote partition related fixes and cleanups along
with selftest updates.
- A change from this merge window made cgroup_rstat_updated_list()
called outside cgroup_rstat_lock leading to list corruptions. Fix it
by relocating the call inside the lock.
* tag 'cgroup-for-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Fix race between newly created partition and dying one
cgroup: rstat: call cgroup_rstat_updated_list with cgroup_rstat_lock
selftest/cgroup: Add a remote partition transition test to test_cpuset_prs.sh
selftest/cgroup: Clean up and restructure test_cpuset_prs.sh
selftest/cgroup: Update test_cpuset_prs.sh to use | as effective CPUs and state separator
cgroup/cpuset: Remove unneeded goto in sched_partition_write() and rename it
cgroup/cpuset: Code cleanup and comment update
cgroup/cpuset: Don't allow creation of local partition over a remote one
cgroup/cpuset: Remove remote_partition_check() & make update_cpumasks_hier() handle remote partition
cgroup/cpuset: Fix error handling in remote_partition_disable()
cgroup/cpuset: Fix incorrect isolated_cpus update in update_parent_effective_cpumask()
"Normal" comments in Rust (`//`) are also formatted in Markdown, like
the documentation (`///` and `//!`), see
Documentation/rust/coding-guidelines.rst
Thus use Markdown autolinks for a couple links that were missing it.
It also helps to get proper linking in some software like kitty [1].
Suggested-by: Benno Lossin <benno.lossin@proton.me>
Link: https://github.com/Rust-for-Linux/pin-init/pull/32#discussion_r2023103712 [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Link: dd230d61bf
Fixes: 84837cf6fa ("rust: pin-init: change examples to the user-space version")
Cc: stable@vger.kernel.org
[ Change case in title. Reworded commit message. - Benno ]
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250407201755.649153-3-benno.lossin@proton.me
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Similar to what was done for `Zeroable<NonNull<T>>` in commit
df27cef153 ("rust: init: fix `Zeroable` implementation for
`Option<NonNull<T>>` and `Option<KBox<T>>`"), the latest Rust
documentation [1] says it guarantees that `transmute::<_,
Option<T>>([0u8; size_of::<T>()])` is sound and produces
`Option::<T>::None` only in some cases. In particular, it says:
`Box<U>` (specifically, only `Box<U, Global>`) when `U: Sized`
Thus restrict the `impl` to `Sized`, and use similar wording as in that
commit too.
Link: https://doc.rust-lang.org/stable/std/option/index.html#representation [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Link: a6007cf555
Fixes: 9b2299af3b ("rust: pin-init: add `std` and `alloc` support from the user-space version")
Cc: stable@vger.kernel.org
[ Adjust mentioned commit to the one from the kernel. - Benno ]
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250407201755.649153-2-benno.lossin@proton.me
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Finish cleaning up the CRC kconfig options by removing the remaining
unnecessary prompts and an unnecessary 'default y', removing
CONFIG_LIBCRC32C, and documenting all the CRC library options.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ/P7QhQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyoOAQCynFcS1dWuD27S+SdUREmBjMAoZo5M
zdsIvlPv9KLycgD/QX5lXjW3KIYY6jQ8vHUuLVwfDl/JEp4GJS9dLGU+agg=
=0R1T
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC cleanups from Eric Biggers:
"Finish cleaning up the CRC kconfig options by removing the remaining
unnecessary prompts and an unnecessary 'default y', removing
CONFIG_LIBCRC32C, and documenting all the CRC library options"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crc: remove CONFIG_LIBCRC32C
lib/crc: document all the CRC library kconfig options
lib/crc: remove unnecessary prompt for CONFIG_CRC_ITU_T
lib/crc: remove unnecessary prompt for CONFIG_CRC_T10DIF
lib/crc: remove unnecessary prompt for CONFIG_CRC16
lib/crc: remove unnecessary prompt for CONFIG_CRC_CCITT
lib/crc: remove unnecessary prompt for CONFIG_CRC32 and drop 'default y'
A recent optimization change in LLVM [1] aims to transform certain loop
idioms into calls to strlen() or wcslen(). This change transforms the
first while loop in UniStrcat() into a call to wcslen(), breaking the
build when UniStrcat() gets inlined into alloc_path_with_tree_prefix():
ld.lld: error: undefined symbol: wcslen
>>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54)
>>> vmlinux.o:(alloc_path_with_tree_prefix)
>>> referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54)
>>> vmlinux.o:(alloc_path_with_tree_prefix)
Disable this optimization with '-fno-builtin-wcslen', which prevents the
compiler from assuming that wcslen() is available in the kernel's C
library.
[ More to the point - it's not that we couldn't implement wcslen(), it's
that this isn't an optimization at all in the context of the kernel.
Replacing a simple inlined loop with a function call to the same loop
is just stupid and pointless if you don't have long strings and fancy
libraries with vectorization support etc.
For the regular 'strlen()' cases, we want the compiler to do this in
order to handle the trivial case of constant strings. And we do have
optimized versions of 'strlen()' on some architectures. But for
wcslen? Just no. - Linus ]
Cc: stable@vger.kernel.org
Link: 9694844d7e [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Perf can hang while freeing a sigtrap event if a related deferred
signal hadn't managed to be sent before the file got closed:
perf_event_overflow()
task_work_add(perf_pending_task)
fput()
task_work_add(____fput())
task_work_run()
____fput()
perf_release()
perf_event_release_kernel()
_free_event()
perf_pending_task_sync()
task_work_cancel() -> FAILED
rcuwait_wait_event()
Once task_work_run() is running, the list of pending callbacks is
removed from the task_struct and from this point on task_work_cancel()
can't remove any pending and not yet started work items, hence the
task_work_cancel() failure and the hang on rcuwait_wait_event().
Task work could be changed to remove one work at a time, so a work
running on the current task can always cancel a pending one, however
the wait / wake design is still subject to inverted dependencies when
remote targets are involved, as pictured by Oleg:
T1 T2
fd = perf_event_open(pid => T2->pid); fd = perf_event_open(pid => T1->pid);
close(fd) close(fd)
<IRQ> <IRQ>
perf_event_overflow() perf_event_overflow()
task_work_add(perf_pending_task) task_work_add(perf_pending_task)
</IRQ> </IRQ>
fput() fput()
task_work_add(____fput()) task_work_add(____fput())
task_work_run() task_work_run()
____fput() ____fput()
perf_release() perf_release()
perf_event_release_kernel() perf_event_release_kernel()
_free_event() _free_event()
perf_pending_task_sync() perf_pending_task_sync()
rcuwait_wait_event() rcuwait_wait_event()
Therefore the only option left is to acquire the event reference count
upon queueing the perf task work and release it from the task work, just
like it was done before 3a5465418f ("perf: Fix event leak upon exec and file release")
but without the leaks it fixed.
Some adjustments are necessary to make it work:
* A child event might dereference its parent upon freeing. Care must be
taken to release the parent last.
* Some places assuming the event doesn't have any reference held and
therefore can be freed right away must instead put the reference and
let the reference counting to its job.
Reported-by: "Yi Lai" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/all/Zx9Losv4YcJowaP%2F@ly-workstation/
Reported-by: syzbot+3c4321e10eea460eb606@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/673adf75.050a0220.87769.0024.GAE@google.com/
Fixes: 3a5465418f ("perf: Fix event leak upon exec and file release")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250304135446.18905-1-frederic@kernel.org
SCX_OPS_HAS_CGROUP_WEIGHT was only used to suppress the missing cgroup
weight support warnings. Now that the warnings are removed, the flag doesn't
do anything. Mark it for deprecation and remove its usage from scx_flatcg.
v2: Actually include the scx_flatcg update.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-and-reviewed-by: Andrea Righi <arighi@nvidia.com>
sched_ext generates warnings when cpu.weight / cpu.idle are set to
non-default values if the BPF scheduler doesn't implement weight support.
These warnings don't provide much value while adding constant annoyance. A
BPF scheduler may not implement any particular behavior and there's nothing
particularly special about missing cgroup weight support. Drop the warnings.
Signed-off-by: Tejun Heo <tj@kernel.org>
Replace kzalloc with kvzalloc for the exit_dump buffer allocation, which
can require large contiguous memory depending on the implementation.
This change prevents allocation failures by allowing the system to fall
back to vmalloc when contiguous memory allocation fails.
Since this buffer is only used for debugging purposes, physical memory
contiguity is not required, making vmalloc a suitable alternative.
Cc: stable@vger.kernel.org
Fixes: 07814a9439 ("sched_ext: Print debug dump after an error exit")
Suggested-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit d072acda48 ("rust: use custom FFI integer types") did not
update rust-analyzer to include the new crate.
To enable rust-analyzer support for these custom ffi types, add the
`ffi` crate as a dependency to the `bindings`, `uapi` and `kernel`
crates, which all directly depend on it.
Fixes: d072acda48 ("rust: use custom FFI integer types")
Signed-off-by: Lukas Fischer <kernel@o1oo11oo.de>
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250404125150.85783-2-kernel@o1oo11oo.de
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Some operations require checking, or ignoring, specific bits in an address
value. For example, this can be comparing address values to identify unique
structures.
Currently, the full address value is compared when filtering for duplicates.
This results in over counting and creation of extra records. This gives the
impression that more unique events occurred than did in reality.
Mask the address for physical rows on MI300.
[ bp: Simplify. ]
Fixes: 6f15e617cc ("RAS: Introduce a FRU memory poison manager")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
landlock_put_hierarchy() can be called when an error occurs in
landlock_merge_ruleset() due to insufficient memory. In this case, the
domain's audit details might not have been allocated yet, which would
cause landlock_free_hierarchy_details() to print a warning (but still
safely handle this case).
We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for
this kind of function, so let's remove it entirely.
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
A number of test suites call functions that expect the returned
drm_display_mode to be destroyed eventually.
However, none of the tests called drm_mode_destroy, which results in a
memory leak.
Since drm_mode_destroy takes two pointers as argument, we can't use a
kunit wrapper. Let's just create a helper every test suite can use.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250408-drm-kunit-drm-display-mode-memleak-v1-1-996305a2e75a@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Commit 5d2b55e55e ("panel/boe-tv101wum-ll2: Use refcounted allocation
in place of devm_kzalloc()") switched from a kmalloc + drm_panel_init
call to a devm_drm_panel_alloc one.
However, the variable it was storing the allocated pointer in doesn't
exist, resulting in a compilation breakage.
Fixes: 5d2b55e55e ("panel/boe-tv101wum-ll2: Use refcounted allocation in place of devm_kzalloc()")
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250408122008.1676235-3-mripard@kernel.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Commit 77dcbce637 ("panel/th101mb31ig002-28a: Use refcounted
allocation in place of devm_kzalloc()") switched from a kmalloc +
drm_panel_init call to a devm_drm_panel_alloc one.
However, the variable it was storing the allocated pointer in doesn't
exist, resulting in a compilation breakage.
Fixes: 77dcbce637 ("panel/th101mb31ig002-28a: Use refcounted allocation in place of devm_kzalloc()")
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250408122008.1676235-2-mripard@kernel.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Commit 9d7d7c3c9a ("panel/auo-a030jtn01: Use refcounted allocation in
place of devm_kzalloc()") switched from a kmalloc + drm_panel_init call
to a devm_drm_panel_alloc one.
However, the variable it was storing the allocated pointer in doesn't
exist, resulting in a compilation breakage.
Fixes: 9d7d7c3c9a ("panel/auo-a030jtn01: Use refcounted allocation in place of devm_kzalloc()")
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250408122008.1676235-1-mripard@kernel.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
There's a consistent pattern where the .cleanup_data() callback is
called when .prepare_data() fails, when it should really be called to
clean after a successful .prepare_data() as per the documentation.
Rewrite the error-handling paths to make sure we don't cleanup
un-prepared data.
Fixes: c781ff12a2 ("ethtool: Allow network drivers to dump arbitrary EEPROM data")
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250407130511.75621-1-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The tfilter_notify() and tfilter_del_notify() functions assume that
NLMSG_GOODSIZE is always enough to dump the filter chain. This is not
always the case, which can lead to silent notify failures (because the
return code of tfilter_notify() is not always checked). In particular,
this can lead to NLM_F_ECHO not being honoured even though an action
succeeds, which forces userspace to create workarounds[0].
Fix this by increasing the message size if dumping the filter chain into
the allocated skb fails. Use the size of the incoming skb as a size hint
if set, so we can start at a larger value when appropriate.
To trigger this, run the following commands:
# ip link add type veth
# tc qdisc replace dev veth0 root handle 1: fq_codel
# tc -echo filter add dev veth0 parent 1: u32 match u32 0 0 $(for i in $(seq 32); do echo action pedit munge ip dport set 22; done)
Before this fix, tc just returns:
Not a filter(cmd 2)
After the fix, we get the correct echo:
added filter dev veth0 parent 1: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid not_in_hw
match 00000000/00000000 at 0
action order 1: pedit action pass keys 1
index 1 ref 1 bind 1
key #0 at 20: val 00000016 mask ffff0000
[repeated 32 times]
[0] 106ef21860
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Frode Nordahl <frode.nordahl@canonical.com>
Closes: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2018500
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20250407105542.16601-1-toke@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
WX_RXD_IPV6EX was incorrectly defined in Rx ring descriptor. In fact, this
field stores the 802.1ad ID from which the packet was received. The wrong
definition caused the statistics rx_csum_offload_errors to fail to grow
when receiving the 802.1ad packet with incorrect checksum.
Fixes: ef4f3c19f9 ("net: wangxun: libwx add rx offload functions")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://patch.msgid.link/20250407103322.273241-1-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When running as a PVH dom0 the ACPI tables exposed to Linux are (mostly)
the native ones, thus exposing the C and P states, that can lead to
attachment of CPU idle and frequency drivers. However the entity in
control of the CPU C and P states is Xen, as dom0 doesn't have a full view
of the system load, neither has all CPUs assigned and identity pinned.
Like it's done for classic PV guests, prevent Linux from using idle or
frequency state drivers when running as a PVH dom0.
On an AMD EPYC 7543P system without this fix a Linux PVH dom0 will keep the
host CPUs spinning at 100% even when dom0 is completely idle, as it's
attempting to use the acpi_idle driver.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250407101842.67228-1-roger.pau@citrix.com>
The current code configures the Physical Function (PF) root node at TL1
and the Virtual Function (VF) root node at TL2.
This ensure at any given point of time PF traffic gets more priority.
PF root node
TL1
/ \
TL2 TL2 VF root node
/ \
TL3 TL3
/ \
TL4 TL4
/ \
SMQ SMQ
Due to a bug in the current code, the TL2 parent queue index on the
VF interface is not being configured, leading to 'SMQ Flush' errors
Fixes: 5e6808b4c6 ("octeontx2-pf: Add support for HTB offload")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250407070341.2765426-1-hkelam@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>