Commit Graph

440 Commits

Author SHA1 Message Date
Linus Torvalds 51d90a15fe ARM:
- Support for userspace handling of synchronous external aborts (SEAs),
   allowing the VMM to potentially handle the abort in a non-fatal
   manner.
 
 - Large rework of the VGIC's list register handling with the goal of
   supporting more active/pending IRQs than available list registers in
   hardware. In addition, the VGIC now supports EOImode==1 style
   deactivations for IRQs which may occur on a separate vCPU than the
   one that acked the IRQ.
 
 - Support for FEAT_XNX (user / privileged execute permissions) and
   FEAT_HAF (hardware update to the Access Flag) in the software page
   table walkers and shadow MMU.
 
 - Allow page table destruction to reschedule, fixing long need_resched
   latencies observed when destroying a large VM.
 
 - Minor fixes to KVM and selftests
 
 Loongarch:
 
 - Get VM PMU capability from HW GCFG register.
 
 - Add AVEC basic support.
 
 - Use 64-bit register definition for EIOINTC.
 
 - Add KVM timer test cases for tools/selftests.
 
 RISC/V:
 
 - SBI message passing (MPXY) support for KVM guest
 
 - Give a new, more specific error subcode for the case when in-kernel
   AIA virtualization fails to allocate IMSIC VS-file
 
 - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
   in small chunks
 
 - Fix guest page fault within HLV* instructions
 
 - Flush VS-stage TLB after VCPU migration for Andes cores
 
 s390:
 
 - Always allocate ESCA (Extended System Control Area), instead of
   starting with the basic SCA and converting to ESCA with the
   addition of the 65th vCPU.  The price is increased number of
   exits (and worse performance) on z10 and earlier processor;
   ESCA was introduced by z114/z196 in 2010.
 
 - VIRT_XFER_TO_GUEST_WORK support
 
 - Operation exception forwarding support
 
 - Cleanups
 
 x86:
 
 - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE
   caching is disabled, as there can't be any relevant SPTEs to zap.
 
 - Relocate a misplaced export.
 
 - Fix an async #PF bug where KVM would clear the completion queue when the
   guest transitioned in and out of paging mode, e.g. when handling an SMI and
   then returning to paged mode via RSM.
 
 - Leave KVM's user-return notifier registered even when disabling
   virtualization, as long as kvm.ko is loaded.  On reboot/shutdown, keeping
   the notifier registered is ok; the kernel does not use the MSRs and the
   callback will run cleanly and restore host MSRs if the CPU manages to
   return to userspace before the system goes down.
 
 - Use the checked version of {get,put}_user().
 
 - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC
   timers can result in a hard lockup in the host.
 
 - Revert the periodic kvmclock sync logic now that KVM doesn't use a
   clocksource that's subject to NTP corrections.
 
 - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter
   behind CONFIG_CPU_MITIGATIONS.
 
 - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path;
   the only reason they were handled in the fast path was to paper of a bug
   in the core #MC code, and that has long since been fixed.
 
 - Add emulator support for AVX MOV instructions, to play nice with emulated
   devices whose guest drivers like to access PCI BARs with large multi-byte
   instructions.
 
 x86 (AMD):
 
 - Fix a few missing "VMCB dirty" bugs.
 
 - Fix the worst of KVM's lack of EFER.LMSLE emulation.
 
 - Add AVIC support for addressing 4k vCPUs in x2AVIC mode.
 
 - Fix incorrect handling of selective CR0 writes when checking intercepts
   during emulation of L2 instructions.
 
 - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on
   VMRUN and #VMEXIT.
 
 - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft
   interrupt if the guest patched the underlying code after the VM-Exit, e.g.
   when Linux patches code with a temporary INT3.
 
 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to
   userspace, and extend KVM "support" to all policy bits that don't require
   any actual support from KVM.
 
 x86 (Intel):
 
 - Use the root role from kvm_mmu_page to construct EPTPs instead of the
   current vCPU state, partly as worthwhile cleanup, but mostly to pave the
   way for tracking per-root TLB flushes, and elide EPT flushes on pCPU
   migration if the root is clean from a previous flush.
 
 - Add a few missing nested consistency checks.
 
 - Rip out support for doing "early" consistency checks via hardware as the
   functionality hasn't been used in years and is no longer useful in general;
   replace it with an off-by-default module param to WARN if hardware fails
   a check that KVM does not perform.
 
 - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32]
   on VM-Enter.
 
 - Misc cleanups.
 
 - Overhaul the TDX code to address systemic races where KVM (acting on behalf
   of userspace) could inadvertantly trigger lock contention in the TDX-Module;
   KVM was either working around these in weird, ugly ways, or was simply
   oblivious to them (though even Yan's devilish selftests could only break
   individual VMs, not the host kernel)
 
 - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU,
   if creating said vCPU failed partway through.
 
 - Fix a few sparse warnings (bad annotation, 0 != NULL).
 
 - Use struct_size() to simplify copying TDX capabilities to userspace.
 
 - Fix a bug where TDX would effectively corrupt user-return MSR values if the
   TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected.
 
 Selftests:
 
 - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM.
 
 - Forcefully override ARCH from x86_64 to x86 to play nice with specifying
   ARCH=x86_64 on the command line.
 
 - Extend a bunch of nested VMX to validate nested SVM as well.
 
 - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to
   verify KVM can save/restore nested VMX state when L1 is using 5-level
   paging, but L2 is not.
 
 - Clean up the guest paging code in anticipation of sharing the core logic for
   nested EPT and nested NPT.
 
 guest_memfd:
 
 - Add NUMA mempolicy support for guest_memfd, and clean up a variety of
   rough edges in guest_memfd along the way.
 
 - Define a CLASS to automatically handle get+put when grabbing a guest_memfd
   from a memslot to make it harder to leak references.
 
 - Enhance KVM selftests to make it easer to develop and debug selftests like
   those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs
   often result in hard-to-debug SIGBUS errors.
 
 - Misc cleanups.
 
 Generic:
 
 - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for
   irqfd cleanup.
 
 - Fix a goof in the dirty ring documentation.
 
 - Fix choice of target for directed yield across different calls to
   kvm_vcpu_on_spin(); the function was always starting from the first
   vCPU instead of continuing the round-robin search.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmkvMa8UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMlFwf+Ow7zOYUuELSQ+Jn+hOYXiCNrdBDx
 ZamvMU8kLPr7XX0Zog6HgcMm//qyA6k5nSfqCjfsQZrIhRA/gWJ61jz1OX/Jxq18
 pJ9Vz6epnEPYiOtBwz+v8OS8MqDqVNzj2i6W1/cLPQE50c1Hhw64HWS5CSxDQiHW
 A7PVfl5YU12lW1vG3uE0sNESDt4Eh/spNM17iddXdF4ZUOGublserjDGjbc17E7H
 8BX3DkC2plqkJKwtjg0ae62hREkITZZc7RqsnftUkEhn0N0H9+rb6NKUyzIVh9NZ
 bCtCjtrKN9zfZ0Mujnms3ugBOVqNIputu/DtPnnFKXtXWSrHrgGSNv5ewA==
 =PEcw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - Support for userspace handling of synchronous external aborts
     (SEAs), allowing the VMM to potentially handle the abort in a
     non-fatal manner

   - Large rework of the VGIC's list register handling with the goal of
     supporting more active/pending IRQs than available list registers
     in hardware. In addition, the VGIC now supports EOImode==1 style
     deactivations for IRQs which may occur on a separate vCPU than the
     one that acked the IRQ

   - Support for FEAT_XNX (user / privileged execute permissions) and
     FEAT_HAF (hardware update to the Access Flag) in the software page
     table walkers and shadow MMU

   - Allow page table destruction to reschedule, fixing long
     need_resched latencies observed when destroying a large VM

   - Minor fixes to KVM and selftests

  Loongarch:

   - Get VM PMU capability from HW GCFG register

   - Add AVEC basic support

   - Use 64-bit register definition for EIOINTC

   - Add KVM timer test cases for tools/selftests

  RISC/V:

   - SBI message passing (MPXY) support for KVM guest

   - Give a new, more specific error subcode for the case when in-kernel
     AIA virtualization fails to allocate IMSIC VS-file

   - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
     in small chunks

   - Fix guest page fault within HLV* instructions

   - Flush VS-stage TLB after VCPU migration for Andes cores

  s390:

   - Always allocate ESCA (Extended System Control Area), instead of
     starting with the basic SCA and converting to ESCA with the
     addition of the 65th vCPU. The price is increased number of exits
     (and worse performance) on z10 and earlier processor; ESCA was
     introduced by z114/z196 in 2010

   - VIRT_XFER_TO_GUEST_WORK support

   - Operation exception forwarding support

   - Cleanups

  x86:

   - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO
     SPTE caching is disabled, as there can't be any relevant SPTEs to
     zap

   - Relocate a misplaced export

   - Fix an async #PF bug where KVM would clear the completion queue
     when the guest transitioned in and out of paging mode, e.g. when
     handling an SMI and then returning to paged mode via RSM

   - Leave KVM's user-return notifier registered even when disabling
     virtualization, as long as kvm.ko is loaded. On reboot/shutdown,
     keeping the notifier registered is ok; the kernel does not use the
     MSRs and the callback will run cleanly and restore host MSRs if the
     CPU manages to return to userspace before the system goes down

   - Use the checked version of {get,put}_user()

   - Fix a long-lurking bug where KVM's lack of catch-up logic for
     periodic APIC timers can result in a hard lockup in the host

   - Revert the periodic kvmclock sync logic now that KVM doesn't use a
     clocksource that's subject to NTP corrections

   - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the
     latter behind CONFIG_CPU_MITIGATIONS

   - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast
     path; the only reason they were handled in the fast path was to
     paper of a bug in the core #MC code, and that has long since been
     fixed

   - Add emulator support for AVX MOV instructions, to play nice with
     emulated devices whose guest drivers like to access PCI BARs with
     large multi-byte instructions

  x86 (AMD):

   - Fix a few missing "VMCB dirty" bugs

   - Fix the worst of KVM's lack of EFER.LMSLE emulation

   - Add AVIC support for addressing 4k vCPUs in x2AVIC mode

   - Fix incorrect handling of selective CR0 writes when checking
     intercepts during emulation of L2 instructions

   - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32]
     on VMRUN and #VMEXIT

   - Fix a bug where KVM corrupt the guest code stream when re-injecting
     a soft interrupt if the guest patched the underlying code after the
     VM-Exit, e.g. when Linux patches code with a temporary INT3

   - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits
     to userspace, and extend KVM "support" to all policy bits that
     don't require any actual support from KVM

  x86 (Intel):

   - Use the root role from kvm_mmu_page to construct EPTPs instead of
     the current vCPU state, partly as worthwhile cleanup, but mostly to
     pave the way for tracking per-root TLB flushes, and elide EPT
     flushes on pCPU migration if the root is clean from a previous
     flush

   - Add a few missing nested consistency checks

   - Rip out support for doing "early" consistency checks via hardware
     as the functionality hasn't been used in years and is no longer
     useful in general; replace it with an off-by-default module param
     to WARN if hardware fails a check that KVM does not perform

   - Fix a currently-benign bug where KVM would drop the guest's
     SPEC_CTRL[63:32] on VM-Enter

   - Misc cleanups

   - Overhaul the TDX code to address systemic races where KVM (acting
     on behalf of userspace) could inadvertantly trigger lock contention
     in the TDX-Module; KVM was either working around these in weird,
     ugly ways, or was simply oblivious to them (though even Yan's
     devilish selftests could only break individual VMs, not the host
     kernel)

   - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a
     TDX vCPU, if creating said vCPU failed partway through

   - Fix a few sparse warnings (bad annotation, 0 != NULL)

   - Use struct_size() to simplify copying TDX capabilities to userspace

   - Fix a bug where TDX would effectively corrupt user-return MSR
     values if the TDX Module rejects VP.ENTER and thus doesn't clobber
     host MSRs as expected

  Selftests:

   - Fix a math goof in mmu_stress_test when running on a single-CPU
     system/VM

   - Forcefully override ARCH from x86_64 to x86 to play nice with
     specifying ARCH=x86_64 on the command line

   - Extend a bunch of nested VMX to validate nested SVM as well

   - Add support for LA57 in the core VM_MODE_xxx macro, and add a test
     to verify KVM can save/restore nested VMX state when L1 is using
     5-level paging, but L2 is not

   - Clean up the guest paging code in anticipation of sharing the core
     logic for nested EPT and nested NPT

  guest_memfd:

   - Add NUMA mempolicy support for guest_memfd, and clean up a variety
     of rough edges in guest_memfd along the way

   - Define a CLASS to automatically handle get+put when grabbing a
     guest_memfd from a memslot to make it harder to leak references

   - Enhance KVM selftests to make it easer to develop and debug
     selftests like those added for guest_memfd NUMA support, e.g. where
     test and/or KVM bugs often result in hard-to-debug SIGBUS errors

   - Misc cleanups

  Generic:

   - Use the recently-added WQ_PERCPU when creating the per-CPU
     workqueue for irqfd cleanup

   - Fix a goof in the dirty ring documentation

   - Fix choice of target for directed yield across different calls to
     kvm_vcpu_on_spin(); the function was always starting from the first
     vCPU instead of continuing the round-robin search"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits)
  KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
  KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
  KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
  KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
  KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
  KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
  KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
  KVM: arm64: selftests: Add test for AT emulation
  KVM: arm64: nv: Expose hardware access flag management to NV guests
  KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
  KVM: arm64: Implement HW access flag management in stage-1 SW PTW
  KVM: arm64: Propagate PTW errors up to AT emulation
  KVM: arm64: Add helper for swapping guest descriptor
  KVM: arm64: nv: Use pgtable definitions in stage-2 walk
  KVM: arm64: Handle endianness in read helper for emulated PTW
  KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
  KVM: arm64: Call helper for reading descriptors directly
  KVM: arm64: nv: Advertise support for FEAT_XNX
  KVM: arm64: Teach ptdump about FEAT_XNX permissions
  KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions
  ...
2025-12-05 17:01:20 -08:00
Linus Torvalds e2aa39b368 * Make MSR-induced taint easier for users to track down
* Restrict KVM-specific exports to KVM itself
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmkuHIIACgkQaDWVMHDJ
 krAxMg//RQUz6JnQnMASuN/NhrjIANRjcPJI9S0LoKcTbZ0nZ5aH6oR1VOFszLLa
 ShGcUO2RuDbCl2wPAG/lRWV8eL/4k4mZi0zNT7vEKTkX/EZn5RDV59p88zCo62KV
 835OpX8W9Hvyiichw51RoVrJxEcqgCmlUYO2fCwtk2rpntUCOVQgHMeLhhqMsZ0e
 yMQECAE75oXQ4vhAG+zO7/KmLqVbSGgqpXYw6DOZGEJF0T7tdZIgFhd25WAPgcf0
 UN8VmTX971Eq67OrUX9OojN6+SxBqQ7vc+qBtd5bDlkZsRxVyV157Zso2PCPbsm2
 FkE65eJBa9qacqvwkCPND6J7gvE/Sm8DaLVafLPKDNWTaqSo4cfKJD7P/sgN1L69
 O8QsiLfafy8ITIA8AXS90C8x/puhqk15OKW2kJFFfUkhrGdu72/AxVlo6JcM1N0u
 qkDXUNBSX9/LHkRT9AtkLch27MEFXRKxsajjx2lFoBIR2VjIijm9314cRczHGZEV
 R/pqBh21yL/ZTriNIgmEPrFOV4zDxaOsHRh8YSEFAXRe2xWvm7dZwNSPRSh7hMT+
 q0ABPuYqTZ4PDGMaAB0gNRqmR9aQKpVMY+4xmTdmqscYkgV4usZQcrQOeiKVwh7F
 KdMC5tr4yFOMMl8CaMgOK+27ZrSYI1hwtXCc/orAhOwxhg62Z40=
 =tjcN
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Dave Hansen:
 "The most significant are some changes to ensure that symbols exported
  for KVM are used only by KVM modules themselves, along with some
  related cleanups.

  In true x86/misc fashion, the other patch is completely unrelated and
  just enhances an existing pr_warn() to make it clear to users how they
  have tainted their kernel when something is mucking with MSRs.

  Summary:

   - Make MSR-induced taint easier for users to track down

   - Restrict KVM-specific exports to KVM itself"

* tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible
  x86/mm: Drop unnecessary export of "ptdump_walk_pgd_level_debugfs"
  x86/mtrr: Drop unnecessary export of "mtrr_state"
  x86/bugs: Drop unnecessary export of "x86_spec_ctrl_base"
  x86/msr: Add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
2025-12-02 14:16:42 -08:00
Linus Torvalds a9a10e920e - Convert the tsx= cmdline parsing to use early_param()
- Cleanup forward declarations gunk in bugs.c
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmkt3IMACgkQEsHwGGHe
 VUqPxRAAkul5R9zZgR9XcV/ZRA6iulo4M0o38Z2COeWtlDtDXpAJ5vNarDBB1l+K
 ea9YzbfN4iECXNXOCtSrEr7SmJ2ld8xBSyzISeWx2V1AXi4A6A8GX574egtDDCaV
 vN+eTT1ki9YuJLhgi0eZ+uHkdoTLhChrBehuvlcHnEOXN1N9D/P8/QfDYG7IyCxu
 /jSnuXyQ3AA2uM3IMEAGLcAarI+cU4HgMsF2Za5Dp8SKhSbhpgpEfFkp8+p+bUO/
 nKLgUj2dqibFZd+hvhrwtrnsDL8rTJxLr0dVwg7MZeBea4GliUsaT8jBRBMQddIL
 DGjAR2niGeGM42DWvFUVZczcgdP0NI8ChpZ3zCjdICrZBGMkZe70iqcPaL0BELob
 toeBjlHIrlUnXIfZhVBYj59+E+q4yjSwvOd23FYqj1XKcgSR+j305dxo9jTAndLV
 M/f8o8au3tlccE308OQ/XgIkXFBlGLjrNlak32ze4P2nHFMTlYJBCbAFIti3qcpT
 Er7q866s7kzagphmZ/2vf0I7JuYI2le/8OnneVFz7l+SVXY1SNEpA7qRM5bA220J
 n2Orfaen3CgotX7c3yT/XP0c9Wqbd29LMXiQbOdGMMONv5O4aaWMsfRPhlCEYiTK
 VZ/FklfXrVxtSqa96zBny4MlvtKoYwYklk+McuGF5M4iDXGCthw=
 =kAho
 -----END PGP SIGNATURE-----

Merge tag 'x86_bugs_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CPU mitigation updates from Borislav Petkov:

 - Convert the tsx= cmdline parsing to use early_param()

 - Cleanup forward declarations gunk in bugs.c

* tag 'x86_bugs_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Get rid of the forward declarations
  x86/tsx: Get the tsx= command line parameter with early_param()
  x86/tsx: Make tsx_ctrl_state static
2025-12-02 13:27:09 -08:00
Paolo Bonzini e64dcfab57 KVM x86 misc changes for 6.19:
- Fix an async #PF bug where KVM would clear the completion queue when the
    guest transitioned in and out of paging mode, e.g. when handling an SMI and
    then returning to paged mode via RSM.
 
  - Fix a bug where TDX would effectively corrupt user-return MSR values if the
    TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected.
 
  - Leave the user-return notifier used to restore MSRs registered when
    disabling virtualization, and instead pin kvm.ko.  Restoring host MSRs via
    IPI callback is either pointless (clean reboot) or dangerous (forced reboot)
    since KVM has no idea what code it's interrupting.
 
  - Use the checked version of {get,put}_user(), as Linus wants to kill them
    off, and they're measurably faster on modern CPUs due to the unchecked
    versions containing an LFENCE.
 
  - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC
    timers can result in a hard lockup in the host.
 
  - Revert the periodic kvmclock sync logic now that KVM doesn't use a
    clocksource that's subject to NPT corrections.
 
  - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter
    behind CONFIG_CPU_MITIGATIONS.
 
  - Context switch XCR0, XSS, and PKRU outside of the entry/exit fastpath as
    the only reason they were handled in the faspath was to paper of a bug in
    the core #MC code that has long since been fixed.
 
  - Add emulator support for AVX MOV instructions to play nice with emulated
    devices whose PCI BARs guest drivers like to access with large multi-byte
    instructions.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmkmTw0ACgkQOlYIJqCj
 N/1UDg/9GaIrOk+qBiPQhS8jxTfumL+2DSQTWyg2Fm8E6alpar/PgWbhO0+y4iYR
 6vncg04iEFPxUQIB5TpBoD5sLYarrS9uT0HGKNEdA84P2LthCqRTsSAWL9lxey7+
 MmoNN9l1IZfe+rn7nh8oK0UYFzhqa34DO81Vdl7otohf/dNyTf47KXsMllAGdSxX
 9ipF0Xr2C6N5d4r5zBJT38vLV89CjBmydvegxbUo9fsp9tOWWz8dwidjR+ZMjFxE
 HNtj4bLHw0JefCJKN0SlKO3Q4IuNPeFNVu7FRkdb1IJI6OOMoGzAufhPFp21+iwI
 4Ost2Kd1RuMB2rjlLjeVq4ygK0BAKkZuELUfQPEavih7yf3v0yhP3ojjGQphFGxm
 8pSOjxUQ9CBELwfAJqrf92Z9Tpya+Goq3qL/aa1E9p7TSpSU9NxQv2nOuMfGsDxg
 xmSjfUOGA7pBSYH17ORdqytHya4kqlqtI7v3FvTg+zCXFhmU41HVyqWBkONWIVLn
 wBGNcj0fgwhF3Q5aHHNAwa4En+S2jF/7f7WuF/B3EG3DARCZdPvfd5NOq2rs95Pj
 V/VG9AHy5r86y7DAzeu0a0JmxW9mzL0TEZW+lJyU12oz1jDvMUOF38wkWjUDAj/N
 dp2GYndbpsn8tlnAuv6DHBBNKlfCXbqi2fXTymWpsY4GKLULqos=
 =cORg
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-misc-6.19' of https://github.com/kvm-x86/linux into HEAD

KVM x86 misc changes for 6.19:

 - Fix an async #PF bug where KVM would clear the completion queue when the
   guest transitioned in and out of paging mode, e.g. when handling an SMI and
   then returning to paged mode via RSM.

 - Fix a bug where TDX would effectively corrupt user-return MSR values if the
   TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected.

 - Leave the user-return notifier used to restore MSRs registered when
   disabling virtualization, and instead pin kvm.ko.  Restoring host MSRs via
   IPI callback is either pointless (clean reboot) or dangerous (forced reboot)
   since KVM has no idea what code it's interrupting.

 - Use the checked version of {get,put}_user(), as Linus wants to kill them
   off, and they're measurably faster on modern CPUs due to the unchecked
   versions containing an LFENCE.

 - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC
   timers can result in a hard lockup in the host.

 - Revert the periodic kvmclock sync logic now that KVM doesn't use a
   clocksource that's subject to NPT corrections.

 - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter
   behind CONFIG_CPU_MITIGATIONS.

 - Context switch XCR0, XSS, and PKRU outside of the entry/exit fastpath as
   the only reason they were handled in the faspath was to paper of a bug in
   the core #MC code that has long since been fixed.

 - Add emulator support for AVX MOV instructions to play nice with emulated
   devices whose PCI BARs guest drivers like to access with large multi-byte
   instructions.
2025-11-26 09:34:21 +01:00
Sean Christopherson f6106d41ec x86/bugs: Use an x86 feature to track the MMIO Stale Data mitigation
Convert the MMIO Stale Data mitigation tracking from a static branch into
an x86 feature flag so that it can be used via ALTERNATIVE_2 in KVM.

No functional change intended.

Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://patch.msgid.link/20251113233746.1703361-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18 16:22:42 -08:00
Pawan Gupta aba7de6088 x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
TSA mitigation:

  d8010d4ba4 ("x86/bugs: Add a Transient Scheduler Attacks mitigation")

introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
(kernel->user also).

Make mitigations on Intel consistent with TSA. This would help handling the
guest-only mitigations better in future.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
[sean: make CLEAR_CPU_BUF_VM mutually exclusive with the MMIO mitigation]
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://patch.msgid.link/20251113233746.1703361-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-18 16:22:40 -08:00
Borislav Petkov (AMD) e67997021f x86/bugs: Get rid of the forward declarations
Get rid of the forward declarations of the mitigation functions by
moving their single caller below them.

No functional changes.

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20251105200447.GBaQut3w4dLilZrX-z@fat_crate.local
2025-11-14 20:32:21 +01:00
Sean Christopherson 6276c67f2b x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(),
and use the helper macro to export symbols for KVM throughout x86 if and
only if KVM will build one or more modules, and only for those modules.

To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be
built (because no vendor modules are selected), let arch code #define
EXPORT_SYMBOL_FOR_KVM to suppress/override the exports.

Note, the set of symbols to restrict to KVM was generated by manual search
and audit; any "misses" are due to human error, not some grand plan.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Tested-by: Kai Huang <kai.huang@intel.com>
Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
2025-11-12 15:29:38 -08:00
Sean Christopherson ed02882460 x86/bugs: Drop unnecessary export of "x86_spec_ctrl_base"
Don't export x86_spec_ctrl_base as it's used only in bugs.c and process.c,
neither of which can be built into a module.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251112173944.1380633-2-seanjc%40google.com
2025-11-12 15:24:42 -08:00
Andy Shevchenko 84dfce65a7 x86/bugs: Remove dead code which might prevent from building
Clang, in particular, is not happy about dead code:

arch/x86/kernel/cpu/bugs.c:1830:20: error: unused function 'match_option' [-Werror,-Wunused-function]
 1830 | static inline bool match_option(const char *arg, int arglen, const char *opt)
      |                    ^~~~~~~~~~~~
1 error generated.

Remove a leftover from the previous cleanup.

Fixes: 02ac6cc8c5 ("x86/bugs: Simplify SSB cmdline parsing")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251024125959.1526277-1-andriy.shevchenko%40linux.intel.com
2025-10-24 09:42:00 -07:00
David Kaplan 204ced4108 x86/bugs: Qualify RETBLEED_INTEL_MSG
When retbleed mitigation is disabled, the kernel already prints an info
message that the system is vulnerable.  Recent code restructuring also
inadvertently led to RETBLEED_INTEL_MSG being printed as an error, which is
unnecessary as retbleed mitigation was already explicitly disabled (by config
option, cmdline, etc.).

Qualify this print statement so the warning is not printed unless an actual
retbleed mitigation was selected and is being disabled due to incompatibility
with spectre_v2.

Fixes: e3b78a7ad5 ("x86/bugs: Restructure retbleed mitigation")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220624
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251003171936.155391-1-david.kaplan@amd.com
2025-10-21 12:32:28 +02:00
David Kaplan 930f2361fe x86/bugs: Report correct retbleed mitigation status
On Intel CPUs, the default retbleed mitigation is IBRS/eIBRS but this
requires that a similar spectre_v2 mitigation is applied.  If the user
selects a different spectre_v2 mitigation (like spectre_v2=retpoline) a
warning is printed but sysfs will still report 'Mitigation: IBRS' or
'Mitigation: Enhanced IBRS'.  This is incorrect because retbleed is not
mitigated, and IBRS is not actually set.

Fix this by choosing RETBLEED_MITIGATION_NONE in this scenario so the
kernel correctly reports the system as vulnerable to retbleed.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250915134706.3201818-1-david.kaplan@amd.com
2025-09-16 13:32:18 +02:00
David Kaplan d1cc1baef6 x86/bugs: Fix reporting of LFENCE retpoline
The LFENCE retpoline mitigation is not secure but the kernel prints
inconsistent messages about this fact.  The dmesg log says 'Mitigation:
LFENCE', implying the system is mitigated.  But sysfs reports 'Vulnerable:
LFENCE' implying the system (correctly) is not mitigated.

Fix this by printing a consistent 'Vulnerable: LFENCE' string everywhere
when this mitigation is selected.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250915134706.3201818-1-david.kaplan@amd.com
2025-09-16 13:21:21 +02:00
David Kaplan 30ef245c6f x86/bugs: Fix spectre_v2 forcing
There were two oddities with spectre_v2 command line options.

First, any option other than 'off' or 'auto' would force spectre_v2
mitigations even if the CPU (hypothetically) wasn't vulnerable to spectre_v2.
That was inconsistent with all the other bugs where mitigations are ignored
unless an explicit 'force' option is specified.

Second, even though spectre_v2 mitigations would be enabled in these cases,
the X86_BUG_SPECTRE_V2 bit wasn't set.  This is again inconsistent with the
forcing behavior of other bugs and arguably incorrect as it doesn't make sense
to enable a mitigation if the X86_BUG bit isn't set.

Fix both issues by only forcing spectre_v2 mitigations when the
'spectre_v2=on' option is specified (which was already called
SPECTRE_V2_CMD_FORCE) and setting the relevant X86_BUG_* bits in that case.

This also allows for simplifying bhi_update_mitigation() because
spectre_v2_cmd will now always be SPECTRE_V2_CMD_NONE if the CPU is immune to
spectre_v2.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250915134706.3201818-1-david.kaplan@amd.com
2025-09-16 12:59:55 +02:00
David Kaplan 440d20154a x86/bugs: Remove uses of cpu_mitigations_off()
cpu_mitigations_off() is no longer needed because all bugs use attack
vector controls to select a mitigation, and cpu_mitigations_off() is
equivalent to no attack vectors being selected.

Remove the few remaining unnecessary uses of this function in this file.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
2025-09-15 21:10:44 +02:00
David Kaplan 02ac6cc8c5 x86/bugs: Simplify SSB cmdline parsing
Simplify the SSB command line parsing by selecting a mitigation directly, as
is done in most of the simpler vulnerabilities.  Use early_param() instead of
cmdline_find_option() for consistency with the other mitigation selections.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/r/20250819192200.2003074-4-david.kaplan@amd.com
2025-09-15 18:04:20 +02:00
David Kaplan 9a9f8147ae x86/bugs: Use early_param() for spectre_v2
Most of the mitigations in bugs.c use early_param() for command line parsing.
Rework the spectre_v2 and nospectre_v2 command line options to be
consistent with the others.

Remove spec_v2_print_cond() as informing the user of the their cmdline
choice isn't interesting.

  [ bp: Zap spectre_v2_check_cmd(). ]

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/r/20250819192200.2003074-3-david.kaplan@amd.com
2025-09-15 17:26:49 +02:00
David Kaplan 8edb9e7711 x86/bugs: Use early_param() for spectre_v2_user
Most of the mitigations in bugs.c use early_param() to parse their command
line options.  Modify spectre_v2_user to use early_param() for consistency.

Remove spec_v2_user_print_cond() because informing a user about their
cmdline choice isn't very interesting and the chosen mitigation is already
printed in spectre_v2_user_update_mitigation().

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/r/20250819192200.2003074-2-david.kaplan@amd.com
2025-09-15 17:07:22 +02:00
David Kaplan 5799d5d8a6 x86/bugs: Add attack vector controls for VMSCAPE
Use attack vector controls to select whether VMSCAPE requires mitigation,
similar to other bugs.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-12 23:19:29 +02:00
Linus Torvalds 223ba8ee0a Mitigate VMSCAPE issue with indirect branch predictor flushes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmi58uwACgkQaDWVMHDJ
 krCIBxAAj/8/RBSSK6ULtDLKbmpRKMVpwEE1Yt8vK95Z/50gVSidtQtofIet+CPY
 NeN5Y4Aip3w/JFoIQafop8ZASOFjNjhqVEjE75RdtdDacQCyluqWg/2PrJpKkBVv
 OWTVVVPD9aSZAY0Tk/79ABV8Fbp/EBID5mhJ40GrBhkLZku2ALDj1eQINEjoBedB
 2+sCO1MMqynlmglt8FltwFtl0rHgtlhGviuc/QmsxH9FrLIGBlgciW4Rma+LOtAE
 4iD1Ij/ICuwA78kPAgrxvs+B1w3QGZhTPvOHjj0c9kKM3jBqphWoMWFUKbFfUK8i
 6rM0jZMB8iaUcKJ+Ra+stNmvddLkbya7J9wwHgQWi/kxEMZMxbbbOXwfl1Ya8sha
 n/kKxm8Lsrjex3RTnd1hoXvGY2blr0dZ97jfjgOqVuYBZih5yWzixQbuf3TAbCZO
 Kb+fbfC7EsI1N0zuFh42Q1hT0zxYYshNIxtGPjDwspJRkHvhmNjNswXr7sccXhFo
 P5araDcYN0ul85SlAhQRMB17mle47ETSgh04LRM4Rq3rbweXzghoRj//WcY4YqYS
 qSJEFzSC7hVwNabG+NBexUaZL8bZRMoE7qx5lmo0q+tTMIQkEG2rqrFz9b1d4JON
 g6aKyrD8YyRCoBjZAF0tjCwhQgxSKXGsVwzBYl0+RcY+1Lo1L2U=
 =8wrr
 -----END PGP SIGNATURE-----

Merge tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vmescape mitigation fixes from Dave Hansen:
 "Mitigate vmscape issue with indirect branch predictor flushes.

  vmscape is a vulnerability that essentially takes Spectre-v2 and
  attacks host userspace from a guest. It particularly affects
  hypervisors like QEMU.

  Even if a hypervisor may not have any sensitive data like disk
  encryption keys, guest-userspace may be able to attack the
  guest-kernel using the hypervisor as a confused deputy.

  There are many ways to mitigate vmscape using the existing Spectre-v2
  defenses like IBRS variants or the IBPB flushes. This series focuses
  solely on IBPB because it works universally across vendors and all
  vulnerable processors. Further work doing vendor and model-specific
  optimizations can build on top of this if needed / wanted.

  Do the normal issue mitigation dance:

   - Add the CPU bug boilerplate

   - Add a list of vulnerable CPUs

   - Use IBPB to flush the branch predictors after running guests"

* tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vmscape: Add old Intel CPUs to affected list
  x86/vmscape: Warn when STIBP is disabled with SMT
  x86/bugs: Move cpu_bugs_smt_update() down
  x86/vmscape: Enable the mitigation
  x86/vmscape: Add conditional IBPB mitigation
  x86/vmscape: Enumerate VMSCAPE bug
  Documentation/hw-vuln: Add VMSCAPE documentation
2025-09-10 20:52:16 -07:00
David Kaplan 8b3641dfb6 x86/bugs: Add attack vector controls for SSB
Attack vector controls for SSB were missed in the initial attack vector series.
The default mitigation for SSB requires user-space opt-in so it is only
relevant for user->user attacks.  Check with attack vector controls when
the command is auto - i.e., no explicit user selection has been done.

Fixes: 2d31d28746 ("x86/bugs: Define attack vectors relevant for each bug")
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250819192200.2003074-5-david.kaplan@amd.com
2025-08-27 18:17:12 +02:00
Li RongQing d4932a1b14 x86/bugs: Fix GDS mitigation selecting when mitigation is off
The current GDS mitigation logic incorrectly returns early when the
attack vector mitigation is turned off, which leads to two problems:

1. CPUs without ARCH_CAP_GDS_CTRL support are incorrectly marked with
   GDS_MITIGATION_OFF when they should be marked as
   GDS_MITIGATION_UCODE_NEEDED.

2. The mitigation state checks and locking verification that follow are
   skipped, which means:
   - fail to detect if the mitigation was locked
   - miss the warning when trying to disable a locked mitigation

Remove the early return to ensure proper mitigation state handling. This
allows:
- Proper mitigation classification for non-ARCH_CAP_GDS_CTRL CPUs
- Complete mitigation state verification

This also addresses the failed MSR 0x123 write attempt at boot on
non-ARCH_CAP_GDS_CTRL CPUs:

  unchecked MSR access error: WRMSR to 0x123 (tried to write 0x0000000000000010) at rIP: ... (update_gds_msr)
  Call Trace:
   identify_secondary_cpu
   start_secondary
   common_startup_64
   WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/bugs.c:1053 update_gds_msr

  [ bp: Massage, zap superfluous braces. ]

Fixes: 8c7261abcb ("x86/bugs: Add attack vector controls for GDS")
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/20250819023356.2012-1-lirongqing@baidu.com
2025-08-19 10:38:04 +02:00
Pawan Gupta b7cc988723 x86/vmscape: Warn when STIBP is disabled with SMT
Cross-thread attacks are generally harder as they require the victim to be
co-located on a core. However, with VMSCAPE the adversary targets belong to
the same guest execution, that are more likely to get co-located. In
particular, a thread that is currently executing userspace hypervisor
(after the IBPB) may still be targeted by a guest execution from a sibling
thread.

Issue a warning about the potential risk, except when:

- SMT is disabled
- STIBP is enabled system-wide
- Intel eIBRS is enabled (which implies STIBP protection)

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14 10:38:18 -07:00
Pawan Gupta 6449f5baf9 x86/bugs: Move cpu_bugs_smt_update() down
cpu_bugs_smt_update() uses global variables from different mitigations. For
SMT updates it can't currently use vmscape_mitigation that is defined after
it.

Since cpu_bugs_smt_update() depends on many other mitigations, move it
after all mitigations are defined. With that, it can use vmscape_mitigation
in a moment.

No functional change.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14 10:37:48 -07:00
Pawan Gupta 556c1ad666 x86/vmscape: Enable the mitigation
Enable the previously added mitigation for VMscape. Add the cmdline
vmscape={off|ibpb|force} and sysfs reporting.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14 10:37:33 -07:00
Pawan Gupta 2f8f173413 x86/vmscape: Add conditional IBPB mitigation
VMSCAPE is a vulnerability that exploits insufficient branch predictor
isolation between a guest and a userspace hypervisor (like QEMU). Existing
mitigations already protect kernel/KVM from a malicious guest. Userspace
can additionally be protected by flushing the branch predictors after a
VMexit.

Since it is the userspace that consumes the poisoned branch predictors,
conditionally issue an IBPB after a VMexit and before returning to
userspace. Workloads that frequently switch between hypervisor and
userspace will incur the most overhead from the new IBPB.

This new IBPB is not integrated with the existing IBPB sites. For
instance, a task can use the existing speculation control prctl() to
get an IBPB at context switch time. With this implementation, the
IBPB is doubled up: one at context switch and another before running
userspace.

The intent is to integrate and optimize these cases post-embargo.

[ dhansen: elaborate on suboptimal IBPB solution ]

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Sean Christopherson <seanjc@google.com>
2025-08-14 10:37:18 -07:00
David Kaplan 4fa7d880ae x86/bugs: Select best SRSO mitigation
The SRSO bug can theoretically be used to conduct user->user or guest->guest
attacks and requires a mitigation (namely IBPB instead of SBPB on context
switch) for these.  So mark SRSO as being applicable to the user->user and
guest->guest attack vectors.

Additionally, SRSO supports multiple mitigations which mitigate different
potential attack vectors.  Some CPUs are also immune to SRSO from
certain attack vectors (like user->kernel).

Use the specific attack vectors requiring mitigation to select the best
SRSO mitigation to avoid unnecessary performance hits.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250721160310.1804203-1-david.kaplan@amd.com
2025-08-11 17:32:36 +02:00
David Kaplan a026dc61cf x86/bugs: Print enabled attack vectors
Print the status of enabled attack vectors and SMT mitigation status in the
boot log for easier reporting and debugging.  This information will also be
available through sysfs.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-21-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 6b21d2f0dc x86/bugs: Add attack vector controls for TSA
Use attack vector controls to determine which TSA mitigation to use.

  [ bp: Simplify the condition in the select function for better
    readability. ]

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250709155844.3279471-1-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 0cdd2c4f35 x86/bugs: Add attack vector controls for ITS
Use attack vector controls to determine if ITS mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-19-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan eda718fde6 x86/bugs: Add attack vector controls for SRSO
Use attack vector controls to determine if SRSO mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-18-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 2f970a5269 x86/bugs: Add attack vector controls for L1TF
Use attack vector controls to determine if L1TF mitigation is required.

Disable SMT if cross-thread protection is desired.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-17-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan fdf99228e2 x86/bugs: Add attack vector controls for spectre_v2
Use attack vector controls to determine if spectre_v2 mitigation is
required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-16-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan ddcd4d3cb3 x86/bugs: Add attack vector controls for BHI
Use attack vector controls to determine if BHI mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-15-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 07a659edcf x86/bugs: Add attack vector controls for spectre_v2_user
Use attack vector controls to determine if spectre_v2_user mitigation is
required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-14-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 9687eb2399 x86/bugs: Add attack vector controls for retbleed
Use attack vector controls to determine if retbleed mitigation is
required.

Disable SMT if cross-thread protection is desired and STIBP is not
available.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-13-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 19a5f3ea43 x86/bugs: Add attack vector controls for spectre_v1
Use attack vector controls to determine if spectre_v1 mitigation is
required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-12-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 8c7261abcb x86/bugs: Add attack vector controls for GDS
Use attack vector controls to determine if GDS mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-11-david.kaplan@amd.com
2025-07-11 17:56:41 +02:00
David Kaplan 71dc301c26 x86/bugs: Add attack vector controls for SRBDS
Use attack vector controls to determine if SRBDS mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-10-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
David Kaplan 54b53dca65 x86/bugs: Add attack vector controls for RFDS
Use attack vector controls to determine if RFDS mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-9-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
David Kaplan de6f0921ba x86/bugs: Add attack vector controls for MMIO
Use attack vectors controls to determine if MMIO mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-8-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
David Kaplan 736565d4ed x86/bugs: Add attack vector controls for TAA
Use attack vector controls to determine if TAA mitigation is required.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-7-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
David Kaplan e3a88d4c06 x86/bugs: Add attack vector controls for MDS
Use attack vector controls to determine if MDS mitigation is required.
The global mitigations=off command now simply disables all attack vectors
so explicit checking of mitigations=off is no longer needed.

If cross-thread attack mitigations are required, disable SMT.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-6-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
David Kaplan 2d31d28746 x86/bugs: Define attack vectors relevant for each bug
Add a function which defines which vulnerabilities should be mitigated
based on the selected attack vector controls.  The selections here are
based on the individual characteristics of each vulnerability.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250707183316.1349127-5-david.kaplan@amd.com
2025-07-11 17:56:40 +02:00
Borislav Petkov (AMD) fde494e905 Add the mitigation logic for Transient Scheduler Attacks (TSA)
TSA are new aspeculative side channel attacks related to the execution
 timing of instructions under specific microarchitectural conditions. In
 some cases, an attacker may be able to use this timing information to
 infer data from other contexts, resulting in information leakage.
 
 Add the usual controls of the mitigation and integrate it into the
 existing speculation bugs infrastructure in the kernel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmhSsvQACgkQEsHwGGHe
 VUrWNw//V+ZabYq3Nnvh4jEe6Altobnpn8bOIWmcBx6I3xuuArb9bLqcbKerDIcC
 POVVW6zrdNigDe/U4aqaJXE7qCRX55uTYbhp8OLH0zzqX3Pjl/hUnEXWtMtlXj/G
 CIM5mqjqEFp5JRGXetdjjuvjG1IPf+CbjKqj2WXbi//T6F3LiAFxkzdUhd+clBF/
 ztWchjwUmqU0WJd6+Smb8ZnvWrLoZuOFldjhFad820B7fqkdJhzjHMmwBHJKUEZu
 oABv8B0/4IALrx6LenCspWS4OuTOGG7DKyIgzitByXygXXb4L3ZUKpuqkxBU7hFx
 bscwtOP7e5HIYAekx6ZSLZoZpYQXr1iH0aRGrjwapi3ASIpUwI0UA9ck2PdGo0IY
 0GvmN0vbybskewBQyG819BM+DCau5pOLWuL7cYmaD2eTNoOHOknMDNlO8VzXqJxa
 NnignSuEWFm2vNV1FXEav2YbVjlanV6JleiPDGBe5Xd9dnxZTvg9HuP2NkYio4dZ
 mb/kEU/kTcN8nWh0Q96tX45kmj0vCbBgrSQkmUpyAugp38n69D1tp3ii9D/hyQFH
 hKGcFC9m+rYVx1NLyAxhTGxaEqF801d5Qawwud8HsnQudTpCdSXD9fcBg9aCbWEa
 FymtDpIeUQrFAjDpVEp6Syh3odKvLXsGEzL+DVvqKDuA8r6DxFo=
 =2cLl
 -----END PGP SIGNATURE-----

Merge tag 'tsa_x86_bugs_for_6.16' into tip-x86-bugs

Pick up TSA changes from mainline so that attack vectors work can
continue ontop.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-07-09 18:16:53 +02:00
David Kaplan 98b5dab4d2 x86/bugs: Clean up SRSO microcode handling
SRSO microcode only exists for Zen3/Zen4 CPUs.  For those CPUs, the microcode
is required for any mitigation other than Safe-RET to be effective.  Safe-RET
can still protect user->kernel and guest->host attacks without microcode.

Clarify this in the code and ensure that SRSO_MITIGATION_UCODE_NEEDED is
selected for any mitigation besides Safe-RET if the required microcode isn't
present.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250625155805.600376-4-david.kaplan@amd.com
2025-06-26 13:32:31 +02:00
David Kaplan ff54ae7314 x86/bugs: Use IBPB for retbleed if used by SRSO
If spec_rstack_overflow=ibpb then this mitigates retbleed as well.  This
is relevant for AMD Zen1 and Zen2 CPUs which are vulnerable to both bugs.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: H . Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250625155805.600376-3-david.kaplan@amd.com
2025-06-26 10:56:39 +02:00
David Kaplan 1fd5eb0286 x86/bugs: Add SRSO_MITIGATION_NOSMT
AMD Zen1 and Zen2 CPUs with SMT disabled are not vulnerable to SRSO.

Instead of overloading the X86_FEATURE_SRSO_NO bit to indicate this,
define a separate mitigation to make the code cleaner.

Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: H . Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250625155805.600376-2-david.kaplan@amd.com
2025-06-26 10:56:39 +02:00
Pawan Gupta ab9f2388e0 x86/bugs: Allow ITS stuffing in eIBRS+retpoline mode also
After a recent restructuring of the ITS mitigation, RSB stuffing can no longer
be enabled in eIBRS+Retpoline mode. Before ITS, retbleed mitigation only
allowed stuffing when eIBRS was not enabled. This was perfectly fine since
eIBRS mitigates retbleed.

However, RSB stuffing mitigation for ITS is still needed with eIBRS. The
restructuring solely relies on retbleed to deploy stuffing, and does not allow
it when eIBRS is enabled. This behavior is different from what was before the
restructuring. Fix it by allowing stuffing in eIBRS+retpoline mode also.

Fixes: 61ab72c2c6 ("x86/bugs: Restructure ITS mitigation")
Closes: https://lore.kernel.org/lkml/20250519235101.2vm6sc5txyoykb2r@desk/
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250611-eibrs-fix-v4-7-5ff86cac6c61@linux.intel.com
2025-06-24 14:12:41 +02:00
Pawan Gupta e2a9c03192 x86/bugs: Remove its=stuff dependency on retbleed
Allow ITS to enable stuffing independent of retbleed. The dependency is only
on retpoline. It is a valid case for retbleed to be mitigated by eIBRS while
ITS deploys stuffing at the same time.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250611-eibrs-fix-v4-6-5ff86cac6c61@linux.intel.com
2025-06-23 12:29:04 +02:00