Commit Graph

33 Commits

Author SHA1 Message Date
Bibo Mao d8cc4fee3f LoongArch: KVM: Remove duplicated cache attribute setting
Cache attribute comes from GPA->HPA secondary mmu page table and is
configured when kvm is enabled. It is the same for all VMs, so remove
duplicated cache attribute setting on vCPU context switch.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13 12:02:56 +08:00
Bibo Mao 2737dee106 LoongArch: KVM: Add hypercall service support for usermode VMM
Some VMMs provides special hypercall service in usermode, KVM should not
handle the usermode hypercall service, thus pass it to usermode, let the
usermode VMM handle it.

Here a new code KVM_HCALL_CODE_USER_SERVICE is added for the user-mode
hypercall service, KVM lets all six registers visible to usermode VMM.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-01-13 21:37:17 +08:00
Huacai Chen 589e6cc759 LoongArch: KVM: Protect kvm_check_requests() with SRCU
When we enable lockdep we get such a warning:

 =============================
 WARNING: suspicious RCU usage
 6.12.0-rc7+ #1891 Tainted: G        W
 -----------------------------
 include/linux/kvm_host.h:1043 suspicious rcu_dereference_check() usage!
 other info that might help us debug this:
 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by qemu-system-loo/948:
  #0: 90000001184a00a8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0xf4/0xe20 [kvm]
 stack backtrace:
 CPU: 0 UID: 0 PID: 948 Comm: qemu-system-loo Tainted: G        W          6.12.0-rc7+ #1891
 Tainted: [W]=WARN
 Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022
 Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 900000012c578000
         900000012c57b920 0000000000000000 900000012c57b928 9000000007e53788
         900000000815bcc8 900000000815bcc0 900000012c57b790 0000000000000001
         0000000000000001 4b031894b9d6b725 0000000004dec000 90000001003299c0
         0000000000000414 0000000000000001 000000000000002d 0000000000000003
         0000000000000030 00000000000003b4 0000000004dec000 90000001184a0000
         900000000806d000 9000000007e53788 00000000000000b4 0000000000000004
         0000000000000004 0000000000000000 0000000000000000 9000000107baf600
         9000000008916000 9000000007e53788 9000000005924778 0000000010000044
         00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d
         ...
 Call Trace:
 [<9000000005924778>] show_stack+0x38/0x180
 [<90000000071519c4>] dump_stack_lvl+0x94/0xe4
 [<90000000059eb754>] lockdep_rcu_suspicious+0x194/0x240
 [<ffff8000022143bc>] kvm_gfn_to_hva_cache_init+0xfc/0x120 [kvm]
 [<ffff80000222ade4>] kvm_pre_enter_guest+0x3a4/0x520 [kvm]
 [<ffff80000222b3dc>] kvm_handle_exit+0x23c/0x480 [kvm]

Fix it by protecting kvm_check_requests() with SRCU.

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-12-02 16:42:10 +08:00
Xianglai Li c532de5a67 LoongArch: KVM: Add IPI device support
Add device model for IPI interrupt controller, implement basic create &
destroy interfaces, and register device model to kvm device table.

Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-11-13 16:18:27 +08:00
Huacai Chen 73adbd92f3 LoongArch: KVM: Mark hrtimer to expire in hard interrupt context
Like commit 2c0d278f32 ("KVM: LAPIC: Mark hrtimer to expire in hard
interrupt context") and commit 9090825fa9 ("KVM: arm/arm64: Let the
timer expire in hardirq context on RT"), On PREEMPT_RT enabled kernels
unmarked hrtimers are moved into soft interrupt expiry mode by default.
Then the timers are canceled from an preempt-notifier which is invoked
with disabled preemption which is not allowed on PREEMPT_RT.

The timer callback is short so in could be invoked in hard-IRQ context.
So let the timer expire on hard-IRQ context even on -RT.

This fix a "scheduling while atomic" bug for PREEMPT_RT enabled kernels:

 BUG: scheduling while atomic: qemu-system-loo/1011/0x00000002
 Modules linked in: amdgpu rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ns
 CPU: 1 UID: 0 PID: 1011 Comm: qemu-system-loo Tainted: G        W          6.12.0-rc2+ #1774
 Tainted: [W]=WARN
 Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022
 Stack : ffffffffffffffff 0000000000000000 9000000004e3ea38 9000000116744000
         90000001167475a0 0000000000000000 90000001167475a8 9000000005644830
         90000000058dc000 90000000058dbff8 9000000116747420 0000000000000001
         0000000000000001 6a613fc938313980 000000000790c000 90000001001c1140
         00000000000003fe 0000000000000001 000000000000000d 0000000000000003
         0000000000000030 00000000000003f3 000000000790c000 9000000116747830
         90000000057ef000 0000000000000000 9000000005644830 0000000000000004
         0000000000000000 90000000057f4b58 0000000000000001 9000000116747868
         900000000451b600 9000000005644830 9000000003a13998 0000000010000020
         00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d
         ...
 Call Trace:
 [<9000000003a13998>] show_stack+0x38/0x180
 [<9000000004e3ea34>] dump_stack_lvl+0x84/0xc0
 [<9000000003a71708>] __schedule_bug+0x48/0x60
 [<9000000004e45734>] __schedule+0x1114/0x1660
 [<9000000004e46040>] schedule_rtlock+0x20/0x60
 [<9000000004e4e330>] rtlock_slowlock_locked+0x3f0/0x10a0
 [<9000000004e4f038>] rt_spin_lock+0x58/0x80
 [<9000000003b02d68>] hrtimer_cancel_wait_running+0x68/0xc0
 [<9000000003b02e30>] hrtimer_cancel+0x70/0x80
 [<ffff80000235eb70>] kvm_restore_timer+0x50/0x1a0 [kvm]
 [<ffff8000023616c8>] kvm_arch_vcpu_load+0x68/0x2a0 [kvm]
 [<ffff80000234c2d4>] kvm_sched_in+0x34/0x60 [kvm]
 [<9000000003a749a0>] finish_task_switch.isra.0+0x140/0x2e0
 [<9000000004e44a70>] __schedule+0x450/0x1660
 [<9000000004e45cb0>] schedule+0x30/0x180
 [<ffff800002354c70>] kvm_vcpu_block+0x70/0x120 [kvm]
 [<ffff800002354d80>] kvm_vcpu_halt+0x60/0x3e0 [kvm]
 [<ffff80000235b194>] kvm_handle_gspr+0x3f4/0x4e0 [kvm]
 [<ffff80000235f548>] kvm_handle_exit+0x1c8/0x260 [kvm]

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-10-23 22:15:44 +08:00
Bibo Mao cdc118f802 LoongArch: KVM: Enable paravirt feature control from VMM
Export kernel paravirt features to user space, so that VMM can control
each single paravirt feature. By default paravirt features will be the
same with kvm supported features if VMM does not set it.

Also a new feature KVM_FEATURE_VIRT_EXTIOI is added which can be set
from user space. This feature indicates that the virt EIOINTC can route
interrupts to 256 vCPUs, rather than 4 vCPUs like with real HW.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-12 20:53:40 +08:00
Song Gao f4e40ea9f7 LoongArch: KVM: Add PMU support for guest
On LoongArch, the host and guest have their own PMU CSRs registers and
they share PMU hardware resources. A set of PMU CSRs consists of a CTRL
register and a CNTR register. We can set which PMU CSRs are used by the
guest by writing to the GCFG register [24:26] bits.

On KVM side:
- Save the host PMU CSRs into structure kvm_context.
- If the host supports the PMU feature.
  - When entering guest mode, save the host PMU CSRs and restore the guest PMU CSRs.
  - When exiting guest mode, save the guest PMU CSRs and restore the host PMU CSRs.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-12 20:53:40 +08:00
Bibo Mao acc7f20d54 LoongArch: KVM: Add vm migration support for LBT registers
Every vcpu has separate LBT registers. And there are four scr registers,
one flags and ftop register for LBT extension. When VM migrates, VMM
needs to get LBT registers for every vcpu.

Here macro KVM_REG_LOONGARCH_LBT is added for new vcpu lbt register type,
the following macro is added to get/put LBT registers.
  KVM_REG_LOONGARCH_LBT_SCR0
  KVM_REG_LOONGARCH_LBT_SCR1
  KVM_REG_LOONGARCH_LBT_SCR2
  KVM_REG_LOONGARCH_LBT_SCR3
  KVM_REG_LOONGARCH_LBT_EFLAGS
  KVM_REG_LOONGARCH_LBT_FTOP

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11 23:26:32 +08:00
Bibo Mao b67ee19a90 LoongArch: KVM: Add Binary Translation extension support
Loongson Binary Translation (LBT) is used to accelerate binary translation,
which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags)
and x87 fpu stack pointer (ftop).

Like FPU extension, here a lazy enabling method is used for LBT. the LBT
context is saved/restored on the vcpu context switch path.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11 23:26:32 +08:00
Bibo Mao a53f48b632 LoongArch: KVM: Add VM feature detection function
Loongson SIMD Extension (LSX), Loongson Advanced SIMD Extension (LASX)
and Loongson Binary Translation (LBT) features are defined in register
CPUCFG2. Two kinds of LSX/LASX/LBT feature detection are added here, one
is VCPU feature, and the other is VM feature. VCPU feature dection can
only work with VCPU thread itself, and requires VCPU thread is created
already. So LSX/LASX/LBT feature detection for VM is added also, it can
be done even if VM is not created, and also can be done by any threads
besides VCPU threads.

Here ioctl command KVM_HAS_DEVICE_ATTR is added for VM, and macro
KVM_LOONGARCH_VM_FEAT_CTRL is added to check supported feature. And
five sub-features relative with LSX/LASX/LBT are added as following:
 KVM_LOONGARCH_VM_FEAT_LSX
 KVM_LOONGARCH_VM_FEAT_LASX
 KVM_LOONGARCH_VM_FEAT_X86BT
 KVM_LOONGARCH_VM_FEAT_ARMBT
 KVM_LOONGARCH_VM_FEAT_MIPSBT

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11 23:26:32 +08:00
Bibo Mao 4956e07f05 LoongArch: KVM: Invalidate guest steal time address on vCPU reset
If ParaVirt steal time feature is enabled, there is a percpu gpa address
passed from guest vCPU and host modifies guest memory space with this gpa
address. When vCPU is reset normally, it will notify host and invalidate
gpa address.

However if VM is crashed and VMM reboots VM forcely, the vCPU reboot
notification callback will not be called in VM. Host needs invalidate
the gpa address, else host will modify guest memory during VM reboots.
Here it is invalidated from the vCPU KVM_REG_LOONGARCH_VCPU_RESET ioctl
interface.

Also funciton kvm_reset_timer() is removed at vCPU reset stage, since SW
emulated timer is only used in vCPU block state. When a vCPU is removed
from the block waiting queue, kvm_restore_timer() is called and SW timer
is cancelled. And the timer register is also cleared at VMM when a vCPU
is reset.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-08-26 23:11:32 +08:00
Paolo Bonzini 86014c1e20 KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
 
  - Setup empty IRQ routing when creating a VM to avoid having to synchronize
    SRCU when creating a split IRQCHIP on x86.
 
  - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
    that arch code can use for hooking both sched_in() and sched_out().
 
  - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
    truncating a bogus value from userspace, e.g. to help userspace detect bugs.
 
  - Mark a vCPU as preempted if and only if it's scheduled out while in the
    KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
    memory when retrieving guest state during live migration blackout.
 
  - A few minor cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj
 N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha
 vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe
 o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV
 PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT
 QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7
 GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+
 jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg
 2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe
 b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8
 GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO
 d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o=
 =BalU
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD

KVM generic changes for 6.11

 - Enable halt poll shrinking by default, as Intel found it to be a clear win.

 - Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.

 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().

 - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.

 - Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.

 - A few minor cleanups
2024-07-16 09:51:36 -04:00
Bibo Mao b4ba157044 LoongArch: KVM: Add PV steal time support in host side
Add ParaVirt steal time feature in host side, VM can search supported
features provided by KVM hypervisor, a feature KVM_FEATURE_STEAL_TIME
is added here. Like x86, steal time structure is saved in guest memory,
one hypercall function KVM_HCALL_FUNC_NOTIFY is added to notify KVM to
enable this feature.

One CPU attr ioctl command KVM_LOONGARCH_VCPU_PVTIME_CTRL is added to
save and restore the base address of steal time structure when a VM is
migrated.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-07-09 16:25:51 +08:00
Bibo Mao b5d4e2325d LoongArch: KVM: Delay secondary mmu tlb flush until guest entry
With hardware assisted virtualization, there are two level HW mmu, one
is GVA to GPA mapping, the other is GPA to HPA mapping which is called
secondary mmu in generic. If there is page fault for secondary mmu,
there needs tlb flush operation indexed with fault GPA address and VMID.
VMID is stored at register CSR.GSTAT and will be reload or recalculated
before guest entry.

Currently CSR.GSTAT is not saved and restored during VCPU context
switch, instead it is recalculated during guest entry. So CSR.GSTAT is
effective only when a VCPU runs in guest mode, however it may not be
effective if the VCPU exits to host mode. Since register CSR.GSTAT may
be stale, it may records the VMID of the last schedule-out VCPU, rather
than the current VCPU.

Function kvm_flush_tlb_gpa() should be called with its real VMID, so
here move it to the guest entrance. Also an arch-specific request id
KVM_REQ_TLB_FLUSH_GPA is added to flush tlb for secondary mmu, and it
can be optimized if VMID is updated, since all guest tlb entries will
be invalid if VMID is updated.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-07-09 16:25:50 +08:00
Bibo Mao e306e51490 LoongArch: KVM: Sync pending interrupt when getting ESTAT from user mode
Currently interrupts are posted and cleared with the asynchronous mode,
meanwhile they are saved in SW state vcpu::arch::irq_pending and vcpu::
arch::irq_clear. When vcpu is ready to run, pending interrupt is written
back to CSR.ESTAT register from SW state vcpu::arch::irq_pending at the
guest entrance.

During VM migration stage, vcpu is put into stopped state, however
pending interrupts are not synced to CSR.ESTAT register. So there will
be interrupt lost when VCPU is migrated to another host machines.

Here in this patch when ESTAT CSR register is read from VMM user mode,
pending interrupts are synchronized to ESTAT also. So that VMM can get
correct pending interrupts.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-07-09 16:25:50 +08:00
David Matlack a6816314af KVM: Introduce vcpu->wants_to_run
Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run
loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit
was not set.

Replace all references to vcpu->run->immediate_exit with
!vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a
malicious userspace could invoked KVM_RUN with immediate_exit=true and
then after KVM reads it to set wants_to_run=false, flip it to false.
This would result in the vCPU running in KVM_RUN with
wants_to_run=false. This wouldn't cause any real bugs today but is a
dangerous landmine.

Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-2-dmatlack@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-18 09:20:01 -07:00
Bibo Mao 163e9fc695 LoongArch: KVM: Add software breakpoint support
When VM runs in kvm mode, system will not exit to host mode when
executing a general software breakpoint instruction such as INSN_BREAK,
trap exception happens in guest mode rather than host mode. In order to
debug guest kernel on host side, one mechanism should be used to let VM
exit to host mode.

Here a hypercall instruction with a special code is used for software
breakpoint usage. VM exits to host mode and kvm hypervisor identifies
the special hypercall code and sets exit_reason with KVM_EXIT_DEBUG. And
then let qemu handle it.

Idea comes from ppc kvm, one api KVM_REG_LOONGARCH_DEBUG_INST is added
to get the hypercall code. VMM needs get sw breakpoint instruction with
this api and set the corresponding sw break point for guest kernel.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06 22:00:47 +08:00
Bibo Mao e33bda7ee5 LoongArch: KVM: Add PV IPI support on host side
On LoongArch system, IPI hw uses iocsr registers. There are one iocsr
register access on IPI sending, and two iocsr access on IPI receiving
for the IPI interrupt handler. In VM mode all iocsr accessing will cause
VM to trap into hypervisor. So with one IPI hw notification there will
be three times of trap.

In this patch PV IPI is added for VM, hypercall instruction is used for
IPI sender, and hypervisor will inject an SWI to the destination vcpu.
During the SWI interrupt handler, only CSR.ESTAT register is written to
clear irq. CSR.ESTAT register access will not trap into hypervisor, so
with PV IPI supported, there is one trap with IPI sender, and no trap
with IPI receiver, there is only one trap with IPI notification.

Also this patch adds IPI multicast support, the method is similar with
x86. With IPI multicast support, IPI notification can be sent to at
most 128 vcpus at one time. It greatly reduces the times of trapping
into hypervisor.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06 22:00:47 +08:00
Bibo Mao 73516e9da5 LoongArch: KVM: Add vcpu mapping from physical cpuid
Physical CPUID is used for interrupt routing for irqchips such as ipi,
msgint and eiointc interrupt controllers. Physical CPUID is stored at
the CSR register LOONGARCH_CSR_CPUID, it can not be changed once vcpu
is created and the physical CPUIDs of two vcpus cannot be the same.

Different irqchips have different size declaration about physical CPUID,
the max CPUID value for CSR LOONGARCH_CSR_CPUID on Loongson-3A5000 is
512, the max CPUID supported by IPI hardware is 1024, while for eiointc
irqchip is 256, and for msgint irqchip is 65536.

The smallest value from all interrupt controllers is selected now, and
the max cpuid size is defines as 256 by KVM which comes from the eiointc
irqchip.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-05-06 22:00:47 +08:00
Bibo Mao aebd3bd586 LoongArch: KVM: Set reserved bits as zero in CPUCFG
Supported CPUCFG information comes from function _kvm_get_cpucfg_mask().
A bit should be zero if it is reserved by HW or if it is not supported
by KVM.

Also LoongArch software page table walk feature defined in CPUCFG2_LSPW
is supported by KVM, it should be enabled by default.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-06 09:12:13 +08:00
WANG Xuerui f0f5c4894f LoongArch: KVM: Streamline kvm_check_cpucfg() and improve comments
All the checks currently done in kvm_check_cpucfg can be realized with
early returns, so just do that to avoid extra cognitive burden related
to the return value handling.

While at it, clean up comments of _kvm_get_cpucfg_mask() and
kvm_check_cpucfg(), by removing comments that are merely restatement of
the code nearby, and paraphrasing the rest so they read more natural for
English speakers (that likely are not familiar with the actual Chinese-
influenced grammar).

No functional changes intended.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23 14:36:31 +08:00
WANG Xuerui ec83f39d2b LoongArch: KVM: Rename _kvm_get_cpucfg() to _kvm_get_cpucfg_mask()
The function is not actually a getter of guest CPUCFG, but rather
validation of the input CPUCFG ID plus information about the supported
bit flags of that CPUCFG leaf. So rename it to avoid confusion.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23 14:36:31 +08:00
WANG Xuerui 179af5751a LoongArch: KVM: Fix input validation of _kvm_get_cpucfg() & kvm_check_cpucfg()
The range check for the CPUCFG ID is wrong (should have been a ||
instead of &&) and useless in effect, so fix the obvious mistake.

Furthermore, the juggling of the temp return value is unnecessary,
because it is semantically equivalent and more readable to just
return at every switch case's end. This is done too to avoid potential
bugs in the future related to the unwanted complexity.

Also, the return value of _kvm_get_cpucfg is meant to be checked, but
this was not done, so bad CPUCFG IDs wrongly fall back to the default
case and 0 is incorrectly returned; check the return value to fix the
UAPI behavior.

While at it, also remove the redundant range check in kvm_check_cpucfg,
because out-of-range CPUCFG IDs are already rejected by the -EINVAL
as returned by _kvm_get_cpucfg().

Fixes: db1ecca22e ("LoongArch: KVM: Add LSX (128bit SIMD) support")
Fixes: 118e10cd89 ("LoongArch: KVM: Add LASX (256bit SIMD) support")
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23 14:36:31 +08:00
Tianrui Zhao 118e10cd89 LoongArch: KVM: Add LASX (256bit SIMD) support
This patch adds LASX (256bit SIMD) support for LoongArch KVM.

There will be LASX exception in KVM when guest use the LASX instructions.
KVM will enable LASX and restore the vector registers for guest and then
return to guest to continue running.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-12-19 10:48:28 +08:00
Tianrui Zhao db1ecca22e LoongArch: KVM: Add LSX (128bit SIMD) support
This patch adds LSX (128bit SIMD) support for LoongArch KVM.

There will be LSX exception in KVM when guest use the LSX instructions.
KVM will enable LSX and restore the vector registers for guest and then
return to guest to continue running.

Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-12-19 10:48:28 +08:00
Bibo Mao 1ab9c60994 LoongArch: KVM: Remove kvm_acquire_timer() before entering guest
Timer emulation method in VM is switch to SW timer, there are two
places where timer emulation is needed. One is during vcpu thread
context switch, the other is halt-polling with idle instruction
emulation. SW timer switching is removed during halt-polling mode,
so it is not necessary to disable SW timer before entering to guest.

This patch removes SW timer handling before entering guest mode, and
put it in HW timer restoring flow when vcpu thread is sched-in. With
this patch, vm timer emulation is simpler, there is SW/HW timer
switch only in vcpu thread context switch scenario.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-12-19 10:48:28 +08:00
Bibo Mao 1612673201 LoongArch: KVM: Remove SW timer switch when vcpu is halt polling
With halt-polling supported, there is checking for pending events or
interrupts when vcpu executes idle instruction. Pending interrupts
include injected SW interrupts and passthrough HW interrupts, such as
HW timer interrupts, since HW timer works still even if vcpu exists from
VM mode.

Since HW timer pending interrupt can be set directly with CSR status
register, and pending HW timer interrupt checking is used in vcpu block
checking function, it is not necessary to switch to SW timer during
halt-polling. This patch adds preemption disabling in function
kvm_cpu_has_pending_timer(), and removes SW timer switching in idle
instruction emulation function.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-12-19 10:48:27 +08:00
Tianrui Zhao 93a9a197b6 LoongArch: KVM: Implement misc vcpu related interfaces
1, Implement LoongArch vcpu status description such as idle exits
counter, signal exits counter, cpucfg exits counter, etc.

2, Implement some misc vcpu relaterd interfaces, such as vcpu runnable,
vcpu should kick, vcpu dump regs, etc.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:28 +08:00
Tianrui Zhao 1f4c39b989 LoongArch: KVM: Implement vcpu load and vcpu put operations
Implement LoongArch vcpu load and vcpu put operations, including load
csr value into hardware and save csr value into vcpu structure.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:28 +08:00
Tianrui Zhao f45ad5b8aa LoongArch: KVM: Implement vcpu interrupt operations
Implement vcpu interrupt operations such as vcpu set irq and vcpu
clear irq, using set_gcsr_estat() to set irq which is parsed by the
irq bitmap.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:28 +08:00
Tianrui Zhao 84be4212dc LoongArch: KVM: Implement fpu operations for vcpu
Implement LoongArch fpu related interface for vcpu, such as get fpu, set
fpu, own fpu and lose fpu, etc.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:28 +08:00
Tianrui Zhao f6deff355b LoongArch: KVM: Implement basic vcpu ioctl interfaces
Implement basic vcpu ioctl interfaces, including:

1, vcpu KVM_ENABLE_CAP ioctl interface.

2, vcpu get registers and set registers operations, it is called when
user space use the ioctl interface to get or set regs.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:27 +08:00
Tianrui Zhao 2fc3bd86db LoongArch: KVM: Implement basic vcpu interfaces
Implement basic vcpu interfaces, including:

1, vcpu create and destroy interface, saving info into vcpu arch
structure such as vcpu exception entrance, vcpu enter guest pointer,
etc. Init vcpu timer and set address translation mode when vcpu create.

2, vcpu run interface, handling mmio, iocsr reading fault and deliver
interrupt, lose fpu before vcpu enter guest.

3, vcpu handle exit interface, getting the exit code by ESTAT register
and using kvm exception vector to handle it.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-10-02 10:01:27 +08:00