Windows Server 2016 with Hyper-V enabled fails to boot on OVMF with SMM
(OVMF_CODE-need-smm.fd). Turns out that the SMM emulation code in KVM
does not handle nested virtualization very well, leading to a whole bunch
of issues.
For example, Hyper-V uses descriptor table exiting (SECONDARY_EXEC_DESC)
so when the SMM handler tries to switch from real mode a VM exit occurs
and is forwarded to a clueless L1.
This series fixes it by switching the vcpu to !guest_mode, i.e. to the L1
state, before entering SMM and then switching back to L2 as part of
emulating the RSM instruction.
Patches 1 and 2 are common for both Intel and AMD, patches 3-4 fix Intel,
and patches 5-6 AMD.
v1->v2:
* Moved left_smm detection to emulator_set_hflags (couldn't quite get rid
of the field despite my original claim) (Paolo)
* Moved the kvm_x86_ops->post_leave_smm() call a few statements down so
it really runs after all state has been synced.
* Added the smi_allowed callback (new patch 2) to avoid running into
WARN_ON_ONCE(vmx->nested.nested_run_pending) on Intel.
v2->v3:
* Ommitted patch 4 ("KVM: nVMX: save nested EPT information in SMRAM state
save map") and replaced it with ("treat CR4.VMXE as reserved in SMM")
(Paolo)
* Implemented smi_allowed on AMD to support SMI interception. Turns out
Windows needs this when running on >1 vCPU.
* Eliminated internal SMM state on AMD and switched to using the SMM state
save area in guest memory instead (Paolo)
v3->v4:
* Changed the order of operations in enter_smm(), now saving the original
(and potentially L2) state into the SMM state save area.
* Made em_rsm() reload the SMM state save area if post_leave_smm() entered
guest mode. This way, SMM handlers see and may change the actual state
of the vCPU at the point where SMI was injected (Radim)
* In patch 4, switched to a different way of avoiding the problem of hitting
the very check the patch is adding.
v4->v5:
* Removed patch 4 (CR4.VMXE protection in SMM, will be done separately),
patch 3 bacame 4 and new patch 3 fixes a bug in load_vmcs12_host_state()
which prevented SMM exit to L2 from working without first restoring the
state from the SMM state save area.
* Eliminated the first restore from SMM state save area (Paolo)
* Tweaked the HF_SMM_MASK flag manipulation (Paolo)
Ladi Prosek (6):
KVM: x86: introduce ISA specific SMM entry/exit callbacks
KVM: x86: introduce ISA specific smi_allowed callback
KVM: nVMX: set IDTR and GDTR limits when loading L1 host state
KVM: nVMX: fix SMI injection in guest mode
KVM: nSVM: refactor nested_svm_vmrun
KVM: nSVM: fix SMI injection in guest mode
arch/x86/include/asm/kvm_emulate.h | 2 +
arch/x86/include/asm/kvm_host.h | 7 ++
arch/x86/kvm/emulate.c | 9 ++
arch/x86/kvm/svm.c | 205 +++++++++++++++++++++++++------------
arch/x86/kvm/vmx.c | 79 ++++++++++++--
arch/x86/kvm/x86.c | 20 +++-
6 files changed, 245 insertions(+), 77 deletions(-)