aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2011-01-1343-1185/+3078
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits) KVM: Initialize fpu state in preemptible context KVM: VMX: when entering real mode align segment base to 16 bytes KVM: MMU: handle 'map_writable' in set_spte() function KVM: MMU: audit: allow audit more guests at the same time KVM: Fetch guest cr3 from hardware on demand KVM: Replace reads of vcpu->arch.cr3 by an accessor KVM: MMU: only write protect mappings at pagetable level KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear() KVM: MMU: Initialize base_role for tdp mmus KVM: VMX: Optimize atomic EFER load KVM: VMX: Add definitions for more vm entry/exit control bits KVM: SVM: copy instruction bytes from VMCB KVM: SVM: implement enhanced INVLPG intercept KVM: SVM: enhance mov DR intercept handler KVM: SVM: enhance MOV CR intercept handler KVM: SVM: add new SVM feature bit names KVM: cleanup emulate_instruction KVM: move complete_insn_gp() into x86.c KVM: x86: fix CR8 handling KVM guest: Fix kvm clock initialization when it's configured out ...
| * KVM: Initialize fpu state in preemptible contextAvi Kivity2011-01-122-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | init_fpu() (which is indirectly called by the fpu switching code) assumes it is in process context. Rather than makeing init_fpu() use an atomic allocation, which can cause a task to be killed, make sure the fpu is already initialized when we enter the run loop. KVM-Stable-Tag. Reported-and-tested-by: Kirill A. Shutemov <kas@openvz.org> Acked-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: when entering real mode align segment base to 16 bytesGleb Natapov2011-01-121-1/+5
| | | | | | | | | | | | | | | | VMX checks that base is equal segment shifted 4 bits left. Otherwise guest entry fails. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: handle 'map_writable' in set_spte() functionXiao Guangrong2011-01-122-11/+4
| | | | | | | | | | | | | | | | | | Move the operation of 'writable' to set_spte() to clean up code [avi: remove unneeded booleanification] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: audit: allow audit more guests at the same timeXiao Guangrong2011-01-123-30/+39
| | | | | | | | | | | | | | | | | | | | | | | | It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Fetch guest cr3 from hardware on demandAvi Kivity2011-01-125-6/+23
| | | | | | | | | | | | | | | | | | | | Instead of syncing the guest cr3 every exit, which is expensince on vmx with ept enabled, sync it only on demand. [sheng: fix incorrect cr3 seen by Windows XP] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Replace reads of vcpu->arch.cr3 by an accessorAvi Kivity2011-01-125-20/+27
| | | | | | | | | | | | This allows us to keep cr3 in the VMCS, later on. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: only write protect mappings at pagetable levelMarcelo Tosatti2011-01-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | If a pagetable contains a writeable large spte, all of its sptes will be write protected, including non-leaf ones, leading to endless pagefaults. Do not write protect pages above PT_PAGE_TABLE_LEVEL, as the spte fault paths assume non-leaf sptes are writable. Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear()Avi Kivity2011-01-121-2/+2
| | | | | | | | | | | | | | | | 'error' is byte sized, so use a byte register constraint. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: MMU: Initialize base_role for tdp mmusAvi Kivity2011-01-121-0/+1
| | | | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Optimize atomic EFER loadAvi Kivity2011-01-121-0/+30
| | | | | | | | | | | | | | | | | | When NX is enabled on the host but not on the guest, we use the entry/exit msr load facility, which is slow. Optimize it to use entry/exit efer load, which is ~1200 cycles faster. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Add definitions for more vm entry/exit control bitsAvi Kivity2011-01-121-0/+8
| | | | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: copy instruction bytes from VMCBAndre Przywara2011-01-128-15/+26
| | | | | | | | | | | | | | | | | | | | | | In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: implement enhanced INVLPG interceptAndre Przywara2011-01-121-1/+6
| | | | | | | | | | | | | | | | | | | | When the DecodeAssist feature is available, the linear address is provided in the VMCB on INVLPG intercepts. Use it directly to avoid any decoding and emulation. This is only useful for shadow paging, though. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: enhance mov DR intercept handlerAndre Przywara2011-01-121-16/+40
| | | | | | | | | | | | | | | | | | | | Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle debug register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: enhance MOV CR intercept handlerAndre Przywara2011-01-122-11/+81
| | | | | | | | | | | | | | | | | | | | Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle CR register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: SVM: add new SVM feature bit namesAndre Przywara2011-01-121-0/+4
| | | | | | | | | | | | | | | | the recent APM Vol.2 and the recent AMD CPUID specification describe new CPUID features bits for SVM. Name them here for later usage. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: cleanup emulate_instructionAndre Przywara2011-01-125-22/+28
| | | | | | | | | | | | | | | | | | | | emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: move complete_insn_gp() into x86.cAndre Przywara2011-01-123-12/+15
| | | | | | | | | | | | | | | | move the complete_insn_gp() helper function out of the VMX part into the generic x86 part to make it usable by SVM. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86: fix CR8 handlingAndre Przywara2011-01-124-16/+15
| | | | | | | | | | | | | | | | | | | | The handling of CR8 writes in KVM is currently somewhat cumbersome. This patch makes it look like the other CR register handlers and fixes a possible issue in VMX, where the RIP would be incremented despite an injected #GP. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM guest: Fix kvm clock initialization when it's configured outAvi Kivity2011-01-121-0/+2
| | | | | | | | Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Take missing slots_lock for kvm_io_bus_unregister_dev()Takuya Yoshikawa2011-01-122-0/+6
| | | | | | | | | | | | | | | | In KVM_CREATE_IRQCHIP, kvm_io_bus_unregister_dev() is called without taking slots_lock in the error handling path. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: return true when user space query KVM_CAP_USER_NMI extensionLai Jiangshan2011-01-121-0/+1
| | | | | | | | | | | | | | userspace may check this extension in runtime. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Correct kvm_pio tracepoint count fieldAvi Kivity2011-01-121-2/+2
| | | | | | | | | | | | Currently, we record '1' for count regardless of the real count. Fix. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: Fix incorrect direct page write protection due to ro host pageAvi Kivity2011-01-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If KVM sees a read-only host page, it will map it as read-only to prevent breaking a COW. However, if the page was part of a large guest page, KVM incorrectly extends the write protection to the entire large page frame instead of limiting it to the normal host page. This results in the instantiation of a new shadow page with read-only access. If this happens for a MOVS instruction that moves memory between two normal pages, within a single large page frame, and mapped within the guest as a large page, and if, in addition, the source operand is not writeable in the host (perhaps due to KSM), then KVM will instantiate a read-only direct shadow page, instantiate an spte for the source operand, then instantiate a new read/write direct shadow page and instantiate an spte for the destination operand. Since these two sptes are in different shadow pages, MOVS will never see them at the same time and the guest will not make progress. Fix by mapping the direct shadow page read/write, and only marking the host page read-only. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Fix build error on s390 due to missing tlbs_dirtyAvi Kivity2011-01-121-1/+1
| | | | | | | | | | | | Make it available for all archs. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add xsetbv interceptJoerg Roedel2011-01-122-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the xsetbv intercept to the AMD part of KVM. This makes AVX usable in a save way for the guest on AVX capable AMD hardware. The patch is tested by using AVX in the guest and host in parallel and checking for data corruption. I also used the KVM xsave unit-tests and they all pass. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: Make the way of accessing lpage_info more genericTakuya Yoshikawa2011-01-122-33/+31
| | | | | | | | | | | | | | | | | | | | | | | | Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: add module parameter to avoid trapping HLT instructions (v5)Anthony Liguori2011-01-122-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct is to simply avoid trapping the HLT instruction which lets the guest directly execute the instruction putting the processor to sleep. Introduce this as a module-level option for kvm-vmx.ko since if you do this for one guest, you probably want to do it for all. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Implement Flush-By-Asid featureJoerg Roedel2011-01-122-2/+10
| | | | | | | | | | | | | | | | This patch adds the new flush-by-asid of upcoming AMD processors to the KVM-AMD module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Use svm_flush_tlb instead of force_new_asidJoerg Roedel2011-01-121-12/+7
| | | | | | | | | | | | | | | | | | | | This patch replaces all calls to force_new_asid which are intended to flush the guest-tlb by the more appropriate function svm_flush_tlb. As a side-effect the force_new_asid function is removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Remove flush_guest_tlb functionJoerg Roedel2011-01-121-5/+0
| | | | | | | | | | | | | | | | This function is unused and there is svm_flush_tlb which does the same. So this function can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: retry #PF for softmmuXiao Guangrong2011-01-124-6/+17
| | | | | | | | | | | | | | | | Retry #PF for softmmu only when the current vcpu has the same cr3 as the time when #PF occurs Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: fix accessed bit set on prefault pathXiao Guangrong2011-01-121-4/+6
| | | | | | | | | | | | | | Retry #PF is the speculative path, so don't set the accessed bit Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: rename 'no_apf' to 'prefault'Xiao Guangrong2011-01-123-12/+13
| | | | | | | | | | | | | | | | It's the speculative path if 'no_apf = 1' and we will specially handle this speculative path in the later patch, so 'prefault' is better to fit the sense. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for LBR stateJoerg Roedel2011-01-121-0/+2
| | | | | | | | | | | | | | | | | | This patch implements the clean-bit for all LBR related state. This includes the debugctl, br_from, br_to, last_excp_from, and last_excp_to msrs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for CR2 registerJoerg Roedel2011-01-121-2/+3
| | | | | | | | | | | | | | | | This patch implements the clean-bit for the cr2 register in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for Segements and CPLJoerg Roedel2011-01-121-0/+2
| | | | | | | | | | | | | | | | This patch implements the clean-bit defined for the cs, ds, ss, an es segemnts and the current cpl saved in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for GDT and IDTJoerg Roedel2011-01-121-0/+3
| | | | | | | | | | | | | | | | This patch implements the clean-bit for the base and limit of the gdt and idt in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for DR6 and DR7Joerg Roedel2011-01-121-0/+4
| | | | | | | | | | | | | | | | This patch implements the clean-bit for the dr6 and dr7 debug registers in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for control registersJoerg Roedel2011-01-121-0/+7
| | | | | | | | | | | | | | | | This patch implements the CRx clean-bit for the vmcb. This bit covers cr0, cr3, cr4, and efer. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for NPT stateJoerg Roedel2011-01-121-0/+3
| | | | | | | | | | | | | | | | This patch implements the clean-bit for all nested paging related state in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for interrupt stateJoerg Roedel2011-01-121-1/+7
| | | | | | | | | | | | | | | | | | This patch implements the clean-bit for all interrupt related state in the vmcb. This corresponds to vmcb offset 0x60-0x67. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for the ASIDJoerg Roedel2011-01-121-0/+3
| | | | | | | | | | | | | | | | This patch implements the clean-bit for the asid in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for IOPM_BASE and MSRPM_BASEJoerg Roedel2011-01-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | This patch adds the clean bit for the physical addresses of the MSRPM and the IOPM. It does not need to be set in the code because the only place where these values are changed is the nested-svm vmrun and vmexit path. These functions already mark the complete VMCB as dirty. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bit for intercetps, tsc-offset and pause filter countJoerg Roedel2011-01-121-0/+7
| | | | | | | | | | | | | | | | | | | | This patch adds the clean-bit for intercepts-vectors, the TSC offset and the pause-filter count to the appropriate places. The IO and MSR permission bitmaps are not subject to this bit. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: SVM: Add clean-bits infrastructure codeRoedel, Joerg2011-01-122-1/+33
| | | | | | | | | | | | | | | | This patch adds the infrastructure for the implementation of the individual clean-bits. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: MMU: Avoid dropping accessed bit while removing write accessTakuya Yoshikawa2011-01-121-1/+1
| | | | | | | | | | | | | | | | | | | | One more "KVM: MMU: Don't drop accessed bit while updating an spte." Sptes are accessed by both kvm and hardware. This patch uses update_spte() to fix the way of removing write access. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: Return 0 from a failed VMREADAvi Kivity2011-01-121-2/+2
| | | | | | | | | | | | | | | | | | If we execute VMREAD during reboot we'll just skip over it. Instead of returning garbage, return 0, which has a much smaller chance of confusing the code. Otherwise we risk a flood of debug printk()s which block the reboot process if a serial console or netconsole is enabled. Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Don't spin on virt instruction faults during rebootAvi Kivity2011-01-122-11/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Since vmx blocks INIT signals, we disable virtualization extensions during reboot. This leads to virtualization instructions faulting; we trap these faults and spin while the reboot continues. Unfortunately spinning on a non-preemptible kernel may block a task that reboot depends on; this causes the reboot to hang. Fix by skipping over the instruction and hoping for the best. Signed-off-by: Avi Kivity <avi@redhat.com>