From owner-svn-src-projects@FreeBSD.ORG Fri May 20 03:23:09 2011 Return-Path: Delivered-To: svn-src-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7EC0D106566C; Fri, 20 May 2011 03:23:09 +0000 (UTC) (envelope-from neel@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 6DEBB8FC15; Fri, 20 May 2011 03:23:09 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id p4K3N9At064793; Fri, 20 May 2011 03:23:09 GMT (envelope-from neel@svn.freebsd.org) Received: (from neel@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id p4K3N9AS064788; Fri, 20 May 2011 03:23:09 GMT (envelope-from neel@svn.freebsd.org) Message-Id: <201105200323.p4K3N9AS064788@svn.freebsd.org> From: Neel Natu Date: Fri, 20 May 2011 03:23:09 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org X-SVN-Group: projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r222112 - projects/bhyve/sys/amd64/vmm/intel X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 03:23:09 -0000 Author: neel Date: Fri May 20 03:23:09 2011 New Revision: 222112 URL: http://svn.freebsd.org/changeset/base/222112 Log: Fix a long standing bug in VMXCTX_GUEST_RESTORE(). There was an assumption by the "callers" of this macro that on "return" the %rsp will be pointing to the 'vmxctx'. The macro was not doing this and thus when trying to restore host state on an error from "vmlaunch" or "vmresume" we were treating the memory locations on the host stack as 'struct vmxctx'. This led to all sorts of weird bugs like double faults or invalid instruction faults. This bug is exposed by the -O2 option used to compile the kernel module. With the -O2 flag the compiler will optimize the following piece of code: int loopstart = 1; ... if (loopstart) { loopstart = 0; vmx_launch(); } else vmx_resume(); into this: vmx_launch(); Since vmx_launch() and vmx_resume() are declared to be __dead2 functions the compiler is free to do this. The compiler has no way to know that the functions return indirectly through vmx_setjmp(). This optimization in turn leads us to trigger the bug in VMXCTX_GUEST_RESTORE(). With this change we can boot a 8.1 guest on a 9.0 host. Reported by: jhb@ Modified: projects/bhyve/sys/amd64/vmm/intel/vmx.c projects/bhyve/sys/amd64/vmm/intel/vmx.h projects/bhyve/sys/amd64/vmm/intel/vmx_genassym.c projects/bhyve/sys/amd64/vmm/intel/vmx_support.S Modified: projects/bhyve/sys/amd64/vmm/intel/vmx.c ============================================================================== --- projects/bhyve/sys/amd64/vmm/intel/vmx.c Fri May 20 02:08:05 2011 (r222111) +++ projects/bhyve/sys/amd64/vmm/intel/vmx.c Fri May 20 03:23:09 2011 (r222112) @@ -1189,7 +1189,7 @@ vmx_exit_process(struct vmx *vmx, int vc static int vmx_run(void *arg, int vcpu, register_t rip, struct vm_exit *vmexit) { - int error, vie, rc, handled, astpending, loopstart; + int error, vie, rc, handled, astpending; uint32_t exit_reason; struct vmx *vmx; struct vmxctx *vmxctx; @@ -1198,7 +1198,7 @@ vmx_run(void *arg, int vcpu, register_t vmx = arg; vmcs = &vmx->vmcs[vcpu]; vmxctx = &vmx->ctx[vcpu]; - loopstart = 1; + vmxctx->launched = 0; /* * XXX Can we avoid doing this every time we do a vm run? @@ -1232,8 +1232,8 @@ vmx_run(void *arg, int vcpu, register_t #endif switch (rc) { case VMX_RETURN_DIRECT: - if (loopstart) { - loopstart = 0; + if (vmxctx->launched == 0) { + vmxctx->launched = 1; vmx_launch(vmxctx); } else vmx_resume(vmxctx); Modified: projects/bhyve/sys/amd64/vmm/intel/vmx.h ============================================================================== --- projects/bhyve/sys/amd64/vmm/intel/vmx.h Fri May 20 02:08:05 2011 (r222111) +++ projects/bhyve/sys/amd64/vmm/intel/vmx.h Fri May 20 03:23:09 2011 (r222112) @@ -34,6 +34,9 @@ #define GUEST_MSR_MAX_ENTRIES 64 /* arbitrary */ struct vmxctx { + register_t tmpstk[32]; /* vmx_return() stack */ + register_t tmpstktop; + register_t guest_rdi; /* Guest state */ register_t guest_rsi; register_t guest_rdx; @@ -63,6 +66,7 @@ struct vmxctx { * XXX todo debug registers and fpu state */ + int launched; /* vmcs launch state */ int launch_error; }; Modified: projects/bhyve/sys/amd64/vmm/intel/vmx_genassym.c ============================================================================== --- projects/bhyve/sys/amd64/vmm/intel/vmx_genassym.c Fri May 20 02:08:05 2011 (r222111) +++ projects/bhyve/sys/amd64/vmm/intel/vmx_genassym.c Fri May 20 03:23:09 2011 (r222112) @@ -43,6 +43,7 @@ __FBSDID("$FreeBSD$"); #include "vmx.h" #include "vmx_cpufunc.h" +ASSYM(VMXCTX_TMPSTKTOP, offsetof(struct vmxctx, tmpstktop)); ASSYM(VMXCTX_GUEST_RDI, offsetof(struct vmxctx, guest_rdi)); ASSYM(VMXCTX_GUEST_RSI, offsetof(struct vmxctx, guest_rsi)); ASSYM(VMXCTX_GUEST_RDX, offsetof(struct vmxctx, guest_rdx)); Modified: projects/bhyve/sys/amd64/vmm/intel/vmx_support.S ============================================================================== --- projects/bhyve/sys/amd64/vmm/intel/vmx_support.S Fri May 20 02:08:05 2011 (r222111) +++ projects/bhyve/sys/amd64/vmm/intel/vmx_support.S Fri May 20 03:23:09 2011 (r222112) @@ -31,15 +31,23 @@ #include "vmx_assym.s" /* - * Assumes that %rdi holds a pointer to the 'vmxctx' + * Assumes that %rdi holds a pointer to the 'vmxctx'. + * + * On "return" all registers are updated to reflect guest state. The two + * exceptions are %rip and %rsp. These registers are atomically switched + * by hardware from the guest area of the vmcs. + * + * We modify %rsp to point to the 'vmxctx' so we can use it to restore + * host context in case of an error with 'vmlaunch' or 'vmresume'. */ #define VMX_GUEST_RESTORE \ /* \ - * Make sure that interrupts are disabled before restoring CR2. \ - * Otherwise there could be a page fault during the interrupt \ - * handler execution that would end up trashing CR2. \ + * Disable interrupts before updating %rsp. The location that \ + * %rsp points to is a 'vmxctx' and not a real stack so we \ + * don't want an interrupt handler to trash it. \ */ \ cli; \ + movq %rdi,%rsp; \ movq VMXCTX_GUEST_CR2(%rdi),%rsi; \ movq %rsi,%cr2; \ movq VMXCTX_GUEST_RSI(%rdi),%rsi; \ @@ -148,6 +156,8 @@ ENTRY(vmx_longjmp) movq %rsp,%rdi movq $VMX_RETURN_LONGJMP,%rsi + + addq $VMXCTX_TMPSTKTOP,%rsp callq vmx_return END(vmx_longjmp) @@ -174,6 +184,8 @@ ENTRY(vmx_resume) /* Return via vmx_setjmp with return value of VMX_RETURN_VMRESUME */ movq %rsp,%rdi movq $VMX_RETURN_VMRESUME,%rsi + + addq $VMXCTX_TMPSTKTOP,%rsp callq vmx_return END(vmx_resume) @@ -200,5 +212,7 @@ ENTRY(vmx_launch) /* Return via vmx_setjmp with return value of VMX_RETURN_VMLAUNCH */ movq %rsp,%rdi movq $VMX_RETURN_VMLAUNCH,%rsi + + addq $VMXCTX_TMPSTKTOP,%rsp callq vmx_return END(vmx_launch)