Date: Sat, 10 May 2008 22:28:53 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Juergen Lock <nox@jelal.kn-bremen.de> Cc: freebsd-emulation@freebsd.org Subject: Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :) Message-ID: <20080510213519.P3083@besplex.bde.org> In-Reply-To: <20080509220922.GA13480@saturn.kn-bremen.de> References: <20080507162713.73A3A5B47@mail.bitblocks.com> <20080508195843.G17500@delplex.bde.org> <20080509220922.GA13480@saturn.kn-bremen.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 10 May 2008, Juergen Lock wrote: > On Thu, May 08, 2008 at 09:59:57PM +1000, Bruce Evans wrote: >> The message in npx.c is actually about violation of an even more >> fundamental invariant -- the invariant that owning the FPU includes >> having the TS flag clear so that DNA traps cannot occur. The bug in >> kqemu seems to be mismanagement of the TS flag related to this. I >> forget if it is the host or the target TS flag that seems to be mismanaged. >> For the target, it would take a bug in the virtualization of the TS flag >> to break this invariant (assuming no related bugs in the target kernel). >> > Well the `fpcurthread == curthread' bug has been fixed quite a while > ago already, or do you mean another one? I didn't know what is already fixed. >> The message in amd64/machdep.c is about violation of the invariant >> that the kernel cannot cause DNA traps. Spurious DNA traps in the >> ... >> > Okay I _think_ I know a little more about this now... kqemu itself > doesn't use the fpu, but the guest code it runs can, and in that case the > DNA trap is just used for (host) lazy fpu context switching like as if the > code was running in userland regularly. And I just tested the following > patch that should get rid of the message by calling fpudna/npxdna directly > (files/patch-fpucontext is the interesting part:) This seems reasonable. Is the following summary of my understanding of kqemu's implementation of this and your change correct?: - kqemu runs in kernel mode on the host and needs to have exactly the same effect as a DNA exception on the target. - having exactly the same effect requires calling the host DNA exception handler. - now it uses a software int $7 (dna) to implement the above, but this is not permitted in kernel mode (although the software int could be permitted, it is hard to distinguish from a hardware exception for unintentional use). - your change makes it call the DNA trap handler directly. This gives the same effect as a permitted software int $7. It is also faster. It would be better to use an official API for this, but none exists. > ... > +Index: kqemu-freebsd.c > +@@ -33,6 +33,11 @@ > + > + #include <machine/vmparam.h> > + #include <machine/stdarg.h> > ++#ifdef __x86_64__ > ++#include <machine/fpu.h> > ++#else > ++#include <machine/npx.h> > ++#endif > + > + #include "kqemu-kernel.h" > + > +@@ -172,6 +177,15 @@ > + { > + } > + > ++void CDECL kqemu_loadfpucontext(unsigned long cpl) > ++{ > ++#ifdef __x86_64__ > ++ fpudna(); > ++#else > ++ npxdna(); > ++#endif > ++} Just be sure that the system state is not too different from that of trap() (directly below a syscall or trap from userland) when this is called. Better not have any interrupts disabled or locks held, though I think npxdna() doesn't care. The FPU must not be owned already at this point. > ++ > + #if __FreeBSD_version < 500000 > + static int > + curpriority_cmp(struct proc *p) I guess kqemu duplicates this old mistake instead of calling it because it is static. npxdna() is already public so it can be abused easily :-), Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080510213519.P3083>