Date: Sat, 17 Jun 2006 14:50:52 +0200 From: Juergen Lock <nox@jelal.kn-bremen.de> To: Bruce Evans <bde@zeta.org.au> Cc: freebsd-emulation@FreeBSD.org, nikolas.britton@gmail.com Subject: Re: QEMU: npxdna: fpcurthread == curthread 640962 times Message-ID: <20060617125051.GA38179@saturn.kn-bremen.de> In-Reply-To: <20060617150847.W38142@delplex.bde.org> References: <200606161855.k5GItlP1001526@saturn.kn-bremen.de> <20060617150847.W38142@delplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 17, 2006 at 04:13:37PM +1000, Bruce Evans wrote: > On Fri, 16 Jun 2006, Juergen Lock wrote: > > >In article <ef10de9a0606131730m7ec33522q2b2273eb3d44d603@mail.gmail.com> > >you write: > >>I'm getting these messages on the console when I run QEMU with kqemu > >>kernel module: > >>npxdna: fpcurthread == curthread 640957 times > >>... > >>What are these and how do I make them stop? > >>[...] > > > >Ok since the previous thread on this, > > http://lists.freebsd.org/pipermail/freebsd-emulation/2006-May/002081.html > >didnt end with a consensus on how to fix it, I've made a band-aid > >patch that simply disables the message for the kqemu case: (which > >_appears_ to be harmless there, maybe because qemu doesnt seem to > >be using the fpu while calling kqemu) > > > >Index: sys/i386/isa/npx.c > >=================================================================== > >RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v > >retrieving revision 1.162.2.1.2.2 > >diff -u -r1.162.2.1.2.2 npx.c > >--- sys/i386/isa/npx.c 30 Apr 2006 05:17:59 -0000 1.162.2.1.2.2 > >+++ sys/i386/isa/npx.c 16 Jun 2006 18:03:23 -0000 > >@@ -756,6 +756,9 @@ > > if (!npx_exists) > > return (0); > > if (PCPU_GET(fpcurthread) == curthread) { > >+#if 1 > >+ if (strcmp(curthread->td_proc->p_comm, "qemu")) > >+#endif > > printf("npxdna: fpcurthread == curthread %d times\n", > > ++err_count); > > stop_emulating(); > > > >Of course I don't ask for something like this be committed, it is just > >for ppl for who the messages cause trouble. Similar patch for amd64 > >(untested): > > This message should be a panic like it used so that the problem gets > fixed. An invariant has been violated. See the other other thread. > > There should be another panic for this, one which occurs if npxdna() is > even called in kernel mode. amd64 has a message instead, and i386 has > nothing, except in my version it has a Debugger() call: > > %%% > Index: trap.c > =================================================================== > RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v > retrieving revision 1.265 > diff -u -2 -r1.265 trap.c > --- trap.c 2 Jun 2004 07:52:33 -0000 1.265 > +++ trap.c 17 Jun 2004 05:38:48 -0000 > @@ -428,4 +478,5 @@ > * registered such use. > */ > + Debugger("T_DNA in kernel mode"); > if (npxdna()) > goto out; > %%% > > Note that Debugger() has rotted in -current. It is now misnamed > kdb_enter() and no longer prints a message when no debugger is configured. > Most uses of it had very little to do with [dgk]db, but (were like the > above one) replacements for panic() in cases you didn't really want > to panic or wanted to debug if a debugger is configured. Such uses > should not exist in production kernels, and it is important for a > message to be printed when no debugger is configured so that the > replacement for panic() doesn't become not even a printf(). > > T_DNA in kernel mode used to cause an unconditional panic too, but in > rev.1.78 of i386/trap.c the panic was made conditional on npxdna() > failing. This was to support using the FPU for optimizing copying and > zeroing on original Pentiums. amd64 never supported this use of the > FPU but still has the above vestige of it. The code in rev.1.78 should > have been > > if (curpcb->pcb_flags & PCB_KERNEL_USING_NPX) { > if (npxdna()) > goto out; > } > break; /* To trap_fatal(). */ > > PCB_KERNEL_USING_NPX should only be set when the kernel is actually using > npx. Your strcmp() acts similarly to a flags test but not so well. > > >And finally a question for the experts: could this be fixed for real > >once the dfly fpu patch mentioned in the `Adding optimized kernel > >copying support' thread, > > http://lists.freebsd.org/pipermail/freebsd-arch/2006-June/005267.html > >is committed? > > No. That patch still depends on the buggy rev.1.78 of trap.c to avoid > a panic in trap(). It should use a flags test like the above. With > such a test, qemu whould panic because it has not claimed the FPU > like the dfly patch does. If qemu claimed the FPU, npxdna() should > still panic due to the violated invariant, but qemu shouldn't claim > the FPU in the same way as the kernel copying support since it wants > to load the user state and not use the FPU itself. I meant if kqemu would then claim the fpu before calling kqemu_exec (afaik it doesn't need the qemu process'es userland fpu state, its just so it can use the fpu.) > > I thought about the npxdna() in trap() mainly in connection with > changing the optimized kernel copying support. I just thought of > another complication from using kqemu. I think kqemu does an "int > $7" to cause a T_DNA to load the user FPU state. If the kernel were > using the FPU, then the kernel FPU state would actually be loaded. > This shouldn't happen in practice since kqemu shoudn't be using the > FPU at the point where it does the "int $7". yeah. Don't get me wrong, I would like to see this fixed for real just like you, the problem is only I don't know how!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060617125051.GA38179>