Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Jun 2006 14:50:52 +0200
From:      Juergen Lock <nox@jelal.kn-bremen.de>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        freebsd-emulation@FreeBSD.org, nikolas.britton@gmail.com
Subject:   Re: QEMU: npxdna: fpcurthread == curthread 640962 times
Message-ID:  <20060617125051.GA38179@saturn.kn-bremen.de>
In-Reply-To: <20060617150847.W38142@delplex.bde.org>
References:  <200606161855.k5GItlP1001526@saturn.kn-bremen.de> <20060617150847.W38142@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 17, 2006 at 04:13:37PM +1000, Bruce Evans wrote:
> On Fri, 16 Jun 2006, Juergen Lock wrote:
> 
> >In article <ef10de9a0606131730m7ec33522q2b2273eb3d44d603@mail.gmail.com> 
> >you write:
> >>I'm getting these messages on the console when I run QEMU with kqemu
> >>kernel module:
> >>npxdna: fpcurthread == curthread 640957 times
> >>...
> >>What are these and how do I make them stop?
> >>[...]
> >
> >Ok since the previous thread on this,
> >   http://lists.freebsd.org/pipermail/freebsd-emulation/2006-May/002081.html
> >didnt end with a consensus on how to fix it, I've made a band-aid
> >patch that simply disables the message for the kqemu case: (which
> >_appears_ to be harmless there, maybe because qemu doesnt seem to
> >be using the fpu while calling kqemu)
> >
> >Index: sys/i386/isa/npx.c
> >===================================================================
> >RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v
> >retrieving revision 1.162.2.1.2.2
> >diff -u -r1.162.2.1.2.2 npx.c
> >--- sys/i386/isa/npx.c	30 Apr 2006 05:17:59 -0000	1.162.2.1.2.2
> >+++ sys/i386/isa/npx.c	16 Jun 2006 18:03:23 -0000
> >@@ -756,6 +756,9 @@
> >	if (!npx_exists)
> >		return (0);
> >	if (PCPU_GET(fpcurthread) == curthread) {
> >+#if 1
> >+		if (strcmp(curthread->td_proc->p_comm, "qemu"))
> >+#endif
> >		printf("npxdna: fpcurthread == curthread %d times\n",
> >		    ++err_count);
> >		stop_emulating();
> >
> >Of course I don't ask for something like this be committed, it is just
> >for ppl for who the messages cause trouble.  Similar patch for amd64
> >(untested):
> 
> This message should be a panic like it used so that the problem gets
> fixed.  An invariant has been violated.  See the other other thread.
> 
> There should be another panic for this, one which occurs if npxdna() is
> even called in kernel mode.  amd64 has a message instead, and i386 has
> nothing, except in my version it has a Debugger() call:
> 
> %%%
> Index: trap.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/i386/i386/trap.c,v
> retrieving revision 1.265
> diff -u -2 -r1.265 trap.c
> --- trap.c	2 Jun 2004 07:52:33 -0000	1.265
> +++ trap.c	17 Jun 2004 05:38:48 -0000
> @@ -428,4 +478,5 @@
>  			 * registered such use.
>  			 */
> +			Debugger("T_DNA in kernel mode");
>  			if (npxdna())
>  				goto out;
> %%%
> 
> Note that Debugger() has rotted in -current.  It is now misnamed
> kdb_enter() and no longer prints a message when no debugger is configured.
> Most uses of it had very little to do with [dgk]db, but (were like the
> above one) replacements for panic() in cases you didn't really want
> to panic or wanted to debug if a debugger is configured.  Such uses
> should not exist in production kernels, and it is important for a
> message to be printed when no debugger is configured so that the
> replacement for panic() doesn't become not even a printf().
> 
> T_DNA in kernel mode used to cause an unconditional panic too, but in
> rev.1.78 of i386/trap.c the panic was made conditional on npxdna()
> failing.  This was to support using the FPU for optimizing copying and
> zeroing on original Pentiums.  amd64 never supported this use of the
> FPU but still has the above vestige of it.  The code in rev.1.78 should
> have been
> 
> 			if (curpcb->pcb_flags & PCB_KERNEL_USING_NPX) {
> 				if (npxdna())
> 					goto out;
> 			}
> 			break;		/* To trap_fatal(). */
> 
> PCB_KERNEL_USING_NPX should only be set when the kernel is actually using
> npx.  Your strcmp() acts similarly to a flags test but not so well.
> 
> >And finally a question for the experts: could this be fixed for real
> >once the dfly fpu patch mentioned in the `Adding optimized kernel
> >copying support' thread,
> >   http://lists.freebsd.org/pipermail/freebsd-arch/2006-June/005267.html
> >is committed?
> 
> No.  That patch still depends on the buggy rev.1.78 of trap.c to avoid
> a panic in trap().  It should use a flags test like the above.  With
> such a test, qemu whould panic because it has not claimed the FPU
> like the dfly patch does.  If qemu claimed the FPU, npxdna() should
> still panic due to the violated invariant, but qemu shouldn't claim
> the FPU in the same way as the kernel copying support since it wants
> to load the user state and not use the FPU itself.

I meant if kqemu would then claim the fpu before calling kqemu_exec
(afaik it doesn't need the qemu process'es userland fpu state, its
just so it can use the fpu.)
> 
> I thought about the npxdna() in trap() mainly in connection with
> changing the optimized kernel copying support.  I just thought of
> another complication from using kqemu.  I think kqemu does an "int
> $7" to cause a T_DNA to load the user FPU state.  If the kernel were
> using the FPU, then the kernel FPU state would actually be loaded.
> This shouldn't happen in practice since kqemu shoudn't be using the
> FPU at the point where it does the "int $7".

 yeah.  Don't get me wrong, I would like to see this fixed for real
just like you, the problem is only I don't know how!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060617125051.GA38179>