Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Jan 2002 15:31:49 +1100
From:      Peter Jeremy <peter.jeremy@alcatel.com.au>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Terry Lambert <tlambert2@mindspring.com>, Peter Wemm <peter@wemm.org>, Alfred Perlstein <bright@mu.org>, Kelly Yancey <kbyanc@posi.net>, Nate Williams <nate@yogotech.com>, Daniel Eischen <eischen@pcnet1.pcnet.com>, Dan Eischen <eischen@vigrid.com>, Archie Cobbs <archie@dellroad.org>, arch@FreeBSD.ORG
Subject:   Re: Request for review: getcontext, setcontext, etc
Message-ID:  <20020114153148.W561@gsmx07.alcatel.com.au>
In-Reply-To: <20020114120026.S3794-100000@gamplex.bde.org>; from bde@zeta.org.au on Mon, Jan 14, 2002 at 01:31:20PM %2B1100
References:  <20020114074238.S561@gsmx07.alcatel.com.au> <20020114120026.S3794-100000@gamplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2002-Jan-14 13:31:20 +1100, Bruce Evans <bde@zeta.org.au> wrote:
>> always load FPU context on a switch - this is more expensive for
>> processes that don't use FP, but saves a DNA trap per context switch
>> (assuming they use FP in that slice) for those that do.
>
>Not overall, since most timeslices don't use the FPU (at least for
>processes that I run :-).

My 2 year old figures suggest that even with FP-intensive progs, the
majority of timeslices don't use the FPU.

>> To add some further numbers, in December 1999, I did some measurements
>> on FP switching by patching npx.c.  This was on a PII-266 running then
>> -current.  (The original e-mail was sent to -arch on Mon, 20 Dec 1999
>> 07:34:06 +1100 in a thread titled "Concrete plans for ucontext/
>> mcontext changes around 4.0" - I don't have the message-id available).
>>
>>   ctxt     DNA    FP
>>  swtch    traps  swtch
>> 1754982  281557  59753  build world and a few CVS operations [1]
>>   79044   18811  10341  gnuplot and xv in parallel [2]
>>     800     138    130  parallel FP-intensive progs [3].
>>
>> In the above, `ctxt swtch' is the number of context switches counted
>> via vm.stats.sys.v_swtch.  `DNA traps' is the number of device not
>> available traps registered and `FP swtch' is the number of DNA traps
>> where the FP context loaded is different to that saved on the
>> preceeding context switch.
>
>That's a lot more DNA traps than I would have expected for buildworld
>and a bit less than I would have expected for the others.  I guess many
>of the ones for buildworld are for the FP in setjmp() for jumps that
>are never taken.

It's also possible that gcc does some FP operations even when compiling
integer-only code.

>220000 extra FP context switches at 264 cycles each would increase my
>buildworld time by a whole 0.34 seconds or 0.025%.  There may be more
>important things to optimize :-).

Except that removing the lazy FPU switching would translate 280,000
DNA traps into 1,755,000 f*rstor's.  Though you could probably cut
this number down by changing the FPU switching code to always do an
f*rstor if the process ever uses FP.  (As someone else suggested).

If anyone is interested, my original patches are below.  Based on
a quick look, they seem to be still valid (though there will be
some fuzz on -CURRENT due to the added cpu_critical_enter() call).

Index: npx.c
===================================================================
RCS file: /home/peter/cvs/src/sys/i386/isa/npx.c,v
retrieving revision 1.78
diff -u -r1.78 npx.c
--- npx.c	1999/09/21 10:51:47	1.78
+++ npx.c	1999/12/17 09:53:02
@@ -779,6 +779,15 @@
 	}
 }
 
+static int	fp_dna;		/* number of DNA traps */
+static int	fp_swtch;	/* Number of real FP context switches */
+static struct proc *fpuproc;	/* Last proc to use FPU */
+
+SYSCTL_INT(_hw, OID_AUTO, fp_dna, CTLFLAG_RW, &fp_dna, 0,
+	"Number of NPX DNA traps");
+SYSCTL_INT(_hw, OID_AUTO, fp_swtch, CTLFLAG_RW, &fp_swtch, 0,
+	"Number of NPX context switches");
+
 /*
  * Implement device not available (DNA) exception
  *
@@ -797,6 +806,11 @@
 		panic("npxdna");
 	}
 	stop_emulating();
+	fp_dna++;
+	if (curproc != fpuproc) {
+		fpuproc = curproc;
+		fp_swtch++;
+	}
 	/*
 	 * Record new context early in case frstor causes an IRQ13.
 	 */


Peter

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020114153148.W561>