From owner-freebsd-arch Sun Jan 13 20:32:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27]) by hub.freebsd.org (Postfix) with ESMTP id 8CA4337B400 for ; Sun, 13 Jan 2002 20:32:04 -0800 (PST) Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1]) by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id PAA28403; Mon, 14 Jan 2002 15:31:54 +1100 (EDT) Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au (PMDF V5.2-32 #37641) with ESMTP id <01KD2E9PO4I8VFMH76@cim.alcatel.com.au>; Mon, 14 Jan 2002 15:32:11 +1100 Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.11.6/8.11.6) id g0E4VnG26895; Mon, 14 Jan 2002 15:31:49 +1100 Content-return: prohibited Date: Mon, 14 Jan 2002 15:31:49 +1100 From: Peter Jeremy Subject: Re: Request for review: getcontext, setcontext, etc In-reply-to: <20020114120026.S3794-100000@gamplex.bde.org>; from bde@zeta.org.au on Mon, Jan 14, 2002 at 01:31:20PM +1100 To: Bruce Evans Cc: Terry Lambert , Peter Wemm , Alfred Perlstein , Kelly Yancey , Nate Williams , Daniel Eischen , Dan Eischen , Archie Cobbs , arch@FreeBSD.ORG Mail-Followup-To: Bruce Evans , Terry Lambert , Peter Wemm , Alfred Perlstein , Kelly Yancey , Nate Williams , Daniel Eischen , Dan Eischen , Archie Cobbs , arch@FreeBSD.ORG Message-id: <20020114153148.W561@gsmx07.alcatel.com.au> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline User-Agent: Mutt/1.2.5i References: <20020114074238.S561@gsmx07.alcatel.com.au> <20020114120026.S3794-100000@gamplex.bde.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 2002-Jan-14 13:31:20 +1100, Bruce Evans wrote: >> always load FPU context on a switch - this is more expensive for >> processes that don't use FP, but saves a DNA trap per context switch >> (assuming they use FP in that slice) for those that do. > >Not overall, since most timeslices don't use the FPU (at least for >processes that I run :-). My 2 year old figures suggest that even with FP-intensive progs, the majority of timeslices don't use the FPU. >> To add some further numbers, in December 1999, I did some measurements >> on FP switching by patching npx.c. This was on a PII-266 running then >> -current. (The original e-mail was sent to -arch on Mon, 20 Dec 1999 >> 07:34:06 +1100 in a thread titled "Concrete plans for ucontext/ >> mcontext changes around 4.0" - I don't have the message-id available). >> >> ctxt DNA FP >> swtch traps swtch >> 1754982 281557 59753 build world and a few CVS operations [1] >> 79044 18811 10341 gnuplot and xv in parallel [2] >> 800 138 130 parallel FP-intensive progs [3]. >> >> In the above, `ctxt swtch' is the number of context switches counted >> via vm.stats.sys.v_swtch. `DNA traps' is the number of device not >> available traps registered and `FP swtch' is the number of DNA traps >> where the FP context loaded is different to that saved on the >> preceeding context switch. > >That's a lot more DNA traps than I would have expected for buildworld >and a bit less than I would have expected for the others. I guess many >of the ones for buildworld are for the FP in setjmp() for jumps that >are never taken. It's also possible that gcc does some FP operations even when compiling integer-only code. >220000 extra FP context switches at 264 cycles each would increase my >buildworld time by a whole 0.34 seconds or 0.025%. There may be more >important things to optimize :-). Except that removing the lazy FPU switching would translate 280,000 DNA traps into 1,755,000 f*rstor's. Though you could probably cut this number down by changing the FPU switching code to always do an f*rstor if the process ever uses FP. (As someone else suggested). If anyone is interested, my original patches are below. Based on a quick look, they seem to be still valid (though there will be some fuzz on -CURRENT due to the added cpu_critical_enter() call). Index: npx.c =================================================================== RCS file: /home/peter/cvs/src/sys/i386/isa/npx.c,v retrieving revision 1.78 diff -u -r1.78 npx.c --- npx.c 1999/09/21 10:51:47 1.78 +++ npx.c 1999/12/17 09:53:02 @@ -779,6 +779,15 @@ } } +static int fp_dna; /* number of DNA traps */ +static int fp_swtch; /* Number of real FP context switches */ +static struct proc *fpuproc; /* Last proc to use FPU */ + +SYSCTL_INT(_hw, OID_AUTO, fp_dna, CTLFLAG_RW, &fp_dna, 0, + "Number of NPX DNA traps"); +SYSCTL_INT(_hw, OID_AUTO, fp_swtch, CTLFLAG_RW, &fp_swtch, 0, + "Number of NPX context switches"); + /* * Implement device not available (DNA) exception * @@ -797,6 +806,11 @@ panic("npxdna"); } stop_emulating(); + fp_dna++; + if (curproc != fpuproc) { + fpuproc = curproc; + fp_swtch++; + } /* * Record new context early in case frstor causes an IRQ13. */ Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message