From owner-cvs-src@FreeBSD.ORG Tue Feb 22 20:18:17 2005 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C7B616A4CE; Tue, 22 Feb 2005 20:18:17 +0000 (GMT) Received: from VARK.MIT.EDU (VARK.MIT.EDU [18.95.3.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9774843D2D; Tue, 22 Feb 2005 20:18:16 +0000 (GMT) (envelope-from das@FreeBSD.ORG) Received: from VARK.MIT.EDU (localhost [127.0.0.1]) by VARK.MIT.EDU (8.13.3/8.13.1) with ESMTP id j1MKIA4x037843; Tue, 22 Feb 2005 15:18:10 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.MIT.EDU (8.13.3/8.13.1/Submit) id j1MKIA06037842; Tue, 22 Feb 2005 15:18:10 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Tue, 22 Feb 2005 15:18:10 -0500 From: David Schultz To: Nate Lawson Message-ID: <20050222201810.GA37791@VARK.MIT.EDU> Mail-Followup-To: Nate Lawson , Maxim Sobolev , src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG References: <200502211604.j1LG4NNx037623@repoman.freebsd.org> <421B24E2.7050800@portaone.com> <20050222135251.GB29054@VARK.MIT.EDU> <421B81E4.6080909@root.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <421B81E4.6080909@root.org> cc: cvs-src@FreeBSD.ORG cc: Maxim Sobolev cc: src-committers@FreeBSD.ORG cc: cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/lib/msun/i387 Makefile.inc e_atan2.S e_atan2f.S s_atan.S X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Feb 2005 20:18:17 -0000 On Tue, Feb 22, 2005, Nate Lawson wrote: > David Schultz wrote: > >By the way, here are some other results for the Pentium 4, all > >without SSE. SSE makes things a bit worse, probably because the > >x87 and SSE registers are shared, and the Pentium 4 imposes a > >large penalty for switching between the two sets. > > I don't believe this is correct. MMX and x87 use the same register > context (hence emms), however the XMM registers (SSE*) are separate. > It's possible gcc is generating MMX instructions though with your SSE > command line switch. Yep, you're right, I was thinking of the MMX register set. I compared the code generated by gcc with and without SSE/SSE2, and found that the only thing it uses SSE2 for is converting from floating point->integer and back (e.g. CVTTSD2SI instead of i387 control word frobbing and FISTL). There was also one place where gcc just got confused and juggled around a bunch of registers on the i387 stack, but I don't think that accounts for the difference. I wonder if CVTTSD2SI and friends are slower than an OR/MOV/FLDCW/FISTL/FLDCW sequence on the Pentium 4 for some bizarre reason, or if I missed something else significant while scanning the diff.