From owner-freebsd-current@FreeBSD.ORG Fri Jan 23 04:34:47 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 708DC16A4CE; Fri, 23 Jan 2004 04:34:47 -0800 (PST) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id BD89443D41; Fri, 23 Jan 2004 04:34:44 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i0NCYh5O003775; Fri, 23 Jan 2004 23:34:43 +1100 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i0NCYfEf027038; Fri, 23 Jan 2004 23:34:42 +1100 Date: Fri, 23 Jan 2004 23:34:42 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: John Baldwin In-Reply-To: <200401211104.49878.jhb@FreeBSD.org> Message-ID: <20040123225430.A23195@gamplex.bde.org> References: <200401190738.i0J7ccF3020266@postoffice.e-easy.com.au> <200401201319.52943.jhb@FreeBSD.org> <20040121153615.O7322@gamplex.bde.org> <200401211104.49878.jhb@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@FreeBSD.org cc: Ruslan Ermilov Subject: Re: Release Building and /etc/make.conf X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jan 2004 12:34:47 -0000 On Wed, 21 Jan 2004, John Baldwin wrote: > On Tuesday 20 January 2004 11:58 pm, Bruce Evans wrote: > > i386 (or equivalently, no special tuning) is the best default, at least > > in non-FPU-intensive applications. In my integer crunching application/ > > benchmark (searching a game tree), it even gives better results than > > -mcpu=pentiumpro on a pentiumpro class machine (a 366MHz Celeron). > > -mcpu=athlon-xp gives even better results. > > > > All with -O3 -fomit-frame-pointer > > -mcpu-athlon-xp 48.42 real 47.31 user 0.41 sys > > 51.22 real 50.10 user 0.30 sys > > -mcpu=i386 51.98 real 50.18 user 0.34 sys > > -mcpu=pentiumpro 56.38 real 55.26 user 0.34 sys > > -mcpu=pentium2 56.24 real 55.25 user 0.36 sys > > -mcpu=pentium3 56.59 real 55.25 user 0.40 sys > > -mcpu=pentium4 58.52 real 56.96 user 0.36 sys > > -mcpu=i486 79.17 real 77.69 user 0.32 sys > > -mcpu=i586 74.80 real 73.07 user 0.48 sys > > > > This is just one benchmark, chosen for its potential optimizability. > > I only did non-exhaustive benchmarks for the makeworld benchmark. I > > removed the -mpentiumpro change when I saw the kernel size bloat that > > it gave. > > Does -mcpu=althon-xp perform worse than the default in other benchmarks that > you've run? I haven't run enough to be sure. It's hard to test all the combinations for long enough. Some quick tests with the cc1 application/benchmark: cc1 compiled with -O3 -fomit-frame-pointer, and: -mcpu=i386 (code o3) -mcpu=i486 (o4) -mcpu=pentiumpro (op) -mcpu=athlon-xp (oa) Times for the "all" part of "make obj; make depend; make all" starting with an empty object tree and source tree = src/bin on the Celeron and src/usr.sbin on the Athlon (it doesn't complete because it wants to link to never-installed unbuilt libraries, but it gets a fair way). Smallest real time for 2 runs: On a Celeron 400 with source tree src/bin: o3: 121.94 real 97.14 user 19.94 sys o4: 130.83 real 106.59 user 19.07 sys oa: 122.69 real 97.58 user 19.39 sys op: 124.01 real 99.54 user 19.56 sys All non-null -mcpu settings are pessimizations, with -mcpu=i486 significantly bad and -mcpu=pentiumpro probably significantly bad. Optimizing the pentiumpro class machine as an athlon-xp works better (less worse here) than optimizing it as a pentiumpro in this benchmark too, but the differences are smaller On an Athlon-XP1600 overclocked with source tree src/usr.sbin: o3: 67.62 real 57.46 user 9.53 sys o4: 69.09 real 57.65 user 10.20 sys oa: 67.53 real 56.78 user 9.62 sys op: 68.14 real 57.47 user 9.70 sys Most of the differences are too small to be significant. Optimizing the athlon-xp as an athlon-xp at least doesn't pessimize it. My integer-crunching benchmark shows similarly small differences on freefall, but that may be just because freefall's gcc is so old. > > > > Note that CPUTYPE has worse bugs for i386's. Setting it to a supported > > > > CPU gives -march instead of -mcpu, so using it gives unportable > > > > binaries, and bsd.cpu.mk provides no way to get the corresponding -mcpu > > > > settings. OTOH, CPUTYPE for alphas gives only -mcpu. > > > > > > That is by design. Note that on all non-i386 architectures such as > > > alpha, etc. -mcpu means the same thing as -march. The other > > > architectures use -mtune to get the same effect as -mcpu on i386. > > > > Doesn't make it any less of a bug. > > The intent of CPUTYPE is that you can have ports and world optimized for the > specific machine you are compiling on, it is not set to anything by default, > so the user only gets -march=foo if they explicitly ask for it. I fail to > see how that is a bug. It is a bug because it implements the least useful option set first. Bruce