From owner-freebsd-current@FreeBSD.ORG Wed Sep 5 22:13:23 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9FA2F106566C for ; Wed, 5 Sep 2012 22:13:23 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 7845F8FC0C for ; Wed, 5 Sep 2012 22:13:23 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q85MDBBS097934; Wed, 5 Sep 2012 15:13:11 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q85MDBRI097933; Wed, 5 Sep 2012 15:13:11 -0700 (PDT) (envelope-from sgk) Date: Wed, 5 Sep 2012 15:13:11 -0700 From: Steve Kargl To: Dimitry Andric Message-ID: <20120905221310.GA97847@troutmask.apl.washington.edu> References: <5046670C.6050500@andric.com> <20120904214344.GA17723@troutmask.apl.washington.edu> <504679CB.90204@andric.com> <20120904221413.GA19395@troutmask.apl.washington.edu> <50471BEE.6030708@andric.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <50471BEE.6030708@andric.com> User-Agent: Mutt/1.4.2.3i Cc: Garrett Cooper , freebsd-current@freebsd.org Subject: Re: Compiler performance tests on FreeBSD 10.0-CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Sep 2012 22:13:23 -0000 On Wed, Sep 05, 2012 at 11:31:26AM +0200, Dimitry Andric wrote: > On 2012-09-05 01:40, Garrett Cooper wrote: > ... > > Steve does have a point. Posting the results of > >CFLAGS/CPPFLAGS/LDFLAGS/etc for config.log (and maybe poking through > >the code to figure out what *FLAGS were used elsewhere) is more > >valuable than the data is in its current state (unfortunately.. > >autoconf makes things more complicated). > > 1) For building the FreeBSD in-tree version of clang 3.2: > > -O2 -pipe -fno-strict-aliasing > > 2) For building the FreeBSD in-tree version of gcc 4.2.1: > > -O2 -pipe > > 3) For building Boost 1.50.0: > > -ftemplate-depth-128 -O3 -finline-functions > Dimitry thanks for the follow-up. I performed an unscientific (micro)benchmark of /usr/bin/cc vs /usr/bin/clang where cc is the base system's gcc 4.2.1. Here's what I found/feared. Compiling libm on CPU: AMD Opteron(tm) Processor 248 (2192.01-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0xf5a Family = f Model = 5 Stepping = 10 Features=0x78bfbff AMD Features=0xe0500800 with default CFLAGS (ie., -O2 -pipe) and -march=opteron. Using 'setenv CC /usr/bin/cc' with 3 runs of make clean time make -DNO_MAN yields 69.39 real 52.00 user 38.55 sys 69.57 real 52.35 user 38.37 sys 69.48 real 52.25 user 38.38 sys Now, repeating with 'setenv CC /usr/bin/clang' yields 39.65 real 21.86 user 17.37 sys 40.91 real 21.48 user 17.91 sys 39.77 real 21.65 user 17.64 sys So, clang does appear to be faster in this particular compiling speed benchmark. However, if I know build my test program for libm's j0f() function where the only difference is whether libm was built with /usr/bin/cc or /usr/bin/clang, I observe the following results. 1234567 x values in the interval [0:25] gcc libm | clang libm ----------------|----------------- ULP <= 0.6 --> 565515 (45.81%) | 513763 (41.61%) 0.6 < ULP <= 0.7 --> 74148 ( 6.01%) | 67221 ( 5.44%) 0.7 < ULP <= 0.8 --> 69112 ( 5.60%) | 62846 ( 5.09%) 0.8 < ULP <= 0.9 --> 63798 ( 5.17%) | 58217 ( 4.72%) 0.9 < ULP <= 1.0 --> 58679 ( 4.75%) | 53834 ( 4.36%) 1.0 < ULP <= 2.0 --> 328221 (26.59%) | 306728 (24.84%) 2.0 < ULP <= 3.0 --> 65323 ( 5.29%) | 63452 ( 5.14%) 3.0 < ULP --> 9771 ( 0.79%) | 108506 ( 8.79%) gcc libm | clang libm -----------------------|-------------------- MAX ULP: 12152.27637 | 1129606938624.00000 x at MAX ULP: 5.520077 0x1.6148f2p+2 | 2.404833 0x1.33d19p+1 Speed test with gcc libm. 1234567 j0f calls in 0.193427 seconds. 1234567 j0f calls in 0.193410 seconds. 1234567 j0f calls in 0.194158 seconds. Speed test with clang libm. 1234567 j0f calls in 0.180260 seconds. 1234567 j0f calls in 0.180130 seconds. 1234567 j0f calls in 0.179739 seconds. So, although the clang built j0f() appears to be faster than the gcc built j0f(), the clang built j0f() has much worse accuracy issues. -- Steve