From owner-freebsd-numerics@FreeBSD.ORG Mon Mar 11 08:37:35 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 049E3ED4 for ; Mon, 11 Mar 2013 08:37:35 +0000 (UTC) (envelope-from theraven@FreeBSD.org) Received: from theravensnest.org (theraven.freebsd.your.org [216.14.102.27]) by mx1.freebsd.org (Postfix) with ESMTP id C22BFE05 for ; Mon, 11 Mar 2013 08:37:33 +0000 (UTC) Received: from [192.168.0.2] (cpc10-cmbg15-2-0-cust123.5-4.cable.virginmedia.com [86.30.246.124]) (authenticated bits=0) by theravensnest.org (8.14.5/8.14.5) with ESMTP id r2B8bPSI075265 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Mon, 11 Mar 2013 08:37:26 GMT (envelope-from theraven@FreeBSD.org) From: David Chisnall Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Fwd: [cfe-dev] More on atlas and clang Date: Mon, 11 Mar 2013 08:37:22 +0000 References: To: "freebsd-numerics@freebsd.org" Message-Id: <8652E076-8710-4766-8FD0-7774D82A1A0B@FreeBSD.org> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Mar 2013 08:37:35 -0000 Recent benchmarks of Atlas with clang, recently posted to the clang list = attached. Note that the -fvectorize and -fslp-vectorize flags are = enabling the new autovectorisation code in clang, which will be enabled = by default in 3.3. =20 David Begin forwarded message: > Hi there, >=20 > I have recently undertaken another experimental build of Atlas = (http://math-atlas.sourceforge.net =96 briefly speaking, Atlas provides = a highly complete BLAS/LAPACK implementation optimized for the native = architecture of the computer on which it is running) on an AVX machine = (MacMini 2011) using a snapshot of clang 3.3 (r173279) provided by = MacPorts (http://macports.org), with -O3, -fPIC, -fvectorize and = -fslp-vectorize flags.=20 >=20 > I am please to say that: >=20 > 1. The generated AVX code seems fine: a full test session run under an = Atlas-based SciPy didn=92t raise any error; > 2. The performance seems now on-par or even (sometimes surprisingly) = better than the =91reference GCC=92 =96 whatever that means (I was = unable to get in touch with Atlas developer at that time) =96 as = evidenced by the table below: >=20 > Reference clock rate=3D3292Mhz, new rate=3D2300Mhz > Refrenc : % of clock rate achieved by reference install > Present : % of clock rate achieved by present ATLAS install >=20 > single precision double precision > ******************************** = ******************************* > real complex real = complex > --------------- --------------- --------------- = --------------- > Benchmark Refrenc Present Refrenc Present Refrenc Present Refrenc = Present > =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D > kSelMM 1289.9 1407.4 1188.7 1229.8 686.7 826.8 647.4 = 682.1 > kGenMM 198.2 239.7 198.5 237.8 193.9 231.8 196.0 = 233.8 > kMM_NT 193.7 266.4 195.2 192.9 184.2 187.4 188.5 = 197.5 > kMM_TN 198.5 211.1 197.9 226.2 189.8 227.6 189.5 = 223.2 > BIG_MM 1213.8 1346.7 1241.3 1366.5 652.0 789.5 661.4 = 795.8 > kMV_N 224.3 308.1 438.8 617.3 115.9 152.1 205.8 = 283.5 > kMV_T 224.6 313.5 460.3 642.9 123.2 159.6 211.3 = 288.2 > kGER 148.3 192.4 290.2 381.2 73.3 95.6 144.3 = 184.3 >=20 > This is in stark contrast with the previous test where clang were = lagging about 20% beyond the =91reference implementation=92 based on GCC = for lines 2, 3 and 4 where compiler performance matters most. >=20 > So =96 to summarize in two words: kudos folks! >=20 > I will build another version on a Core2Duo machine tonight and see if = the results are consistent. >=20 > Cheers! > Vincent >=20 >=20 > _______________________________________________ > cfe-dev mailing list > cfe-dev@cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev