Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Apr 2010 18:26:44 +0300
From:      Andriy Gapon <avg@freebsd.org>
To:        Maho NAKATA <chat95@mac.com>
Cc:        alc@freebsd.org, alan.l.cox@gmail.com, freebsd-stable@freebsd.org, als@modulus.org
Subject:   Re: How to reproduce: Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920
Message-ID:  <4BC5DEB4.1090208@freebsd.org>
In-Reply-To: <20100414.082109.29593248145846106.chat95@mac.com>
References:  <h2yca3526251004122230l909bc93ey916d7fe0dd24fd33@mail.gmail.com>	<4BC402B7.5000400@modulus.org>	<v2gca3526251004122322i709c523ct4f93bcf75a778a8e@mail.gmail.com> <20100414.082109.29593248145846106.chat95@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 14/04/2010 02:21 Maho NAKATA said the following:
> 4. run dgemm. 
> % ./dgemm
> n: 3000
> time : 134.648208 or 16.910525 
> Mflops : 31943.419695
> n: 3100
> time : 148.122279 or 18.615284 
> Mflops : 32017.357408
> n: 3200
> time : 162.488885 or 20.430651 
> Mflops : 32087.318295
> n: 3300
> time : 178.497079 or 22.446093 
> Mflops : 32030.420499
> n: 3400
> time : 195.550715 or 24.586152 
> Mflops : 31981.873273
> n: 3500
> time : 213.403379 or 26.825058 
> Mflops : 31975.513363
> n: 3600
> ...
> above output is on Core i7 920 (2.66GHz; TurboBoost on)

My results:
$ ./dgemm
n: 3000
time : 54.151302 or 28.189781
Mflops : 19162.263125
n: 3100
time : 60.157449 or 32.214141
Mflops : 18501.570537
n: 3200
time : 65.753191 or 34.114872
Mflops : 19216.393378

CPU:
CPU: Intel(R) Core(TM)2 Duo CPU     E7300  @ 2.66GHz (2653.35-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x10676  Stepping = 6

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
⋮
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)

FreeBSD:
FreeBSD 8.0-STABLE r205070 amd64

Please note that the system was not dedicated to the test, I had
Xorg+KDE3+thunderbird+skype+kopete+konsole(s) plus a bunch of daemons running.
That probably explains irregularities in the results.

I am not sure how exactly theoretical maximum should be calculated, I used 2 *
2.66G * 4 ≈ 21.3G.
And so 19.2G / 21.3G ≈ 90%.

Not as bad as what you get.
Although not as good as what you report for Linux.
But given the impurity and imprecision of my test…

P.S. the machine is two-core obviously :-)
Don't have anything with more cpus/cores handy.

P.P.S. Having _only glimpsed_ at the source I think that there are some things
that GotoBLAS doesn't try to do on FreeBSD that it tries to do on Linux.
Like setting CPU-affinity for the threads, or avoiding HTT pseudo-cores.
Those things are possible on FreeBSD.
Perhaps, there are more things like that.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BC5DEB4.1090208>