Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Apr 2010 16:29:36 +0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        Maho NAKATA <chat95@mac.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: HyperThreading makes worse to me (was Re: How to reproduce: Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920)
Message-ID:  <m2wd763ac661004150129za1457973zdff5310f17341084@mail.gmail.com>
In-Reply-To: <20100415.094643.450985660335296086.chat95@mac.com>
References:  <20100414.082109.29593248145846106.chat95@mac.com> <4BC5DEB4.1090208@freebsd.org> <x2k6201873e1004140934z6f7518b9j72ffd9e1adc1ad49@mail.gmail.com> <20100415.094643.450985660335296086.chat95@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
May I make a suggestion?

Would you mind creating a shared google spreadsheet with your testing
results and a shared google document with the test setup?

I think having the data in an easily represented, easily shared medium
would be beneficial to everyone.


Adrian

On 15 April 2010 08:46, Maho NAKATA <chat95@mac.com> wrote:
> Hi Andry and Adam
>
> My test again. No desktop, etc. I just run dgemm.
> Contrary to Adam's result, Hyper Threading makes the performance worse.
> all tests are done on Core i7 920 @ 2.67GHz. (TurboBoost @2.8GHz)
>
> Turbo Boost off, Hyper threading off: 82% (35GFlops) =C2=A0 =C2=A0[1]
> Turbo Boost off, Hyper threading off: 72% (30.5GFlops) =C2=A0[2]
>
> Turbo Boost on, =C2=A0Hyper threading on: 71% (32GFlops) =C2=A0 =C2=A0[3]
> Turbo Boost off, Hyper threading off: 84-89% (38-40GFlops) [4]
>
> ---my system---
> CPU: Intel(R) Core(TM) i7 CPU =C2=A0 =C2=A0 =C2=A0 =C2=A0 920 =C2=A0@ 2.6=
7GHz (2683.44-MHz K8-class CPU)
> =C2=A0Origin =3D "GenuineIntel" =C2=A0Id =3D 0x106a5 =C2=A0Stepping =3D 5
> =C2=A0Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,M=
TRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE=
>
> =C2=A0Features2=3D0x98e3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,=
xTPR,PDCM,SSE4.1,SSE4.2,POPCNT>
> =C2=A0AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM>
> =C2=A0AMD Features2=3D0x1<LAHF>
> =C2=A0TSC: P-state invariant
> real memory =C2=A0=3D 12884901888 (12288 MB)
> avail memory =3D 12387717120 (11813 MB)
> ACPI APIC Table: <110909 APIC1026>
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s)
> ---my system---
>
> ---DETAILS---
> [1]
> % ./dgemm
> n: 3000
> time : 57.666717 or 16.339074
> Mflops : 33060.624827
> n: 3100
> time : 61.502677 or 16.597376
> Mflops : 35910.025544
> n: 3200
> time : 69.075401 or 19.199833
> Mflops : 34144.297133
> n: 3300
> time : 73.699540 or 19.633594
> Mflops : 36618.756539
> n: 3400
> time : 82.256194 or 22.373651
> Mflops : 35144.518837
> n: 3500
> time : 88.975662 or 24.118761
> Mflops : 35563.394249
> n: 3600
> time : 96.436652 or 26.027588
> Mflops : 35861.148385
> n: 3700
> [2]
> % ./dgemm
> n: 3000
> time : 139.622739 or 17.693806
> Mflops : 30529.327312
> n: 3100
> time : 154.344971 or 19.566886
> Mflops : 30460.247702
> n: 3200
> time : 169.507739 or 21.467100
> Mflops : 30538.116602
> n: 3300
> time : 186.363773 or 23.615281
> Mflops : 30444.600545
> n: 3400
> time : 203.798979 or 25.817667
> Mflops : 30456.322788
> n: 3500
> ...
> [3]
> % ./dgemm
> n: 3000
> time : 134.673079 or 16.958682
> Mflops : 31852.711082
> n: 3100
> time : 148.410085 or 18.663248
> Mflops : 31935.073574
> n: 3200
> time : 162.835473 or 20.468825
> Mflops : 32027.475770
> n: 3300
> time : 179.025370 or 22.479189
> Mflops : 31983.262501
> n: 3400
> time : 195.859710 or 24.663009
> Mflops : 31882.208788
> n: 3500
> [4]
> % ./dgemm
> n: 3000
> time : 54.259647 or 14.684309
> Mflops : 36786.204907
> n: 3100
> time : 60.899147 or 17.124599
> Mflops : 34804.447141
> n: 3200
> time : 64.295342 or 17.490787
> Mflops : 37480.577569
> n: 3300
> time : 69.781247 or 18.288840
> Mflops : 39311.284796
> n: 3400
> time : 79.234397 or 21.829736
> Mflops : 36020.187858
> n: 3500
> time : 83.905419 or 22.381237
> Mflops : 38324.289174
> n: 3600
> time : 92.195022 or 25.105942
> Mflops : 37177.621122
> n: 3700
> time : 97.718841 or 25.434243
> Mflops : 39841.319494
> n: 3800
> time : 105.740463 or 27.414029
> Mflops : 40042.592613
> n: 3900
> time : 113.980157 or 29.678505
> Mflops : 39984.635420
> n: 4000
> time : 122.941569 or 31.946174
> Mflops : 40077.412531
> n: 4100
> ---DETAILS---
>
>
> From: Adam Vande More <amvandemore@gmail.com>
> Subject: Re: How to reproduce: Re: Only 70% of theoretical peak performan=
ce on FreeBSD 8/amd64, Corei7 920
> Date: Wed, 14 Apr 2010 11:34:45 -0500
>
>>> > time : 162.488885 or 20.430651
>>> > Mflops : 32087.318295
>>> > n: 3300
>>> > time : 178.497079 or 22.446093
>>> > Mflops : 32030.420499
>>> > n: 3400
>>> > time : 195.550715 or 24.586152
>>> > Mflops : 31981.873273
>>> > n: 3500
>>> > time : 213.403379 or 26.825058
>>> > Mflops : 31975.513363
>>> > n: 3600
>>> > ...
>>> > above output is on Core i7 920 (2.66GHz; TurboBoost on)
>>>
>>> My results:
>>> $ ./dgemm
>>> n: 3000
>>> time : 54.151302 or 28.189781
>>> Mflops : 19162.263125
>>> n: 3100
>>> time : 60.157449 or 32.214141
>>> Mflops : 18501.570537
>>> n: 3200
>>> time : 65.753191 or 34.114872
>>> Mflops : 19216.393378
>>>
>>> CPU:
>>> CPU: Intel(R) Core(TM)2 Duo CPU =C2=A0 =C2=A0 E7300 =C2=A0@ 2.66GHz (26=
53.35-MHz K8-class
>>> CPU)
>>> =C2=A0Origin =3D "GenuineIntel" =C2=A0Id =3D 0x10676 =C2=A0Stepping =3D=
 6
>>>
>>>
>>> Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,=
PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>>>
>>> =C2=A0Features2=3D0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTP=
R,PDCM,SSE4.1>
>>> =C2=A0AMD Features=3D0x20100800<SYSCALL,NX,LM>
>>> =C2=A0AMD Features2=3D0x1<LAHF>
>>> =C2=A0TSC: P-state invariant
>>> =E2=8B=AE
>>> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
>>> FreeBSD/SMP: 1 package(s) x 2 core(s)
>>>
>>> FreeBSD:
>>> FreeBSD 8.0-STABLE r205070 amd64
>>>
>>> Please note that the system was not dedicated to the test, I had
>>> Xorg+KDE3+thunderbird+skype+kopete+konsole(s) plus a bunch of daemons
>>> running.
>>> That probably explains irregularities in the results.
>>>
>>> I am not sure how exactly theoretical maximum should be calculated, I u=
sed
>>> 2 *
>>> 2.66G * 4 =E2=89=88 21.3G.
>>> And so 19.2G / 21.3G =E2=89=88 90%.
>>>
>>> Not as bad as what you get.
>>> Although not as good as what you report for Linux.
>>> But given the impurity and imprecision of my test=E2=80=A6
>>>
>>> P.S. the machine is two-core obviously :-)
>>> Don't have anything with more cpus/cores handy.
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m2wd763ac661004150129za1457973zdff5310f17341084>