Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Mar 2006 13:56:27 -0800
From:      Peter Wemm <peter@wemm.org>
To:        freebsd-amd64@freebsd.org, cokane@cokane.org
Cc:        kono@kth.se
Subject:   Re: amd64 slower than i386 on identical AMD 64 system? / How is hyperthreading handled on amd64?
Message-ID:  <200603151356.27972.peter@wemm.org>
In-Reply-To: <346a80220603141520i2ac1a4br66cbfb213453dcd6@mail.gmail.com>
References:  <20060313221836.5491916A420@hub.freebsd.org> <200603140740.38388.joao@matik.com.br> <346a80220603141520i2ac1a4br66cbfb213453dcd6@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 14 March 2006 03:20 pm, Coleman Kane wrote:
> On 3/14/06, JoaoBR <joao@matik.com.br> wrote:
> > On Tuesday 14 March 2006 07:06, Alexander Konovalenko wrote:
> > > > Hi
> > > > Since some time (>6.0R) I have the impression that amd64 runs
> > > > slower
> >
> > than
> >
> > > > i386. Now I run some tests on identical hardware and using
> > > > ubench confirmes this. Somebody has comments on this?
> > >
> > > I have Dual core AMD64 4400+ and FreeBSD RELENG_5. I don't have
> > > FreeBSD i386 installed but you can just compare benchmarks.
> > >
> > > ubench uses all CPU/cores by default, when one ubench is running,
> > > top shows:
> >
> > so where is your comparism? My point was that the same hardware is
> > faster running i386
> >
> > I experience this also on X2 machines but do not have two machines
> > to compare
> > I have a X2-4400-SMP running amd64 and a X2-4200-SMP running i386
> > and it gives
> > me the same numbers running ubench
> >
> >
> >
> > Jo=E3o
> >
> > >  PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU  =20
> > > CPU COMMAND 11528 XXXX       111    0  3572K   880K RUN    1 =20
> > > 0:12 93.64% 42.29% ubench 11529 XXXX       111    0  3572K   880K
> > > CPU0   1   0:11 97.21% 41.16% ubench 11526 XXXX        -8    0=20
> > > 3572K   880K piperd 0 0:17 41.76% 31.98% ubench
> > >
> > >
> > > one ubench executed (with no -s flag =3D use all CPU, default):
> > >
> > > Unix Benchmark Utility v.0.3
> > > Copyright (C) July, 1999 PhysTech, Inc.
> > > Author: Sergei Viznyuk <sv@phystech.com>
> > > http://www.phystech.com/download/ubench.html
> > > FreeBSD 5.5-PRERELEASE FreeBSD 5.5-PRERELEASE #12: Sun Mar  5
> > > 17:34:07
> >
> > CET
> >
> > > 2006     XXXX@XXXX:/usr/obj/usr/src/sys/DAEMON64SMP amd64
> > > Ubench CPU:   238149
> > > Ubench MEM:   255459
> > > --------------------
> > > Ubench AVG:   246804
> > >
> > >
> > > two ubench executed with -s flag (use single CPU only):
> > >
> > > Ubench Single CPU:   120184 (0.40s)
> > > Ubench Single MEM:   126787 (0.39s)
> > > -----------------------------------
> > > Ubench Single AVG:   123485
> > >
> > > Ubench Single CPU:   121000 (0.41s)
> > > Ubench Single MEM:   128762 (0.40s)
> > > -----------------------------------
> > > Ubench Single AVG:   124881
> > >
> > >
> > > one ubench executed with -s flag (use single CPU only):
> > >
> > > Ubench Single CPU:   123251 (0.40s)
> > > Ubench Single MEM:   161494 (0.40s)
> > > -----------------------------------
> > > Ubench Single AVG:   142372
> > >
> > >
> > > /Alexander Konovalenko
> > >
> > > +46-8-5537-8142 (office)
> > > +46-7-3752-2116
> > > http://daemon.nanophys.kth.se/~kono
> > >
> > > Royal Institute of Technology (KTH)
> > > Nanostructure Physics Department, Albanova
> > > Roslagstullsbacken 21
> > > 10691 Stockholm
> > > Sweden
> >
> > A mensagem foi scaneada pelo sistema de e-mail e pode ser
> > considerada segura.
> > Service fornecido pelo Datacenter Matik=20
> > https://datacenter.matik.com.br
> > _______________________________________________
> > freebsd-amd64@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64
> > To unsubscribe, send any mail to
> > "freebsd-amd64-unsubscribe@freebsd.org"
>
> I think that the nature of the ubench benchmark should be
> investigated to reveal the reasons behind your dismay. It seems to me
> that your assumption that 64-bit should be faster than 32-bit in all
> cases is wrong. The nature of the processor design, the OS
> implementation, and how ubench does its measurement needs to be
> addressed.
>
> First of all, when comparing a 64-bit amd64 to a 32-bit IA-32 system
> it is important to know that this *does not* in fact mean that if you
> tested a loop of:
> long x, y, z;
> x =3D 1;
> y =3D 1;
> z =3D x + y;
>
> That the 64-bit machine would do 2X that above calculation. In fact,
> on the 64-bit machine, the memory taken up by the x, y, z would be
> double that on the i386, the add/load instruction would also double
> in size, and as far as execution goes, the time *should* be about the
> same for both units. This is all looking like 64-bit would, by its
> nature, have a slower average than your 32-bit system.
>
> In addition, amd64 64-bit mode doubles your register set, increasing
> the amount of memory that needs to be moved around on a context
> switch, and everything is pointing towards.....probably slower.

I tend to agree with this.  ubench is not a useful benchmark for=20
comparing 32 bit vs 64 bit systems.

However, what might be interesting is to compile a 32 bit binary (and=20
statically link it) on the i386 system, and compare the runtime on the=20
64 bit kernel, using the same identical binary.  That way you are=20
measuring the same math operations on both platforms.  Comparing 64 bit=20
operations vs 32 bit operations is apples vs oranges.

Of course, it may still be slower, but at least the results would be=20
more meaningful.  Don't assume the OS is slower because the compiler=20
makes the application do twice the work.

=2D-=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200603151356.27972.peter>