Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Jul 2005 08:34:38 -0700
From:      "Boleyn, Erich" <erich.boleyn@amd.com>
To:        ray@redshift.com, freebsd-amd64@freebsd.org
Subject:   RE: Benchmarks: AMD64 vs i386 on Dual 246 Opteron
Message-ID:  <C630E866708D364A984D83C7B58763BC36B0BA@SSVLEXMB1.amd.com>

next in thread | raw e-mail | index | archive | help

[NOTE:  I am not subscribed to the list, but got this sent to me by the =
author
 of the message.  Please copy me on any responses that are relevant]

Include std disclaimer here:  I work for AMD on this stuff, but am =
answering
generally as myself.

ray@redshift.com [mailto:ray@redshift.com] wrote:

> Freebsd-AMD64 list:
...
> The machine provided was a Dual Opteron 246 using the Tyan S2881 =
motherboard.
> It had 4GB or ram and included a single SATA hard drive.
>=20
> I initially loaded FreeBSD 5.4 AMD64 on the machine, recompiled the =
kernel, etc.
> and applied all the normal tweaks to apache, PHP, etc.  The machine, =
while
> faster than our single 2.4 Ghz Xeon's, wasn't all that much faster =
(maybe only
> 10 to 15 percent). =20
>=20
> After speaking with AMD and doing further benchmarks, I was about to =
give up on
> AMD and return the machine.  However, at the last minute, an engineer =
from AMD
> suggested that perhaps loading the 32 bit version of FreeBSD (aka =
i386) might
> improve performance, since it was possible that the overhead from 64 =
bit
> pointers was causing the machine to run slower than expected.  He also =
explained
> that the AMD should be running about 3 to 4 times faster than the =
single Xeon.
>=20
> While this sounded like a long shot, I loaded FreeBSD 5.4 i386 on the =
machine
> and after applying the exact same configuration to the OS, Apache, PHP =
and
> MySQL, re-ran the benchmarks.  Much to my surprise, just changing the =
OS from 64
> bit to 32 bit caused the machine to double in speed.  The results are =
attached
> in an Excel spreadsheet.  So the exact same machine, running the =
identical
> configuration, performed roughly twice as fast when running FreeBSD =
5.4 i386 vs
> FreeBSD 5.4 AMD64.  Something about this seems so wrong to me :-)

...[shorted for brevity]...

> The only answer I have so far as to why this may be the case is that =
perhaps
> i386 uses 32 bit pointers which the CPU(s) can handle faster (thus =
less overhead
> for the CPU).  But it still seems odd to me that if FreeBSD AMD64 is =
written
> specifically for the 64 bit CPU, why doesn't the machine perform =
better when
> running it?

This question isn't as simple as saying "64-bit pointers vs. 32-bit =
pointers".

Opterons really do run code much the same clock-per-clock in 64-bit and =
32-bit
mode, given it's the same instruction sequence, which is almost always =
NOT the
case.

Typically, for well-optimized code, you see:
  --  32-bit x86 code has a smaller data footprint since pointers are =
only 4 bytes.
  --  64-bit x86 code has fewer spills/reloads of registers (i.e. =
shorter code
        paths in most routines).

The result is that, for most reasonably well-optimized or identical code =
I've run,
the 64-bit version is faster.  Large-data-footprint/cache-busting cases =
seem the
exception, of which there are plenty of course.  Linux webservers we've =
tested
do seem to perform much better on 64-bit.

Having said that, it's well known that FreeBSD i386 is very highly =
optimized.  I'd
bet that the considerably less mature FreeBSD AMD64 codebase, or the =
PHP/Apache
software stack is to blame here.  There are almost certainly i386 =
assembly
or just i386-code-path-specific C/C++ code involved which is distinct =
from the
version compiled for AMD64.

The i386 version of FreeBSD (and PHP/Apache), and in specific, the =
i386-specific
codepaths, have been beaten to death by a very large number of people =
opimizing
the heck out of it.  I sincerely doubt the same can be said of the AMD64 =
code
path.


> I'm also wondering if there is a compiler switch on AMD64 that could =
be used
> (perhaps in /etc/make.conf or something) that would force the AMD64 =
version to
> run in 32 bit mode only - since that would be an interesting =
comparison as well.

Yes, for building 32-bit code:  for GCC, use "-m32", for LD, use =
"-melf_i386".


Erich Boleyn
CPG Architecture
AMD



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C630E866708D364A984D83C7B58763BC36B0BA>