Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Mar 2001 10:56:46 -0500 (EST)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Matt Dillon <dillon@earth.backplane.com>
Cc:        <freebsd-hackers@FreeBSD.ORG>
Subject:   Re: Machines are getting too damn fast
Message-ID:  <15013.2238.953211.516979@grasshopper.cs.duke.edu>
In-Reply-To: <200103060013.f260DHY46910@earth.backplane.com>
References:  <Pine.BSF.4.32.0103051729350.84853-100000@mail.wolves.k12.mo.us> <200103060013.f260DHY46910@earth.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Matt Dillon writes:
 > 
 >     I modified my original C program again, this time to simply read
 >     the data from memory given a block size in kilobytes as an argument.  
 >     I had to throw in a little __asm to do it right, but here are my results.
 >     It shows about 3.2 GBytes/sec from the L2 (well, insofar as my
 >     3-instruction loop goes), and about 1.4 GBytes/sec from main memory.
 > 
 > 
 > NOTE:  cc x.c -O2 -o x
 > 
 > ./x 4
 > 3124.96 MBytes/sec (read)
<...>
 > ./x 1024
 > 1397.90 MBytes/sec (read)
 > 
 >     In contrast I get 1052.50 MBytes/sec on the Dell 2400 from the L2,
 >     and 444 MBytes/sec from main memory.
 > 

FWIW: 1.2GHz Athlon, VIA Apollo KT133 chipset, Asus A7V motherboard,
(PC133 ECC Registered Dimms)

./x 4
2393.70 MBytes/sec (read)
./x 8
2398.19 MBytes/sec (read)
<...>
./x 1024
627.32 MBytes/sec (read)


And a Dual 933MHz PIII SuperMicro 370DER Serverworks HE-SL Chipset
(2-way interleaved PC133 ECC Registered DIMMS)

./x 4
1853.54 MBytes/sec (read)
./x 1024
526.19 MBytes/sec (read)


There's something diabolic about your previous bw test, though.  I
think it only hits one bank of interleaved ram.  On the 370DER it gets
only 167MB/sec.  Every other bw test I've run on the box shows copy
perf at around 260MB/sec (Hbench, lmbench).  I see the same problem on
a PE4400 (also 2-way interleaved); it shows copy perf as 111MB/sec.
Every other test has it at 230MB/sec.

The Athlon copies at 174MB/sec, which is right about what lmbench, hbench,
etc, and your test show.

How's your P4 for floating point?  Is real-life perf as good as the
specbench numbers would indicate, or do you need a better compiler
than GCC to get any benefit from it?  My wife is a statistician, and
she runs some really fp intensive workloads.  This Athlon is faster
than the Serverworks box and (barely) faster than a year-old Alpha
UP1000 for her code.

Drew


------------------------------------------------------------------------------
Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
Duke University				Email: gallatin@cs.duke.edu
Department of Computer Science		Phone: (919) 660-6590

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15013.2238.953211.516979>