Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Apr 1996 14:15:32 -0800
From:      asami@cs.berkeley.edu (Satoshi Asami)
To:        bde@zeta.org.au
Cc:        bde@zeta.org.au, current@FreeBSD.org, hasty@star-gate.com, nisha@cs.berkeley.edu, tege@matematik.su.se
Subject:   Re: fast memory copy for large data sizes
Message-ID:  <199604052215.OAA08351@sunrise.cs.berkeley.edu>
In-Reply-To: <199604052157.HAA25295@godzilla.zeta.org.au> (message from Bruce Evans on Sat, 6 Apr 1996 07:57:31 %2B1000)

next in thread | previous in thread | raw e-mail | index | archive | help
 * >Well, from the numbers below, it certainly seems faster than yours for 
 * >larger sizes even if things are in the L2 cache!
 * 
 * They aren't in the L2 cache (256K is a tie and yours are faster for 512K
 * but 2*512K isn't in the cache).

I was commenting on the two rightmost columns, the "it" was your
version of FP copy.  (You can't compare your int copy and our original
FP numbers, because we always started with everything out of the
cache! ;)

 * 				    I get similar results with fildl.  Now
 * trying reading and pushing then popping and writing 32 bytes at a time.
 * This might work better if there were more registers so the stack doesn't
 * have to have to be used.

Can you elaborate?  Can I use FP registers without using the stack?  I 
thought all the FP registers are in the stack!

 * 			     However, the stack is very fast if it's in the
 * L1 cache (I get 800 MB/s read and 750 MB/s write).

Wow.

 * Have you tried using fldt?  No conversion for that.

What's fldt?  My assembler doesn't know about that instruction....

 * >Ok.  By the way, why is your data lacking smaller sizes for your FP
 * >copy?
 * 
 * I didn't run them all and they weren't interesting (nowhere near 350K/s).

Well, it might be worthwhile to put them side by side and see how they 
compare.

By the way, may we have a copy of your routine?  Is it beerware? :)

Satoshi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604052215.OAA08351>