Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Apr 1996 07:57:31 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        asami@cs.berkeley.edu, bde@zeta.org.au
Cc:        current@FreeBSD.org, hasty@star-gate.com, nisha@cs.berkeley.edu, tege@matematik.su.se
Subject:   Re: fast memory copy for large data sizes
Message-ID:  <199604052157.HAA25295@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
> * This seemed like a bad idea.  I added a test using it (just 8 fldl's
> * followed by 8 fstpl's, storing in reverse order - this works for at
> * least all-zero data) and got good results, but I still think it is a bad
> * idea.  

>Well, from the numbers below, it certainly seems faster than yours for 
>larger sizes even if things are in the L2 cache!

They aren't in the L2 cache (256K is a tie and yours are faster for 512K
but 2*512K isn't in the cache).  I get similar results with fildl.  Now
trying reading and pushing then popping and writing 32 bytes at a time.
This might work better if there were more registers so the stack doesn't
have to have to be used.  However, the stack is very fast if it's in the
L1 cache (I get 800 MB/s read and 750 MB/s write).

>Note that the speed of fldls depend on the actual data.  All-zero data
>is faster than random data (to avoid traps, try ((double *)src[i] =
>random())), probably because the all-zero bit pattern can be converted
>to floating point (ok, no conversion necesarry in this case :) in a
>snap.

Have you tried using fldt?  No conversion for that.

>Ok.  By the way, why is your data lacking smaller sizes for your FP
>copy?

I didn't run them all and they weren't interesting (nowhere near 350K/s).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604052157.HAA25295>