Date: Sat, 6 Apr 1996 07:57:31 +1000 From: Bruce Evans <bde@zeta.org.au> To: asami@cs.berkeley.edu, bde@zeta.org.au Cc: current@FreeBSD.org, hasty@star-gate.com, nisha@cs.berkeley.edu, tege@matematik.su.se Subject: Re: fast memory copy for large data sizes Message-ID: <199604052157.HAA25295@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
> * This seemed like a bad idea. I added a test using it (just 8 fldl's > * followed by 8 fstpl's, storing in reverse order - this works for at > * least all-zero data) and got good results, but I still think it is a bad > * idea. >Well, from the numbers below, it certainly seems faster than yours for >larger sizes even if things are in the L2 cache! They aren't in the L2 cache (256K is a tie and yours are faster for 512K but 2*512K isn't in the cache). I get similar results with fildl. Now trying reading and pushing then popping and writing 32 bytes at a time. This might work better if there were more registers so the stack doesn't have to have to be used. However, the stack is very fast if it's in the L1 cache (I get 800 MB/s read and 750 MB/s write). >Note that the speed of fldls depend on the actual data. All-zero data >is faster than random data (to avoid traps, try ((double *)src[i] = >random())), probably because the all-zero bit pattern can be converted >to floating point (ok, no conversion necesarry in this case :) in a >snap. Have you tried using fldt? No conversion for that. >Ok. By the way, why is your data lacking smaller sizes for your FP >copy? I didn't run them all and they weren't interesting (nowhere near 350K/s). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604052157.HAA25295>