Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 06 Apr 1996 19:54:25 +0200
From:      Torbjorn Granlund <tege@matematik.su.se>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        asami@cs.berkeley.edu, current@freebsd.org, hasty@rah.star-gate.com, mrami@minerva.cis.yale.edu, nisha@cs.berkeley.edu, tege@matematik.su.se
Subject:   Re: optimized bzeros found harmful (was: fast memory copy ...) 
Message-ID:  <199604061754.TAA17355@insanus.matematik.su.se>
In-Reply-To: Your message of "Sat, 06 Apr 1996 09:13:46 %2B1000." <199604052313.JAA28956@godzilla.zeta.org.au> 

next in thread | previous in thread | raw e-mail | index | archive | help
  This behaviour is consistent with the data being zeroed usually not being
  in the L2 cache.  RBW is 33% slower in that case on my system.  Other
  cases: if the data is in the L2 cache but not in the L1 cache, then RBW
  is between 0% and 33% faster; if data the data is in the L1 cache, then
  RBW is 8.5 times faster (740MB/s!).

This must be a misunderstanding!

If the data is really in the L1 cache, the read-before-write is wasted and
just contributes to the overhead.

The read-before-write is effective if and only if the data is not in the L1
cache.  In that case, it forces allocation of the cache line in the L1
cache, and thereby allows a 14x peak speedup.

If other behaviours are observed, the timing framework confuses you.

All other CPUs I know of have caches that do allocate-on-write.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604061754.TAA17355>