Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Feb 1999 12:47:41 +1100
From:      Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>
To:        hackers@FreeBSD.ORG
Subject:   Re: vm_page_zero_fill
Message-ID:  <99Feb19.123711est.40325@border.alcanet.com.au>

next in thread | raw e-mail | index | archive | help
Alfred Perlstein <bright@cygnus.rush.net> wrote:
>After playing with "gcc -O -S bcmp.c" on several platforms, i386,
>sparc32, alpha.  It seems to me that the function ought to be
>replaced with this:
[deleted]

The code given is portable, but not optimal for any of these
architectures - especially the Alpha.  The original Alpha chips don't
have character instructions so character handling is quite poor (and
gcc2.7.x doesn't include support for the new character instructions).

Optimal code for the Alpha would read 8-byte long-word aligned chunks
from memory, then appropriately re-align and compare them.  (There's
some discussion about this, though not actual code, in the early Alpha
white papers).

A similar strategy probably holds for the SPARC (but 4-bytes loads
except on UltraSPARCs).  Something similar could be done on the ix86,
but I'm not certain about the advantages.

This _is_ one area where carefully hand-crafted code is worth the
effort (especially on the RISC architectures).

>it uses the "rep cmpsl" opcode, i have heard that using "movs/lods/cmps"
>was no longer optimal after the 486 line, but i'm unsure.
Sort of true.  In theory, an explicit loop is faster than "rep cmps".
Lack of CPU<->RAM bandwidth tends to make this less of an issue unless
both strings are in L1 cache.

Peter


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Feb19.123711est.40325>