Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Mar 2003 06:05:04 +1100
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org>
Cc:        cvs-all@freebsd.org
Subject:   Re: Checksum/copy
Message-ID:  <20030327190504.GD11307@cirb503493.alcatel.com.au>
In-Reply-To: <xzp7kalw5j4.fsf@flood.ping.uio.no>
References:  <Pine.BSF.4.21.0303260956250.27748-100000@root.org> <20030326225530.G2075@odysseus.silby.com> <20030327180247.D1825@gamplex.bde.org> <xzp7kalw5j4.fsf@flood.ping.uio.no>

next in thread | previous in thread | raw e-mail | index | archive | help
[I think this is getting somewhat off topic for the CVS lists]

On Thu, Mar 27, 2003 at 09:57:35AM +0100, Dag-Erling Smørgrav wrote:
>Might it be a good idea to have separate b{copy,zero} implementations
>for special purposes like pmap_{copy,zero}_page?  Since these cases
>copy or zero a fixed and relatively large amount of data, they should
>lend themselves well to optimization.

I think it would be useful - even ignoring SSE, most of the fast
b{zero,copy} implementations include a fair amount of special code
to handle alignment issues and the odd few bytes at the beginning/end
that don't fit into the main loop's work unit.  Having a known size
and alignment simplifies the code a lot.

>  Zeroing a 4096-byte page on an
>SSE-enabled i386 should take no more than 35 SSE instructions

The downside is that we need multiple implementations to take advantage
of features available in different CPUs.

I guess it's a "put up your patches and benchmark results" issue.

Peter



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030327190504.GD11307>