Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Nov 1995 02:56:06 -0800
From:      "Amancio Hasty Jr." <hasty@rah.star-gate.com>
To:        freebsd-hackers@freebsd.org
Message-ID:  <199511071056.CAA02766@rah.star-gate.com>

next in thread | raw e-mail | index | archive | help
This is a MIME-encapsulated message

- --CAA02716.815741535/rah.star-gate.com




- --CAA02716.815741535/rah.star-gate.com
Content-Type: message/rfc822

Return-Path: hasty@rah.star-gate.com
Received: from rah.star-gate.com (rah.star-gate.com [204.188.121.18]) by 
rah.star-gate.com (8.6.12/8.6.9) with SMTP id CAA02714 for 
<freebsd-hackers@freebsd.org>; Tue, 7 Nov 1995 02:52:12 -0800
Message-Id: <199511071052.CAA02714@rah.star-gate.com>
Date: Tue, 07 Nov 95 02:52:13 -0800
Sender: hasty
From: "Amancio Hasty, Jr." <hasty@rah.star-gate.com>
X-Mailer: Mozilla 1.1N (X11; I; FreeBSD 2.1-STABLE i386)
MIME-Version: 1.0
To: freebsd-hackers@freebsd.org
Subject: Re: A question about fast copying with a Pentium processor
X-URL: news:47lm63$6j0@ixnews3.ix.netcom.com
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii

Should we start using floating point ? 8)

Thats a joke however I do think that some of you may find this interesting...

	Cheers,
	Amancio

mschmit@ix.netcom.com (Mike Schmit) wrote:
>In <1995Nov5.235249.8471@nmt.edu> borchers@nmt.edu (Brian Borchers) writes: 
>>
>>I've got a question about coding for speed on the Pentium that has me
>>somewhat baffled.  Consider the problem of copying a large number of
>>double precision numbers from one array to another.  Here's C code
>>for the operation:
>>
>>    for (i=0; i<=SIZE-1; i++)
>>      {
>>	b[i]=a[i];
>>      };
>>
>>
>>Using the Gnu C Compiler version 2.6.3 (I know, I should move up to the 
>>latest version, but that has nothing to do with my question) we get
>>the following code for this loop:
>>
>>L20:
>>	movl (%ebx),%eax
>>	movl 4(%ebx),%edx
>>	movl %eax,(%ecx)
>>	movl %edx,4(%ecx)
>>	addl $8,%ecx
>>	addl $8,%ebx
>>	cmpl %edi,%ecx
>>	jle L20
>>
>>When I run the code on fairly large arrays, I find that my system can copy
>>about 30 Megabytes per second on arrays of four megabytes or so.  
>>
>>I then rewrite the loop as follows:
>>
>>L20:
>>	fldl (%ebx)
>>	fstpl (%ecx) 
>>	addl $8,%ecx
>>	addl $8,%ebx
>>	cmpl %edi,%ecx
>>	jle L20
>>
>>The resulting program copies data at about 60 Megabytes per second.  
>>
>>Thinking about it, I came to the conclusion that both versions of the
>>code should probably be most limited by memory bandwidth.  However, I
>>expect that both codes should be using exactly the same memory
>>bandwidth.
>>
>>Looking at "Optimizations for Intel's 32-Bit Processors", Version 2.0, 
>>I see that on page 25, an approach like that used by gcc is suggested
>>as being twice as fast as the other approach, while in practice, it
>>seems to be twice as slow.  
>>
>>Questions:
>> 
>>         - Why is the first version of the code not as fast as the
>second?
>> 
>>         - Why isn't the second version faster than the first (as
>indicated
>>           by "Optimizations for Intel's 32-Bit Processors")    
>
>     (Did you mean first version?)
>
>> 
>>         - What's going on here?
>>           
>
>I'm not sure why the Intel book says what it does. But the reason you
>are
>getting a faster copy is that the FP load and store instructions are
>reading and writing memory 8 bytes at a time (and presumably these have
>been properly aligned). The other integer code is just copying 4 bytes
>at a time.
>
>Mike Schmit
>
>-------------------------------------------------------------------
>mschmit@ix.netcom.com           author:
>408-244-6826                    Pentium Processor Programming Tools
>800-765-8086                    ISBN: 0-12-627230-1
>-------------------------------------------------------------------
>
news:47lm63$6j0@ixnews3.ix.netcom.com

- -- 
Amancio Hasty                       
Hasty Software Consulting Services
Tel:      415-495-3046
Fax:      415-495-3046
Cellular: 415-309-8434
e-mail:	  hasty@star-gate.com      Powered by FreeBSD



- --CAA02716.815741535/rah.star-gate.com--


------- End of Forwarded Message






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199511071056.CAA02766>