Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 May 2000 06:22:20 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Dan Nelson <dnelson@emsphone.com>
Cc:        Jean-Marc Zucconi <jmz@FreeBSD.ORG>, current@FreeBSD.ORG
Subject:   Re: Can someone explain this?
Message-ID:  <Pine.BSF.4.21.0005070611030.8973-100000@besplex.bde.org>
In-Reply-To: <20000506002203.A6363@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 6 May 2000, Dan Nelson wrote:

> In the last episode (May 05), Jean-Marc Zucconi said:
> > Here is something I don't understand:
> > 
> > $ sh -c  '/usr/bin/time  ./a.out'
> >         2.40 real         2.38 user         0.01 sys
> > $ /usr/bin/time  ./a.out
> >         7.19 real         7.19 user         0.00 sys

> It has to do with your stack.  Calling the program via /bin/sh sets up
> your environment differently, so your program's stack starts at a
> different place.  Try running this:

> Here are some bits from the gcc infopage explaining your options if you
> want consistant speed from programs using doubles:
> 
> `-mpreferred-stack-boundary=NUM'
>      Attempt to keep the stack boundary aligned to a 2 raised to NUM
>      byte boundary.  If `-mpreferred-stack-boundary' is not specified,
>      the default is 4 (16 bytes or 128 bits).
>      The stack is required to be aligned on a 4 byte boundary.  On
>      Pentium and PentiumPro, `double' and `long double' values should be
>      aligned to an 8 byte boundary (see `-malign-double') or suffer
>      significant run time performance penalties.  On Pentium III, the
>      Streaming SIMD Extention (SSE) data type `__m128' suffers similar
>      penalties if it is not 16 byte aligned.

The default of 4 for -mpreferred-stack-boundary perfectly preserves
any initial misaligment of the stack.  Under FreeBSD the stack is
initially misaligned (for doubles) with a probability of 1/2.  There
was some discussion of fixing this when gcc-2.95 was imported, but
nothing was committed.  I use the following local hack:

diff -c2 kern_exec.c~ kern_exec.c
*** kern_exec.c~	Mon May  1 15:56:40 2000
--- kern_exec.c	Mon May  1 15:56:42 2000
***************
*** 627,630 ****
--- 647,659 ----
  		vectp = (char **)
  			(destp - (imgp->argc + imgp->envc + 2) * sizeof(char*));
+ 
+ 	/*
+ 	 * Align stack to a multiple of 0x20.
+ 	 * XXX vectp has the wrong type; we usually want a vm_offset_t;
+ 	 * the suword() family takes a void *, but should take a vm_offset_t.
+ 	 * XXX should align stack for signals too.
+ 	 * XXX should do this more machine/compiler-independently.
+ 	 */
+ 	vectp = (char **)(((vm_offset_t)vectp & ~(vm_offset_t)0x1F) - 4);
  
  	/*

Bruce



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0005070611030.8973-100000>