Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Aug 2002 07:10:05 -0700 (PDT)
From:      Bruce Evans <bde@zeta.org.au>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: i386/41528: better stack alignment patch for lib/csu/i386-elf/
Message-ID:  <200208121410.g7CEA5eM035719@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR i386/41528; it has been noted by GNATS.

From: Bruce Evans <bde@zeta.org.au>
To: NIIMI Satoshi <sa2c@sa2c.net>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: i386/41528: better stack alignment patch for lib/csu/i386-elf/
Date: Tue, 13 Aug 2002 00:12:56 +1000 (EST)

 On Sun, 11 Aug 2002, NIIMI Satoshi wrote:
 
 > >Description:
 >
 > Although system C compiler, GCC, maintains stack pointer to keep
 > aligned to 2**preferred-stack-boundary byte, C startup routine does
 > not care about this.
 >
 > This causes a big performance penalty for floating point operations
 > with variables in stack frame because IA32 CPUs are optimized to
 > operate with aligned data.
 
 I think stack alignment belongs only in functions that actually store
 floating point or other data that actually needs large alignment, and
 it should be done by the compiler since only the compiler knows the
 alignment requirements.  gcc on i386's requires the stack to be aligned
 4 bytes below a 2^<preferred-stack-boundary> boundary on entry to each
 function, but this is compiler-dependent and pessimal IMO.
 
 > >Fix:
 >
 > (gcc 3.1 masks %esp by himself, so this patch might not be required
 > for -current.)
 
 Right; it isen't needed for -current.  gcc-3.1 treats main() specially
 and aligns the stack using a single andl instruction:
 
 ---
 $ cat z.c
 main()
 {
 	foo();
 }
 # -mpreferred-stack-boundary=4 is to ensure the standard default.
 $ cc -O -S z.c -mpreferred-stack-boundary=4
 $ cat z.s
 
 	.file	"z.c"
 	.text
 	.p2align 2,,3
 .globl main
 	.type	main,@function
 main:
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$8, %esp		<-- waste time
 	andl	$-16, %esp		<-- align
 	call	foo
 	leave
 	ret
 .Lfe1:
 	.size	main,.Lfe1-main
 	.ident	"GCC: (GNU) 3.1 [FreeBSD] 20020509 (prerelease)"
 ---
 
 gcc only seems to do the andl for main().  It still does the old stack
 alignment for main() and apparently depends on all other functions doing
 it.
 
 I still use my old kernel hack for alignment:
 
 %%%
 Index: kern_exec.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_exec.c,v
 retrieving revision 1.179
 diff -u -2 -r1.179 kern_exec.c
 --- kern_exec.c	1 Aug 2002 14:31:58 -0000	1.179
 +++ kern_exec.c	2 Aug 2002 13:53:14 -0000
 @@ -846,4 +843,13 @@
 
  	/*
 +	 * Align stack to a multiple of 0x20.
 +	 * XXX vectp has the wrong type; we usually want a vm_offset_t;
 +	 * the suword() family takes a void *, but should take a vm_offset_t.
 +	 * XXX should align stack for signals too.
 +	 * XXX should do this more machine/compiler-independently.
 +	 */
 +	vectp = (char **)(((vm_offset_t)vectp & ~(vm_offset_t)0x1F) - 4);
 +
 +	/*
  	 * vectp also becomes our initial stack base
  	 */
 %%%
 
 This is a wrong place to do alignment (see above).  Alignment to a
 nice power of 2 here would be reasonable, but subtracting 4 isn't.  4
 is magic, and the correct number depends on the language and the
 compiler that the application being exec'ed was compiled with, so it
 shoulnd't be given by a kernel arch-dependent #define.
 
 Fortunately, the problem is now fixed for the most important case of
 language == c && compiler == cc-current.  However, there is still a
 problem for signal handlers.  The i386 signal trampoline is:
 
 %%%
 NON_GPROF_ENTRY(sigcode)
 	call	*SIGF_HANDLER(%esp)	/* call signal handler */
 %%%
 
 This pushes one 32-bit word (for the return address) on top of the
 carelessly aligned argument words.  To work properly for cc-current,
 the stack still needs to be 16-byte aligned before the call.  Fortunately,
 this bug is mostly mott because floating point in signal handlers is
 not useful and doesn't work.
 
 > --- diff begins here ---
 > Index: stable/lib/csu/i386-elf/crt1.c
 > ===================================================================
 > RCS file: /home/ncvs/src/lib/csu/i386-elf/crt1.c,v
 > retrieving revision 1.4.2.1
 > diff -u -r1.4.2.1 crt1.c
 > --- stable/lib/csu/i386-elf/crt1.c	30 Oct 2000 20:32:24 -0000	1.4.2.1
 > +++ stable/lib/csu/i386-elf/crt1.c	10 Aug 2002 19:40:54 -0000
 > @@ -93,7 +93,33 @@
 >      monstartup(&eprol, &etext);
 >  #endif
 >      _init();
 > +#if 0
 >      exit( main(argc, argv, env) );
 > +#else
 > +    /*
 > +     * GCC expects stack frame to be aligned like following figure.
 > +     *
 > +     *  +--------------+
 > +     *  |%ebp (if any) |
 > +     *  +--------------+
 > +     *  |return address|
 > +     *  +--------------+ --- aligned by PREFERRED_STACK_BOUNDARY
 > +     *  |  arguments   |
 > +     *  |      :       |
 > +     *  |      :       |
 > +     */
 > +    __asm__ ("
 > +	subl	$12, %%esp		# create stack frame for arguments
 > +	andl	$~0xf, %%esp		# align stack to 16-byte boundary
 > +	movl	%0, 0(%%esp)
 > +	movl	%1, 4(%%esp)
 > +	movl	%2, 8(%%esp)
 > +	call	main
 > +	movl	%%eax, 0(%%esp)
 > +	call	exit
 > +	hlt				# do not return
 > +    " : : "r"(argc), "r"(argv), "r"(env));
 > +#endif
 >  }
 > ...
 
 I would only use this fix or one like it in RELENG_4.  Maybe my kernel
 hack is better since it "fixes" most applications without a recompile.
 It is simpler because it doesn't use any assembly code or have to recover
 from the kernel pushing the args in a misaligned place.
 
 Minor improvements:
 - remove the comment about %ebp since the caller doesn't handle %ebp
 - remove the hlt since it wouldn't really help if it were reached (it is
   privileged so it just generates SIGBUS, and execution continues at the
   instruction after the hlt if there is a SIGBUS handler and it returns...).
 
 Bruce
 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200208121410.g7CEA5eM035719>