Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 01 Nov 2007 02:44:25 +0100
From:      Christoph Mallon <christoph.mallon@gmx.de>
To:        Andrey Chernov <ache@nagual.pp.ru>,  Juli Mallett <juli@clockworksquid.com>, src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG
Subject:   Re: cvs commit: src/include _ctype.h
Message-ID:  <47292F79.9030102@gmx.de>
In-Reply-To: <20071031215526.GC89932@nagual.pp.ru>
References:  <200710272232.l9RMWSbK072082@repoman.freebsd.org>	<20071030200331.GA29309@toxic.magnesium.net> <20071031215526.GC89932@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrey Chernov wrote:
> On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
>> * "Andrey A. Chernov" <ache@FreeBSD.org> [ 2007-10-27 ]
>> 	[ cvs commit: src/include _ctype.h ]
>>> ache        2007-10-27 22:32:28 UTC
>>>
>>>   FreeBSD src repository
>>>
>>>   Modified files:
>>>     include              _ctype.h 
>>>   Log:
>>>   Micro-optimization of prev. commit, change
>>>   (_c < 0 || _c >= 128) to (_c & ~0x7F)
>> Isn't that a non-optimization in code and a minor pessimization of readability?
>> Maybe I'm getting rusty, but those seem to result in nearly identical code on
>> i386 with a relatively modern GCC.  Did you look at the compiler output for this
>> optimization?  Is there a specific expensive instruction you're trying to avoid?
>> For such thoroughyl bit-aligned range checks, you shouldn't even get a branch
>> for the former case.  Is there a platform other than i386 I should look at where
>> the previous expression is more clearly pessimized?  Or a different compiler
>> than GCC?
> 
> For ones who doubts there two tests compiled with -O2. As you may see the 
> result is almost identical (andl vs cmpl):
> -------------------- a.c --------------------
> main () {
> 
> 	int c;
> 
> 	return (c & ~0x7f) ? 0 : c * 2;
> }
> -------------------- a.s --------------------
> 	.file	"a.c"
> 	.text
> 	.p2align 4,,15
> .globl main
> 	.type	main, @function
> main:
> 	leal	4(%esp), %ecx
> 	andl	$-16, %esp
> 	pushl	-4(%ecx)
> 	movl	%eax, %edx
> 	andl	$-128, %edx
> 	addl	%eax, %eax
> 	cmpl	$1, %edx
> 	sbbl	%edx, %edx
> 	pushl	%ebp
> 	andl	%edx, %eax
> 	movl	%esp, %ebp
> 	pushl	%ecx
> 	popl	%ecx
> 	popl	%ebp
> 	leal	-4(%ecx), %esp
> 	ret
> 	.size	main, .-main
> 	.ident	"GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
> -------------------- a1.c --------------------
> main () {
> 
> 	int c;
> 
> 	return (c < 0 || c >= 128) ? 0 : c * 2;
> 
> 
> }
> -------------------- a1.s --------------------
> 	.file	"a1.c"
> 	.text
> 	.p2align 4,,15
> .globl main
> 	.type	main, @function
> main:
> 	leal	4(%esp), %ecx
> 	andl	$-16, %esp
> 	pushl	-4(%ecx)
> 	addl	%eax, %eax
> 	cmpl	$128, %eax
> 	sbbl	%edx, %edx
> 	andl	%edx, %eax
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	pushl	%ecx
> 	popl	%ecx
> 	popl	%ebp
> 	leal	-4(%ecx), %esp
> 	ret
> 	.size	main, .-main
> 	.ident	"GCC: (GNU) 4.2.1 20070719  [FreeBSD]"

Your example is invalid. The value of c is undefined in this function 
and you see random garbage as result (for example in the code snippet 
you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which 
uses %eax, too). In fact it would be perfectly legal for the compiler to 
always return 0, call abort(), or let demons fly out of your nose.

Also the example is still unrealistic: You usually don't multiply chars 
by two. Lets try something more realistic: an ASCII filter

int filter_ascii0(int c)
{
         return c < 0 || c >= 128 ? '?' : c;
}

int filter_ascii1(int c)
{
         return c & ~0x7F ? '?' : c;
}

Especially mind that c is not dead after the condition. Even if your 
example did not used an undefined value, the value of c is dead after 
the test, which is unlikely for typical string handling code.

And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp 
-fomit-frame-pointer - I used these switches to get more compact code. 
It has no influence on the condition test.):

00000000 <filter_ascii0>:
    0:   8b 54 24 04             mov    0x4(%esp),%edx
    4:   b8 3f 00 00 00          mov    $0x3f,%eax
    9:   83 fa 7f                cmp    $0x7f,%edx
    c:   0f 46 c2                cmovbe %edx,%eax
    f:   c3                      ret

00000010 <filter_ascii1>:
   10:   8b 54 24 04             mov    0x4(%esp),%edx
   14:   b8 3f 00 00 00          mov    $0x3f,%eax
   19:   f7 c2 80 ff ff ff       test   $0xffffff80,%edx
   1f:   0f 44 c2                cmove  %edx,%eax
   22:   c3                      ret

You see there is a test instruction used in filter_ascii1, because the 
value in %edx does not die at the test, but is used again in the cmove.

	Christoph



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47292F79.9030102>