Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Jul 2003 00:56:34 +0200
From:      "Poul-Henning Kamp" <phk@phk.freebsd.dk>
To:        "Alan L. Cox" <alc@imimic.com>
Cc:        Marcel Moolenaar <marcel@xcllnt.net>
Subject:   Re: cvs commit: src/sys/kern init_main.c kern_malloc.c md5c.c subr_autoconf.c subr_mbuf.c subr_prf.c tty_subr.c vfs_cluster.c vfs_subr.c 
Message-ID:  <16119.1058914594@critter.freebsd.dk>
In-Reply-To: Your message of "Tue, 22 Jul 2003 17:39:01 CDT." <3F1DBD05.A4886D5E@imimic.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <3F1DBD05.A4886D5E@imimic.com>, "Alan L. Cox" writes:
>Poul-Henning Kamp wrote:

>> Compiling the file with and without the inline, and forcing GCC
>> to respect the inline finds:
>> 
>>            text    data     bss     dec     hex filename
>> inlined:  17408      76     420   17904    45f0 vm_object.o
>> regular:  14944      76     420   15440    3c50 vm_object.o
>>           -----
>>            2464
>> 
>Again, in the case of vm_object_backing_scan(), code size is a
>bad predictor of inlining's effect.  Inlining is being used to achieve a
>form of code specialization that will actually reduce the size of the
>code that is *executed*.

	00001ef0 t vm_object_backing_scan
	000023e0 t vm_object_qcollapse

The non-inlined function is 1264 bytes, it is inlined 3 times, so
best case you have saved up to 1328 bytes from being (mostly)
branched over.

On the other hand, you have 2464 bytes to cache rather than 1264
bytes to cache.

If you had said "I ran a benchmark and it is in fact faster" that
would make me conceeded right away.

>In conclusion, my point is not that you should stop what you're doing. 
>It is rather that there are exceptional cases where gcc is doing the
>wrong thing and we should have an override to force inlining that can be
>applied.

I agree that there are exceptional cases, I agree that we need some
sort of __inline_damnit, but I still request that we only used it when
we know for a fact that there is an actual benefit.

And the only two criteria I think are trivial to use for proving an
actual benefit is:
	1. less code is generated.
	2. it runs faster in tests.

I am very reluctant to accept "a speculated benefit" when there is
as strong a counter indication as 2k extra code segment.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16119.1058914594>