From owner-freebsd-hackers@FreeBSD.ORG Sun Mar 21 12:22:45 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7FFAA16A4CE for ; Sun, 21 Mar 2004 12:22:45 -0800 (PST) Received: from malasada.lava.net (malasada.lava.net [64.65.64.17]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2F59743D1D for ; Sun, 21 Mar 2004 12:22:45 -0800 (PST) (envelope-from cliftonr@lava.net) Received: by malasada.lava.net (Postfix, from userid 102) id 8B5B0153882; Sun, 21 Mar 2004 10:22:44 -1000 (HST) Date: Sun, 21 Mar 2004 10:22:44 -1000 From: Clifton Royston To: Garance A Drosihn Message-ID: <20040321202243.GA3199@lava.net> References: <20040321200044.C571716A4D0@hub.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040321200044.C571716A4D0@hub.freebsd.org> User-Agent: Mutt/1.4.2i cc: hackers@FreeBSD.org Subject: Re: Adventures with gcc: code vs object-code size X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Mar 2004 20:22:45 -0000 > Date: Sat, 20 Mar 2004 17:45:04 -0500 From: Garance A Drosihn > Subject: Adventures with gcc: code vs object-code > size To: hackers@FreeBSD.org Message-ID: > Content-Type: text/plain; > charset="us-ascii" ; format="flowed" > > I have written a fairly major set of changes to the `ps' command, > which is available as: > http://people.freebsd.org/~gad/ps-susv3.diff > > Debate/discussion about the changes themselves actual changes should > be going on in the freebsd-standards mailing list. So for purposes > of this mailing list, please ignore most of that. > > But while doing it, I was somewhat obsessed about making sure that > my changes wouldn't cause a dramatic increase in the size of the > executable for `ps'. Due to that, I tripped over one example of > "code" vs "object-code produced" which makes no sense to me. So, > consider just this section of the update (with some reformatting > so it is easy to see the code): > > char elemcopy[PATH_MAX]; > ...do stuff... > #if !defined(ADD_PS_LISTRESET) > inf->addelem(inf, elemcopy); > #else > /* > * We now have a single element. Add it to the > * list, unless the element is ":". In that case, > * reset the list so previous entries are ignored. > */ > if (strcmp(elemcopy, ":") == 0) > inf->count = 0; > else > inf->addelem(inf, elemcopy); > #endif > > Now, here is what I noticed: > > * XXX - Adding this check increases the total size of `ps' by > * 3940 bytes on i386! That's 12% of the entire program! > * { using gcc (GCC) 3.3.3 [FreeBSD] 20031106 } > * > * When compiling for sparc, adding this option causes NO > * change in the size of the `ps' executable. And on alpha, > * adding this option adds only 8 bytes to the executable. > > So, by adding one call to strcmp() to check for a ":" string, I end > up with /bin/ps (the stripped-object-file) which has grown by 12.6% !! > This is for a program which is almost 2500 lines spread out over > 5 '.c'-files. How is that possible? What am I missing here? In my coding experience (especially back in embedded-land when I cared a *lot* about code size) when this happens, the reason is usually nothing to do with the compiler per se, but with the packaging of library functions into modules. If it happens that strcmp was not previously being referenced at all, absent this one line, then of course this change will pull in the strcmp routine. Now while strcmp itself is not likely to be 3940 bytes, if it is packed together in the library with a number of other routines (e.g. a collection of hand-optimized assembler string routines, or of other routines which it's assumed are likely to be used in the same program) then the one module being pulled in can suddenly bloat the source surprisingly. This would be easy to test - extract the program's symbol table before stripping, with and without this one line, diff it, and see whether other new symbols are showing up along with strcmp. This may also be true even if gcc is partially inlining it - gcc may be pulling in a big clump of its own internal support routines in that case, on the assumption that when inlining you care only about speed and not about code size. Again, differencing the symbol table should give you some idea. -- Clifton -- Clifton Royston -- cliftonr@tikitechnologies.com Tiki Technologies Lead Programmer/Software Architect Did you ever fly a kite in bed? Did you ever walk with ten cats on your head? Did you ever milk this kind of cow? Well we can do it. We know how. If you never did, you should. These things are fun, and fun is good. -- Dr. Seuss