From owner-freebsd-hackers@FreeBSD.ORG Tue Apr 28 12:32:02 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97FB410656E5; Tue, 28 Apr 2009 12:32:02 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (77-93-215-190.static.masterinter.net [77.93.215.190]) by mx1.freebsd.org (Postfix) with ESMTP id E449D8FC16; Tue, 28 Apr 2009 12:32:01 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 7ED629CB0A1; Tue, 28 Apr 2009 14:13:30 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (lev.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EOadVSl43OyX; Tue, 28 Apr 2009 14:13:28 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id DDE539CB1B5; Tue, 28 Apr 2009 14:13:27 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.14.3/8.14.3/Submit) id n3SCDROB043302; Tue, 28 Apr 2009 14:13:27 +0200 (CEST) (envelope-from rdivacky) Date: Tue, 28 Apr 2009 14:13:27 +0200 From: Roman Divacky To: Christoph Mallon Message-ID: <20090428121327.GA41168@freebsd.org> References: <49F4070C.2000108@gmx.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49F4070C.2000108@gmx.de> User-Agent: Mutt/1.4.2.3i Cc: FreeBSD Hackers , Warner Losh , Ed Schouten , Maxim Sobolev Subject: Re: C99: Suggestions for style(9) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Apr 2009 12:32:03 -0000 I like the part about using as many variables as possible because of documentation and performance enhancements. I tend to like the other changes as well.. On Sun, Apr 26, 2009 at 09:02:36AM +0200, Christoph Mallon wrote: > Hi hackers@, > > as some of you may have noticed, several years ago a new millenium > started and a decade ago there was a new C standard. HEAD recently > switched to C99 as default (actually gnu99, but that's rather close). So > it's finally time to re-examine wether style(9) needs to be accomodated. > The diff with the proposed changes is attached. Below are the changes > with some further explanations. They are in the order as they appear in > style(9). > Maybe using all of these changes is a bit too big change at once, but > I'd like some of them applied to style(9). So, please consider the > proposed changes individually and not a as a all-or-nothing package. > > >-Do not put declarations > >-inside blocks unless the routine is unusually complicated. > >+Prefer declaring loop iterators in the for-statement to limit their scope. > > .Bd -literal > >- for (; cnt < 15; cnt++) { > >+ for (int cnt = 0; cnt < 15; cnt++) { > [...] > >-When declaring variables in functions declare them sorted by size, > >-then in alphabetical order; multiple ones per line are okay. > >+When declaring variables in functions declare them sorted in alphabetical > >order; > >+prefer one declaration per line, especially when pointer declarators or > >+initializations are involved. > > If a line overflows reuse the type keyword. > > .Pp > >-Be careful to not obfuscate the code by initializing variables in > >-the declarations. > >-Use this feature only thoughtfully. > >-DO NOT use function calls in initializers. > >+Prefer initializing variables right at their declaration. > >+When the value is not supposed to change in the function, make the > >variable > >+const. > >+This makes it easier for a reader to identify the where the value of a > >variable > >+originates. > >+Do not reuse the same variable in a different context, declare a new > >variable. > > .Bd -literal > >- struct foo one, *two; > >- double three; > >- int *four, five; > >- char *six, seven, eight, nine, ten, eleven, twelve; > >+ struct foo *bar; > >+ struct foo baz = { 42, "qux" }; > > > >- four = myfunction(); > >+ FILE *const f = fopen("name", "r"); > >+ if (f == NULL) > >+ err("Failed to open file"); > >+ /* We can safely assume that f is not changed anymore, even if the > >+ * function is a hundred lines long */ > > This change is to reduce the scope of local variables. For reasons why > this does not cost anything in terms of stack space, please see the last > change, which adds explanations about local variables. > Smaller scopes and distinct variables for distinct contexts make it > easier for readers of the code to identify the def-use-chains of > variables, because there are less places with assignments to a variable > and the scope is smaller. Also, when a variable is initialised right at > its declaration and is not supposed to change, you can emphasize this by > making it const. > I looked at older discussions regarding this topic and identified > several concerns, which were raised. I'll address them below: > > - It does not add anything to the language. > Well, yes, just like if, do, for, goto, the comma operator, compound > assignment (+=, ...), ++/-- and many other things. All you need to be > Turing complete is lambda calculus - there hardly can be less syntax. > But, like all mentioned constructs, it makes the code more concise. > > - It leads to sloppy code. You could reuse another variable or should > think again whether you really need this variable. > Reusing variables in different contexts is error prone and bad for > maintainability. You could reuse a variable, which is not as dead as you > thought. More local variables cost nothing, especially no stack space, > see the last proposed changed below. It is good to use more local > variables, because it gives the reader a name, which carries > information. Also it makes it easier for a reader (and the compiler!) to > identify, which expressions are the same. You could repeat > foo->long_name->even_longer_name[2 * i + 1] five times or just declare a > local variable and cache the value to make it easier to read. > It also enables the compiler to produce better warnings: If you declare > a new variable, it is not initialised and if you do not initialise it on > all paths before a use, the compiler will warn you. But if you reuse an > older variable, it is already initialised by its former uses and so you > get no warning, but operate on garbage values (the old value from the > previous use). > So it does not lead to slopyy code, but better code, which is easier to > comprehend, the compiler can better help you to prevent bugs, and as a > side effect the compiler can produce better code (aliasing is a major > problem for common subexpression elimination). > > - You can determine the stack usage with all the variable declarations > at the top. > This is not true. There is no relation between the declared variables > and the stack used. Especially, more stack than suggested by the > declarations can be used due to various optimisations - less space can > be used, too, of course. > > - It is harder to find the declaration of a variable. > Every editor, which is more advanced than ed, has facilities to jump to > the declaration of a variable (e.g. in vim it's "gd"). And most often > this is not necessary, because the declaration is just a few lines above > instead of all the way up at the top of the function, so no jumping > around is required at all. > > - You could accidently shadow a variable. > All modern compilers have warnings for this. FreeBSD uses -Wshadow for > GCC (and all relevant compilers understand this very option). > > > >-Values in > >-.Ic return > >-statements should be enclosed in parentheses. > >-.Pp > > return with parentheses: > Removed, because it does not improve maintainability in any way. There > is no source for confusion here, so the rule even contradicts the rule, > which states not to use redundant parentheses. Maybe, decades ago it was > just a workaround for a broken compiler, which does not exist anymore. > > > >-Old-style function declarations look like this: > >-.Bd -literal > >-static char * > >-function(a1, a2, fl, a4) > >- int a1, a2; /* Declare ints, too, don't default them. */ > >- float fl; /* Beware double vs. float prototype differences. */ > >- int a4; /* List in order declared. */ > >-{ > >-.Ed > >-.Pp > >-Use ANSI function declarations unless you explicitly need K&R > >compatibility. > >+Use ANSI function declarations. > > Old-style function declarations: > Removed, because there is no need to use them anymore. The C99 standard > considers them obsolescent[1], too. All headers use prototype > declarations, there's no reason to handle the function definitions > different. Also they are a source of bugs - or did you know that the > following is wrong: > void wrong(char /* sic! */); > void wrong(x) char x; {} > And this is correct[2]: > void right(int /* sic! */); > void right(x) char x; {} > (It's a bug in GCC that it does not complain about the former.) > > > >- > >-static void > >-usage() > >-{ > >- /* Insert an empty line if the function has no local variables. */ > > Empty line when there are no local variables: > Removed, because it does not improve readability and it actually hinders > readability for very short functions, e.g. static inline functions, > which just consist of one or two lines of code without any local > variables except for parameters. Further, it makes even less sense with > the changed recommendations for declarations. > > > >+.Sh LOCAL VARIABLES > >+Unaliased local variables are for free on modern compilers. > >+They do not unconditionally use stack space. > >+In fact there is no relation between their number and the used stack > >space. > >+A local variable is guaranteed to be unaliased, if its address has not > >been > >+taken (e.g. &i). > >+Do not use the same local variable in different context just because it > >happens > >+that the same type is needed. > >+It does not improve the generated code in any way, but it makes it harder > >for a > >+reader to determine all definitions and uses which belong together. > >+Further, a new variable can be given a meaningful name (even if it is very > >+short), which automatically serves as documentation and so improves > >+comprehensibility. > >+Especially avoid re-using variables, whose address has been taken: > >+.Bd -literal > >+ int i; > >+ foo(&i); > >+ printf("%d\\n", i); > >+ for (i = 0; i != 10; ++i) { > >+ /* BAD: i will be needlessly moved to and from memory in > >+ * the loop, because its address was taken before. This > >+ * results in larger and slower code. > >+ * Declare a new variable for the loop instead. */ > >+ printf("%d\\n", i); > >+ } > >+.Ed > >+ > > Last, but definitely not least, I added this paragraph about the use of > local variables. This is to clarify, how today's compilers handle > unaliased local variables. There is absolutely no need to reuse them in > different contexts. Trying to optimise stack usage in this way is > misguided and is a source of bugs, when a reused variable is not as dead > as thought. For more reasons, please read the quoted diff. > Maxim, you requested this paragraph: Does this addition suit you? > > > Hopefully at least some of these suggestions are considered improvements > and are applied to style(9). > > Regards > Christoph > > > [1] ISO/IEC 9899:1999 (E) ?6.11.6 and ?6.11.7 > [2] ISO/IEC 9899:1999 (E) ?6.7.5.3:15