From owner-svn-src-head@FreeBSD.ORG Mon Jul 26 07:06:02 2010 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41D961065672; Mon, 26 Jul 2010 07:06:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id CC5238FC25; Mon, 26 Jul 2010 07:06:01 +0000 (UTC) Received: from c122-106-147-41.carlnfd1.nsw.optusnet.com.au (c122-106-147-41.carlnfd1.nsw.optusnet.com.au [122.106.147.41]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o6Q75uc5022602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 26 Jul 2010 17:05:57 +1000 Date: Mon, 26 Jul 2010 17:05:56 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Kostik Belousov In-Reply-To: <20100725195926.GE22295@deviant.kiev.zoral.com.ua> Message-ID: <20100726153325.M12476@delplex.bde.org> References: <201007241814.o6OIEY4K099556@svn.freebsd.org> <20100724183732.GA1715@mole.fafoe.narf.at> <20100726013202.G11808@delplex.bde.org> <20100725181255.GB22295@deviant.kiev.zoral.com.ua> <4C4C969E.1060602@freebsd.org> <20100725195926.GE22295@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, Stefan Farfeleder , Nathan Whitehorn , Bruce Evans , svn-src-head@FreeBSD.org Subject: Re: svn commit: r210451 - head/sys/sys X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2010 07:06:02 -0000 On Sun, 25 Jul 2010, Kostik Belousov wrote: > On Sun, Jul 25, 2010 at 09:55:10PM +0200, Nathan Whitehorn wrote: >> On 07/25/10 20:12, Kostik Belousov wrote: >>> On Mon, Jul 26, 2010 at 01:36:07AM +1000, Bruce Evans wrote: >>> >>>> On Sat, 24 Jul 2010, Stefan Farfeleder wrote: >>>> >>>> >>>>> declaring enums like this is not standard C code (seems to be a GCC >>>>> extension). I don't think we should use this feature in our headers. Please fix your mail program to not add extra empty lines when quoting. (I didn't write the 2 empty lines in the above, but only 1.) >>>> This is unfortunate. This is because the size of an enum variable >>>> depends on its complete declaration. This is an error unconditionaly >>>> with TenDRA. It takes -pedantic to get a warning from gcc. >>>> >>> I looked at the C99, and indeed, there is an explicit sentence >>> "A type specifier of the form enum identifier without an enumerator list >>> shall only appear after the type it specifies is complete." In C90, even this is not allowed (with an empty enumerator list) according to TenDRA (2004 version). This seems to be a bug in TenDRA: - gcc -std=c89 -pedantic allows it - the grammar for enums and enum type specifiers is the same in C99 as in C90, except C99 allows a trailing comma in enumerator lists. TenDRA gives a reference to C90 6.5.2.3 to justify this, but that (at least in the old draft n869.txt) only has a footnote saying that a bare enum tag is unnecessary due to the requirement for a complete type that we're talking about (footnote 62). The key part of the grammar that allows the bare enum tag is C90 6.5 (C99 6.7): declaration: declaration-specifiers int-declarator-list-opt ; Constraints: A declaration shall declare at least a declarator, a tag, or the members of an enumeration. The constraint is satisfied by a bare enum tag being a tag (C90 6.5.2.3). >>> I fully agree with Bruce that this is unfortunate, or rather, makes >>> enum tag declaration completely unuseful. gcc extension greatly simplifies >>> dealing with the headers pollution. But there is nothing similar for typedefs (except to not use them for structs). The problem with typedefs is partly handled by putting too many of them in and/or and either polluting eveything with these or requiring everything to include them. I normally avoid using enums since they provide few advantages except for debuggers. >>> On the other hand, I disagree with the statement that the size of the >>> enum variable depends on the full declaration. It seems that C99 >>> defines values of the enum to by of type int, and both i386 and >>> amd64 ABIs define enums as represented by 4-byte integers. >>> Yes, I am aware that C++ allows the enum to be assigned the >>> the shortest arithmetic type that can represent all enum values. >>> >> Now the broken quoting gives different indentations. Normally I don't want to know the quoting level for blank lines, but I hope my mail program preserves the brokennes by adding precisely 1 level to the above :-). >> This is not actually true. Try adding a value that requires a 64-bit int >> to your enum -- it will become 8 bytes. Also, the signedness of the type >> depends on the values in the enumeration. Such values are not allowed in C unless ints are 64 bits. (C99 6.4.4.3 [#2].) Hmm, can an enum type be unsigned? I can't find any constraint except C99 6.7.2.2 [#4] which says that the type shall be compatible with an integer type capable of representing all the enum values. Only the C "value-preserving" promotion bug gives a chance of avoiding lots of sign extension bugs if an enum type is unsigned. E.g.: (1) enum foo { UCM = UCHAR_MAX, }; The enum type for this can be u_char provided UCHAR_MAX <= INT_MAX, which is true except on exotic machines (e.g., ones with 32-bit u_chars and 32- bit ints), and machines where this allowed are the same ones that the default promotion of u_char is int, so comparison of an an enum foo with an int will not give sign extension bugs. (2) enum silly { IM = INT_MAX, }; Now the enum type should be int. u_int is also capable of representing all the enum values, but using it would mainy give sign extension bugs and should not be allowed. (3) enum sillier { ONE = 1, }; Like (2) except u_int is even less needed to represent the enum value 1. (4) enum envalid { UIM = UINT_MAX, }; This could be represented by a u_int, but is not permitted by C99 6.4.4.3 [#2] (see also C99 6.7.2.2 [#2] -- this says that although the type of an expression for an enum constant can be any integer type, the value of the expression must be representable as an int (so that it can actually be repesented by the enum constant). (4) enum valid { UIM_O2_M1 = UINT_MAX / 2 - 1, }; Example for previous paragraph, assuming normal 2's complement ints and u_ints. > Well, the amd64 ABI has a note > "C++ and some implementations of C permit enums larger than an int. The > underlying type is bumped to an unsigned int, long int or unsigned long > int, in that order." The C standard allows emums to be gratuitously large or small, but only a recalcitrant implementation would use anything except a plain signed int or a signed or unsigned integer type smaller than an int. The smaller types are allowed for optimizations, but on amd64 and i386 it is a pessimization (except for space) to use the sub-integer possibilities, so the sub-integer types are gratuitously small and it is good for the ABI to not allow them. The C standard doesn't permit enum vales that need the underlying type to be bumped to be representable. Unfortunately, portable code can't depend on the ABI, and handling the problem with unsigned types that I just noticed is onerous. E.g., what is the result of this code?: enum one { ONE = 1, } one; assert(one > -1); If the enum type is gratuitously unsigned and no smaller than int, u_int say, then the result of this is assertion failure, and portable code must write the assertion as: assert((int)one > -1); to ensure getting the normal result. OTOH, `ONE' has type int, so there is no problem with assert(ONE > -1); > The gcc extension is consistent in this regard, since it only allows > to use pointers to enum without complete definition. Allowing pointers to incomplete enum types might work even if the size of the enum type depends on the enum. The corresponding thing for structs depends on all struct pointers having the same representation. This might be implementable for enum pointers too, even if all pointers to integer types don't have the same representation. At worse, the pointers could be given the representation of 'void *'. Bruce