Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Jul 2010 17:05:56 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, Stefan Farfeleder <stefanf@FreeBSD.org>, Nathan Whitehorn <nwhitehorn@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-head@FreeBSD.org
Subject:   Re: svn commit: r210451 - head/sys/sys
Message-ID:  <20100726153325.M12476@delplex.bde.org>
In-Reply-To: <20100725195926.GE22295@deviant.kiev.zoral.com.ua>
References:  <201007241814.o6OIEY4K099556@svn.freebsd.org> <20100724183732.GA1715@mole.fafoe.narf.at> <20100726013202.G11808@delplex.bde.org> <20100725181255.GB22295@deviant.kiev.zoral.com.ua> <4C4C969E.1060602@freebsd.org> <20100725195926.GE22295@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 25 Jul 2010, Kostik Belousov wrote:

> On Sun, Jul 25, 2010 at 09:55:10PM +0200, Nathan Whitehorn wrote:
>> On 07/25/10 20:12, Kostik Belousov wrote:
>>> On Mon, Jul 26, 2010 at 01:36:07AM +1000, Bruce Evans wrote:
>>>
>>>> On Sat, 24 Jul 2010, Stefan Farfeleder wrote:
>>>>
>>>>
>>>>> declaring enums like this is not standard C code (seems to be a GCC
>>>>> extension). I don't think we should use this feature in our headers.

Please fix your mail program to not add extra empty lines when quoting.
(I didn't write the 2 empty lines in the above, but only 1.)

>>>> This is unfortunate.  This is because the size of an enum variable
>>>> depends on its complete declaration.  This is an error unconditionaly
>>>> with TenDRA.  It takes -pedantic to get a warning from gcc.
>>>>
>>> I looked at the C99, and indeed, there is an explicit sentence
>>> "A type specifier of the form enum identifier without an enumerator list
>>> shall only appear after the type it specifies is complete."

In C90, even this is not allowed (with an empty enumerator list) according
to TenDRA (2004 version).  This seems to be a bug in TenDRA:
- gcc -std=c89 -pedantic allows it
- the grammar for enums and enum type specifiers is the same in C99 as in
   C90, except C99 allows a trailing comma in enumerator lists.  TenDRA
   gives a reference to C90 6.5.2.3 to justify this, but that (at least in
   the old draft n869.txt) only has a footnote saying that a bare enum tag
   is unnecessary due to the requirement for a complete type that we're
   talking about (footnote 62).  The key part of the grammar that allows
   the bare enum tag is C90 6.5 (C99 6.7):

     declaration:
       declaration-specifiers int-declarator-list-opt ;
     Constraints:
       A declaration shall declare at least a declarator, a tag, or the
       members of an enumeration.

   The constraint is satisfied by a bare enum tag being a tag (C90 6.5.2.3).

>>> I fully agree with Bruce that this is unfortunate, or rather, makes
>>> enum tag declaration completely unuseful. gcc extension greatly simplifies
>>> dealing with the headers pollution.

But there is nothing similar for typedefs (except to not use them for
structs).  The problem with typedefs is partly handled by putting too
many of them in <sys/types.h> and/or <sys/_types.h> and either polluting
eveything with these or requiring everything to include them.

I normally avoid using enums since they provide few advantages except
for debuggers.

>>> On the other hand, I disagree with the statement that the size of the
>>> enum variable depends on the full declaration. It seems that C99
>>> defines values of the enum to by of type int, and both i386 and
>>> amd64 ABIs define enums as represented by 4-byte integers.
>>> Yes, I am aware that C++ allows the enum to be assigned the
>>> the shortest arithmetic type that can represent all enum values.
>>>
>>

Now the broken quoting gives different indentations.  Normally I don't
want to know the quoting level for blank lines, but I hope my mail
program preserves the brokennes by adding precisely 1 level to the
above :-).

>> This is not actually true. Try adding a value that requires a 64-bit int
>> to your enum -- it will become 8 bytes. Also, the signedness of the type
>> depends on the values in the enumeration.

Such values are not allowed in C unless ints are 64 bits.  (C99 6.4.4.3
[#2].)  Hmm, can an enum type be unsigned?  I can't find any constraint
except C99 6.7.2.2 [#4] which says that the type shall be compatible
with an integer type capable of representing all the enum values.  Only
the C "value-preserving" promotion bug gives a chance of avoiding lots
of sign extension bugs if an enum type is unsigned.  E.g.:

(1)    enum foo { UCM = UCHAR_MAX, };

The enum type for this can be u_char provided UCHAR_MAX <= INT_MAX, which
is true except on exotic machines (e.g., ones with 32-bit u_chars and 32-
bit ints), and machines where this allowed are the same ones that the
default promotion of u_char is int, so comparison of an an enum foo with
an int will not give sign extension bugs.

(2)    enum silly { IM = INT_MAX, };

Now the enum type should be int.  u_int is also capable of representing
all the enum values, but using it would mainy give sign extension bugs
and should not be allowed.

(3)    enum sillier { ONE = 1, };

Like (2) except u_int is even less needed to represent the enum value 1.

(4)    enum envalid { UIM = UINT_MAX, };

This could be represented by a u_int, but is not permitted by C99
6.4.4.3 [#2] (see also C99 6.7.2.2 [#2] -- this says that although the
type of an expression for an enum constant can be any integer type,
the value of the expression must be representable as an int (so that
it can actually be repesented by the enum constant).

(4)    enum valid { UIM_O2_M1 = UINT_MAX / 2 - 1, };

Example for previous paragraph, assuming normal 2's complement ints and
u_ints.

> Well, the amd64 ABI has a note
> "C++ and some implementations of C permit enums larger than an int. The
> underlying type is bumped to an unsigned int, long int or unsigned long
> int, in that order."

The C standard allows emums to be gratuitously large or small, but only
a recalcitrant implementation would use anything except a plain signed
int or a signed or unsigned integer type smaller than an int.  The smaller
types are allowed for optimizations, but on amd64 and i386 it is a
pessimization (except for space) to use the sub-integer possibilities,
so the sub-integer types are gratuitously small and it is good for the ABI
to not allow them.

The C standard doesn't permit enum vales that need the underlying type to
be bumped to be representable.

Unfortunately, portable code can't depend on the ABI, and handling the
problem with unsigned types that I just noticed is onerous.  E.g., what
is the result of this code?:

 	enum one { ONE = 1, } one;

 	assert(one > -1);

If the enum type is gratuitously unsigned and no smaller than int, u_int
say, then the result of this is assertion failure, and portable code
must write the assertion as:

 	assert((int)one > -1);

to ensure getting the normal result.  OTOH, `ONE' has type int, so there
is no problem with

 	assert(ONE > -1);

> The gcc extension is consistent in this regard, since it only allows
> to use pointers to enum without complete definition.

Allowing pointers to incomplete enum types might work even if the size of
the enum type depends on the enum.  The corresponding thing for structs
depends on all struct pointers having the same representation.  This might
be implementable for enum pointers too, even if all pointers to integer
types don't have the same representation.  At worse, the pointers could
be given the representation of 'void *'.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100726153325.M12476>