Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Jul 2016 02:53:29 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Andrey Chernov <ache@freebsd.org>
Cc:        svn-src-head@freebsd.org, FreeBSD Current <freebsd-current@freebsd.org>, freebsd-stable@freebsd.org, freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r302601 - in head/sys: arm/include arm64/include [clang 3.8.0: powerpc int instead of 32-bit SYSVR4's long and 64-bit ELF V2 long]
Message-ID:  <580A746B-3F02-44FA-AB2E-20CC71A1E9D2@dsl-only.net>
In-Reply-To: <3DFF1DC9-2AE6-498A-9FE0-4970E76F8AB5@dsl-only.net>
References:  <46153340-D2F4-48BD-B738-4792BC25FA3F@dsl-only.net> <b4d1b3d9-9577-3f89-c13e-8c46d1ddee95@freebsd.org> <38CF2C28-3BD1-4D09-939F-4DD0C2E8B58F@dsl-only.net> <a3f33812-1780-024e-4638-994c56e45c42@freebsd.org> <3DFF1DC9-2AE6-498A-9FE0-4970E76F8AB5@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[Top post of a history note for powerpc and wchar_t's type in FreeBSD. =
The history is from looking around in svn.]

[The below is not a complaint or a request for a change. It just looks =
like int for wchar_t for powerpc was a choice made long ago for simpler =
code given FreeBSD's pre-existing structure.]

int being used for powerpc wchar_t on FreeBSD goes back to at least =
2001-Jan-1. [FYI: "27 February, 2008: FreeBSD 7.0 is the first release =
to officially support the FreeBSD/ppc port". So long before official =
support.]

wchar_t's type is one place where FreeBSD choose to override the powerpc =
(and powerpc64) ABI standards (that indicate long, not int). I'm not =
sure if this was implicit vs. explicitly realizing the ABI mismatch. =
[The SYSVR4 32-bit powerpc ABI goes back to 1995.]

I first traced the history back to 2002-Aug-23: -r102315 of =
sys/sys/_types.h standardized FreeBSD on the following until the ARM =
change:

typedef int             __ct_rune_t;
typedef __ct_rune_t     __rune_t;
typedef __ct_rune_t     __wchar_t;
typedef __ct_rune_t     __wint_t;

Prior to this there was 2002-Aug-21's -r102227 =
sys/powerpc/include/_types.h that used __int32_t.

Prior to that had ansi.h and types.h instead of _types.h --and ansi.h =
had:

#define _BSD_WCHAR_T_   _BSD_CT_RUNE_T_         /* wchar_t (see below) =
*/
. . .
#define _BSD_CT_RUNE_T_ int                     /* arg type for ctype =
funcs */

Going back to sys/powerpc/include/ansi.h's -r70571 (2001-Jan-1 creation =
in svn):

#define _BSD_WCHAR_T_   int                     /* wchar_t */

And the comments back then say:

. . . It is not
 * unsigned so that EOF (-1) can be naturally assigned to it and used.
. . . The reason an int was
 * chosen over a long is that the is*() and to*() routines take ints =
(says
 * ANSI C), but they use __ct_rune_t instead of int.

I've decided to not go any farther back in time (if there is prior =
history for wchar_t for powerpc).

Ignoring the temporary __int32_t use: FreeBSD has had its own powerpc =
wchar_t type (int) for at least the last 15 years, at least when viewed =
just relative to the powerpc ABI(s) FreeBSD is based on for powerpc.



Modern gcc versions even have the FreeBSD wchar_t type correct for =
powerpc variants in recent times: int. Previously some notation (L based =
notation) used the wrong type for one of the powerpc variants (32-bit =
vs. 64-bit), causing lots of false-positive compiler notices. gcc had =
followed the ABI involved (long int) until the correction.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Jul-13, at 11:46 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

> On 2016-Jul-13, at 6:00 PM, Andrey Chernov <ache at freebsd.org> =
wrote:
>=20
>> On 13.07.2016 11:53, Mark Millard wrote:
>>> [The below does note that TARGET=3Dpowerpc has a mix of signed =
wchar_t and unsigned char types and most architectures have both being =
signed types.]
>>=20
>> POSIX says nothing about wchar_t and char should be the same =
(un)signed.
>> It is arm ABI docs may say so only. They are different entities
>> differently encoded and cross assigning between wchar_t and char is =
not
>> recommended.
>=20
> [My "odd" would better have been the longer phrase "unusual for =
FreeBSD" for the signed type mismatch point.]
>=20
> C11 (9899:2011[2012]) and C++11 (14882:2011(E)) agree with your POSIX =
note: no constraint to have the same signed type status as char.
>=20
> But when I then looked at the "System V Application Binary Interface =
PowerpC Processor Supplement" (1995-Sept SunSoft document) that I =
believe FreeBSD uses for powerpc (32-bit only: TARGET_ARCH=3Dpowerpc) it =
has:
>=20
> typedef long wchar_t;
>=20
> as part of: Figure 6-39 <stddef.h> (page labeled 6-38).
>=20
> While agreeing about the signed-type status for wchar_t this does not =
agree with FreeBSD 11.0's use of int as the type:
>=20
> sys/powerpc/include/_types.h:typedef	int		___wchar_t;
> sys/powerpc/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
> sys/powerpc/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>=20
> # clang --target=3Dpowerpc-freebsd11 -std=3Dc99 -E -dM  - < /dev/null =
| more
> . . .
> #define __WCHAR_MAX__ 2147483647
> #define __WCHAR_TYPE__ int
> #define __WCHAR_WIDTH__ 32
> . . .
>=20
> I'm not as sure of which document is official for =
TARGET_ARCH=3Dpowerpc64 but using "Power Architecture 64-bit ELF V2 ABI =
Specification" (Open POWER ABI for Linux Supplement) as an example of =
what likely is common for that context: 5.1.3 Types Defined in Standard =
header lists:
>=20
> typedef long wchar_t;
>=20
> which again does not agree with FreeBSD 11.0's use of int as the type:
>=20
> # clang --target=3Dpowerpc64-freebsd11 -std=3Dc99 -E -dM  - < =
/dev/null | more
> . . .
> #define __WCHAR_MAX__ 2147483647
> #define __WCHAR_TYPE__ int
> #define __WCHAR_WIDTH__ 32
> . . .
>=20
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net
>=20
>=20
>>=20
>> On 2016-Jul-11, at 8:57 PM, Andrey Chernov <ache at freebsd.org> =
wrote:
>>=20
>>> On 12.07.2016 5:44, Mark Millard wrote:
>>>> My understanding of the criteria for __WCHAR_MIN and __WCHAR_MAX:
>>>>=20
>>>> A) __WCHAR_MIN and __WCHAR_MAX: same type as the integer promotion =
of
>>>> ___wchar_t (if that is distinct).
>>>> B) __WCHAR_MIN is the low value for ___wchar_t as an integer type; =
not
>>>> necessarily a valid char value
>>>> C) __WCHAR_MAX is the high value for ___wchar_t as an integer type; =
not
>>>> necessarily a valid char value
>>>=20
>>> It seems you are right about "not a valid char value", I'll back =
this
>>> change out.
>>>=20
>>>> As far as I know arm FreeBSD uses unsigned character types (of =
whatever
>>>> width).
>>>=20
>>> Probably it should be unsigned for other architectures too, clang =
does
>>> not generate negative values with L'<char>' literals and locale use =
only
>>> positive values too.
>>=20
>> Looking around:
>>=20
>> # grep -i wchar sys/*/include/_types.h
>> sys/arm/include/_types.h:typedef	unsigned int	___wchar_t;
>> sys/arm/include/_types.h:#define	__WCHAR_MIN	0		=
/* min value for a wchar_t */
>> sys/arm/include/_types.h:#define	__WCHAR_MAX	__UINT_MAX	=
/* max value for a wchar_t */
>> sys/arm64/include/_types.h:typedef	unsigned int	___wchar_t;
>> sys/arm64/include/_types.h:#define	__WCHAR_MIN	0		=
/* min value for a wchar_t */
>> sys/arm64/include/_types.h:#define	__WCHAR_MAX	__UINT_MAX	=
/* max value for a wchar_t */
>> sys/mips/include/_types.h:typedef	int		___wchar_t;
>> sys/mips/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
>> sys/mips/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>> sys/powerpc/include/_types.h:typedef	int		___wchar_t;
>> sys/powerpc/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
>> sys/powerpc/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>> sys/riscv/include/_types.h:typedef	int		___wchar_t;
>> sys/riscv/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
>> sys/riscv/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>> sys/sparc64/include/_types.h:typedef	int		___wchar_t;
>> sys/sparc64/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
>> sys/sparc64/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>> sys/x86/include/_types.h:typedef	int		___wchar_t;
>> sys/x86/include/_types.h:#define	__WCHAR_MIN	__INT_MIN	=
/* min value for a wchar_t */
>> sys/x86/include/_types.h:#define	__WCHAR_MAX	__INT_MAX	=
/* max value for a wchar_t */
>>=20
>> So only arm and arm64 have unsigned wchar_t types.
>>=20
>> [NOTE: __CHAR16_TYPE__ and __CHAR32_TYPE__ are always unsigned: in =
C++11 terms char16_t is like std::uint_least16_t and char32_t is like =
std::uint_least32_t despite being distinct types. So __CHAR16_TYPE__ and =
__CHAR32_TYPE__ are ignored below.]
>>=20
>> The clang 3.8.0 compiler output has an odd mix for =
TARGET_ARCH=3Dpowerpc and TARGET_ARCH=3Dpowerpc64 . . .
>>=20
>> armv6 has unsigned types for both char and __WCHAR_TYPE__.
>> aarch64 has unsigned types for both char and __WCHAR_TYPE__.
>> powerpc has unsigned for char but signed for __WCHAR_TYPE__.
>> powerpc64 has unsigned for char but signed for __WCHAR_TYPE__.
>> amd64 has signed types for both char and __WCHAR_TYPE__.
>> i386 has signed types for both char and __WCHAR_TYPE__.
>> mips has signed types for both char and __WCHAR_TYPE__.
>> sparc64 has signed types for both char and __WCHAR_TYPE__.
>> (riscv is not covered by clang as I understand)
>>=20
>> The details via compiler #define's. . .
>>=20
>> # clang --target=3Darmv6-freebsd11 -std=3Dc99 -E -dM  - < /dev/null | =
more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> #define __CHAR_UNSIGNED__ 1
>> . . .
>> #define __WCHAR_MAX__ 4294967295U
>> #define __WCHAR_TYPE__ unsigned int
>> #define __WCHAR_UNSIGNED__ 1
>> #define __WCHAR_WIDTH__ 32
>> . . .
>>=20
>> # clang --target=3Daarch64-freebsd11 -std=3Dc99 -E -dM  - < /dev/null =
| more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> #define __CHAR_UNSIGNED__ 1
>> . . .
>> #define __WCHAR_MAX__ 4294967295U
>> #define __WCHAR_TYPE__ unsigned int
>> #define __WCHAR_UNSIGNED__ 1
>> #define __WCHAR_WIDTH__ 32
>> . . .
>>=20
>> # clang --target=3Dpowerpc-freebsd11 -std=3Dc99 -E -dM  - < /dev/null =
| more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> #define __CHAR_UNSIGNED__ 1
>> . . .
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>> Is powerpc wrong?
>>=20
>> # clang --target=3Dpowerpc64-freebsd11 -std=3Dc99 -E -dM  - < =
/dev/null | more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> #define __CHAR_UNSIGNED__ 1
>> . . .
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>> Is powerpc64 wrong?
>>=20
>>=20
>> # clang --target=3Damd64-freebsd11 -std=3Dc99 -E -dM  - < /dev/null | =
more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> . . . (note the lack of __CHAR_UNSIGNED__) . . .
>>=20
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>> # clang --target=3Di386-freebsd11 -std=3Dc99 -E -dM  - < /dev/null | =
more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> . . . (note the lack of __CHAR_UNSIGNED__) . . .
>>=20
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>>=20
>> # clang --target=3Dmips-freebsd11 -std=3Dc99 -E -dM  - < /dev/null | =
more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> . . . (note the lack of __CHAR_UNSIGNED__) . . .
>>=20
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>> # clang --target=3Dsparc64-freebsd11 -std=3Dc99 -E -dM  - < /dev/null =
| more
>> . . .
>> #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
>> . . .
>> #define __CHAR_BIT__ 8
>> . . . (note the lack of __CHAR_UNSIGNED__) . . .
>>=20
>> #define __WCHAR_MAX__ 2147483647
>> #define __WCHAR_TYPE__ int
>> #define __WCHAR_WIDTH__ 32
>> . . . (note the lack of __WCHAR_UNSIGNED__) . . .
>>=20
>>=20
>>=20
>> =3D=3D=3D
>> Mark Millard
>> markmi at dsl-only.net







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?580A746B-3F02-44FA-AB2E-20CC71A1E9D2>