Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Jun 2016 17:54:20 -0400
From:      Jung-uk Kim <jkim@FreeBSD.org>
To:        Dimitry Andric <dim@FreeBSD.org>, Gerald Pfeifer <gerald@pfeifer.com>
Cc:        Andreas Tobler <andreast@FreeBSD.org>, freebsd-toolchain@freebsd.org
Subject:   Re: Re: Duplicate OPT_ entries in gcc/options.h
Message-ID:  <0610816e-2675-1abf-a4ee-274807317932@FreeBSD.org>
In-Reply-To: <75411813-0C9B-4CEF-BEE4-8B26DD8346F7@FreeBSD.org>
References:  <alpine.LSU.2.20.1606082038000.2798@anthias.pfeifer.com> <75411813-0C9B-4CEF-BEE4-8B26DD8346F7@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--qg1A47UvPVkdvm1uIQ6tBqxxCAWAmrR2b
Content-Type: multipart/mixed; boundary="m4Ud1lAVu3MP132uI1fkfpRv0isPT4uke"
From: Jung-uk Kim <jkim@FreeBSD.org>
To: Dimitry Andric <dim@FreeBSD.org>, Gerald Pfeifer <gerald@pfeifer.com>
Cc: Andreas Tobler <andreast@FreeBSD.org>, freebsd-toolchain@freebsd.org
Message-ID: <0610816e-2675-1abf-a4ee-274807317932@FreeBSD.org>
Subject: Re: Re: Duplicate OPT_ entries in gcc/options.h
References: <alpine.LSU.2.20.1606082038000.2798@anthias.pfeifer.com>
 <75411813-0C9B-4CEF-BEE4-8B26DD8346F7@FreeBSD.org>
In-Reply-To: <75411813-0C9B-4CEF-BEE4-8B26DD8346F7@FreeBSD.org>

--m4Ud1lAVu3MP132uI1fkfpRv0isPT4uke
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 06/ 8/16 05:15 PM, Dimitry Andric wrote:
> On 08 Jun 2016, at 21:11, Gerald Pfeifer <gerald@pfeifer.com> wrote:
>>
>> I got a user report, and could reproduce this, that building
>> GCC (lang/gcc, but also current HEAD, so probably pretty much
>> any version) with FreeBSD 11 and LANG =3D en_US.UTF-8 we get
>> conflicting entires in $BUILDDIR/gcc/options.h such as
>>
>>  OPT_d =3D 135,                               /* -d */
>>  OPT_D =3D 136,                               /* -D */
>>  OPT_d =3D 137,                               /* -d */
>>  OPT_D =3D 138,                               /* -D */
>>  OPT_d =3D 141,                               /* -d */
>>  OPT_D =3D 142,                               /* -D */
>>  OPT_d =3D 143,                               /* -d */
>>
>> Using LANG =3D en_US (without UTF-8), everything works fine.
>>
>> Any ideas what might be going on here?  (This is done via
>> AWK scripts from what I can tell, does this trigger any
>> ideas?)
>=20
> It is definitely something caused by our awk in base, in any case.
> First opt-gather.awk is run to generate a flat list of all options:
>=20
>   /usr/bin/awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-gather.awk=
 /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/ada/gcc-interface/lang.opt /usr/p=
orts/lang/gcc/work/gcc-4.8.5/gcc/fortran/lang.opt /usr/ports/lang/gcc/wor=
k/gcc-4.8.5/gcc/go/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/java/l=
ang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/lto/lang.opt /usr/ports/la=
ng/gcc/work/gcc-4.8.5/gcc/c-family/c.opt /usr/ports/lang/gcc/work/gcc-4.8=
=2E5/gcc/common.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/fused-m=
add.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/i386/i386.opt /usr/=
ports/lang/gcc/work/gcc-4.8.5/gcc/config/rpath.opt /usr/ports/lang/gcc/wo=
rk/gcc-4.8.5/gcc/config/freebsd.opt > tmp-optionlist
>=20
> Then opt-functions.awk is run to process optionlist into options.h:
>=20
>   /usr/bin/awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-functions.=
awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-read.awk -f /usr/ports/=
lang/gcc/work/gcc-4.8.5/gcc/opth-gen.awk < optionlist > options.h
>=20
> If I run the first step using LANG=3DC, or without any LANG setting, bo=
th
> optionlist and options.h are as expected.  If I run the first step usin=
g
> LANG=3Den_US.UTF-8, the optionlist is sorted differently, for example t=
he
> "good" optionlist has the uppercase d options first, and much later the=

> lowercase d options:
>=20
>   D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name missi=
ng after %qs)^\-D<macro>[=3D<val>]   Define a <macro> with <val> as its v=
alue.  If just <macro> is given, <val> is taken to be 1
>   D^\Driver Joined Separate
>   D^\Fortran Joined Separate
>   ... much later in the file, after all options starting with an upperc=
ase letter ...
>   d^\C ObjC C++ ObjC++ Joined
>   d^\Common Joined^\-d<letters>   Enable dumps from specific passes of =
the compiler
>   d^\Fortran Joined
>   d^\Java Separate SeparateAlias Alias(foutput-class-dir=3D)
>=20
> The "bad" optionlist has the upper and lower case d options sorted
> together:
>=20
>   d^\C ObjC C++ ObjC++ Joined
>   D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name missi=
ng after %qs)^\-D<macro>[=3D<val>]   Define a <macro> with <val> as its v=
alue.  If just <macro> is given, <val> is taken to be 1
>   d^\Common Joined^\-d<letters>   Enable dumps from specific passes of =
the compiler
>   D^\Driver Joined Separate
>   defsym=3D^\Driver JoinedOrMissing
>   defsym^\Driver Separate
>   d^\Fortran Joined
>   D^\Fortran Joined Separate
>   d^\Java Separate SeparateAlias Alias(foutput-class-dir=3D)
>=20
> Note that GNU awk does *not* produce a different optionlist file when
> used with either LANG=3DC or LANG=3Den_US.UTF-8.
>=20
> opt-gather.awk's sorting function looks like this:
>=20
>   function sort(ARRAY, ELEMENTS)
>   {
>           for (i =3D 2; i <=3D ELEMENTS; ++i) {
>                   for (j =3D i; ARRAY[j-1] > ARRAY[j]; --j) {
>                           temp =3D ARRAY[j]
>                           ARRAY[j] =3D ARRAY[j-1]
>                           ARRAY[j-1] =3D temp
>                   }
>           }
>           return
>   }
>=20
> So I am assuming that the ARRAY[j-1] > ARRAY[j] comparison works
> differently in our awk, depending on the LANG settings.  No idea when
> that changed, though, if it changed at all...

This behaviour is known for very long time:

https://svnweb.freebsd.org/changeset/base/173731

and it is not our fault:

https://www.gnu.org/software/gawk/manual/html_node/POSIX-String-Compariso=
n.html

GNU awk produces the same output with "--posix" option.

FYI...

Jung-uk Kim


--m4Ud1lAVu3MP132uI1fkfpRv0isPT4uke--

--qg1A47UvPVkdvm1uIQ6tBqxxCAWAmrR2b
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJXWJQNAAoJEHyflib82/FGnqwIAIeXVDRcuKi3D1N4JRuPr4fx
QzExtI/vGEwB3RN05eDI9R+1ME4bVCXJix5nATL3YlUohXC0wUcbCE92R2MJ/xMo
LJ5sPsNT73nG2NMGBgyW4ffwnyTrbiyWDKj8lGJFW8extBsgw3E+OmdyKKn+afuF
LAsj2qvcRqor5ChhNpTblwCx5PM4BkrYKI3zj3bveW8gfDM1i580xUx5jsWl7gTK
mjxi4zc+zFEz2vU2Yhx82gm6leHy+1nU3DCR5ZkMWRpa88EHLeYrVEk5JxRkjYWz
bpS2s14Oh8csT8enU4ex1W9+cUFVyeK5jTKlWUEGIr4Q68QH0nhDuQP4vwYcuKQ=
=Kh5h
-----END PGP SIGNATURE-----

--qg1A47UvPVkdvm1uIQ6tBqxxCAWAmrR2b--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0610816e-2675-1abf-a4ee-274807317932>