Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Aug 2003 17:18:58 +0300
From:      Ruslan Ermilov <ru@freebsd.org>
To:        Andrey Chernov <ache@nagual.pp.ru>
Cc:        current@freebsd.org
Subject:   Re: buildworld broken after installworld
Message-ID:  <20030804141858.GB60105@sunbay.com>
In-Reply-To: <20030804140332.GA39367@nagual.pp.ru>
References:  <20030804195135.0562a9a2.yosimoto@waishi.jp> <20030804114723.GB39384@sunbay.com> <20030804223833.6c9a6718.yosimoto@waishi.jp> <20030804134636.GA39138@nagual.pp.ru> <20030804135713.GA39289@nagual.pp.ru> <20030804140332.GA39367@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

--4bRzO86E/ozDv8r1
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

[ standards@ Cc:ed ]

On Mon, Aug 04, 2003 at 06:03:32PM +0400, Andrey Chernov wrote:
> On Mon, Aug 04, 2003 at 17:57:13 +0400, Andrey Chernov wrote:
>=20
> > > There is
> > >=20
> > > tr '[a-z]' '[A-Z]'
> > >=20
> > > which can be different for different locales since use collate now as
> > > required by POSIX. Please tell which exact non-C locale you use and w=
hat
> > > happens? I miss start of this discussion.
> >=20
> > Well, I found error in the archives, so the question remains, what loca=
le=20
> > you use?
>=20
> For example, this result is right and not the bug (but wrong tr usage):
>=20
> env LANG=3Dde_DE.ISO8859-1 tr '[a-z]' '[A-Z]'
> vi_zero
> WI_]ERO
>=20
Clearly this is a useless construct then.

I can read this in the POSIX.1-2003 spec when it comes to tr(1):

: c-c     In the POSIX locale, this construct shall
:         represent the range of collating elements between
:         the range endpoints (as long as neither endpoint
:         is an octal sequence of the form \octal),
:         inclusive, as defined by the collation sequence.
:         The characters or collating elements in the
:         range shall be placed in the array in ascending
:         collation sequence. If the second endpoint
:         precedes the starting endpoint in the collation
:         sequence, it is unspecified whether the range
:         of collating elements is empty, or this construct
:         is treated as invalid. In locales other than
                                 ^^^^^^^^^^^^^^^^^^^^^
:         the POSIX locale, this construct has unspecified
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
:         behavior.
          ^^^^^^^^

This is identical to a similar issue with awk(1), and the latest
snapshot of the One True AWK reverts to NOT using strcoll(3) to
handle character ranges in RE, because different locales and even
the same locales on different operating systems (FreeBSD, Linux,
and Solaris were compared) have different ideas about the collating
order.  On Linux, the German locale's collating sequence will be
``A a ... B b'', while on FreeBSD, it's ``A B ... a b''.

So I'd rather prefer if we revert to the old behavior in tr(1).


Cheers,
--=20
Ruslan Ermilov		Sysadmin and DBA,
ru@sunbay.com		Sunbay Software Ltd,
ru@FreeBSD.org		FreeBSD committer

--4bRzO86E/ozDv8r1
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (FreeBSD)

iD8DBQE/LmtSUkv4P6juNwoRApBUAJ432W3ibErlWOQ/8iLbNY1BuWCLKQCffdbv
fED2u9Hfcu0M/2dMAkgjXOU=
=n8OG
-----END PGP SIGNATURE-----

--4bRzO86E/ozDv8r1--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030804141858.GB60105>