From owner-freebsd-hackers@freebsd.org Thu Mar 9 10:03:28 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42467D03661 for ; Thu, 9 Mar 2017 10:03:28 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1FA8EA9B; Thu, 9 Mar 2017 10:03:28 +0000 (UTC) (envelope-from bapt@freebsd.org) Received: by freefall.freebsd.org (Postfix, from userid 1235) id 6740A3ECE; Thu, 9 Mar 2017 10:03:27 +0000 (UTC) Date: Thu, 9 Mar 2017 11:03:27 +0100 From: Baptiste Daroussin To: Matthias Apitz , Xin Li , "freebsd-hackers@freebsd.org" , d@delphij.net, theraven@freebsd.org Subject: Re: Why en_US.UTF-8 locale consider a < A? Message-ID: <20170309100326.h2dwsj43vbmujaeh@ivaldir.net> References: <062a0098-1975-6d2b-b017-f623e46ca20b@delphij.net> <20170308084047.qc2j3vnrh5hycg32@ivaldir.net> <7ad51573-c575-ad2f-b3bd-b011d15981ed@delphij.net> <20170308155947.GA4129@c720-r292778-amd64> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="hxas3do6ihtbhy2s" Content-Disposition: inline In-Reply-To: <20170308155947.GA4129@c720-r292778-amd64> User-Agent: NeoMutt/20170225 (1.8.0) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Mar 2017 10:03:28 -0000 --hxas3do6ihtbhy2s Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 08, 2017 at 04:59:47PM +0100, Matthias Apitz wrote: > El d=EDa Wednesday, March 08, 2017 a las 12:51:11AM -0800, Xin Li escribi= =F3: >=20 > >=20 > >=20 > > On 3/8/17 00:40, Baptiste Daroussin wrote: > > >> Is this result correct? It matches some Debian behavior but not mac= OS > > >> behavior. > > >=20 > > > Yes the result is correct, macOS does not have unicode collation if y= ou want to > > > match the macos behaviour you have to set LC_COLLATE=3DC > >=20 > > Thanks, I also found this https://www.cl.cam.ac.uk/~mgk25/unicode.html > > just for the record if someone else hits the same issue. >=20 > I recently came across with a related problem and have two questions > (unresolved until now): >=20 > 1. > Using sort, reading the man page of it, it should be sufficient to > set LC_COLLATE correctly. It seems that setting LANG (or unsetting it) > changes the sort Order, why? This has been answered by someone else already. >=20 > 2. > Speaking about German Umlauts, should they be treated as their normal > letters, i.e. '=E4' is like 'a', as one can read in Wiki, or how they are > sorted exactly? I don't know the details for this particular case, but we do take the data = =66rom cldr (http://cldr.unicode.org/), so if you check there you will have your a= nswer Best regards, Bapt --hxas3do6ihtbhy2s Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgOTj3suS2urGXVU3Y4mL3PG3PloFAljBKGoACgkQY4mL3PG3 PlqWXg/+NNwdXb26MIQqcAIaoJblYB4dwz4mItD5FaI2RJ5WyI7CTJq1oI683jNE if8kl8tm61R960iQ9CCUwPMdODpQoWY7mZdr8vR3RYw6XgB/KNH2ajw1O7/Ox8On it8XVYZgJCaoMXzViSDFQDw0fgW75N//C0KEbkxgaT9seEDHlitoouzg+6ejUaPW SBF+izy5zuIXfOqIrzALwrXlvWTfY8j2FGyPewO7CpTLktczShhCWQy0VdxSDn2Q P7xe4v6F+mEz0FsXT8KsH/jhZMbhzx0W1LSRfRrJw0S29v7elRVVgFxwcNNUeEUq 5QVvA8MvynzuV/OWWZ2DsQRQQmafCH6zSYVJAWUnrwlEP0H6XvrUuEjZJ4WHGKUQ 4HkJ3HII2XX9O919j19rgAnDkAqhUq72vGRuOD+7urlu+krK3sLZZrRmAOfKHfLz Cr45Hr/zoFhT3c49Apk+/aC3eYrIl69BS5XKVSxQnMWL5ZcZB/Hs1gjsILBqFDFm 2bz5Ri3eZER0tp0Ks1vTc28eeplW/8mcH6KisWtQw2eGvNKJt3AZdekOUtnKZNsX ZhVJ6zm8+p5Y/6DRBmYw9BZgfAs0W+6hA+4dJBMR5g+9+GC49a2OX7bVEV8qeUqH 9+DdUP87M/sLng9KV6o0sQRRIWZj4cjT9udbxoYXChViAcS+rOM= =GzKr -----END PGP SIGNATURE----- --hxas3do6ihtbhy2s--