From owner-freebsd-arch@FreeBSD.ORG Sun Jan 25 19:00:11 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A6F7B957; Sun, 25 Jan 2015 19:00:11 +0000 (UTC) Received: from vps.rulingia.com (vps.rulingia.com [103.243.244.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps.rulingia.com", Issuer "CAcert Class 3 Root" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6615A9E1; Sun, 25 Jan 2015 19:00:10 +0000 (UTC) Received: from server.rulingia.com (c220-239-242-83.belrs5.nsw.optusnet.com.au [220.239.242.83]) by vps.rulingia.com (8.14.9/8.14.9) with ESMTP id t0PIxxdS052107 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Jan 2015 06:00:05 +1100 (AEDT) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.9/8.14.9) with ESMTP id t0PIxqaD011075 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 26 Jan 2015 05:59:52 +1100 (AEDT) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.9/8.14.9/Submit) id t0PIxpio011074; Mon, 26 Jan 2015 05:59:51 +1100 (AEDT) (envelope-from peter) Date: Mon, 26 Jan 2015 05:59:51 +1100 From: Peter Jeremy To: Slawa Olhovchenkov Subject: Re: [RFC] Set the default locale to en_US.UTF-8 Message-ID: <20150125185951.GC23253@server.rulingia.com> References: <20150124143357.GI81001@ivaldir.etoilebsd.net> <20150125143243.GB76051@zxy.spb.ru> <7B1D8345-248B-4C44-9568-079BA29614C2@ixsystems.com> <20150125155000.GD76051@zxy.spb.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="z4+8/lEcDcG5Ke9S" Content-Disposition: inline In-Reply-To: <20150125155000.GD76051@zxy.spb.ru> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.23 (2014-03-12) Cc: arch@FreeBSD.org, Baptiste Daroussin , Jordan Hubbard X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Jan 2015 19:00:11 -0000 --z4+8/lEcDcG5Ke9S Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2015-Jan-25 18:50:00 +0300, Slawa Olhovchenkov wrote: >On Sun, Jan 25, 2015 at 06:58:13AM -0800, Jordan Hubbard wrote: >> > On Jan 25, 2015, at 6:32 AM, Slawa Olhovchenkov wrote: >> >=20 >> > NO! Please, NOT! >> > Not all bytestring allowed in UTF-8, as result -- unpedicable failed >> > execution of sed, grep, vi, ed and etc. I switched to en_AU.UTF-8 about 5 years ago with relatively little pain (though I had very little non-ASCII text). The downside of UTF-8 in that random non-ASCII bytestrings are unlikely to be valid UTF-8 and will therefore get rejected. About the only time I get bitten by this is that my random password generator: dd if=3D/dev/random bs=3D32 count=3D1 | tr -cd '!-~' will die with an "tr: Illegal byte sequence" and needs a "LC_ALL=3DC" to placate it. At least with emacs (and I think vi), you can override the default locale on a file-by-file basis - and emacs is very good at coping with non-UTF-8 files in a UTF-8 locale, as well as translating between locales. >> It's a good idea to change it. We have outgrown ISO-Latin1, and UTF-8 s= olves a host of ugly I18N interoperability problems when used consistently. Agreed. IMHO, this is long overdue. >I am years use ru_RU.KOI8-R. Now I try use ru_RU.UTF8 and got some >issuse (on 10-STABLE). 9.x and OS may have dufferent version of >software and don't touch this. Once you've started using any 8-bit locale, switching to UTF-8 (or any other 8-bit locale) will be a PITA because you need to re-encode everything. And, since it's very difficult to run with multiple locales, you need to do a complete sweep when you change locales. If you are running into specific issues with incorrect handling of ru_RU.UTF8, that is a bug and you need to report it. Note that we're talking about changing the default - you already override the default so it won't affect you. >This is (change from one-byte tu multi-bytes locale) may be do >individualy, after inspecting systems. This is may be OK for new >install, but not [automatic] for update/upgrade. Either an existing system has already overridden the default locale, so changing the default will have no impact, or the treatment of non-ASCII data is currently undefined so changing the default is changing undefined behaviour to explicitly warning the the user that they have problems with their data. --=20 Peter Jeremy --z4+8/lEcDcG5Ke9S Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJUxT0nXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFRUIyOTg2QzMwNjcxRTc0RTY1QzIyN0Ux NkE1OTdBMEU0QTIwQjM0AAoJEBall6Dkogs0gV4P/2usHL4uwjoIvhELGQkT5ZcV +tuSvoQDlhWKUz6/3ThnBlGsHTZ5vCBTqWLtJ2twsC9C8u3EQBjN4YaFbT8NQ8aa AilFc/MH5/Lu7D9hDh6mv8PxOx9/P4cC3uH6GVElzLgYfQXSWYx3vuw1MH90b1QS JHF+a/PJCUCDaAtqLv2lejHBLqKSNJdchMbhiLH0XVrurWRLxb1uSaMAKTbNcp6v XIQqhP6uENJhc/pHEK7yOwpAsuv/MdLGa1sUhIrbFow8yYeR7bA58/VpA/V6nVi1 NU0g8WL4VQJr/dt7xOqjWZ2kdl5ML7qj/8VpoxEjSCmkO4IRa05mAZhiD74RSsF9 pOGxqyahhWhBRhrSu6EU1JPS3aMwGFuQybBPVfYRn3mpqigMRDKE/5yP0y3qa5t/ AinsLLzm9+Ti6Ht0lsDYkx/Ys1J8tRZQiwY6EJe9PM+qR1P1ry2VOfl/+jT+SHIB uGxNSFrNeI11EX62AUvx5oh7fTU5cMmOpJP+xvO2IHduzkegEt526K4g9yzRB2zR hfa0sTHgj06V5PIq1q1x7rMwpDzhEqdEfhtKsyXfkpwiYeuisJYfqUaoLMLtpvNE zo7UGLmF1LKydHBQAaYQBEwpqWbjoBQSVUTUORBdcaeXnrccvx1EXnIQ8ShF0HDj DQvTf8YfLONDZRFw1/cy =mtMF -----END PGP SIGNATURE----- --z4+8/lEcDcG5Ke9S--