From owner-freebsd-stable@freebsd.org Wed Feb 21 12:30:24 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 56CE8F265AB for ; Wed, 21 Feb 2018 12:30:24 +0000 (UTC) (envelope-from rumrunner@terraplane.org) Received: from nmsh5.e.nsc.no (nmsh5.e.nsc.no [148.123.160.199]) by mx1.freebsd.org (Postfix) with ESMTP id CC98476B60 for ; Wed, 21 Feb 2018 12:30:23 +0000 (UTC) (envelope-from rumrunner@terraplane.org) Received: from terraplane.org (ti0027a400-1948.bb.online.no [88.88.110.169]) by nmsh5.nsc.no (8.15.2/8.15.2) with ESMTPS id w1LCUIpB034094 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 21 Feb 2018 13:30:22 +0100 Received: from terraplane.org (localhost [127.0.0.1]) by terraplane.org (8.14.5/8.14.5) with ESMTP id w1LCVD19075961; Wed, 21 Feb 2018 13:31:13 +0100 (CET) (envelope-from rumrunner@terraplane.org) Received: (from rumrunner@localhost) by terraplane.org (8.14.5/8.13.8/Submit) id w1LCVDJ7075960; Wed, 21 Feb 2018 13:31:13 +0100 (CET) (envelope-from rumrunner) Date: Wed, 21 Feb 2018 13:31:13 +0100 From: Eivind Nicolay Evensen To: Brandon Allbery Cc: freebsd-stable Subject: Re: Locale problem updating 10.3 to 11.1 Message-ID: <20180221123112.GB75251@klump.hjerdalen.lokalnett> References: <20180218230251.GA60727@klump.hjerdalen.lokalnett> <20180219081129.GB62932@klump.hjerdalen.lokalnett> <20180220230822.GA72560@klump.hjerdalen.lokalnett> <20180221120811.GA75251@klump.hjerdalen.lokalnett> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Feb 2018 12:30:24 -0000 On Wed, Feb 21, 2018 at 07:16:49AM -0500, Brandon Allbery wrote: > A locale mapping is basically a lookup table (with complications for things > like ß). A single-byte lookup table will be 256 entries, each holding one > or more (because of combining characters) Unicode codepoints representing > the mapping from the locale character set to the underlying common > character set (Unicode). (There may also be a reverse lookup table for > mapping Unicode codepoints to locale codepoints.) That's fine, it doesn't make my life miserable such as it would when directly using multibyte character sets, as long as it doesn't negatively affect performance. > Without this, every program would have to deal directly with every possible > character set. Or only handle what one cares about. > (Complications include things like: depending on encoding/locale details, > German lowercase ß will uppercase to either SS or ???. While German is not my main language, I've never seen a situation where an uppcase variant of ß would make sense, though I understand the example. > And that's one of the > simpler ones; for some locales, things can get *really* weird. Not to > mention fun stuff like Arabic having 4 representations of every character: > initial, medial, final, standalone.) Complications I don't want or need, nicely points out what I dislike about unicode, although I can understand some os wanting to support it, to be useful in more situations. -- Eivind