From owner-freebsd-hackers Fri Jun 12 15:06:06 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA26433 for freebsd-hackers-outgoing; Fri, 12 Jun 1998 15:06:06 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp01.primenet.com (daemon@smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA26344 for ; Fri, 12 Jun 1998 15:05:46 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp01.primenet.com (8.8.8/8.8.8) id PAA22665; Fri, 12 Jun 1998 15:05:28 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp01.primenet.com, id smtpd022641; Fri Jun 12 15:05:23 1998 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id PAA20631; Fri, 12 Jun 1998 15:05:21 -0700 (MST) From: Terry Lambert Message-Id: <199806122205.PAA20631@usr01.primenet.com> Subject: Re: internationalization To: seggers@semyam.dinoco.de (Stefan Eggers) Date: Fri, 12 Jun 1998 22:05:21 +0000 (GMT) Cc: freebsd-hackers@FreeBSD.ORG, seggers@semyam.dinoco.de In-Reply-To: <199806122129.XAA25486@semyam.dinoco.de> from "Stefan Eggers" at Jun 12, 98 11:29:34 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Anyway, as long as there are good and easy to use converters from the > representation FreeBSD uses from and to Big5, GB, ISO 2022, Unicode > and others in the base system and the complete system (including > syscons/pcvt) gets converted I think I can live with the result. Shift encoding, such as ISO 2022 uses, requires the use of a state machine to convert. Or even use in the first place. This is one of the major objections to it, since I could "shift in" ISO 10646 and be done with it. > For practical reasons I'd prefer a fixed length of a character. The > software has to be written and modified by someone and for most of the > FreeBSD system software and ports collections this is people who use > ISO 646, ISO 8859 and KIOR-8. If they have to take into account > variable length characters it might scare some of them away and those > not scared have to deal with additional complexity. Actually, the ports maintainer is Japanese. 8-). My other objection, to the use of a 32 bit instead of a 16 bit wchar_t, is not based on memory and disk footprint. My objection is based on the fact that Unicode supports byte order determination for a two byte encoding, ISO 10646 doesn't support byte order and word order determination for a four byte encoding. So while I can select an ISO 10646 character set using ISO 2022, I can't write interoperable software using it. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message