From owner-freebsd-hackers  Thu Jun 11 15:52:16 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id PAA21671
          for freebsd-hackers-outgoing; Thu, 11 Jun 1998 15:52:16 -0700 (PDT)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from smtp03.primenet.com (daemon@smtp03.primenet.com [206.165.6.133])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA21554
          for <hackers@FreeBSD.ORG>; Thu, 11 Jun 1998 15:51:59 -0700 (PDT)
          (envelope-from tlambert@usr09.primenet.com)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.8.8/8.8.8) id PAA11090;
	Thu, 11 Jun 1998 15:51:36 -0700 (MST)
Received: from usr09.primenet.com(206.165.6.209)
 via SMTP by smtp03.primenet.com, id smtpd010964; Thu Jun 11 15:51:23 1998
Received: (from tlambert@localhost)
	by usr09.primenet.com (8.8.5/8.8.5) id PAA29278;
	Thu, 11 Jun 1998 15:51:13 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199806112251.PAA29278@usr09.primenet.com>
Subject: Re: internationalization
To: mike@smith.net.au (Mike Smith)
Date: Thu, 11 Jun 1998 22:51:13 +0000 (GMT)
Cc: junker@jazz.snu.ac.kr, itojun@itojun.org, kline@tao.thought.org,
        tlambert@primenet.com, hackers@FreeBSD.ORG
In-Reply-To: <199806111325.GAA01739@antipodes.cdrom.com> from "Mike Smith" at Jun 11, 98 06:25:12 am
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> If the GNU gettext is GPL'd, using it for i18n work on the base FreeBSD 
> system would seem to be a pretty bad idea.  If we're serious about 
> doing this "right", we need something that we can integrate entirely.

I agree completely.  We must not bind ourselves to that cross.


> Just for reference's sake, what's wrong with the XPG3 support we 
> currently have (catopen/catclose/catgets)?

XPG/3 does not support multibyte encoding, such as EUC.  It supports
only character set shift encoding (per ISO 2022), such as used by
what is calle "Shift-JIS".

My opinions:

I believe it is an error to use multibyte encodings, as they destroy
important information which is utilized by 8-bit alphabetic programmers
to make user interaction and data storage task drastically less complicated
than they would otherwise be.

The Japanese already have to deal with this problem because of their
failure to use the Kana (an alphabetic Japanese that fits comfortably
in 8 bits), so these problems (apparently) don't chafe them like they
chafe us.

The information density of Kanji is drastically higher, and the Japanese
will, arguably, beat the heck out of us in storage density, once they
figure out how to solve the input problem for the normally 20,000
ideogrammatic characters known to an average Japanese with a PhD.  One
of the main reasons Japan doesn't lead the world in software production
and sales right now is their reliance on Kanji.

The most damning argument I can put forth against ISO 2022 is that it
is inferior to SGML, even if SGML is only used as a font family markup
language (which is all that ISO 2022 is).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message