FreeBSD Mail Archives

Date:      Mon, 15 Dec 1997 06:53:13 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        perhaps@yes.no (Eivind Eklund)
Cc:        tlambert@primenet.com, chat@FreeBSD.ORG
Subject:   Re: blocksize on devfs entries (and related)
Message-ID:  <199712150653.XAA26495@usr09.primenet.com>
In-Reply-To: <8690toqdco.fsf@bitbox.follo.net> from "Eivind Eklund" at Dec 14, 97 02:11:03 pm

> > This becomes *very* important if you ever want to support Unicode,
> > which takes 2 characters per character, or EUC encoded "Big 5",
> > which may take up to 5 characters per character (one of the reasons
> > I am "for" Unicode and "against" EUC/ISO2022).
> 
> You loose.  Unicode isn't enough - the asians have introduced shifting
> _anyway_, as they couldn't fit all asian languages into the space
> available.
> 
> Now, if the asians hadn't voted down the original Unicode proposal
> which called for selection of Unicode charsets for different asian
> languages...
> 
> When you think about it, it is fairly seldom an average user need to
> display multiple languages in the same document.
> 
> Eivind.
> 
> P.S. The above is based on 2nd hand "oral" information - I'm not a
> nationalisation/character encoding expert, but got it off one.  So no
> difficult questions now ;-)

Your oral information is a lie.

The real "Asian Unicode issue" is language bigotry.  You couldn't display
only the Japanese out of a mixed Chinese/Japanese document.

It's not really an issue of "the asians" voting things down.  It's an
issue of language bigotry.

One big problem (for the Japanese) is that Unicode selected Chinese
dictionary order.

Forget for the moment that the Chinese make up 1/5 of the worlds population,
and forget that the Japanese dictionare order is not capable of providing
sequencing information for Chinese Kanji's, while Chinese dictionary
order can accommodate both Chinese and Japanese Kanji (it being in a
stroke-radical ordering).

The real problem for the Japanese (especially Ohto-san) is that you can't
get rid of all non-Japanese characters easily... in other words, the
standard is "impure".

The standard accomplished it's intent: provide a mechanism for a round
trip to and from all existing character set standards.  That the Japanese
can not easily distinguish Chinese characters which are Unicode encoded
is more the fault of the Japanese and Chinese never agreeing on an
encoding standard that had seperate code points.

The other implicit issue is that the collation sequence is not the same
as the Japanese collation sequence.

On the other hand, the Japanese have no problem with UTF-8 and UTF-7 and
EUC and Shift-JIS and ISO2022, and all those other "standards" that make
Western software using fixed field forms input and record storage
practically useless.

This is on the order of expecting the French to use English words in
their contribution to the ESA, instead of delaying things while their
official standards body makes up new French words...

Unicode is a *character* encoding standard, not a *font* encoding
standard.  It has it's faults (one of which is the bias against fixed
cell rendering technologies -- but how could you expect Taligent to not
favor Adobe technology over X, despite the licensing costs?), but the
use of multilingual documents is adequately covered by Compounding (as
described in the Unicode standard, volume 1).

The problem is that software is generally internationalized.  It is *not*
generally *multinationalized*.

The difference is subtle: the first only enables data-driven localization
into a single round-trip character set.

If the Japanese want *multinationalization* (a smoke screen for the
collation and language seperation issues), then the onus is on them
to invent a round-trip character set that includes seperate code points;
at that time, *then* (and *only* then) Unicode would have to seperate
the code points.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199712150653.XAA26495>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation