FreeBSD Mail Archives

Date:      Thu, 13 Mar 1997 10:51:34 -0500 (EST)
From:      John Fieber <jfieber@indiana.edu>
To:        Terry Lambert <terry@lambert.org>
Cc:        pam@polynet.lviv.ua, chat@freebsd.org
Subject:   Re: Q: Locale - is it possible to change on the fly?
Message-ID:  <Pine.BSF.3.95q.970312143119.26807O-100000@fallout.campusview.indiana.edu>
In-Reply-To: <199703121800.LAA27652@phaeton.artisoft.com>

On Wed, 12 Mar 1997, Terry Lambert wrote:

> > How many times have you seen web pages with the telltale signs of
> > "smart quotes"?  Box drawing characters that are portable across
> > platforms?  Wheee!  Math symbols?  Lots of people could use a
> > richer set than + - / * and ^.
> 
> You can't use Unicode for this...  how can you attribute fonts on, for
> instance, a Japanese www page on Chinese poetry?  Any character sets
> which have mutually unified code points that have different glyphs
> can not be simultaneously represented without font attribution.  The
> Unicode standard is not a glyph encoding standard.

In the current world, numerous glyph encodings are used to
represent documents.  Correct?  These differing glyh encodings
often share the code space, and thus it is essential that an
encoding switch signal, a font tag for example, be present.  In
Unicode terms, these font tags constitute a "higher level
protocol".  If you need to convert that document's high level
protocol, MS Word to HTML for example, the all-important encoding
information stands a good chance of getting lost and/or mangled. 
If you used dingbats, math symbols, smart quotes, or any other
encodings, you are SOL.  Your document has just become rubbish. 

You suggest that Unicode has the same problem, and I'll agree but
only to a limited extent.  A Unicode document should have
language tags for optimal rendering, processing, input method
selection, etc., but if that information is lost in a high level
protocol conversion, your document is hardly turned to rubbish! 
First, because of a unified character encoding, your smart quotes
(0x201C, 0x201D) will never be mistaken for something else as
they would be in the multiple glyph encoding schemes we have to
use now. Second, although Unicode makes a clear distinction
between character and glyph encoding, and Unicode is a character
encoding, it is also true that many of the Unicode character
blocks have a direct, language independent glyph mappings in
practice.

Certainly, other scripts do not have direct glyph mappings, and
in some cases glyph mapping is affected by the language, but I
hardly think this is grounds for rendering Unicode useless for
multinational computing.  In the absence of higher level
protocols, you cannot handle all possible languages
simultaneously, but you can easily handle a heck of a lot more
than you can with the current current collection of glyh encoding
standards.  Is that not a contribution to multinational
computing? 

Let me also re-state that Unicode by itself is not a complete
multinational computing solution any more than US-ASCII is a
complete solution for American English. I never stated it as
such, and certainly never meant to imply it. 

-john

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.95q.970312143119.26807O-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation