Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Oct 1995 20:52:36 EST
From:      "Kaleb S. KEITHLEY" <kaleb@x.org>
To:        Terry Lambert <terry@lambert.org>
Cc:        hackers@freefall.FreeBSD.org
Subject:   Re: xterm dumps core 
Message-ID:  <199510190052.UAA00286@exalt.x.org>
In-Reply-To: Your message of Wed, 18 Oct 1995 14:21:51 EST. <199510182121.OAA00987@phaeton.artisoft.com> 

next in thread | previous in thread | raw e-mail | index | archive | help

> > The ANSI/POSIX/ISO locale model is inadequate for describing things 
> > like I/O in a graphical user interface. One of the deficiencies is the 
> > inability to describe a set of fonts to use for rendering text in an
> > arbitrary locale. Another deficiency is its failure to address input 
> > methods, without which keyboard input in Oriental and Arabic languages 
> > would be all but impossible.
> 
> With the implementation of Unicode as a character set standard, the
> real issue has moved to either:
> 
> 1)	The deficiency in the Unicode standard in the placement of
> 	"private use" areas such that there is a *very* strong bias
> 	against fixed cell rendering technologies, like X, that
> 	use BLIT copies of prerendered characters at the server
> 	level.
> 
> OR
> 
> 2)	The deficiency in X string drawing with regard to its choice
> 	of fixed cell as a rendering technology.

I don't know of anyone that claims X is perfect. :-)

> The main issue here is whether a single "Unicode font" is possible or not.

Possible or practical given the current technology base?

> For non-ligatured languages (ie: not Arabic, Hebrew, Tamil, Devengari,
> or English script ["Cursive"]), the answer is "yes, it is possible,
> as long as we are talking about internationalization (enabling for
> soft localization to a single locale) ...

Possible or practical given the current technology base?

> ...instead of multinationalization
> (ability to simultaneously support multiple glyphs for a single
> character encoding, ie: the Han Unifications, etc.).

You're talking about a stateful encoding, where the glyph for a particular
character is dependent on preceeding or suceeding characters; conceptually 
similar to using compose sequences to enter characters in the right half 
of Latin-1 using a QWERTY keyboard, although that's probably an over-
simplification.

> For ligatured languages, it's possible to either adopt a locale
> regocognized block print font (Hebrew has one), or redefine the
> areas where the ligatured fonts lie as "private use" areas (in tacit
> violation of the standard), and respecify character encodings and
> round-trip tables for those languages.

I believe we have stateful encodings right now. You seem to be saying
that stateful encodings aren't possible in Unicode.

> Keyboard input methodology is an interpretational issue, and is only
> loosely bound to the fact that X (improperly) assigns keycode values
> based on internal knowledge of keycap legends.  This is loosely bound
> because of the ability to symbolically rebind these values with single
                 ^^^^^^^
??? The ability or the inability to rebind values?

> forward table references.

> The "support for locale-based characater set designation" argues on the
> basis of a choice of a character set that is a subset of Unicode, or
> is an artifact of coding technique (ie: "xtamil").
> 
> I believe this to be a largely specious argument.

I don't follow you. I'm confident that when vendors start supplying a
Unicode locale, that the X locale mechanism is extensible and flexible
enough to follow suit.

> What the ANSI/POSIX/ISO standards *do* lack is the ability to specify
> locale-based input methods for distinct sub-character set based locales
> as part of the locale information.

Do you mean e.g. the ability to switch to an alternative character set/
encoding such as Arabic in a Latin-1 locale?

> This (and runic encoding at all) is why I believe that XPG/4 is itself
> bogus, thoughit is quite argualbe that locale specificity of input
> is a problem entirely addressable by hardware alone.
> 
> Note that input method *could* be specified by locale specific hardware,
> as long as one was not intereted in multinationalization and/or various
> multilingual applications without a single round-trip standard for use
> in conversion to/from Unicode.

You lost me again.

> > If you make changes like this without considering how it might affect
> > the things that have dependencies on them, you pretty much get what you 
> > deserve. I'm sure you wouldn't make a gratuitous change like moving 
> > printf out of libc would you? 
> 
> I agree with this summation.  One must consider the ramifications of
> changes that will cause unexpected behavior that is not of a ducumented
> type.
> 
> > If you're going to change your locale naming convention then you need 
> > to document the change where people can find it and preserve the old 
> > names (perhaps with symlinks) long enough that people can find either 
> > the changes or the documentation and make the changes necessary in
> > their software to accomodate your changes.
> 
> I don't think anyone has suggested directly modifying locale specification
> to anything other than ISO standards.  

No, but Andrey has said that he is going to/has already given new names
to FreeBSD locales. I consider it a serious mistake to not maintain
backwards compatibility with previous releases of FreeBSD. Even in going
to HPUX-10, HP has maintained the HPUX-9 locale names. In HP's case the
deprecated names will ultimately be deleted in an as yet unnamed release.
Given how trivial it is to do this I fail to understand his blatant
disregard for backwards compatibility from one release to the next.

> The X locale alias mechanism is
> indeed an artifact of local extensions (ie: AIX "DOSANSI", HP, etc.)
> rather than an artifact of the deficiencies in the weel defined naming
> conventions for locales which are not vendor private.

An artifact of local extensions? I wouldn't say that. I would say it's
an implementation detail to overcome the lack of consistency in naming
locales, e.g.: HP's american.iso88591, Digital's en_US.ISO8859-1, SVR4's 
en_US, SunOS's iso_8859_1 LC_CTYPE, and all the other variations the 
vendors use for their ISO locale names. The X Consortium release of R6 
makes no attempt to cover vendor proprietary locales like HP's roman8 
locales, or AIX and Unixware Codepage 850 locales.

As an aside I would say that I believe all these companies take their
standards compliance very seriously. Yet none of them have a problem with 
not following RFC 7000 in choosing names for their locales. The switch 
from foo.ISO8859-1 to foo.ISO_8859-1 seems completely gratuitous. The fact 
that he will compound it by failing to have any sort of backwards 
compatibility is inexcusable. 

Andrey should think about the consequences of upsetting thousands of 
previously happy FreeBSD users when they discover that the X that they've
been using just fine for a year or more on FreeBSD 2.0/2.0.5 no longer 
works, with problems ranging from xterm dumping core to compose processing 
no longer working.

> On the other hand, I have no problem whatsoever orphaning vendor-private
> locale naming mechanisms if it buys an additional level of functionality
> at no other cost.

This is not a case of X orphaning vendor locale names. It is a case of
mapping as many vendor locale names as possible to the corresponding X 
locale name. It is a X internal implementation detail. It is not, as Andrey 
claims, a bug that the X Consortium release of R6 does not support the 
locale names used in an as yet unreleased version of FreeBSD.

--

Kaleb



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510190052.UAA00286>