Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Oct 1995 15:33:03 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=)
Cc:        kaleb@x.org, hackers@freefall.freebsd.org
Subject:   Re: A couple problems in FreeBSD 2.1.0-950922-SNAP
Message-ID:  <199510162233.PAA25657@phaeton.artisoft.com>
In-Reply-To: <hYatTWmWE0@ache.dialup.demos.ru> from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 16, 95 07:21:56 am

next in thread | previous in thread | raw e-mail | index | archive | help
> >>A nice suggestion. Too bad it doesn't work. ANSI/POSIX1 say that a
> >>program does the equivalent of setlocale(LC_ALL, "C") on startup. Given
> >>that ls, and I gather everything else, disregard my LANG, LC_ALL, and 
> >>LC_CTYPE environment variables, I'm left wondering how it is you think
> >>that using the "proper locale" will help. Are you assuming that I'm
> >>using the undocumented hack of setting the ENABLE_STARTUP_LOCALE 
> >>environment variable?
> 
> Briefly says, I disagree with default table propogation to 8859-1
> (well, maybe agree with propogation to KOI8-R :-) because:
> 
> 1) It violates POSIX default "C" locale description.

But not the ISO/ANSI C description.  I can put in anything for "undefined"
that I damn well please.  Including 8859-1.

> 2) It breaks all >=8bit charsets which names != 8859-1.

This is patently false.  What results is a predominantly 8-bit clean
interface that has 0x00-0x1f,0x80-0x9f shown a controls, and everything
else shown as printable characters.

This is valid for all 8859-x display/input systems, since the reuse of
the code points are not transformed by this (8859-x does not encode
characters in those locations).

The only potentially incorrect behaviour is on blanks not being interpreted
as blanks.  If you want a blank, you shouldn't be using some wild code
point other than 0x20 anyway.  You get what you deserve.

At this point, that would cause the resulting behaviour to be "undefined".
This is acceptable according to ISO interpretation of X3J11 anyway.  You
use an undefined character, you get wierd grap on your screen.

The problems you will encounter in this circumstance are all *very*
specific to cases where a single file system is being used by multiple
nationalities of clients.

Since the locale mechanism is a *internationalization* mechanism, not
a *multinationalization* mechanism, this is in fact correct behaviour.
The difference is that "internationalization" is defined as enabling
for localization to a particular (read as *single*) language.

Thus this behaviour is acceptable and,  in fact, expected.

If you want *multinationalization*, use ISO10646, code page 0.  That
is, 16 bit Unicode.  It won't buy you font selection, but unless you
are a language bigot, that shouldn't matter on file names.  If you
care, start lobbying for allocation of code pages other than 0 in
10646.  Good luck on your lobbying, you'll need it.  I hope you don't
mind if I lobby for RTF encoding for language bigots, since they are
in the minority.  8-).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510162233.PAA25657>