Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Oct 1995 02:23:38 +0300 (MSK)
From:      =?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?= (aka Andrey A. Chernov, Black Mage) <ache@astral.msk.su>
To:        Terry Lambert <terry@lambert.org>
Cc:        hackers@freefall.freebsd.org, joerg_wunsch@uriah.heep.sax.de, kaleb@x.org
Subject:   Re: A couple problems in FreeBSD 2.1.0-950922-SNAP
Message-ID:  <YlwbkWmKE0@ache.dialup.demos.ru>
In-Reply-To: <199510162209.PAA25573@phaeton.artisoft.com>; from Terry Lambert at Mon, 16 Oct 1995 15:09:02 -0700 (MST)
References:  <199510162209.PAA25573@phaeton.artisoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In message <199510162209.PAA25573@phaeton.artisoft.com> Terry Lambert
    writes:

>I've only been working on user interface internationalization for more
>than 10 years, and OS internationalization for 6 years.

I too but my case was restricted mostly to russian language.

>The crt0 hack is a kludge that supposedly "fixes" non-internationalized
>programs that are otherwise 8 bit clean.

>The reason it is that is that the default C locale is not i18n clean in
>its undefined behaviour.

>It should not be there.  0xa3 will display the same for you no matter
>which 8859-x locale you pick, except the current C locale, which I think
>is wrong.

>The problem is that the current C locale renders some printable characters
>unprintiable, etc. by virtue of the way the ctype.h macros operate.

>Well, fix the C locale's undefined behaviour to be the same as the defined
>8859-1 behaviour.  Problem solved.

It seems that you miss the point here. Most harmful are macros such
as isprint(), islower()/isupper(), isalpha(), ispunct(), etc.
all of them are different for various 8bit charsets, f.e.
isalhpa(8859-1) != isalpha(KOI8-R).
If you stuck with one particular version, f.e. 8859-1, is*()
functions will return incorrect values for any other charset used
screwing your screen and keyboard input. I.e. 8859-1 toupper() can
produce very strange char for KOI8-R input. Or 8859-1 checking
input for ispunct() can allow very strange KOI8-R chars sneak in.
Or 8859-1 isalhpa() for output can print very strange chars
for KOI8-R, etc. Don't forget, I use KOI8-R only for example,
you can find some 8859-* font to substitute instead of this name.

>Fix the C locale, not the crt0.o.  Then, as time permits, fix the locale
>unaware code.

What do you mean by fixing C locale exactly?

>As long as the characters are passed through unadulterated, there is
>no difference for n == 1 and n != 1 in the non-setlocale() called case,
>which is the issue.  If the damn thing wasn't being called and the
>C locale were correctly defined for "undefined" code points, then there
>would not be a problem.

What you mean by unaltered? They are unaltered, but they belongs
to different classes in different charsets, real separator is
is*() functions.

>Calling "setlocale()" for an otherwise non-internationalized program is
>a big mistake, and just compounds the C locale mistake.  Correct the
>right code.

BTW, when C program is known 8bit clean, what I and my users
want from FreeBSD is proper interaction with russian language.

It means that
1) all is*() macros must be correct for russian charset (LC_CTYPE).
2) strftime must return national data (LC_TIME).
3) National sorting must works (LC_COLLATE).

Now all that goals are reached by 'setenv ENABLE_STARTUP_LOCALE'
and without any program modifications. It is especially essential when
program isn't FreeBSD native but comes from 3rd party, i.e.
ports area. Moreover, they can be reached on any remote system
too, includes freefall f.e.

The same words are true for 8859-1 users too, not only for KOI8-R
users.

Maybe this functionality isn't kosher but you even can't imagine how
it is useful.

If you know "proper way" to do things and keeps this goals non-broken too,
I am all ears.

-- 
Andrey A. Chernov        : And I rest so composedly,  /Now, in my bed,
ache@astral.msk.su       : That any beholder  /Might fancy me dead -
FidoNet: 2:5020/230.3    : Might start at beholding me,  /Thinking me dead.
RELCOM Team,FreeBSD Team :         E.A.Poe         From "For Annie" 1849



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YlwbkWmKE0>