Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Oct 1995 14:37:18 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=)
Cc:        terry@lambert.org, bde@zeta.org.au, hackers@freefall.freebsd.org, j@uriah.heep.sax.de, kaleb@x.org
Subject:   Re: A couple problems in FreeBSD 2.1.0-950922-SNAP
Message-ID:  <199510162137.OAA25492@phaeton.artisoft.com>
In-Reply-To: <UlSvj6lWW2@ache.dialup.demos.ru> from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Jan 17, 95 00:00:12 am

next in thread | previous in thread | raw e-mail | index | archive | help
> >> And what? Now too many pgms require proper locale support, even ls,
> >> so we can't avoid this thing. Code added regardles of
> >> ENABLE_STARTUP_LOCALE set or not, so 'hack' means this variable
> >> as I understand and not code added. As I already say,
> >> I can revert default case to pick ctype and use variable
> >> DISABLE_STARTUP_LOCALE to disable it for debugging purposes.
> 
> >aaaaaaaaaaaaaaauuuuuuuuuuuuuuuuuuuuuuuuuuggggggggggggggggggghhhhhhhhhhhhh!
> 
> >Why do we think ls requires this?
> 
> It is simple: to display native filenames.

Excuse me.  All you need is the correct matching keyboard/font, an 8 bit
clean code path (which the current limited C locale and automatically
calling setlocale() in ctr0.o screws up), and the guarantee that your
character encodings don't stomp on control sequence reserved areas,
like 0x00-0x1f,0x80-0x9f.

Except for the bogus C locale (which I agree is bogus), and the fact
that KOI-8 disrepectfully stomps on control areas with its data, you
already have all that.

To get around the stomping, you'll have to define a locale and make the
programs locale aware.

Or get an encoding standard that respects 8859-x and ISO control encoding.

> >Because the default locale is 'C', doesn't mean that the default locale
> >should not be ISO 8 bit clean.
> 
> It is already 8bit clean. You can safely call ctype(>127).

Excuse me.  The C locale does not return the same values as 8859-1.
It is not ISO 8 bit encoding clean.

> >Also, programs whose output is limited in this fashion should be
> >explicitly calling setlocale(), or they are only half-assed in their
> >attempt to support internationalization.
> 
> Correct ctype != half-assed.
> Correct ctype != full i18n
> Correct ctype is what user expects at least.

Read the ISO standization of the ANSI C standard with respect to the C
locale.  The specific wording is "is undefined".  You can make it return
whatever you want it to for that.  Including i18n.

> Majority of users use various 8bit charsets and >8bit charsets
> isn't commonly used. Why not make life easier for all 8bit charsets
> users, if this not affects at all >8bit users?

Exactly.  Define the undefined portions of the C locale to act in an
implementation dependent fashion.  That happens to look exactly like
8859-1.

> >In the case that it is explicitly called (ie: programs supposedly using
> >these features), then the hack is unnecessary.
> 
> And what? Second call does no-op.

First call should not be made at all in a non-internationalized program;
the default behaviour should be i18n.

> >Likewise, if the program is *not* using theses features, then they
> >should stick their ugly noses into the tent uninvited.
> 
> Users prefers to interact in native language with all programs
> which they have. It is hard to explain to user why tcsh reacts
> on LANG settings when ls does not.

Neither one should react, or at least the characters displayed should
not change.  The conversion of the high bit set characters into '?' in
ls is broken.

When a character in the 0x20-0x7f,0xa0-0xff range is put out, it should
not be translated or otherwise multilated if you are in a C or i18n
locale.  Period.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510162137.OAA25492>