Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Oct 1995 15:09:02 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=)
Cc:        terry@lambert.org, hackers@freefall.freebsd.org, joerg_wunsch@uriah.heep.sax.de, kaleb@x.org
Subject:   Re: A couple problems in FreeBSD 2.1.0-950922-SNAP
Message-ID:  <199510162209.PAA25573@phaeton.artisoft.com>
In-Reply-To: <lP2bj6l0P1@ache.dialup.demos.ru> from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Jan 16, 95 11:38:26 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >> >IMHO, the base utilities that use <ctype.h> should properly initialize
> >> >the locale instead of relying on that hack.  (The hack is useful to
> >> >force programs that don't like to handle locale's, but base utilities
> >> >of the system are expected to do it right theirselves.)
> >> 
> >> I have nothing against reverting this variable to
> >> DISABLE_STARTUP_LOCALE f.e. If you remember I plan to make startup locale
> >> as default for all program, but some peoples disagree, so I introduce
> >> ENABLE_STARTUP_LOCALE.
> 
> >I also thing that the crt0 is the *wrong* place to do the locale work,
> >which really belongs as a call in main().
> 
> It seems that every new person appears immediately starts to says
> the same wrong things as other starts instead of reading full
> discussion first where all this stuff already explained several times.

I've only been working on user interface internationalization for more
than 10 years, and OS internationalization for 6 years.

You may be right that I have insufficient context.  But I doubt it.

> It is very bad karma to call setlocale from main for ctype-oriented
> programs when chars size assumed <= 8bit.
> I already tries explain it to Joerg and if you really interested,
> you can found answer in my previous messages.
> Only crt0 is proper place for this things.

Then use XPG/3 localization techiques instead of XPG/4 multibyte
localization.  The two are not incompatible.

The problem you are facing here is that in that case, you have half the
designated locale definitions that you should have to support it.

It is a data problem and an XPG code compatability problem, not a
problem in the code doing the calls.

The crt0 hack is a kludge that supposedly "fixes" non-internationalized
programs that are otherwise 8 bit clean.

The reason it is that is that the default C locale is not i18n clean in
its undefined behaviour.

It should not be there.  0xa3 will display the same for you no matter
which 8859-x locale you pick, except the current C locale, which I think
is wrong.

The problem is that the current C locale renders some printable characters
unprintiable, etc. by virtue of the way the ctype.h macros operate.

Well, fix the C locale's undefined behaviour to be the same as the defined
8859-1 behaviour.  Problem solved.



> >It is wrong to "fix" broken use of a programming model by causing
> >broken use of the startup model in it's place.
> >
> >Making this broken startup code implicit rather than explicit (by changing
> >from a positive to a negative environment test) is just plain wrong.
> 
> Well, what you consider as 'broken' most of user expect to see as 'i18n'.

No.  You are misinterpreting what I have said.

The brokenness is in a setlocale() call in a non-internationalized
piece of code that happens to be (maybe) 8bit clean because the
default C locale happens to also be broken.

Fix the C locale, not the crt0.o.  Then, as time permits, fix the locale
unaware code.

> Why it is 'broken' to have right ctype at startup? Try to ask your
> customers, almost every user which directly sets "LANG" assume that
> 'ls' f.e. must be affected immediately and _not_ by additional hidden magic
> of 'setenv ENABLE_STARTUP_LOCALE'. If you don't want any startup code,
> simple not set your "LANG".

That makes ls broken for not explicitly calling setlocale().

Not setting 'LANG' is not going to recover my 24k for a NULL program.

> Where was you when Kaleb suggest more uglier hack with default code
> table propogating?

Hiding under a rock, apparently, or I would have called him on it.  I
probably *did* call him on it.

> My hack keeps right ctype in all cases and his hack works only for
> 8859-1 and not works even for 8859-n, n != 1.

As long as the characters are passed through unadulterated, there is
no difference for n == 1 and n != 1 in the non-setlocale() called case,
which is the issue.  If the damn thing wasn't being called and the
C locale were correctly defined for "undefined" code points, then there
would not be a problem.

Calling "setlocale()" for an otherwise non-internationalized program is
a big mistake, and just compounds the C locale mistake.  Correct the
right code.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510162209.PAA25573>