Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 May 2001 07:38:21 +0900 (JST)
From:      Noriyuki Soda <soda@sra.co.jp>
To:        ache@nagual.pp.ru, i18n@freebsd.org, audit@freebsd.org
Cc:        bsd-locale@hauN.org
Subject:   Re: CFR: ISO_* -> ISO-* locale renaming
Message-ID:  <200105182238.HAA29872@srapc342.sra.co.jp>
In-Reply-To: <20010519050946U.tshiozak@din.or.jp>
References:  <20010518203702.B79058@nagual.pp.ru> <20010519050946U.tshiozak@din.or.jp>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrey A. Chernov <ache@nagual.pp.ru> wrote:
> In the spirit of GNU locale (which use IANA charsets too) I plan to rename
> our ISO_* locales to ISO-* ones, because ISO-* is preferred name according
> to http://www.iana.org/assignments/character-sets and we have only one
> locale name, it should be preferred. GNU locale use preferred MIME names
> in the first place too.

It's highly suspicious to me.
I still think that using X11 codeset name is better than using IANA
registry due to the following problems.

6 questions.

1. As I already wrote, Solaris, Tru64 and IRIX uses "ISO8859-1".
  And X11's primary name of Latin-1 codeset is also "ISO8859-1".
  I prefer to use the name compatible with Solaris, Tru64, IRIX
  and X Window System, rather than the name only compatible with Linux.

  Note that Linux also supports "ISO8859-1" as locale's codeset suffix.
  So, if we use "ISO8859-1", we are still compatible with Linux, 
  as well as Solaris, Tru64 and IRIX.

  If we use "ISO-8859-1", we are only compatible with Linux,
  and we become incompatible with Solaris, Tru64 and IRIX.

  Why do you think that it is better to become only compatible with
  Linux?

  For me, apparently "ISO8859-1" is better.

2. What codeset name will you use for codesets which are available
  on X Window System, but not defined in IANA registry?
  (Yes, there is such codeset in locales supported by X11, already.)

  If we follow the convention of X Window System, this problem never
  happens.

  Note that nl_langinfo(CODESET) of glibc-2 returns *WRONG* result
  for such locale.

3. IANA registry (MIME charset name) is case insensitive.
  Will you support case-insensitive codeset-suffix for locale name?

  Yes, codeset-suffix in glibc is case insenstive,
  although language part and territory part of locale name are
  case sensitive.
  i.e. ja_JP.EUC-JP, ja_JP.eUc-jP, ja_JP.EuC-Jp are all correct
    locale name on glibc, although ja_jp.EUC-JP is incorrect.

  If we use IANA registry for codeset name, we should support
  case-insensitive codeset-suffix as above.

  Will you really support this?

4. IANA registry (MIME charset name) has many name variants in one
  codeset. For example, "Extended_UNIX_Code_Packed_Format_for_Japanese",
  "csEUCPkdFmtJapanese" are same codeset with "EUC-JP".
  Will you support all variants for locale name?

  Yes, glibc supports all variants.
  e.g. the following names are all valid locale names in glibc:
	"ja_JP.Extended_UNIX_Code_Packed_Format_for_Japanese"
	"ja_JP.csEUCPkdFmtJapanese"
	"ja_JP.EUC-JP"
	"ja_JP.eucJP"
	(note that because MIME charset name is case insenstive,
	 names which only differs about upper-case/lower-case
	 are also valid.
	 e.g. "ja_JP.eXTENDED_unix_cODE_pACKED_fORMAT_FOR_jAPANESE"
	  is valid, too.)

  Will you really support this?

5. Why do you think that is is better *NOT* to follow OpenGroup
  standard?

  At least, "eucJP" and "SJIS" seem to be OpenGroup standard as I
  already said. And those names are not compatible with IANA registry.

6. Do you really think that the following name should be usable
  for locale name?
	"ja_JP.Extended_UNIX_Code_Packed_Format_for_Japanese"
  (I don't think so.)
-- 
soda

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-i18n" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200105182238.HAA29872>