Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Sep 2007 11:12:07 +0900
From:      Taku YAMAMOTO <taku@tackymt.homeip.net>
To:        Andrey Chernov <ache@nagual.pp.ru>
Cc:        i18n@FreeBSD.ORG, Petr Hroudn?? <petr.hroudny@gmail.com>, perky@FreeBSD.ORG, current@FreeBSD.ORG
Subject:   Re: Ctype patch for review
Message-ID:  <20070919111207.f37653fc.taku@tackymt.homeip.net>
In-Reply-To: <20070917171633.GA31179@nagual.pp.ru>
References:  <20070916192924.GA12678@nagual.pp.ru> <ab8fc7f50709170129p6f436069iffaf697e83a34e3c@mail.gmail.com> <20070917092130.GA24424@nagual.pp.ru> <20070918020100.d43beb0b.taku@tackymt.homeip.net> <20070917171633.GA31179@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Sep 2007 21:16:33 +0400
Andrey Chernov <ache@nagual.pp.ru> wrote:

> On Tue, Sep 18, 2007 at 02:01:00AM +0900, YAMAMOTO, Taku wrote:
> > Checking for __mb_cur_max is not enough for certain locales.
> > For example, SJIS has following range for JIS X0201 (a.k.a. HALFWIDTH KANA).
> > 
> > /*
> >  * JIS X201
> >  */
> > PUNCT           0xa1-0xa5
> > SPACE           0xa0
> > BLANK           0xa0
> > SPECIAL         0xa1-0xdf
> > PHONOGRAM       0xa6-0xdf
> > SWIDTH1         0xa0-0xdf
> 
> I don't understand your remark. MSKanji have __mb_cur_max = 2 and so those 
> ranges are wchar_t ranges. My patch restrict unsigned char ranges only.

These characters ARE single byte.
The problem is that a byte >= 0x80 does not always mean it composes a
multi-byte character in that locale.


-- 
-|-__   YAMAMOTO, Taku
 | __ <     <taku@tackymt.homeip.net>

      - A chicken is an egg's way of producing more eggs. -



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070919111207.f37653fc.taku>