Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Sep 2007 02:01:00 +0900
From:      "YAMAMOTO, Taku" <taku@tackymt.homeip.net>
To:        Andrey Chernov <ache@nagual.pp.ru>
Cc:        current@freebsd.org, i18n@freebsd.org, Petr Hroudn?? <petr.hroudny@gmail.com>, perky@freebsd.org
Subject:   Re: Ctype patch for review
Message-ID:  <20070918020100.d43beb0b.taku@tackymt.homeip.net>
In-Reply-To: <20070917092130.GA24424@nagual.pp.ru>
References:  <20070916192924.GA12678@nagual.pp.ru> <ab8fc7f50709170129p6f436069iffaf697e83a34e3c@mail.gmail.com> <20070917092130.GA24424@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Sep 2007 13:21:30 +0400
Andrey Chernov <ache@nagual.pp.ru> wrote:

> On Mon, Sep 17, 2007 at 10:29:21AM +0200, Petr Hroudn?? wrote:
> > 2007/9/16, Andrey Chernov <ache@nagual.pp.ru>:
> > > The problem is: currently our single byte ctype functions are broken for
> > > wide characters locales in the argument range >= 0x80 - they may return
> > > false positives.
> > >
> > > For example, for UTF-8 locale we currently have:
> > > iswspace(0xA0)==1 and isspace(0xA0)==1
> > > (because iswspace() and isspace() are the same code)
> > > but must have
> > > isspace(0xA0)==0
> > 
> > This is exactly what happens on other OSes and I agree this is the
> > right behaviour
> > for UTF-8. However, we must ensure, that:
> > 
> > for C locale:  isspace(0xA0)==0
> > for ISO8859-* locales: isspace(0xA0)==1
> > for UTF-8 locales: isspace(0xA0)==0
> 
> The patch test for wide char locale presence first (__mb_cur_max > 1), so 
> does not affect single byte locales like ISO8859-*
> 

Checking for __mb_cur_max is not enough for certain locales.
For example, SJIS has following range for JIS X0201 (a.k.a. HALFWIDTH KANA).

/*
 * JIS X201
 */
PUNCT           0xa1-0xa5
SPACE           0xa0
BLANK           0xa0
SPECIAL         0xa1-0xdf
PHONOGRAM       0xa6-0xdf
SWIDTH1         0xa0-0xdf


-- 
-|-__   YAMAMOTO, Taku
 | __ <     <taku@tackymt.homeip.net>

      - A chicken is an egg's way of producing more eggs. -



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070918020100.d43beb0b.taku>