Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Jun 2008 15:49:17 +0400
From:      Andrey Chernov <ache@nagual.pp.ru>
To:        Dag-Erling Sm??rgrav <des@des.no>
Cc:        Doug Barton <dougb@FreeBSD.org>, current@FreeBSD.org, Konrad Jankowski <konrad.jankowski@bluemedia.pl>, Diomidis Spinellis <dds@aueb.gr>, hackers@FreeBSD.org, Gabor Kovesdan <gabor@FreeBSD.org>, Max Khon <fjoe@samodelkin.net>, "Sean C. Farley" <scf@FreeBSD.org>, K?vesd?n G?bor <gabor@t-hosting.hu>
Subject:   Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo]
Message-ID:  <20080618114917.GB89383@nagual.pp.ru>
In-Reply-To: <86skvbc9gn.fsf@ds4.des.no>
References:  <48577510.4020007@aueb.gr> <48577BD2.4070205@bluemedia.pl> <20080617102900.GA46479@nagual.pp.ru> <485798C4.2050605@FreeBSD.org> <20080618055851.GA85018@nagual.pp.ru> <86zlpjduew.fsf@ds4.des.no> <20080618083739.GA87100@nagual.pp.ru> <867icndqv5.fsf@ds4.des.no> <4858DBF6.5070001@bluemedia.pl> <86skvbc9gn.fsf@ds4.des.no>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote:
> For grep, I believe it should simply be a matter of calling setlocale(),
> using wide strings, and using a multibyte regex engine (for appropriate
> values of "simply").

See my prev reply telling more details. Using wide strings is not so easy, 
f.e. all ctype BSD grep now uses should be converted to wctype, input 
conversion added, etc.

> Another thing I'm unsure about is the matter of input and output.  Do
> mbstowcs() / mbtowc() simply trust the input to conform to LC_CTYPE and
> convert accordingly?  When reading UTF, do they recognize and handle

They return EILSEQ on wrong sequence.

-- 
http://ache.pp.ru/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080618114917.GB89383>