From owner-freebsd-hackers@FreeBSD.ORG Wed Jun 18 11:49:43 2008 Return-Path: Delivered-To: hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 897801065672; Wed, 18 Jun 2008 11:49:43 +0000 (UTC) (envelope-from ache@nagual.pp.ru) Received: from nagual.pp.ru (nagual.pp.ru [194.87.13.69]) by mx1.freebsd.org (Postfix) with ESMTP id CCED78FC1C; Wed, 18 Jun 2008 11:49:42 +0000 (UTC) (envelope-from ache@nagual.pp.ru) Received: from nagual.pp.ru (ache@localhost [127.0.0.1]) by nagual.pp.ru (8.14.2/8.14.2) with ESMTP id m5IBnHud089674; Wed, 18 Jun 2008 15:49:17 +0400 (MSD) (envelope-from ache@nagual.pp.ru) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nagual.pp.ru; s=default; t=1213789757; bh=h9mhvLWHfSU6doCssK94ZpDNwi1r2lc6q0hxEW5 mBVY=; l=711; h=Date:From:To:Cc:Subject:Message-ID:References: MIME-Version:Content-Type:In-Reply-To; b=N+UccMfo/nSV5iuljMXxd+wPx ysmq7qfiuuF4vzDbZMVvsRFQ0qaYAKqXO7U+OqxUof1EQjqrrb/RC7e0HjbcKUFvH3K C8zPtd6oudQjTX7O+zehoUmif8n9mHdsUGQBkQ6N+UK5UBz3uG3sCFfAsGrIPTAABgm Z0acd0Fjfx30= Received: (from ache@localhost) by nagual.pp.ru (8.14.2/8.14.2/Submit) id m5IBnHeX089673; Wed, 18 Jun 2008 15:49:17 +0400 (MSD) (envelope-from ache) Date: Wed, 18 Jun 2008 15:49:17 +0400 From: Andrey Chernov To: Dag-Erling Sm??rgrav Message-ID: <20080618114917.GB89383@nagual.pp.ru> Mail-Followup-To: Andrey Chernov , Dag-Erling Sm??rgrav , Konrad Jankowski , Gabor Kovesdan , Diomidis Spinellis , Doug Barton , K?vesd?n G?bor , hackers@FreeBSD.org, current@FreeBSD.org, "Sean C. Farley" , Max Khon References: <48577510.4020007@aueb.gr> <48577BD2.4070205@bluemedia.pl> <20080617102900.GA46479@nagual.pp.ru> <485798C4.2050605@FreeBSD.org> <20080618055851.GA85018@nagual.pp.ru> <86zlpjduew.fsf@ds4.des.no> <20080618083739.GA87100@nagual.pp.ru> <867icndqv5.fsf@ds4.des.no> <4858DBF6.5070001@bluemedia.pl> <86skvbc9gn.fsf@ds4.des.no> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86skvbc9gn.fsf@ds4.des.no> User-Agent: Mutt/1.5.18 (2008-05-17) X-Mailman-Approved-At: Wed, 18 Jun 2008 12:25:54 +0000 Cc: Doug Barton , current@FreeBSD.org, Konrad Jankowski , Diomidis Spinellis , hackers@FreeBSD.org, Gabor Kovesdan , Max Khon , "Sean C. Farley" , K?vesd?n G?bor Subject: Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jun 2008 11:49:43 -0000 On Wed, Jun 18, 2008 at 12:40:24PM +0200, Dag-Erling Sm??rgrav wrote: > For grep, I believe it should simply be a matter of calling setlocale(), > using wide strings, and using a multibyte regex engine (for appropriate > values of "simply"). See my prev reply telling more details. Using wide strings is not so easy, f.e. all ctype BSD grep now uses should be converted to wctype, input conversion added, etc. > Another thing I'm unsure about is the matter of input and output. Do > mbstowcs() / mbtowc() simply trust the input to conform to LC_CTYPE and > convert accordingly? When reading UTF, do they recognize and handle They return EILSEQ on wrong sequence. -- http://ache.pp.ru/