Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 04 Sep 2008 13:11:19 +0200
From:      Gabor Kovesdan <gabor@kovesdan.org>
To:        Andrey Chernov <ache@nagual.pp.ru>, Gabor Kovesdan <gabor@kovesdan.org>, hackers@freebsd.org, Max Khon <fjoe@freebsd.org>, dougb@freebsd.org,  krion@freebsd.org, current@freebsd.org
Subject:   Re: CFT: BSD grep
Message-ID:  <48BFC257.2010000@kovesdan.org>
In-Reply-To: <20080827013221.GA82176@nagual.pp.ru>
References:  <48B44A7D.3070108@kovesdan.org> <20080827013221.GA82176@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrey Chernov ha scritto:
> Just from quick looking at the sources...
>
> This code looks suspicious:
>
> wend = sscanf(&l->dat[pmatch.rm_eo], "%lc", &wend);
>
> Perhaps it should be
>
> if (sscanf(&l->dat[pmatch.rm_eo], "%lc", &wend) != 1)
> 	r = REG_NOMATCH;
>
> The next thing is that perhaps each r = REG_NOMATCH; case should be 
> isolated from others in this block (with "else if"?)
> F.e. failing mbstowcs() can leave buffer for sscanf() in junk.
>
> wbegin = grep_malloc(mbstowcs(NULL, l->dat, pmatch.rm_so));
>
> grep_malloc() here could terminate program for invalid mbstowcs() 
> sequence, but really must set only r = REG_NOMATCH;
>
> Think about files which, for various reasons, may contain not only valid 
> MB sequences.
>
> fgrepcomp() uses toupper()/tolower() while should use wide chars analogs 
> (MB chars can be in the pattern too). There are also many other places 
> where pattern treated as single chars one, fastcomp() etc. grep_cmp() 
> compares single chars toupper(data[]) too. There must be no plain ctype 
> usage in the whole data _and_ pattern handling code.
>   
Hello Andrey,

thanks for the detailed description of the current deficiencies, I'll 
fix them soon. I've been busy with moving to another flat, that's why I 
haven't replied yet, sorry for that.

Gábor




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48BFC257.2010000>