Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Aug 2010 12:32:52 -0500
From:      Alan Cox <alan.l.cox@gmail.com>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        Dimitry Andric <dimitry@andric.com>, current@freebsd.org
Subject:   Re: Official request: Please make GNU grep the default
Message-ID:  <AANLkTim6ogFvM-BLRXQcJOtKS4vK=ZH1H30=_t8KcqVM@mail.gmail.com>
In-Reply-To: <20100817154537.GM2396@deviant.kiev.zoral.com.ua>
References:  <20100813085235.GA16268@freebsd.org> <4C66C010.3040308@FreeBSD.org> <4C673F02.8000805@FreeBSD.org> <20100815013438.GA8958@troutmask.apl.washington.edu> <4C67492C.5020206@FreeBSD.org> <B7A05068-9578-4341-851B-86BD9BC7A2DA@gmail.com> <8639ufd78w.fsf@ds4.des.no> <4C6844D8.5070602@andric.com> <86sk2faqdl.fsf@ds4.des.no> <4C6AAA88.5080606@andric.com> <20100817154537.GM2396@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 17, 2010 at 10:45 AM, Kostik Belousov <kostikbel@gmail.com>wrote:

> [Cc: list sanitized]
>
> On Tue, Aug 17, 2010 at 05:28:08PM +0200, Dimitry Andric wrote:
> > On 2010-08-16 10:55, Dag-Erling Sm??rgrav wrote:
> > > Dimitry Andric <dimitry@andric.com> writes:
> > >> - Uses plain file descriptors instead of struct FILE, since the
> > >>   buffering is done manually anyway, and it makes it easier to support
> > >>   gzip and bzip2.
> > > It might be worth a shot adding mmap(2) support as well, i.e. when
> > > processing an uncompressed regular file, try to mmap(2) it first, and
> if
> > > that fails, fall back to the plain buffered read(2) method.
> >
> > I added a simple mmap to grep, and time-trialed it, but the mmap version
> > was somewhat slower than the regular version.  I understood from Kostik
> > Belousov that readahead does not work properly with mmap, and it should
> > not be used for "one-time" reads.
> This is not exactly what I said. I argue that read-ahead implemented
> by vm_faul() is much less efficient that buffer clustering. Also,
> the cost of setting user mapping for the one time read is also non-trivial.
> The conclusion is right, it is better to use read(2) for one-time read.
>

The mapping (and unmapping) costs should be relatively small if the contents
of the file can be prefaulted using 2/4MB pages.  In such cases, we still
touch every struct vm_page in the 2/4MB region, but we only create and
destroy one PTE and PV entry, and perform a single INVLPG.

Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTim6ogFvM-BLRXQcJOtKS4vK=ZH1H30=_t8KcqVM>