FreeBSD Mail Archives

Date:      Wed, 18 Aug 2010 23:25:49 +0200
From:      Dimitry Andric <dimitry@andric.com>
To:        mdf@FreeBSD.org
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Gabor Kovesdan <gabor@freebsd.org>
Subject:   Re: svn commit: r211463 - head/usr.bin/grep
Message-ID:  <4C6C4FDD.8080803@andric.com>
In-Reply-To: <AANLkTimjHt9NZa0-vU%2Bm2dkY2pTciUDLGd0Qut=uhFTq@mail.gmail.com>
References:  <201008181740.o7IHeA4c075984@svn.freebsd.org> <AANLkTimjHt9NZa0-vU%2Bm2dkY2pTciUDLGd0Qut=uhFTq@mail.gmail.com>

On 2010-08-18 22:48, mdf@FreeBSD.org wrote:
>>  - Refactor file reading code to use pure syscalls and an internal buffer
>>    instead of stdio.  This gives BSD grep a very big performance boost,
>>    its speed is now almost comparable to GNU grep.
> 
> I didn't read all of the details in the profiling mails in the thread,
> but does this mean that work on stdio would give a performance boost
> to many apps?  Or is there something specific about how grep(1) is
> using its input that makes it a horse of a different color?

Originally, it was reading files 1 character at a time, using fgetc(3),
the locking version even.  This is usually not the fastest way to read
a large file with stdio. :)

If grep did not have to support .gz or .bz2 files, we could just have
plugged in stdio's fgetln(3).  I tried this approach first on some
non-compressed files, and it performed much better than fgetc'ing.

The reading code that was now committed, is basically the same algorithm
as fgetln() uses internally, but it can handle gzip and bzip2 input too.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C6C4FDD.8080803>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation