Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Jul 1999 13:42:43 -0400
From:      "Mark Dickey" <mark@suarez.bestweb.net>
To:        <freebsd-hackers@FreeBSD.ORG>
Subject:   Re: replacing grep(1)
Message-ID:  <012801bed920$99256c20$23645ed1@bestweb.net>
References:  <Pine.GSO.4.10.9907272344440.19477-100000@rac9.wam.umd.edu> <379F3701.F79E10DF@newsguy.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I expect that there is a very good reason why this shouldn't be done,
but could it be possible to implement two different algorithms/code
dependant on the size of the file being grepped?

Mark Dickey
mark@bestweb.net

Daniel C. Sobral wrote:
> James Howard wrote:
> >
> > Due to the discussion of speed, I have been looking at it and it is
really
> > slow.  Even slower than I thought and I was thinking it was pretty slow.
> >
> > So using gprof, I have discovered that it seems to spend a whole mess of
> > time in grep_malloc() and free().  So I pulled all the references to
> > malloc inside the main loop (the copy for ln.dat and removed queueing).
> > This stills leaves us with a grep that is about ~6x slower than GNU.
> > Before that, it ran closer to 80x.  After this, gprof says it spends
> > around 53% of its time in procline().
>
> Sorry, but a simplistic analysis like that just won't cut for grep.
> The algorithmic complexity is highly relevant here. Try this:
> generate a 1 Mb file, and then generate 10 Mb and 50 Mb files by
> concatenating that first file. Benchmark yours and GNU grep a number
> of times to get the average for each file. Now compare the
> *proportions* between the different sized files. Are they the same?
>
> Next, try different sized patterns on the 50 Mb file on both yours
> and GNU grep. Again, compare the proportion.
>
> Next, compare patterns with different number of "wildcards",
> patterns with things like [acegikmoqsuvxz] vs
> [acegikmoqsuvxzACEGIKMOQSUVXZ], etc.
>
> Either that, or do a complexity analysis of the algorithms. :-)
>
> (In case anyone reading this discussion wants to know more about
> complexity of algorithms, I recommend Computer Algorithms,
> Introduction to Design and Analysis, by Sara Baase, Addison Wesley.)
>
> --
> Daniel C. Sobral (8-DCS)
> dcs@newsguy.com
> dcs@freebsd.org
>
> "Is it true that you're a millionaire's son who never worked a day
> in your life?"
> "Yeah, I guess so."
> "Lemme tell you, son, you ain't missed a thing."
>
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message
>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?012801bed920$99256c20$23645ed1>