Date: Wed, 28 Jul 1999 13:42:43 -0400 From: "Mark Dickey" <mark@suarez.bestweb.net> To: <freebsd-hackers@FreeBSD.ORG> Subject: Re: replacing grep(1) Message-ID: <012801bed920$99256c20$23645ed1@bestweb.net> References: <Pine.GSO.4.10.9907272344440.19477-100000@rac9.wam.umd.edu> <379F3701.F79E10DF@newsguy.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I expect that there is a very good reason why this shouldn't be done, but could it be possible to implement two different algorithms/code dependant on the size of the file being grepped? Mark Dickey mark@bestweb.net Daniel C. Sobral wrote: > James Howard wrote: > > > > Due to the discussion of speed, I have been looking at it and it is really > > slow. Even slower than I thought and I was thinking it was pretty slow. > > > > So using gprof, I have discovered that it seems to spend a whole mess of > > time in grep_malloc() and free(). So I pulled all the references to > > malloc inside the main loop (the copy for ln.dat and removed queueing). > > This stills leaves us with a grep that is about ~6x slower than GNU. > > Before that, it ran closer to 80x. After this, gprof says it spends > > around 53% of its time in procline(). > > Sorry, but a simplistic analysis like that just won't cut for grep. > The algorithmic complexity is highly relevant here. Try this: > generate a 1 Mb file, and then generate 10 Mb and 50 Mb files by > concatenating that first file. Benchmark yours and GNU grep a number > of times to get the average for each file. Now compare the > *proportions* between the different sized files. Are they the same? > > Next, try different sized patterns on the 50 Mb file on both yours > and GNU grep. Again, compare the proportion. > > Next, compare patterns with different number of "wildcards", > patterns with things like [acegikmoqsuvxz] vs > [acegikmoqsuvxzACEGIKMOQSUVXZ], etc. > > Either that, or do a complexity analysis of the algorithms. :-) > > (In case anyone reading this discussion wants to know more about > complexity of algorithms, I recommend Computer Algorithms, > Introduction to Design and Analysis, by Sara Baase, Addison Wesley.) > > -- > Daniel C. Sobral (8-DCS) > dcs@newsguy.com > dcs@freebsd.org > > "Is it true that you're a millionaire's son who never worked a day > in your life?" > "Yeah, I guess so." > "Lemme tell you, son, you ain't missed a thing." > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?012801bed920$99256c20$23645ed1>