From owner-freebsd-questions@FreeBSD.ORG Mon Jun 9 15:45:41 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50DB1106567F for ; Mon, 9 Jun 2008 15:45:41 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from gaia.nimnet.asn.au (nimbin.lnk.telstra.net [139.130.45.143]) by mx1.freebsd.org (Postfix) with ESMTP id 9183A8FC16 for ; Mon, 9 Jun 2008 15:45:39 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from localhost (smithi@localhost) by gaia.nimnet.asn.au (8.8.8/8.8.8R1.5) with SMTP id BAA21181; Tue, 10 Jun 2008 01:44:37 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Tue, 10 Jun 2008 01:44:36 +1000 (EST) From: Ian Smith To: Bill Campbell In-Reply-To: <20080608231711.053E310656E1@hub.freebsd.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Jos Chrispijn , Wojciech Puchar , Raphael Becker , freebsd-questions@freebsd.org Subject: Re: Grep Guru X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jun 2008 15:45:41 -0000 On Sun, 8 Jun 2008 16:07:12 -0700 Bill Campbell wrote: > On Mon, Jun 09, 2008, Raphael Becker wrote: > >On Sun, Jun 08, 2008 at 10:15:50PM +0200, Wojciech Puchar wrote: > >> find . -type f -print0|xargs -0 grep > > > >There's no more need for find | xargs > > > >Try: > > > >find . -type -f -exec grep {} \+ > > > >-exec foo {} \+ behaves like xargs foo > >-exec foo {} \; exec foo for every file Thanks for this kick; I'd missed or misunderstood using {} \+ > The issue here is that grep execs grep for each file found while > xargs batches the files. If find(1) is to be believed, so does -exec utility [argument ...] {} + > This is of particular importance if one wants to see the file > names in the output. In relation to this, if one wants to be > sure that grep always generates the file name, insure that it > always gets at least two files as arguments: > > find . -type f -print0 | xargs -0 grep pattern /dev/null Another good clue. Many ways to do anything; I've often used such as: % find /sys/ -name "*.[chm]" -exec egrep -Hi 'CPUFREQ_[GS]ET' {} \; which has grep print the filenames, rather than using -print with find, but I've just now run the above find, then using \+ instead, twice each, and am pleased to learn that the latter method runs ~4 times faster in real time and is even lighter on the system: % time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \; /sys/kern/kern_cpu.c:static int cpufreq_settings_sysctl(SYSCTL_HANDLER_ARGS); [.. etc ..] 20.524u 46.205s 4:03.91 27.3% 79+201k 5698+0io 0pf+0w % time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \+ 1.756u 3.058s 1:07.51 7.1% 81+290k 7148+0io 13pf+0w % time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \; 21.742u 44.382s 3:57.99 27.7% 79+200k 7144+0io 0pf+0w % time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \+ 1.651u 3.134s 0:58.39 8.1% 75+267k 7149+0io 10pf+0w (Ignore sloth; poor 300MHz Celeron already busy dumping /usr over nfs :) > FWIW, I have learned about gnu-grep's -r option reading this > thread, which I had not noticed previously. I guess that just > goes to show that old habits die hard :-). When you're on a good thing :) but always plenty new tricks to learn. cheers, Ian