From owner-freebsd-questions  Fri Mar 12 20:24:49 1999
Delivered-To: freebsd-questions@freebsd.org
Received: from alpha.comkey.com.au (alpha.comkey.com.au [203.9.152.215])
	by hub.freebsd.org (Postfix) with SMTP id 81A5314C43
	for <questions@FreeBSD.ORG>; Fri, 12 Mar 1999 20:23:51 -0800 (PST)
	(envelope-from gjb@comkey.com.au)
Received: (qmail 4444 invoked by uid 1001); 13 Mar 1999 04:21:23 -0000
Message-ID: <19990313042123.4443.qmail@alpha.comkey.com.au>
X-Posted-By: GBA-Post 1.04 06-Feb-1999
X-PGP-Fingerprint: 5A91 6942 8CEA 9DAB B95B  C249 1CE1 493B 2B5A CE30
Date: Sat, 13 Mar 1999 14:21:23 +1000
From: Greg Black <gjb@comkey.com.au>
To: Graeme Tait <graeme@echidna.com>
Cc: "questions@FreeBSD.ORG" <questions@FreeBSD.ORG>
Subject: Re: Use of pipe with gzip | more 
References: <36E94F09.C8A0DA32@echidna.com> 
In-reply-to: <36E94F09.C8A0DA32@echidna.com> 
    of Fri, 12 Mar 1999 09:29:45 PST
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-freebsd-questions@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> I was examining a large gzipped text file (about 6MB zipped) using
> 
> $ gzip -cd file.gz | more
> 
> and used "/[text]" to attempt to find a line that didn't exist. As indicated by
> top, "more" consumed all available CPU for a very long time, its PRI value
> rising to over 100 before it finally reported "Pattern not found". The elapsed
> time was a couple of orders of magnitude more than if I had unzipped the file
> first, and then run "more" on the unzipped file. However, some files I do this
> on are so large that unzipping first places a burden on available file space.
> 
> Is this a legitimate use of a pipe?

It's legitimate, but can be very inefficient if you have
insufficient memory and disk resources.  If you want to be able
to scroll back to the beginning of the file, more (or whatever
process is sitting in that part of pipeline) has to buffer the
whole data set in memory.  If you use it on real files, it
doesn't need to do that and its memory use can be much more
restrained.

If this is something that you need to do and for which you need
better performance, you either need to develop better software
(which is non-trivial) or add more hardware (which is cheap and
simple).

If neither of these solutions appeals, consider using grep to
find the stuff you want and apply its options to provide the
output lines in context and possibly the options to show line
numbers or byte offsets so you can locate stuff you want and
maybe extract larger clumps from the input with dd.  This might
take a bit of fiddling, but it's a simple solution.

-- 
Greg Black <gjb@acm.org>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message