Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Jan 2007 21:09:35 +0200
From:      Giorgos Keramidas <keramida@ceid.upatras.gr>
To:        Kurt Buff <kurt.buff@gmail.com>
Cc:        James Long <list@museum.rain.com>, freebsd-questions@freebsd.org
Subject:   Re: Batch file question - average size of file in directory
Message-ID:  <20070103190935.GA7164@kobe.laptop>
In-Reply-To: <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com>
References:  <20070102200721.31D1C16A517@hub.freebsd.org> <20070103035000.GA99263@ns.umpquanet.com> <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2007-01-03 10:42, Kurt Buff <kurt.buff@gmail.com> wrote:
> On 1/2/07, James Long <list@museum.rain.com> wrote:
> <snip my problem description>
> >Hi, Kurt.
> >
> >Can I make some assumptions that simplify things?  No kinky filenames,
> >just [a-zA-Z0-9.].  My approach specifically doesn't like colons or
> >spaces, I bet.  Also, you say gzipped, so I'm assuming it's ONLY gzip,
> >no bzip2, etc.
>
> Right, no other compression types - just .gz.
>
> Here's a small snippet of the directory listing:
>
> -rw-r-----  1 kurt  kurt   108208 Dec 21 06:15 dummy-zKLQEWrDDOZh
> -rw-r-----  1 kurt  kurt    24989 Dec 28 17:29 dummy-zfzaEjlURTU1
> -rw-r-----  1 kurt  kurt    30596 Jan  2 19:37 stuff-0+-OvVrXcEoq.gz
> -rw-r-----  1 kurt  kurt     2055 Dec 22 20:25 stuff-0+19OXqwpEdH.gz
> -rw-r-----  1 kurt  kurt    13781 Dec 30 03:53 stuff-0+1bMFK2XvlQ.gz
> -rw-r-----  1 kurt  kurt    11485 Dec 20 04:40 stuff-0+5jriDIt0jc.gz
>
>> Here's a first draft [...]
>
> Hmmm....
>
> That's the same basic approach that Giogos took, to uncompress the
> file and count bytes with wc. I'm liking the 'zcat -l' contstruct, as
> it looks more flexible, but then I have to parse the output, probably
> with grep and cut.

Excellent.  I didn't know about the -l option of gzip(1) until today :)

You can easily extract the uncompressed size, because it's always in
column 2 and it contains only numeric digits:

    gzip -l *.gz *.Z *.z | awk '{print $2}' | grep '[[:digit:]]\+'

Then you can feed the resulting stream of uncompressed sizes to the awk
script I sent before :)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070103190935.GA7164>