Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Apr 2004 10:53:40 -0700
From:      Tim Kientzle <tim@kientzle.com>
To:        "Brian F. Feldman" <green@FreeBSD.org>
Cc:        current@FreeBSD.org
Subject:   Re: cvs commit: src/usr.bin/tar Makefile bsdtar.1 bsdtar.c bsdtar.h bsdtar_platform.h matching.c read.c util.c write.c
Message-ID:  <407C2924.2050503@kientzle.com>
In-Reply-To: <200404090121.i391LRlr096539@green.homeunix.org>
References:  <200404090121.i391LRlr096539@green.homeunix.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Brian F. Feldman wrote:
> 
> ... bsdtar is ALWAYS faster than the (multi-process) tar tfy/xfy!

Well, well, well.  That's a pleasant surprise.

> ... it should be possible to get more speed out of bsdtar by actually pulling in 
> the entire size of a block ... it's silly to save a little bit of space by 
> using a very small "file read" buffer.  For S_ISREG() use no KB instead of 
> 10KB by using mmap(2), maybe...

The 10KB default buffer size is actually dictated by
various standards that were designed for old tape
drives.  I didn't do that to save space.

In fact, libarchive is designed to handle a wide
variety of I/O strategies.  Bsdtar is just a particularly
conservative client at the moment, one that uses a
convenience routine that just reads fixed-size blocks.
This is suitable for tape I/O, but we can do better
for files.

Libarchive calls a client-provided routine to read
each "block" of input, but puts no restrictions on
the size of those blocks.  (For example, I've tested
libarchive with a routine that always returns a single
byte.  Slow, but it works.)  It is perfectly reasonable
for the client routine to mmap() the entire file and
return the whole thing as a single block.

Libarchive is a bit more restrictive on write than on
read; it always writes fixed-size blocks, but you
can at least set the block size (and set whether
or not the last block is padded).  In particular,
you could set 100k blocks and pad out to the next
multiple of 10k, which should be considerably more
efficient for the S_ISREG() case.

Tim Kientzle



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?407C2924.2050503>