Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jul 2012 21:47:13 -0700
From:      Tim Kientzle <tim@kientzle.com>
To:        Ryan Stone <rysto32@gmail.com>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: Generating a tarball directly from make installworld
Message-ID:  <25149679-6B99-4FF0-AB8C-90D5A7880F00@kientzle.com>
In-Reply-To: <CAFMmRNwiZtbfuyT3tZ1udKk=VPJgwVuAD9gS=FY9rdGuoupqMw@mail.gmail.com>
References:  <CAFMmRNwiZtbfuyT3tZ1udKk=VPJgwVuAD9gS=FY9rdGuoupqMw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 10, 2012, at 8:17 PM, Ryan Stone wrote:
>=20
> The other problem that I have is performance.  When bsdtar appends to
> a tar file, it iterates over every entry in the tar to figure out
> where the end of it is.  I gather that this is to get rid of padding
> but I'm not entirely sure.

It is unfortunately necessary if you want to append
to an existing file.  In short:  The tar format wasn't
designed for appending, and the append option
of the standard tar command is a 30-year old hack.

But, I wrote a better approach into libarchive and
bsdtar a few years ago.  See below.

>   Even if this isn't necessary I still have
> to iterate over the entire file in most cases.  The problem is in the
> sloppy semantics of ln and install: install foo bar means "install foo
> to path bar/foo" if bar is a directory, but "install foo to path bar"
> if bar is a regular file or it doesn't exist(symlinks add an extra
> layer of complexity).  In order to implement this correctly, I have to
> iterate over the tar to figure out what type of file bar is, every
> time that install or ln is invoked.=20

The sloppy semantics are indeed a problem and I
hadn't considered this before.  I fear the
only answer might be to fix the Makefiles so they
don't rely on this  (fortunately, most of the install and
ln invocations are built from just a few places, so it might
not be necessary to change very many places to fix it).

> I know that a lot of people have suggested generating an mtree file
> and then converting the mtree file into a tarball, but I admit that
> it's not at all clear to me how to generate the mtree file.

I can definitely help with this, since I had this exact
use in mind when I originally built that part of libarchive.

First, there are actually two different variants of mtree format.
The one supported by FreeBSD's mtree is the older one.
It's very pretty with all that indentation but not particularly
amenable to this kind of task.

The interesting one is a newer variant supported
by NetBSD's mtree and also supported by libarchive.
In the newer mtree variant, each line is completely
self-contained, e.g.,

/bin/ls group=3Dwheel user=3Droot mode=3D0755

Such files can be easily combined (just append them
together), can be appended to via "echo spec >>file",
etc.

Libarchive extends this further by adding a "contents"
keyword, e.g.,

/bin/ls user=3Droot group=3Dwheel mode=3D0755 =
contents=3D/usr/obj/usr/src/bin/ls/ls

When libarchive reads this line, it returns a file description
that has:
   * The specified name
   * The specified properties
   * The specified contents
   * (other properties --- including file size --- are taken from the =
contents file)

So, my idea was that 'install' or 'ln' could write a line
like the above to /usr/obj/usr/src/bin/ls/ls.dist-mtree and
at the end you could pull all those together and build a tar ball
in a single fast pass:

find . -name '*.dist-mtree' | xargs cat | bsdtar czf distfile.tgz @-

I wrote "man 5 mtree" to attempt to develop a single consistent
description of both mtree variants.  Ignore the mention of a
signature; that was misguided wishful thinking on my part as
I wrestled with how to teach libarchive to automatically recognize
mtree files.  Michihiro fortunately figured out a better way to
do that.

The '@-' here is a bsdtar extension that reads an archive
and appends the entries from that archive to the archive
being created.

For even more fun, you can install directly from the
mtree descriptions:

find . -name '*.dist-mtree' | xargs cat | bsdtar xf -

Another nice trick with this extended mtree format:
it's relatively easy to use tools like grep and see
to filter the mtree description, so you could play with
having makewhatis or kldxref read from an archive
and then let libarchive unpack directly from mtree for
you, e.g.,

find . -name '*.dist-mtree' | xargs cat | grep '/man/' | makewhatis =
--read-from-stdin-archive

Let me know if I can help.  As I said, I had this exact
application in mind when I built this support into
libarchive and bsdtar.  If there are additional tweaks
that would help, I'll see what I can do.

Tim




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25149679-6B99-4FF0-AB8C-90D5A7880F00>