From owner-freebsd-current@FreeBSD.ORG Tue Apr 6 11:15:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9F4C16A4F0; Tue, 6 Apr 2004 11:15:03 -0700 (PDT) Received: from kientzle.com (h-66-166-149-50.snvacaid.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id E438A43D3F; Tue, 6 Apr 2004 11:15:02 -0700 (PDT) (envelope-from tim@kientzle.com) Received: from kientzle.com (p54.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i36IEs90075653; Tue, 6 Apr 2004 11:14:54 -0700 (PDT) (envelope-from tim@kientzle.com) Message-ID: <4072F398.5040709@kientzle.com> Date: Tue, 06 Apr 2004 11:14:48 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ruslan Ermilov References: <200404052132.i35LWIgJ009519@repoman.freebsd.org> <20040406082945.GD397@ip.net.ua> In-Reply-To: <20040406082945.GD397@ip.net.ua> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: Tim Kientzle cc: current@FreeBSD.org Subject: Re: cvs commit: src/usr.bin/tar Makefile bsdtar.1 bsdtar.c bsdtar.h bsdtar_platform.h matching.c read.c util.c write.c X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2004 18:15:23 -0000 Ruslan Ermilov wrote: > On Mon, Apr 05, 2004 at 02:32:18PM -0700, Tim Kientzle wrote: > >>kientzle 2004/04/05 14:32:18 PDT >> >> FreeBSD src repository >> >> Added files: >> usr.bin/tar Makefile bsdtar.1 bsdtar.c bsdtar.h >> bsdtar_platform.h matching.c read.c >> util.c write.c >> Log: >> Initial commit for bsdtar. >> > > Awesome! Are there some benchmarking results available? I haven't focused very closely on performance yet, to be honest, though the internal architecture is pretty clean (minimal data copying; reuse of internal buffers to avoid heap thrashing). I did some quick tests early on and the performance (on dearchiving) was roughly comparable to gnutar. (Within about 5-10%.) That will improve some as I continue to work on it. However, in general, I expect it to be a little bit slower because the compression isn't handled in a separate process (thus there's less overlapping of I/O and computation). But, there are a lot of nice new features: * Fully automatic format/compression detection. In particular, the following commands all work: bsdtar -xf file.tgz bsdtar -xf file.tbz bsdtar -xf file.cpio or even fetch -o - http://...../file.tgz | bsdtar -xf - GNU tar can't do any of these; 'star' fails the last one. To be fair, "Heirloom tar" does support all of these. * Ability to interpolate an archive. The following combines the contents of "foo1.tgz" and "foo2.cpio" into a single archive called "out.tbz": bsdtar -cjf out.tbz @foo1.tgz @foo2.cpio Yes, you can mix interpolations and regular files on the command line. You can even interpolate from stdin: bsdtar -cjf - -F pax @- converts an archive read on stdin into a pax-format, bzip2-compressed archive on stdout. Once I get mtree read support, you'll be able to convert an mtree file into a shell script, for example: bsdtar -cf tree.sh -F shar @tree.mtree * Compliance with SUSv2. SUSv2 (POSIX.1-1997 ?) was the last official spec for tar. GNU tar does not comply with the file format specified there, nor does it correctly implement the command-line options specified there. By default, bsdtar will create standard ustar archives unless it finds a file attribute that is not supported by ustar (such as a very long filename or ACL), in which case it will use SUSv3 (POSIX.1-2001) extensions to carry the additional data. There are command-line options to force straight ustar format or permit SUSv3 ("pax") extensions even when not absolutely required. (The default format won't use SUSv3 extensions just to store atime/ctime or sub-second timestamps; specifying "pax" format will.) * Support for SUSv3 extensions. The "pax" format extensions eliminate essentially all of the historic limitations of tar in a way that is easily extensible and compatible with standard-compliant "pax" implementations on other platforms. (as well as some modern tar implementations, notably Joerg Schilling's "star") * More complete archiving. With the "pax" format, bsdtar will archive ino/dev/nlink, sub-second resolution mtime/ctime/atime, ACLs, file flags, etc, etc. Not all of this can currently be restored (ino/dev/nlink/ctime are currently ignored on extract), but it's all stored in the archive. * Broad format support. bsdtar reads the usual bevy of tar formats, and some cpio archives (only the odc variant at the moment). It writes standard tar formats, cpio, and shar. The underlying libarchive library is extensible and I have plans for reading mtree files, reading/writing more cpio formats, reading ZIP archives, etc. * Cleanly factored. The archive format support is all in a separate library. It should be fairly routine to build "cpio" or "pax" command-line interfaces to the same library or use the library for "pkg_install" or "pkg_create." For comparison, right now "bsdtar" is ~2,000 lines of C, "libarchive" is closer to 10,000 lines of C. There is some performance work to be done; I need to build a uid/gid/uname/gname cache, for example. Part of my recent rewrite of the ACL support was to get to the point that there was one place where all such lookups were handled, regardless of whether it's a file owner or an ACL that needs the information. There are still a few bugs to iron out and a couple of features that are a bit incomplete, but it's getting better quickly. My hope is that a few adventurous souls will start using it and giving me feedback so that I can grow it into the system tar that FreeBSD deserves. Tim