From owner-freebsd-stable@FreeBSD.ORG Mon Jan 14 00:22:47 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6903816A419; Mon, 14 Jan 2008 00:22:47 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (h-66-166-149-50.snvacaid.covad.net [66.166.149.50]) by mx1.freebsd.org (Postfix) with ESMTP id 21DF913C468; Mon, 14 Jan 2008 00:22:47 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: (from root@localhost) by kientzle.com (8.12.9/8.12.9) id m0E07oXW058932; Sun, 13 Jan 2008 16:07:50 -0800 (PST) (envelope-from kientzle@freebsd.org) Received: from [10.0.0.209] (p54.kientzle.com [66.166.149.54]) by kientzle.com with SMTP; Sun, 13 Jan 2008 16:07:49 -0800 (PST) (envelope-from kientzle@freebsd.org) Message-ID: <478AA7D5.6050707@freebsd.org> Date: Sun, 13 Jan 2008 16:07:49 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060422 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Steven Hartland , freebsd-stable@freebsd.org, kris@freebsd.org References: <4789D7BA.9080000@kientzle.com> <002b01c85606$48e4dd10$b6db87d4@multiplay.co.uk> In-Reply-To: <002b01c85606$48e4dd10$b6db87d4@multiplay.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: FreeBSD tar errors on valid empty tar.gz X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2008 00:22:47 -0000 > On Jan 10, 2008, at 4:41 PM, Kris Kennaway wrote: >>> Not that I'm aware of. gtar works but libarchive tar fails on >>> the file it created. > > Indeed. Trying to create a tarball using a non-existent list of files > returns an error and generates a 0-byte tgz; as previously shown, BSD > tar in 6.3 treats that as an empty archive, which seems reasonable, > whereas gtar feeds it to gzip which generates an error: > > 20% tar cvzf test.tar.gz --files-from empty > tar: Couldn't open empty: No such file or directory > 21% ls -l test.tar.gz > -rw-r--r-- 1 chuck chuck 0 Jan 10 19:42 test.tar.gz > 22% tar tvzf test.tar.gz > 23% gtar tvzf test.tar.gz > gzip: (stdin): unexpected end of file > gtar: Child returned status 1 > gtar: Error exit delayed from previous errors I don't normally follow freebsd-stable@, but Steven and Kris were both kind enough to bring this discussion to my attention. In short: yes, this is broken in 6.2 and fixed in 6.3 and 7. If you wish, you can install the libarchive port to get a fix for this bug. I think the details are a little bit interesting: It turns out that empty archives are a tricky case. Libarchive always tastes files to determine their format, and empty files have nothing in them, so libarchive used to fall over when it tried to determine the format. After all, there was no data there to be tasted, so there's no way to distinguish between an empty file pretending to be a cpio archive and an empty file pretending to be a tar archive. I eventually resolved this paradox by adding a new format called "empty" that attaches itself to empty files. In Chuck's example above, bsdtar simply ignores the 'z' when reading the archive. Instead, it tastes the file for compression, sees that it's uncompressed, then tastes the archive format and recognizes the archive as having format "empty." Any archive with format "empty" successfully returns no entries. So bsdtar can successfully list nothing from an empty input file. $ tar tvvf /dev/null Archive Format: Empty file, Compression: none $ tar tvvf /dev/zero Archive Format: tar, Compression: none Note: In bsdtar, "vv" adds a final summary line that isn't otherwise shown. (The last line here only works after a commit I pushed in about 2 minutes ago. Thanks for helping me find this new bug. ;-) In FreeBSD 6.2, libarchive lacked the "empty" format support so it would choke when it tried to identify the format of the empty file. GNU tar, on the other hand, always invokes gzip if you tell it to, and gzip complains loudly if its input is mis-formatted in any way. But GNU tar doesn't complain if you try to list the contents of an empty file without the 'z' option. Go figure. Post-6.2, libarchive has also fixed a more serious bug writing empty archives, so that it no longer creates an empty file when you try to create a tar archive with no entries. In particular, even if there are no entries, a tar archive always gets an end-of-file marker (1024 zero bytes) and correct padding. Trivia: bsdtar has special logic so that the "r" and "u" modes work correctly with empty files; it asks libarchive for the format and then silently converts "empty" to a format that can actually be written before re-opening the archive to append to it. (Defaults to "pax restricted," though you can specify --format to force the result.) This is surprisingly complicated. Deep trivia: Someone else asked about why they sometimes saw 1024-byte empty archives and sometimes 10240-byte archives. This is deliberate: Tar archives are padded to a block size---10240 bytes by default---except in certain circumstances. Uncompressed archives are padded unless being written to a file on disk. Size Command 10240 tar cvf - --from-file /dev/null | wc 1024 tar cvf test.tar --from-file /dev/null 1024 tar cvf - --from-file /dev/null > test.tar Archives to be compressed are always padded: Size Command 10240 tar cvzf test.tgz --from-file /dev/null; gunzip test.tgz Compressed archives are padded after compression unless being written to a regular file on disk: Size Command 10240 tar cvzf - --from-file /dev/null | wc 45 tar cvzf test.tgz --from-file /dev/null Tim K