Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 31 Mar 2007 22:36:28 GMT
From:      Patrick Lamaiziere<patpr@davenulle.org>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   bin/111079: [libarchive] problem with big file
Message-ID:  <200703312236.l2VMaSkJ003630@www.freebsd.org>
Resent-Message-ID: <200703312250.l2VMo7KR064576@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         111079
>Category:       bin
>Synopsis:       [libarchive] problem with big file
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Mar 31 22:50:06 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Patrick Lamaiziere
>Release:        6.2-RELEASE
>Organization:
>Environment:
FreeBSD roxette.lamaiziere.net 6.2-RELEASE-p3 FreeBSD 6.2-RELEASE-p3 #0: Sat Mar 24 14:08:07 CET 2007     patrick@roxette.lamaiziere.net:/usr/obj/usr/src/sys/GENERIC  i386

>Description:
With a .tar file that include a big file (6 Go), bsdtar fails to extract or list the files into the archive because a bug into libarchive :
 
$ tar tf samba.tar
[...]
samba/bigfile.rar
tar: (Empty error message)

$ truss tf samba.tar
write(1,"\n",1)                                  = 1 (0x1)
samba/bigfile.rarwrite(1,"samba/bigfile.rar"...,79) = 79 (0x4f)
lseek(3,0x0,SEEK_CUR)                            = 843239424 (0x3242d000)
lseek(3,0x846ca000,SEEK_CUR)                     = -1230016512 (0xb6af7000)

write(1,"\n",1)                                  = 1 (0x1)
tar: write(2,"tar: ",5)                          = 5 (0x5)
(Empty error message)write(2,"(Empty error message)",21) = 21 (0x15)

--------------------

I think the problem is into the "file_skip" functions because this is the only place where there are two lseek(). I don't know witch one : there is one function into "archive_read_open_fd.c" and the other into "archive_read_open_file.c". Anyway they are similar:

file archive_read_open_fd.c :

static ssize_t
file_skip(struct archive *a, void *client_data, size_t request)
{
        struct read_fd_data *mine = client_data;
        off_t old_offset, new_offset;

        /* Reduce request to the next smallest multiple of block_size */
        request = (request / mine->block_size) * mine->block_size;
        /*
         * Hurray for lazy evaluation: if the first lseek fails, the second
         * one will not be executed.
         */
        if (((old_offset = lseek(mine->fd, 0, SEEK_CUR)) < 0) ||
            ((new_offset = lseek(mine->fd, request, SEEK_CUR)) < 0))
        {
[...CUT...]
        return (new_offset - old_offset);
}

The result is a ssize_t (int32) and new_offset, old_offset are off_t (int64)
There is an owerflow here :
from truss :
lseek(3,0x0,SEEK_CUR)                            = 843239424 (0x3242d000)
lseek(3,0x846ca000,SEEK_CUR)                     = -1230016512 (0xb6af7000)

So :
new_offset - old_offset
0xb6af7000 - 0x3242d000 = 0x846ca000 => this is a negative value on int32.

I don't know how to solve this problem, sorry.

Also, may be it would be better to compare lseek() == -1 instead lseek () < 0 ?

Best regards.
>How-To-Repeat:
list or extract big files (several Go) from a .tar with bsdtar.

$tar tf tar_with_big_file.tar
>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200703312236.l2VMaSkJ003630>