From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 20 18:25:58 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 693551065673 for ; Fri, 20 Apr 2012 18:25:58 +0000 (UTC) (envelope-from scdbackup@gmx.net) Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.22]) by mx1.freebsd.org (Postfix) with SMTP id CF3968FC14 for ; Fri, 20 Apr 2012 18:25:57 +0000 (UTC) Received: (qmail invoked by alias); 20 Apr 2012 18:25:51 -0000 Received: from 165.126.46.212.adsl.ncore.de (HELO 192.168.2.69) [212.46.126.165] by mail.gmx.net (mp004) with SMTP; 20 Apr 2012 20:25:51 +0200 X-Authenticated: #2145628 X-Provags-ID: V01U2FsdGVkX1/u5pojGv/uerAtOAayRIQVKbZw5YeuBGgQIYeHXd AuHF5y+EdjaYLK Date: Fri, 20 Apr 2012 20:26:36 +0200 From: "Thomas Schmitt" To: freebsd-hackers@freebsd.org References: In-Reply-To: Message-Id: <99329673623314@192.168.2.69> X-Y-GMX-Trusted: 0 Cc: wojtek@wojtek.tensor.gdynia.pl Subject: Re: what's wrong with cd9660 fs X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Apr 2012 18:25:58 -0000 Hi, > mkisofs -rJ --iso-level 3 -o /path_to/file.iso . > and i see TWO 4 gigabyte files with the same name! This happens too on my "8.0-STABLE Mar 23 14:55:20 CET 2010". FreeBSD is probably not alone with this. An example image can be found at https://dev.haiku-os.org/attachment/ticket/8473/file_of_4gb.iso.bz2 compressed 25 KiB bzip2, uncompressed 4.1 GiB ISO 9660 image with "file_of_4gb" having 2 extents. The content of the image root directory should be: $ ls f* -rw-r--r-- 1 1000 1000 4297064448 Apr 16 13:00 file_of_4gb But on FreeBSD i get this: $ bunzip2 file_of_4gb.iso.bz2 # su Password: # mdconfig -a -t vnode -f file_of_4gb.iso md1 # mount_cd9660 /dev/md1 /mnt # ls -l /mnt/f* -rw-r--r-- 1 1000 1000 4294965248 Apr 16 13:00 file_of_4gb -rw-r--r-- 1 1000 1000 4294965248 Apr 16 13:00 file_of_4gb Omitting Rock Ridge interpretation changes the result: # umount /mnt # mount_cd9660 -r /dev/md1 /mnt # ls -l /mnt/f* -r-xr-xr-x 1 root wheel 2099200 Apr 16 13:00 file_of_4gb This is a skewed reflection of the image entrails: The file content is stored in two file sections (see also ECMA-119 6.5.1). The directory records of both sections bear the same File Identifier (struct iso_directory_record.name), but different Location of Extent (struct iso_directory_record.extent) and Data Length (struct iso_directory_record.size). One extent begins at block 55 (= .extent) and has 4294965248 bytes (= .size). The other begins at block 2097206 and has 2099200 bytes. Afaics in my local copy of /usr/src/sys/fs/cd9660/cd9660_node.h there is a 1:1 relation between struct iso_node and ECMA-119 extents: long iso_extent; /* extent of file */ unsigned long i_size; In /usr/src/sys/fs/cd9660/cd9660_vfsops.c i believe to see a 1:1 relation between struct vnode and struct iso_node: vp->v_data = ip; ip->i_vnode = vp; But there would be needed a 1:N relation between inode and ECMA-119 extents in order to represent large files. I am currently trying to understand how fs/udf handles multiple extents. /usr/src/sys/fs/udf/udf_vnops.c bears in function udf_bmap_internal() a comment: * If the offset is beyond the current extent, look for the * next extent. Have a nice day :) Thomas