From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 06:22:23 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A41416A41F for ; Sun, 30 Oct 2005 06:22:23 +0000 (GMT) (envelope-from maxim@macomnet.ru) Received: from mp2.macomnet.net (mp2.macomnet.net [195.128.64.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9A64843D46 for ; Sun, 30 Oct 2005 06:22:22 +0000 (GMT) (envelope-from maxim@macomnet.ru) Received: from localhost (localhost [127.0.0.1]) by mp2.macomnet.net (8.13.3/8.13.3) with ESMTP id j9U6MK3M093760; Sun, 30 Oct 2005 09:22:20 +0300 (MSK) (envelope-from maxim@macomnet.ru) Date: Sun, 30 Oct 2005 09:22:20 +0300 (MSK) From: Maxim Konovalov To: Gal Ben-Haim In-Reply-To: Message-ID: <20051030091959.A93720@mp2.macomnet.net> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@freebsd.org Subject: Re: disklabel recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 06:22:23 -0000 On Sat, 29 Oct 2005, 20:59+0200, Gal Ben-Haim wrote: > Hello, > Im running FreeBSD 5.4 > I accedently enabled swap on /dev/ad2s1c (raw label), this destroyed my > disklabel, the data is still on the disk (after doing 'cat' on the /dev > device). > When I try to mount the filesystem I get: the mount is successful, but when > I 'ls' the mount point I get bad file descriptor. > I tried to create a new label with 'sysinstall', but the label which is > created seems to be with invalid values. > I tried running scan_ffs, after it read all the blocks on the disk it > outputs 'input/output error. > > is there any way to reconstruct the label properly or access atleast some of > the files ? src/tools/tools/find-sb/ ports/sysutils/scan_ffs/ ports/sysutils/ffsrecov/ -- Maxim Konovalov From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 12:55:36 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 410E216A41F for ; Sun, 30 Oct 2005 12:55:36 +0000 (GMT) (envelope-from gbenhaim@gmail.com) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.199]) by mx1.FreeBSD.org (Postfix) with ESMTP id C493F43D45 for ; Sun, 30 Oct 2005 12:55:35 +0000 (GMT) (envelope-from gbenhaim@gmail.com) Received: by zproxy.gmail.com with SMTP id x3so709442nzd for ; Sun, 30 Oct 2005 04:55:33 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=nj1ocwgeJVshRB8RBdFwAnoDSrlGwVlhkT8IVRdry87FW+rTXIdZjOVG8W+nOGSt34yp6e8AXInObl//wS+JOGRb5s2oVvsott7G4B2uxtdgtpl7oxNp2v2hPZ+rRA/GVbSvPG8dxCfq7pdyfAyYPf0LoySk4tFMCt7xL+8jIW8= Received: by 10.36.75.17 with SMTP id x17mr2324189nza; Sun, 30 Oct 2005 04:55:32 -0800 (PST) Received: by 10.36.65.15 with HTTP; Sun, 30 Oct 2005 04:55:32 -0800 (PST) Message-ID: Date: Sun, 30 Oct 2005 14:55:32 +0200 From: Gal Ben-Haim To: freebsd-fs@freebsd.org In-Reply-To: <20051030091959.A93720@mp2.macomnet.net> MIME-Version: 1.0 References: <20051030091959.A93720@mp2.macomnet.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: disklabel recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 12:55:36 -0000 meanwhile I did a 'dd' image of the drive to a network drive and formatted the hd, I needed the system up again. what can I do with that 'dd' image in order to restore atleast some of the data on that drive ? On 10/30/05, Maxim Konovalov wrote: > > On Sat, 29 Oct 2005, 20:59+0200, Gal Ben-Haim wrote: > > > Hello, > > Im running FreeBSD 5.4 > > I accedently enabled swap on /dev/ad2s1c (raw label), this destroyed my > > disklabel, the data is still on the disk (after doing 'cat' on the /dev > > device). > > When I try to mount the filesystem I get: the mount is successful, but > when > > I 'ls' the mount point I get bad file descriptor. > > I tried to create a new label with 'sysinstall', but the label which is > > created seems to be with invalid values. > > I tried running scan_ffs, after it read all the blocks on the disk it > > outputs 'input/output error. > > > > is there any way to reconstruct the label properly or access atleast > some of > > the files ? > > src/tools/tools/find-sb/ > ports/sysutils/scan_ffs/ > ports/sysutils/ffsrecov/ > > -- > Maxim Konovalov > -- Gal Ben-Haim, gbenhaim@gmail.com From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 13:06:06 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AC0E116A41F for ; Sun, 30 Oct 2005 13:06:06 +0000 (GMT) (envelope-from arne_woerner@yahoo.com) Received: from web30302.mail.mud.yahoo.com (web30302.mail.mud.yahoo.com [68.142.200.95]) by mx1.FreeBSD.org (Postfix) with SMTP id 3602F43D45 for ; Sun, 30 Oct 2005 13:06:06 +0000 (GMT) (envelope-from arne_woerner@yahoo.com) Received: (qmail 35722 invoked by uid 60001); 30 Oct 2005 13:06:05 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=6CYOHrSdGmABDlD8quQBpOaG4UvYwU/sj942gP4T0Z/svgyRIdMyovKNcCzrfgu1yG8USpeakiP1kG6VJAphDe1DH6kSdcguqjrRVUlG3vIXObQ7RsyYVW37y+BTlja5Syn7obvxk0eHiHfGj2og5XoHYI5dklr4bb/xb5K84S4= ; Message-ID: <20051030130605.35720.qmail@web30302.mail.mud.yahoo.com> Received: from [213.54.84.222] by web30302.mail.mud.yahoo.com via HTTP; Sun, 30 Oct 2005 05:06:05 PST Date: Sun, 30 Oct 2005 05:06:05 -0800 (PST) From: Arne "Wörner" To: Gal Ben-Haim , freebsd-fs@freebsd.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: Subject: Re: disklabel recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 13:06:06 -0000 --- Gal Ben-Haim wrote: > meanwhile I did a 'dd' image of the drive to a network drive and > formatted the hd, I needed the system up again. > > what can I do with that 'dd' image in order to restore atleast > some of the data on that drive ? > 1. You could just dd the image back to the freshly sliced disc and then do a "fsck -f" (forced check even if it is marked clean). 2. You could try to fsck the image via # mdconfig -a -t vnode -f # fsck -f /dev/md # mount /dev/md /mnt # ls /mnt/. -Arne __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 13:53:50 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 54D7B16A41F for ; Sun, 30 Oct 2005 13:53:50 +0000 (GMT) (envelope-from bv@bilver.wjv.com) Received: from wjv.com (fl-65-40-24-38.sta.sprint-hsd.net [65.40.24.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id A9D7843D46 for ; Sun, 30 Oct 2005 13:53:49 +0000 (GMT) (envelope-from bv@bilver.wjv.com) Received: from bilver.wjv.com (localhost.wjv.com [127.0.0.1]) by wjv.com (8.13.5/8.13.1) with ESMTP id j9UDrleG078660; Sun, 30 Oct 2005 08:53:47 -0500 (EST) (envelope-from bv@bilver.wjv.com) Received: (from bv@localhost) by bilver.wjv.com (8.13.5/8.13.1/Submit) id j9UDrkiL078659; Sun, 30 Oct 2005 08:53:46 -0500 (EST) (envelope-from bv) Date: Sun, 30 Oct 2005 08:53:46 -0500 From: Bill Vermillion To: Arne =?unknown-8bit?Q?W=F6rner?= Message-ID: <20051030135346.GA78577@wjv.com> References: <20051030130605.35720.qmail@web30302.mail.mud.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20051030130605.35720.qmail@web30302.mail.mud.yahoo.com> Organization: W.J.Vermillion / Orlando - Winter Park ReplyTo: bv@wjv.com User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on bilver.wjv.com Cc: freebsd-fs@freebsd.org Subject: Re: disklabel recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bv@wjv.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 13:53:50 -0000 Arne Wörner, the prominent pundit, on Sun, Oct 30, 2005 at 05:06 while half mumbling, half-witicized: > --- Gal Ben-Haim wrote: > > meanwhile I did a 'dd' image of the drive to a network drive and > > formatted the hd, I needed the system up again. > > > > what can I do with that 'dd' image in order to restore atleast > > some of the data on that drive ? > > > 1. You could just dd the image back to the freshly sliced disc and > then do a "fsck -f" (forced check even if it is marked clean). > 2. You could try to fsck the image via > # mdconfig -a -t vnode -f > # fsck -f /dev/md > # mount /dev/md /mnt > # ls /mnt/. If the data is important he might try 'lazarus' from TCT - The Coroners Toolkit. But it can be tedious. It examines the drive and tried to determine what is data, and then puts it in small files. 'TCT' also comes with a program called 'unrm' but that works on a drive that has a FS intact, as 'unrm' is similar to 'dd' but only copies block that aren't allocated. You wind up with thousands of files. It has an HTML mode so that you can examine the found data with a brower. I'd try this only as last resort and only if the files are that important, becaus as I said above, it is tedious. Bill -- Bill Vermillion - bv @ wjv . com From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 14:10:26 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8EF4B16A41F for ; Sun, 30 Oct 2005 14:10:26 +0000 (GMT) (envelope-from arne_woerner@yahoo.com) Received: from web30302.mail.mud.yahoo.com (web30302.mail.mud.yahoo.com [68.142.200.95]) by mx1.FreeBSD.org (Postfix) with SMTP id 1BC7743D48 for ; Sun, 30 Oct 2005 14:10:26 +0000 (GMT) (envelope-from arne_woerner@yahoo.com) Received: (qmail 46856 invoked by uid 60001); 30 Oct 2005 14:10:25 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=DHsbW9RTQmes850XKVsFWyFZQUw8SihGhmM3OnKtAklrsbo09MrmYD4A6wG9fS6FbkGs2wjuRNK3O+mSQkHVA/0yizdSXnCqj1RuZPYKNoMlXKIQX+JUi9Tb+nJyQcgV+4PKl5yn9JKHps/prvQsz2z/VovOjjdpPIcSK6mPaIg= ; Message-ID: <20051030141025.46854.qmail@web30302.mail.mud.yahoo.com> Received: from [213.54.84.222] by web30302.mail.mud.yahoo.com via HTTP; Sun, 30 Oct 2005 06:10:25 PST Date: Sun, 30 Oct 2005 06:10:25 -0800 (PST) From: Arne "Wörner" To: bv@wjv.com In-Reply-To: <20051030135346.GA78577@wjv.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org Subject: Re: disklabel recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 14:10:26 -0000 --- Bill Vermillion wrote: > Arne Wörner, the prominent pundit, on Sun, Oct 30, 2005 at 05:06 > while half mumbling, half-witicized: > Didn't know, that they still know me here... *grin* *blush* > > Gal Ben-Haim wrote: > > > what can I do with that 'dd' image in order to restore > > > atleast some of the data on that drive ? > > > It just came to me, that a backup might be a good idea... For example I do a backup every hour from /usr/home to /opt/backup (just the changed/new files) and after 10 days I write those backups to a DVD... Maybe Gal Ben-Haim did something similar, so that at least the state of 10 days ago would be restorable (I had already bad luck with that idea in case of the missing Hitler-LPs of my history teacher; I hope this time it works better)? -Arne __________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 14:14:50 2005 Return-Path: X-Original-To: fs@freebsd.org Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A5A7416A41F for ; Sun, 30 Oct 2005 14:14:50 +0000 (GMT) (envelope-from delphij@frontfree.net) Received: from tarsier.geekcn.org (tarsier.geekcn.org [210.51.165.229]) by mx1.FreeBSD.org (Postfix) with ESMTP id F038443D46 for ; Sun, 30 Oct 2005 14:14:49 +0000 (GMT) (envelope-from delphij@frontfree.net) Received: from localhost (tarsier.geekcn.org [210.51.165.229]) by tarsier.geekcn.org (Postfix) with ESMTP id 21D6AEB170A for ; Sun, 30 Oct 2005 22:14:48 +0800 (CST) Received: from tarsier.geekcn.org ([210.51.165.229]) by localhost (mail.geekcn.org [210.51.165.229]) (amavisd-new, port 10024) with ESMTP id 82145-12 for ; Sun, 30 Oct 2005 22:14:45 +0800 (CST) Received: from beastie.frontfree.net (unknown [211.71.95.7]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tarsier.geekcn.org (Postfix) with ESMTP id C935FEB1718 for ; Sun, 30 Oct 2005 22:14:43 +0800 (CST) Received: from localhost (localhost.frontfree.net [127.0.0.1]) by beastie.frontfree.net (Postfix) with ESMTP id 15EE41312C7; Sun, 30 Oct 2005 22:14:37 +0800 (CST) Received: from beastie.frontfree.net ([127.0.0.1]) by localhost (beastie.frontfree.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 82965-07; Sun, 30 Oct 2005 22:14:36 +0800 (CST) Received: by beastie.frontfree.net (Postfix, from userid 1001) id 1E6F2130DAC; Sun, 30 Oct 2005 22:14:36 +0800 (CST) Date: Sun, 30 Oct 2005 22:14:36 +0800 From: Xin LI To: rick@snowhite.cis.uoguelph.ca Message-ID: <20051030141436.GA83332@frontfree.net> References: <200510272003.QAA12616@snowhite.cis.uoguelph.ca> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uAKRQypu60I7Lcqm" Content-Disposition: inline In-Reply-To: <200510272003.QAA12616@snowhite.cis.uoguelph.ca> User-Agent: Mutt/1.4.2.1i X-GPG-key-ID/Fingerprint: 0xCAEEB8C0 / 43B8 B703 B8DD 0231 B333 DC28 39FB 93A0 CAEE B8C0 X-GPG-Public-Key: http://www.delphij.net/delphij.asc X-Operating-System: FreeBSD beastie.frontfree.net 5.4-RELEASE-p6 FreeBSD 5.4-RELEASE-p6 #4: Thu Jul 28 10:59:26 CST 2005 delphij@beastie.frontfree.net:/usr/obj/usr/src/sys/BEASTIE i386 X-URL: http://www.delphij.net X-By: delphij@beastie.frontfree.net X-Location: Beijing, China X-Virus-Scanned: amavisd-new at frontfree.net X-Virus-Scanned: amavisd-new at geekcn.org Cc: fs@freebsd.org Subject: Re: FreeBSD5.4 nfsv4 server patch X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 14:14:50 -0000 --uAKRQypu60I7Lcqm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, On Thu, Oct 27, 2005 at 04:03:31PM -0400, rick@snowhite.cis.uoguelph.ca wro= te: > I've just put a patch on the ftp site at: > ftp.cis.uoguelph.ca/pub/nfsv4/FreeBSD5.4-patch1.diffc >=20 > that fixes the NFSVOPISLOCKED() macro to avoid panics and some fixes > courtesy of Mr. Jeremy Mika, so that it will build on amd64. (Jeremy, > the only part of this patch of interest to you is the NFSVOPISLOCKED() > macro change.) >=20 > Thanks Jeremy, rick > ps: These changes will be in the FreeBSD6 tarball I'll create shortly > after the FreeBSD6.0-Release1 is out. That's nice! Would you mind sending the patch through send-pr(1) so we can assign that to someone? Thanks in advance! Cheers, -- Xin LI http://www.delphij.net/ See complete headers for GPG key and other information. --uAKRQypu60I7Lcqm Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFDZNVM/cVsHxFZiIoRArtUAJ4gMORtDwZs3N3Lgp304YT4oOSWOACbBeYN IZxhIFBkUxUfs43Wzp0iKBA= =Vtyd -----END PGP SIGNATURE----- --uAKRQypu60I7Lcqm-- From owner-freebsd-fs@FreeBSD.ORG Sun Oct 30 17:43:18 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E66B016A41F; Sun, 30 Oct 2005 17:43:18 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (geri.cc.fer.hr [161.53.72.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3EB1643D45; Sun, 30 Oct 2005 17:43:17 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (localhost.cc.fer.hr [127.0.0.1]) by geri.cc.fer.hr (8.13.4/8.13.1) with ESMTP id j9UHbBZn019484; Sun, 30 Oct 2005 18:37:11 +0100 (CET) (envelope-from ivoras@fer.hr) Received: from localhost (ivoras@localhost) by geri.cc.fer.hr (8.13.4/8.13.1/Submit) with ESMTP id j9UHbBoT019481; Sun, 30 Oct 2005 18:37:11 +0100 (CET) (envelope-from ivoras@fer.hr) X-Authentication-Warning: geri.cc.fer.hr: ivoras owned process doing -bs Date: Sun, 30 Oct 2005 18:37:11 +0100 (CET) From: Ivan Voras Sender: ivoras@geri.cc.fer.hr To: freebsd-fs@freebsd.org Message-ID: <20051030183340.B19470@geri.cc.fer.hr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-hackers@freebsd.org Subject: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Oct 2005 17:43:19 -0000 I recently tried to use ext2 on FreeBSD but have decided not to when I saw that the support for large files is missing (and went with msdosfs instead). Now I accidentaly noticed that large_file support is present in latest NetBSD (and maybe OpenBSD). Is anyone interested in porting the support to FreeBSD? :) From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 01:09:57 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B779A16A420 for ; Mon, 31 Oct 2005 01:09:57 +0000 (GMT) (envelope-from freebsd@classicalguitar.net) Received: from sccmmhc92.asp.att.net (sccmmhc92.asp.att.net [204.127.203.212]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5165D43D45 for ; Mon, 31 Oct 2005 01:09:56 +0000 (GMT) (envelope-from freebsd@classicalguitar.net) Received: from [192.168.1.50] (12-217-103-241.client.mchsi.com[12.217.103.241]) by sccmmhc92.asp.att.net (sccmmhc92) with SMTP id <20051031010955m920067ak4e>; Mon, 31 Oct 2005 01:09:56 +0000 In-Reply-To: <20051030183340.B19470@geri.cc.fer.hr> References: <20051030183340.B19470@geri.cc.fer.hr> Mime-Version: 1.0 (Apple Message framework v734) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> Content-Transfer-Encoding: 7bit From: Brian Bergstrand Date: Sun, 30 Oct 2005 19:09:49 -0600 To: Ivan Voras X-Pgp-Agent: GPGMail 1.1.1 (Tiger) X-Mailer: Apple Mail (2.734) Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 01:09:58 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I ported the FreeBSD driver to Mac OS X / Darwin a few years ago and added large file support in the process. Here's the patch, as you can see it's a rather simple patch (as long as the in core i_size member is 64bit which I think is true for FreeBSD): I doubt this patch will apply cleanly as my tree has diverged from FreeBSD in a non-compatible way quite a while ago -- but it should give you a start: The first part of the patch (@@ -98,7 +128,9 @@) is to ext2_ei2i() and the second part (@@ -182,6 +214,31 @@) is to ext2_i2ei() $ cvs diff -r 1.6 -r 1.7 ext2_inode_cnv.c Index: ext2_inode_cnv.c =================================================================== RCS file: /cvsroot/ext2fsx/src/gnu/ext2fs/ext2_inode_cnv.c,v retrieving revision 1.6 retrieving revision 1.7 diff -u -b -r1.6 -r1.7 - --- ext2_inode_cnv.c 3 May 2003 23:54:39 -0000 1.6 +++ ext2_inode_cnv.c 9 Jul 2003 23:09:24 -0000 1.7 @@ -98,7 +128,9 @@ ip->i_uid |= le16_to_cpu(ei->i_uid_high) << 16; ip->i_gid |= le16_to_cpu(ei->i_gid_high) << 16; /*}*/ - - /* XXX use memcpy */ + if (S_ISREG(ip->i_mode)) + ip->i_size |= ((u_int64_t)le32_to_cpu(ei->i_size_high)) << 32; + /* TBD: Otherwise, setup the dir acl */ #if BYTE_ORDER == BIG_ENDIAN /* We don't want to swap the block addr's for a short symlink because @@ -182,6 +214,31 @@ raw_inode->i_uid_high = 0; raw_inode->i_gid_high = 0; }*/ + if (S_ISREG(ip->i_mode)) { + ei->i_size_high = cpu_to_le32(ip->i_size >> 32); + if (ip->i_size > 0x7fffffffULL) { + struct ext2_sb_info *sb = ip->i_e2fs; + if (!EXT2_HAS_RO_COMPAT_FEATURE(sb, + EXT2_FEATURE_RO_COMPAT_LARGE_FILE) || + EXT2_SB(sb)->s_es->s_rev_level == cpu_to_le32 (EXT2_GOOD_OLD_REV)) { + /* First large file, add the flag to the superblock. */ + lock_super (VFSTOEXT2(ip->i_vnode->v_mount)->um_devvp); + + if (EXT2_SB(sb)->s_es->s_rev_level == cpu_to_le32 (EXT2_GOOD_OLD_REV)) { + log(LOG_WARNING, + "ext2: updating to rev %d because of new feature flag, " + "running e2fsck is recommended", EXT2_DYNAMIC_REV); + sb->s_es->s_first_ino = cpu_to_le32 (EXT2_GOOD_OLD_FIRST_INO); + sb->s_es->s_inode_size = cpu_to_le16 (EXT2_GOOD_OLD_INODE_SIZE); + sb->s_es->s_rev_level = cpu_to_le32(EXT2_DYNAMIC_REV); + } + + EXT2_SET_RO_COMPAT_FEATURE(sb, EXT2_FEATURE_RO_COMPAT_LARGE_FILE); + sb->s_dirt = 1; + unlock_super (VFSTOEXT2(ip->i_vnode->v_mount)->um_devvp); + } + } + } On Oct 30, 2005, at 11:37 AM, Ivan Voras wrote: > I recently tried to use ext2 on FreeBSD but have decided not to > when I saw that the support for large files is missing (and went > with msdosfs instead). > > Now I accidentaly noticed that large_file support is present in > latest NetBSD (and maybe OpenBSD). Is anyone interested in porting > the support to FreeBSD? :) > Brian Bergstrand PGP Key ID: 0xB6C7B6A2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (Darwin) iD8DBQFDZW7jedHYW7bHtqIRAtbIAKC4WIm+F/vruqRDWCfTLQ9XDrqNigCfdJO/ 263t1QPoOgfSuz7LWwYEiAQ= =gxC7 -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 08:53:55 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09BBF16A41F for ; Mon, 31 Oct 2005 08:53:55 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 58F5543D48 for ; Mon, 31 Oct 2005 08:53:54 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9V8rhvM013085; Mon, 31 Oct 2005 19:53:43 +1100 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9V8re3A011815; Mon, 31 Oct 2005 19:53:41 +1100 Date: Mon, 31 Oct 2005 19:53:40 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Brian Bergstrand In-Reply-To: <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> Message-ID: <20051031191139.J38757@delplex.bde.org> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 08:53:55 -0000 On Sun, 30 Oct 2005, Brian Bergstrand wrote: > I ported the FreeBSD driver to Mac OS X / Darwin a few years ago and added > large file support in the process. Here's the patch, as you can see it's a > rather simple patch (as long as the in core i_size member is 64bit which I > think is true for FreeBSD): tjr implemented it in FreeBSD almost 2 years ago: % RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_fs.h,v % Working file: ext2_fs.h % head: 1.13 % ... % ---------------------------- % revision 1.13 % date: 2004/02/18 14:08:25; author: tjr; state: Exp; lines: +2 -1 % Add partial support for large (>4GB) files on ext2 filesystems. This % support is partial in that it will refuse to create large files on % filesystems that haven't been upgraded to EXT2_DYN_REV or that don't % have the EXT2_FEATURE_RO_COMPAT_LARGE_FILE flag set in the superblock. % % MFC after: 2 weeks % ---------------------------- I don't don't know if block allocation actually works for large files. > I doubt this patch will apply cleanly as my tree has diverged from FreeBSD in > a non-compatible way quite a while ago -- but it should give you a start: I prefer tjr's version. > Index: ext2_inode_cnv.c > =================================================================== > RCS file: /cvsroot/ext2fsx/src/gnu/ext2fs/ext2_inode_cnv.c,v > retrieving revision 1.6 > retrieving revision 1.7 > diff -u -b -r1.6 -r1.7 > - --- ext2_inode_cnv.c 3 May 2003 23:54:39 -0000 1.6 > +++ ext2_inode_cnv.c 9 Jul 2003 23:09:24 -0000 1.7 > ... > @@ -182,6 +214,31 @@ > raw_inode->i_uid_high = 0; > raw_inode->i_gid_high = 0; > }*/ > + if (S_ISREG(ip->i_mode)) { > + ei->i_size_high = cpu_to_le32(ip->i_size >> 32); > + if (ip->i_size > 0x7fffffffULL) { > + struct ext2_sb_info *sb = ip->i_e2fs; > + if (!EXT2_HAS_RO_COMPAT_FEATURE(sb, > + EXT2_FEATURE_RO_COMPAT_LARGE_FILE) || > + EXT2_SB(sb)->s_es->s_rev_level == cpu_to_le32 > (EXT2_GOOD_OLD_REV)) { > + /* First large file, add the flag to the superblock. */ Compatibility flags shouldn't be forced on IMO. Linux does it for this flag, but this is a bug IMO. It breaks subsequent remounting r/w on old or other kernels that don't support large files. Changes are also needed to at least ext2_fs.h (to indicate that the kernel supports large files), ext2_vfsops.c (to set fs_maxfilesize according to the compatibility flag), and ext2_readwrite.c (to actually use the ifdefed out code that checks fs_maxfilesize, and fix minor bitrot in that code). > On Oct 30, 2005, at 11:37 AM, Ivan Voras wrote: > >> I recently tried to use ext2 on FreeBSD but have decided not to when I saw >> that the support for large files is missing (and went with msdosfs >> instead). msdosfs is physically incapable of supporting large files. Its maximum file size is the constant 0xffffffff. >> Now I accidentaly noticed that large_file support is present in latest >> NetBSD (and maybe OpenBSD). Is anyone interested in porting the support to >> FreeBSD? :) I prefer FreeBSD's version. NetBSD got it 9 months ago, only a year after FreeBSD. It refuses to create files larger than 2G-1 if the ext2fs rev number is old, and says in a comment that Linux silently upgrades the rev number. It silently clobbers the compat flag like Linux. Someone has an off-by-power-of-2 error -- the corresponding limit in FreeBSD is 4G-1. Bruce From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 15:24:06 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D33F416A41F for ; Mon, 31 Oct 2005 15:24:06 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (geri.cc.fer.hr [161.53.72.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2627843D46 for ; Mon, 31 Oct 2005 15:24:05 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (localhost.cc.fer.hr [127.0.0.1]) by geri.cc.fer.hr (8.13.4/8.13.1) with ESMTP id j9VFHn5S067351; Mon, 31 Oct 2005 16:17:49 +0100 (CET) (envelope-from ivoras@fer.hr) Received: from localhost (ivoras@localhost) by geri.cc.fer.hr (8.13.4/8.13.1/Submit) with ESMTP id j9VFHlIk067348; Mon, 31 Oct 2005 16:17:48 +0100 (CET) (envelope-from ivoras@fer.hr) X-Authentication-Warning: geri.cc.fer.hr: ivoras owned process doing -bs Date: Mon, 31 Oct 2005 16:17:47 +0100 (CET) From: Ivan Voras Sender: ivoras@geri.cc.fer.hr To: Bruce Evans In-Reply-To: <20051031191139.J38757@delplex.bde.org> Message-ID: <20051031160354.G67271@geri.cc.fer.hr> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 15:24:07 -0000 On Mon, 31 Oct 2005, Bruce Evans wrote: > tjr implemented it in FreeBSD almost 2 years ago: It doesn't, or something other is wrong. This happens on a freshly created ext2fs: > dd if=/dev/zero of=big_file bs=1m count=2500 dd: big_file: File too large 2048+0 records in 2047+0 records out 2146435072 bytes transferred in 100.387067 secs (21381590 bytes/sec) > l total 2098180 -rw-r--r-- 1 ivoras wheel 2146435072 Oct 31 16:09 big_file (btw: the transfer rate is also somewhat bad: 50% of CPU time was taken, ~25% in sys, ~25% in interrupts. This is UP machine. I think this is because ext2 support seems to divide I/O requests into 4KB chunks :( ) Here's what dumpe2fs says: Filesystem volume name: walker_ext2 Last mounted on: Filesystem UUID: 9a920c62-05f6-4631-8c90-30af2c63d5df Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: filetype sparse_super Default mount options: bsdgroups Filesystem state: not clean Errors behavior: Continue Filesystem OS type: FreeBSD Inode count: 4889248 Block count: 9769520 Reserved block count: 488476 Free blocks: 9091534 Free inodes: 4889235 First block: 0 Block size: 4096 Fragment size: 4096 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16352 Inode blocks per group: 511 Filesystem created: Mon Oct 31 15:57:15 2005 Last mount time: n/a Last write time: Mon Oct 31 16:09:16 2005 Mount count: 0 Maximum mount count: 26 Last checked: Mon Oct 31 15:57:15 2005 Check interval: 15552000 (6 months) Next check after: Sat Apr 29 16:57:15 2006 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group wheel) First inode: 11 Inode size: 128 Default directory hash: tea Directory Hash Seed: 8d6e7f0b-37f2-45bd-8d65-fc515046d7b6 (it's not clean because it's mounted. I've set the bsdgroups option myself with tune2fs) > Compatibility flags shouldn't be forced on IMO. Linux does it for this > flag, but this is a bug IMO. It breaks subsequent remounting r/w on > old or other kernels that don't support large files. So, how to set the flag? man pages for tune2fs and mke2fs don't mention the large_file option. Is there some other utility that does this? >>> I recently tried to use ext2 on FreeBSD but have decided not to when I saw >>> that the support for large files is missing (and went with msdosfs >>> instead). > > msdosfs is physically incapable of supporting large files. Its maximum > file size is the constant 0xffffffff. Yes, I should have said "larger" files :) Current ext2 support in FreeBSD is limited to 2GB files, while 4GB is enough for me for now. > NetBSD got it 9 months ago, only a year after FreeBSD. It refuses to > create files larger than 2G-1 if the ext2fs rev number is old, and says > in a comment that Linux silently upgrades the rev number. It silently > clobbers the compat flag like Linux. Someone has an off-by-power-of-2 > error -- the corresponding limit in FreeBSD is 4G-1. I just tried it - the limit is 2GB on FreeBSD. So, it seems that it boils down to that FreeBSD's ext2 support still cannot handle large files? From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 15:57:08 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98E3416A41F for ; Mon, 31 Oct 2005 15:57:08 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (geri.cc.fer.hr [161.53.72.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id D9F7E43D49 for ; Mon, 31 Oct 2005 15:57:07 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (localhost.cc.fer.hr [127.0.0.1]) by geri.cc.fer.hr (8.13.4/8.13.1) with ESMTP id j9VFopIl067597; Mon, 31 Oct 2005 16:50:51 +0100 (CET) (envelope-from ivoras@fer.hr) Received: from localhost (ivoras@localhost) by geri.cc.fer.hr (8.13.4/8.13.1/Submit) with ESMTP id j9VFoox9067594; Mon, 31 Oct 2005 16:50:51 +0100 (CET) (envelope-from ivoras@fer.hr) X-Authentication-Warning: geri.cc.fer.hr: ivoras owned process doing -bs Date: Mon, 31 Oct 2005 16:50:50 +0100 (CET) From: Ivan Voras Sender: ivoras@geri.cc.fer.hr To: Bruce Evans In-Reply-To: <20051031191139.J38757@delplex.bde.org> Message-ID: <20051031164856.P67549@geri.cc.fer.hr> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 15:57:08 -0000 Btw. I'm using 6.0-RC1 from a few days ago, and it didn't work at the time of BETA5 either. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 18:01:41 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9488A16A41F for ; Mon, 31 Oct 2005 18:01:41 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.115]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9BA2643D46 for ; Mon, 31 Oct 2005 18:01:40 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86]) by mailout2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9VI1Vsl004556; Tue, 1 Nov 2005 05:01:31 +1100 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j9VI1RSl013227; Tue, 1 Nov 2005 05:01:28 +1100 Date: Tue, 1 Nov 2005 05:01:27 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Ivan Voras In-Reply-To: <20051031160354.G67271@geri.cc.fer.hr> Message-ID: <20051101042444.K40281@delplex.bde.org> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> <20051031160354.G67271@geri.cc.fer.hr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 18:01:41 -0000 On Mon, 31 Oct 2005, Ivan Voras wrote: > On Mon, 31 Oct 2005, Bruce Evans wrote: > >> tjr implemented it in FreeBSD almost 2 years ago: > > It doesn't, or something other is wrong. This happens on a freshly created > ext2fs: > >> dd if=/dev/zero of=big_file bs=1m count=2500 > dd: big_file: File too large > 2048+0 records in > 2047+0 records out > 2146435072 bytes transferred in 100.387067 secs (21381590 bytes/sec) > >> l > total 2098180 > -rw-r--r-- 1 ivoras wheel 2146435072 Oct 31 16:09 big_file > > > (btw: the transfer rate is also somewhat bad: 50% of CPU time was taken, ~25% > in sys, ~25% in interrupts. This is UP machine. I think this is because ext2 > support seems to divide I/O requests into 4KB chunks :( ) Ext2fs is less efficient than ffs, but in my experience only by a factor of 2. I see it having 10-15% overhead but this is with a transfer rate of only 8MB/sec. > Here's what dumpe2fs says: > > Filesystem volume name: walker_ext2 > Last mounted on: > Filesystem UUID: 9a920c62-05f6-4631-8c90-30af2c63d5df > Filesystem magic number: 0xEF53 > Filesystem revision #: 1 (dynamic) > Filesystem features: filetype sparse_super I think the rev is high enough, but the feature list would say "large_file" if the file system had that feature. >> Compatibility flags shouldn't be forced on IMO. Linux does it for this >> flag, but this is a bug IMO. It breaks subsequent remounting r/w on >> old or other kernels that don't support large files. > > So, how to set the flag? man pages for tune2fs and mke2fs don't mention the > large_file option. Is there some other utility that does this? Apparently tune2fs doesn't do it because it expects the kernel to. e2fsck seems to have some support for it, at least in e2fsprogs-1.27, but I think that support is limited to setting the flag if there is a large file an maybe unsetting the flag if there isn't. >> msdosfs is physically incapable of supporting large files. Its maximum >> file size is the constant 0xffffffff. > > Yes, I should have said "larger" files :) Current ext2 support in FreeBSD is > limited to 2GB files, while 4GB is enough for me for now. > >> NetBSD got it 9 months ago, only a year after FreeBSD. It refuses to >> create files larger than 2G-1 if the ext2fs rev number is old, and says >> in a comment that Linux silently upgrades the rev number. It silently >> clobbers the compat flag like Linux. Someone has an off-by-power-of-2 >> error -- the corresponding limit in FreeBSD is 4G-1. > > I just tried it - the limit is 2GB on FreeBSD. Yes. I misread the constant. > So, it seems that it boils down to that FreeBSD's ext2 support still cannot > handle large files? Unless the file system already has or had a large file. Possible workarounds: (1) Boot Linux and create a large file. Hopefully e2fsck only sets the flag so you only have to do this once. (2) Edit the limit in FreeBSD (grep for maxfilesize), create a large file, run e2fsck to set the flag, and change the limit back for safety. (3) Modify tune2fs to support setting the flag. (4) Use a disk editor or fs debugger to set the flag. debugfs in e2fsprogs should be able to do this. However, it doesn't seem to be in the FreeBSD package and my old version of it seems to have been broken by removal of block devices (most e2fs utilities want block devices but some barely work without them). Bruce From owner-freebsd-fs@FreeBSD.ORG Mon Oct 31 19:59:08 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D500316A420 for ; Mon, 31 Oct 2005 19:59:08 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (geri.cc.fer.hr [161.53.72.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0549843D48 for ; Mon, 31 Oct 2005 19:59:07 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (localhost.cc.fer.hr [127.0.0.1]) by geri.cc.fer.hr (8.13.4/8.13.1) with ESMTP id j9VJqoDO069001; Mon, 31 Oct 2005 20:52:50 +0100 (CET) (envelope-from ivoras@fer.hr) Received: from localhost (ivoras@localhost) by geri.cc.fer.hr (8.13.4/8.13.1/Submit) with ESMTP id j9VJqmxu068998; Mon, 31 Oct 2005 20:52:49 +0100 (CET) (envelope-from ivoras@fer.hr) X-Authentication-Warning: geri.cc.fer.hr: ivoras owned process doing -bs Date: Mon, 31 Oct 2005 20:52:48 +0100 (CET) From: Ivan Voras Sender: ivoras@geri.cc.fer.hr To: Bruce Evans In-Reply-To: <20051101042444.K40281@delplex.bde.org> Message-ID: <20051031201719.S68800@geri.cc.fer.hr> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> <20051031160354.G67271@geri.cc.fer.hr> <20051101042444.K40281@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Oct 2005 19:59:08 -0000 On Tue, 1 Nov 2005, Bruce Evans wrote: > Unless the file system already has or had a large file. Possible > workarounds: > (1) Boot Linux and create a large file. Hopefully e2fsck only sets the > flag so you only have to do this once. I did this but e2fsck doesn't set the flag. Fortunately, I found out that e2fsprogs includes "debugfs" utility with which I manually set the flag. It works now! ext2 filesystem access is still a bit slower than with WindowsXP with ext2+ext3 IFS driver (~20.5MB/s vs ~25MB/s). The reason I brought up this subject is that I'm experimenting with using ext2 instead of msdosfs for exchanging data between the systems in dual-boot configuration. Because ext2 large_file support works now, I think it's much more safer and even somewhat faster (less fragmentation! FreeBSD's msdosfs looks like it's pessimized for fragmentation!) to use instead. I propose this patch to the mount_ext2fs manual page: --- mount_ext2fs.8_old Mon Oct 31 20:43:17 2005 +++ mount_ext2fs.8 Mon Oct 31 20:56:45 2005 @@ -60,6 +60,21 @@ .Xr mount 8 man page for possible options and their meanings. .El +.Sh BUGS +Unlike the original Linux implementation, the "large_file" +flag is not set automatically when first file larger than +2GB is created. Instead, +.Xr debugfs 8 +utility from the e2fsprogs port must be used to manually +set the flag with `feature large_file` command. Other than +this, large files are fully supported. + +Support for journaling (ext3) is missing, and filesystems that +have it enabled are treated as plain ext2 filesystems. +This means that +.Xr e2fsck 8 +will have to be used to repair the journal when the filesystem +is to be used in Linux. .Sh SEE ALSO .Xr mount 2 , .Xr unmount 2 , From owner-freebsd-fs@FreeBSD.ORG Tue Nov 1 04:31:45 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD92416A41F for ; Tue, 1 Nov 2005 04:31:45 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4468443D45 for ; Tue, 1 Nov 2005 04:31:44 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id jA14Vb4l028419; Tue, 1 Nov 2005 15:31:37 +1100 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id jA14VYk9020136; Tue, 1 Nov 2005 15:31:34 +1100 Date: Tue, 1 Nov 2005 15:31:33 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Ivan Voras In-Reply-To: <20051031201719.S68800@geri.cc.fer.hr> Message-ID: <20051101141726.W41623@delplex.bde.org> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> <20051031160354.G67271@geri.cc.fer.hr> <20051101042444.K40281@delplex.bde.org> <20051031201719.S68800@geri.cc.fer.hr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Nov 2005 04:31:45 -0000 On Mon, 31 Oct 2005, Ivan Voras wrote: > On Tue, 1 Nov 2005, Bruce Evans wrote: > >> Unless the file system already has or had a large file. Possible >> workarounds: >> (1) Boot Linux and create a large file. Hopefully e2fsck only sets the >> flag so you only have to do this once. > > I did this but e2fsck doesn't set the flag. Fortunately, I found out that > e2fsprogs includes "debugfs" utility with which I manually set the flag. > > It works now! Does e2fsck report the problem? > ext2 filesystem access is still a bit slower than with WindowsXP with > ext2+ext3 IFS driver (~20.5MB/s vs ~25MB/s). The reason I brought up this > subject is that I'm experimenting with using ext2 instead of msdosfs for > exchanging data between the systems in dual-boot configuration. Because ext2 > large_file support works now, I think it's much more safer and even somewhat > faster (less fragmentation! FreeBSD's msdosfs looks like it's > pessimized for fragmentation!) to use instead. Strangely enough, I first got interested in ext2fs under FreeBSD because testing showed that it was faster than ffs in one configuration, and this turned out to be mostly because of fragmentation: - ext2fs under FreeBSD has a primitive block allocator that will give lots of fragmentation over the long term but is almost optimal in simple tests. It doesn't really understand cylinder groups and just allocates the next free block, so in simple tests that creates files in one process and never delete files, the layout is almost optimal. In particular, the layout is good after copying a large directory tree to a new file system. You can see evidence of this using dump2fs -- it shows the first few cylinder groups full and the rest unused, where Linux would use all the groups fairly evenly. - ffs at the time had a not very good block allocator that optimized for fragmentation of directories (optimized for this == pessimized for performance), so it gave very poor peformance for large directory trees with small files. My test was with the Linux src tree. The FreeBSD ports tree would be pessimized more. This has been fixed. Now the problems in ffs's block allocator are more local. - my test drive at the time (1997?) didn't have much caching, and this interacted badly with ffs's block allocator. Even for sequentially created files, ffs likes to seek backwards to fill in fragments with small files, and the drive's cache size of caching algorithm apparently didn't like these backwards seeks although they are small. ffs still does this, but drives' caches are now large enough for another physical access to usually not be needed to get back to the small files. ffs's other known remaining allocation problems involve not allocating indirect blocks sequentially; this problem, or something related, is especially large for soft updates -- soft updates takes advantage of its delayed block allocation to put indirect blocks further away. This used to cause a 10% performance penalty for a freshly lai out copy of /usr/src, but now with bigger drives and cache it is less noticeable. I use the following to break the optimization for fregmentation in msdosfs: % Index: msdosfs_fat.c % =================================================================== % RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_fat.c,v % retrieving revision 1.35 % diff -u -2 -r1.35 msdosfs_fat.c % --- msdosfs_fat.c 29 Dec 2003 11:59:05 -0000 1.35 % +++ msdosfs_fat.c 26 Apr 2004 05:03:55 -0000 % @@ -68,4 +68,6 @@ % #include % % +static int fat_allocpolicy = 1; % + % /* % * Fat cache stats. % @@ -759,4 +761,14 @@ % if (got) % *got = count; % + % + /* % + * For internal use, cluster pmp->pm_nxtfree is not necessarily free % + * but is the next place to look for a free cluster. Perhaps this % + * is the correct thing to pass to the next mount too. % + */ % + pmp->pm_nxtfree = start + count; % + if (pmp->pm_nxtfree > pmp->pm_maxcluster) % + pmp->pm_nxtfree = CLUST_FIRST; % + % return (0); % } % @@ -796,9 +808,30 @@ % len = 0; % % - /* % - * Start at a (pseudo) random place to maximize cluster runs % - * under multiple writers. % - */ % - newst = random() % (pmp->pm_maxcluster + 1); % + switch (fat_allocpolicy) { % + case 0: % + newst = start; % + break; % + case 1: % + newst = pmp->pm_nxtfree; % + break; % + case 5: % + newst = (start == 0 ? pmp->pm_nxtfree : start); % + break; % + case 2: % + /* FALLTHROUGH */ % + case 3: % + if (start != 0) { % + newst = fat_allocpolicy == 2 ? start : pmp->pm_nxtfree; % + break; % + } % + /* FALLTHROUGH */ % + default: % + /* % + * Start at a (pseudo) random place to maximize cluster runs % + * under multiple writers. % + */ % + newst = random() % (pmp->pm_maxcluster + 1); % + } % + % foundl = 0; % Only fat_allocpolicy == 1 and its case in the switch statement are needed here. The other cases are for testing how bad alternative simple allocators are. Policy 1 gives the same primitive sequential as in Linux -- this works well for copying but not so well when there is lots of file system activity with multiple concurrent processes. According to postmark, it is still much better than random allocation with multiple processes (but more like 2 to 4 times than 10 times). The fix for advancing pm->pm_nxtfree might not be needed. IIRC, it is mostly part of a fix for passing pm_nxtfree across mounts. With these and some other optimization for msdosfs, and optimizations and pessimizations for ext2fs, I get access times for a fresh copy of 75% of /usr/src (all that will fit in VMIO on a system with 1GB -- source always fully cached): % bde-current writing to IBM IC35L060AVV207-0 h: 24483060 57512700 % tar = tar % srcs = "contrib crypto lib sys" in /i/src % --- % % ffs-16384-02048-1: % tarcp /f srcs: 50.93 real 0.22 user 6.68 sys % tar cf /dev/zero srcs: 13.63 real 0.17 user 2.35 sys % ffs-16384-02048-2: % tarcp /f srcs: 45.15 real 0.27 user 6.71 sys % tar cf /dev/zero srcs: 14.99 real 0.20 user 2.33 sys % ffs-16384-02048-as-1: % tarcp /f srcs: 21.91 real 0.38 user 4.54 sys % tar cf /dev/zero srcs: 13.82 real 0.21 user 2.30 sys % ffs-16384-02048-as-2: % tarcp /f srcs: 21.08 real 0.34 user 4.64 sys % tar cf /dev/zero srcs: 15.24 real 0.15 user 2.41 sys % ffs-16384-02048-su-1: % tarcp /f srcs: 42.25 real 0.37 user 4.87 sys % tar cf /dev/zero srcs: 14.13 real 0.15 user 2.37 sys % ffs-16384-02048-su-2: % tarcp /f srcs: 47.76 real 0.34 user 4.93 sys % tar cf /dev/zero srcs: 16.25 real 0.16 user 2.38 sys % % ext2fs-1024-1024: % tarcp /f srcs: 108.68 real 0.30 user 7.99 sys % tar cf /dev/zero srcs: 41.15 real 0.17 user 5.63 sys % ext2fs-1024-1024-as: % tarcp /f srcs: 81.10 real 0.29 user 7.03 sys % tar cf /dev/zero srcs: 41.57 real 0.19 user 5.62 sys % ext2fs-4096-4096: % tarcp /f srcs: 107.48 real 0.32 user 6.75 sys % tar cf /dev/zero srcs: 27.37 real 0.08 user 3.00 sys % ext2fs-4096-4096-as: % tarcp /f srcs: 61.87 real 0.34 user 5.72 sys % tar cf /dev/zero srcs: 27.33 real 0.16 user 2.93 sys % % msdosfs-4096-4096: % tarcp /f srcs: 41.53 real 0.48 user 8.69 sys % tar cf /dev/zero srcs: 16.94 real 0.18 user 4.40 sys Here the first 2 numbers attached to the fs name are the block and fragment size; "as" means async mount and "su" means soft updates; the final number for ffs is for ffs1 vs ffs2. This shows the following points: - soft updates (in this test) is now not much faster than ordinary (-nosync -noasync) mounts and is much slower than async mounts. It used to be only 1.5 times slower than async mounts. This test was run when bufdaemon was buggier than it is now and showed bufdaemon behaving badly under pressure, with only soft updates creating enough pressure to cause problems. - soft updates is still about 5% slower for readback. My kernel has changes to allocate indirect blocks sequentially, but only for ffs1 and I'm not sure if the fixes work for soft updates. - msdosfs is competitive wil non-async ffs provided it uses clustering and VMIO as in my version. However, it cheats to get this -- its most important metadata, namely its FAT, is updated using delayed writes unless you mount with -sync. -sync is thus needed to get near the same robustness as the default for ffs. - ext2fs is about twice as slow as the other 2 (worse for non-async writes). For async writes, this is partly because -async is not fully implemented. It is mostly because the block size is very small, and although this only necessarily costs extra CPU to do clustering, FreeBSD is optimized for ffs's default block size and does pessimal things with ext2fs's smaller sizes. - Both msdosfs and ffs are as efficiect as can be hoped for for read-back: 13 to 16 seconds for reading 340MB of small files is 20-25MB/sec. This is on a drive with a max transfer rate of 45 or 55 MB/sec and not very fast (normal ATA 7200 rpm) seeks. On active (fragmented) file systems you have to be lucky to get half of that. On my active /usr, reading the same files takes 49 seconds. This is on a drive with a max transfer rate of 36MB/sec. > I propose this patch to the mount_ext2fs manual page: Someone else will have to look after this. You might have to file a PR so that it doesn't get lost. Bruce From owner-freebsd-fs@FreeBSD.ORG Tue Nov 1 11:13:08 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 48BB716A41F for ; Tue, 1 Nov 2005 11:13:08 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (geri.cc.fer.hr [161.53.72.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 961E843D46 for ; Tue, 1 Nov 2005 11:13:07 +0000 (GMT) (envelope-from ivoras@fer.hr) Received: from geri.cc.fer.hr (localhost.cc.fer.hr [127.0.0.1]) by geri.cc.fer.hr (8.13.4/8.13.1) with ESMTP id jA1B6iRq090707; Tue, 1 Nov 2005 12:06:44 +0100 (CET) (envelope-from ivoras@fer.hr) Received: from localhost (ivoras@localhost) by geri.cc.fer.hr (8.13.4/8.13.1/Submit) with ESMTP id jA1B6gpY090704; Tue, 1 Nov 2005 12:06:44 +0100 (CET) (envelope-from ivoras@fer.hr) X-Authentication-Warning: geri.cc.fer.hr: ivoras owned process doing -bs Date: Tue, 1 Nov 2005 12:06:42 +0100 (CET) From: Ivan Voras Sender: ivoras@geri.cc.fer.hr To: Bruce Evans In-Reply-To: <20051101141726.W41623@delplex.bde.org> Message-ID: <20051101114150.X90580@geri.cc.fer.hr> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> <20051031160354.G67271@geri.cc.fer.hr> <20051101042444.K40281@delplex.bde.org> <20051031201719.S68800@geri.cc.fer.hr> <20051101141726.W41623@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Nov 2005 11:13:08 -0000 On Tue, 1 Nov 2005, Bruce Evans wrote: > On Mon, 31 Oct 2005, Ivan Voras wrote: >> I did this but e2fsck doesn't set the flag. Fortunately, I found out that >> e2fsprogs includes "debugfs" utility with which I manually set the flag. > > Does e2fsck report the problem? Surprisingly, no. It run normally (as far as I can tell) without reporting anything. This was on an unmounted system that contained one large (3GB) file created (actually extended from a 2G file) in the Windows ext2 driver, which also didn't set the flag (though its documentation said it would). Maybe there's undocumented "features" regarding extending already existing files. > Strangely enough, I first got interested in ext2fs under FreeBSD because > testing showed that it was faster than ffs in one configuration, and this > turned out to be mostly because of fragmentation: Very nice explanation, thanks! > - ext2fs under FreeBSD has a primitive block allocator that will give lots > and never delete files, the layout is almost optimal. In particular, > the layout is good after copying a large directory tree to a new file This is what I was doing with msdosfs, and accidentally looked at the defragmenter - the display was like it's been randomly fragmented. Now I know why :) > first few cylinder groups full and the rest unused, where Linux would > use all the groups fairly evenly. Not so good. I suppose this is not easaly fixable? > - soft updates (in this test) is now not much faster than ordinary > (-nosync -noasync) mounts and is much slower than async mounts. It > used to be only 1.5 times slower than async mounts. This test was I've noticed this too. > - ext2fs is about twice as slow as the other 2 (worse for non-async writes). > It is mostly because the block size is very small, and although this > only necessarily costs extra CPU to do clustering, FreeBSD is optimized > for ffs's default block size and does pessimal things with ext2fs's > smaller sizes. These effects are also very noticable with NTFS (default block size=4096 for all/most partition sizes) and FAT32 on smaller drives (where bs=4096 fits FAT in 8MB). From owner-freebsd-fs@FreeBSD.ORG Wed Nov 2 14:18:00 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E37716A41F for ; Wed, 2 Nov 2005 14:18:00 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.115]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6EE2843D45 for ; Wed, 2 Nov 2005 14:17:59 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id jA2EHrnI016238; Thu, 3 Nov 2005 01:17:53 +1100 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id jA2EHpW4015639; Thu, 3 Nov 2005 01:17:51 +1100 Date: Thu, 3 Nov 2005 01:17:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Ivan Voras In-Reply-To: <20051101114150.X90580@geri.cc.fer.hr> Message-ID: <20051103011024.R7399@epsplex.bde.org> References: <20051030183340.B19470@geri.cc.fer.hr> <46D894BD-16E0-4CBA-B40A-EEBAAC2547D2@classicalguitar.net> <20051031191139.J38757@delplex.bde.org> <20051031160354.G67271@geri.cc.fer.hr> <20051101042444.K40281@delplex.bde.org> <20051031201719.S68800@geri.cc.fer.hr> <20051101141726.W41623@delplex.bde.org> <20051101114150.X90580@geri.cc.fer.hr> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: ext2 large_file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Nov 2005 14:18:00 -0000 On Tue, 1 Nov 2005, Ivan Voras wrote: > On Tue, 1 Nov 2005, Bruce Evans wrote: > >> On Mon, 31 Oct 2005, Ivan Voras wrote: > [for ext2fs under FreeBSD] >> first few cylinder groups full and the rest unused, where Linux would >> use all the groups fairly evenly. > > Not so good. I suppose this is not easaly fixable? Not very easily, but not very uneasily either. We can either obtain a block allocator from NetBSD (I think it is similar to ffs's, but specialized to ext2fs), or use the current Linux allocator (I think we have an old version already -- ISTR a comment saying that it is used for checking only). >> - ext2fs is about twice as slow as the other 2 (worse for non-async >> writes). >> It is mostly because the block size is very small, and although this >> only necessarily costs extra CPU to do clustering, FreeBSD is optimized >> for ffs's default block size and does pessimal things with ext2fs's >> smaller sizes. > > These effects are also very noticable with NTFS (default block size=4096 for > all/most partition sizes) and FAT32 on smaller drives (where bs=4096 fits FAT > in 8MB). 4K isn't too bad, except possibly on arches with a page size of 8K -- see my benchmark output for msdosfs. Clustering is certainly required to get good results for large files. Bruce From owner-freebsd-fs@FreeBSD.ORG Fri Nov 4 20:30:57 2005 Return-Path: X-Original-To: fs@freebsd.org Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3280916A41F for ; Fri, 4 Nov 2005 20:30:57 +0000 (GMT) (envelope-from cel@citi.umich.edu) Received: from citi.umich.edu (citi.umich.edu [141.211.133.111]) by mx1.FreeBSD.org (Postfix) with ESMTP id ABC4543D46 for ; Fri, 4 Nov 2005 20:30:56 +0000 (GMT) (envelope-from cel@citi.umich.edu) Received: from [141.211.133.33] (dexter.citi.umich.edu [141.211.133.33]) by citi.umich.edu (Postfix) with ESMTP id DAAA01BBAC; Fri, 4 Nov 2005 15:30:55 -0500 (EST) Message-ID: <436BC4FF.9090107@citi.umich.edu> Date: Fri, 04 Nov 2005 15:30:55 -0500 From: Chuck Lever Organization: Network Appliance, Inc. User-Agent: Mozilla Thunderbird 1.0.7-1.4.1 (X11/20050929) X-Accept-Language: en-us, en MIME-Version: 1.0 To: rick@snowhite.cis.uoguelph.ca Content-Type: multipart/mixed; boundary="------------070206020907030807060307" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: fs@freebsd.org Subject: NFS client reuses XID when server returns NFSERR_JUKEBOX X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: cel@citi.umich.edu List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2005 20:30:57 -0000 This is a multi-part message in MIME format. --------------070206020907030807060307 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit hi rick- (your name is in the source file, so i'm asking you first :^) i have an internal report that FreeBSD 5.3 (and by my examination, 6.0 too) has a bug around handling an EJUKEBOX from the server. RFC1813 states that the client should retransmit using a fresh XID if it receives NFSERR_JUKEBOX, but i see that the retry logic in sys/nfsclient/nfs_socket.c reuses the old XID. is the fix as easy as generating a new XID in the if {} clause that handles the NFSERR_TRYLATER error? --------------070206020907030807060307-- From owner-freebsd-fs@FreeBSD.ORG Fri Nov 4 20:31:45 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 68C0916A41F for ; Fri, 4 Nov 2005 20:31:45 +0000 (GMT) (envelope-from user@dhp.com) Received: from shell.dhp.com (shell.dhp.com [199.245.105.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2D30743D45 for ; Fri, 4 Nov 2005 20:31:45 +0000 (GMT) (envelope-from user@dhp.com) Received: by shell.dhp.com (Postfix, from userid 896) id 646B43134B; Fri, 4 Nov 2005 15:31:44 -0500 (EST) Date: Fri, 4 Nov 2005 15:31:44 -0500 (EST) From: user To: freebsd-fs@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: UFS2 snapshots on large filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2005 20:31:45 -0000 Hello, Considering a PC server running FreeBSD with 4 400 GB hard drives attached to a hardware raid controller doing raid-5. So this will present itself to the OS as a 1.2TB filesystem. Any comments on taking one or multiple snapshots of a filesystem of this size ? Given current disk capacities, I would not exactly consider this 1.2TB filesystem a "large" one ... any comments on say ... a 6-8 TB filesystem and making one or more snapshots of it ? Assume they are marginally busy - perhaps a 5-10% data turnover per day... Thanks. From owner-freebsd-fs@FreeBSD.ORG Fri Nov 4 21:51:38 2005 Return-Path: X-Original-To: fs@freebsd.org Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0877916A41F for ; Fri, 4 Nov 2005 21:51:38 +0000 (GMT) (envelope-from rick@snowhite.cis.uoguelph.ca) Received: from dargo.cs.uoguelph.ca (dargo.cs.uoguelph.ca [131.104.96.159]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9991F43D45 for ; Fri, 4 Nov 2005 21:51:37 +0000 (GMT) (envelope-from rick@snowhite.cis.uoguelph.ca) Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca [131.104.48.1]) by dargo.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id jA4LpV80000922; Fri, 4 Nov 2005 16:51:31 -0500 Received: (from rick@localhost) by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id QAA29932; Fri, 4 Nov 2005 16:52:32 -0500 (EST) Date: Fri, 4 Nov 2005 16:52:32 -0500 (EST) From: rick@snowhite.cis.uoguelph.ca Message-Id: <200511042152.QAA29932@snowhite.cis.uoguelph.ca> To: cel@citi.umich.edu X-Scanned-By: MIMEDefang 2.52 on 131.104.96.159 Cc: fs@freebsd.org Subject: nfs XID doesn't change after NFSERR_JUKEBOX X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2005 21:51:38 -0000 Here is a patch that I think fixes the problem. It only got a 2 minute mount test, so someone should test it more thoroughly before committing it. (I think a similar patch is in NetBSD, but I can't seem to get onto their site at this time.) It is for FreeBSD5.4, so expect somewhat different source line #s. My current client code recreates the RPC header on every retry, but changing to that requires a lot more coding. (It's necessary for RPCSEC_GSS, since a seq# in the RPC header has to change each retry and then be re-checksummed.) Hopefully this is useful for you, rick --- quick and dirty fix for xid not changing for NFSERR_JUKEBOX --- *** nfs_socket.c.orig Fri Nov 4 16:21:15 2005 --- nfs_socket.c Fri Nov 4 16:29:48 2005 *************** *** 76,81 **** --- 76,83 ---- #define TRUE 1 #define FALSE 0 + extern u_int32_t nfs_xid; + /* * Estimate rto for an nfs rpc sent via. an unreliable datagram. * Use the mean and mean deviation of rtt for the appropriate type of rpc *************** *** 919,925 **** int s, error = 0, mrest_len, auth_len, auth_type; int trylater_delay = NQ_TRYLATERDEL, trylater_cnt = 0; struct timeval now; ! u_int32_t xid; /* Reject requests while attempting a forced unmount. */ if (vp->v_mount->mnt_kern_flag & MNTK_UNMOUNTF) { --- 921,927 ---- int s, error = 0, mrest_len, auth_len, auth_type; int trylater_delay = NQ_TRYLATERDEL, trylater_cnt = 0; struct timeval now; ! u_int32_t *xidp; /* Reject requests while attempting a forced unmount. */ if (vp->v_mount->mnt_kern_flag & MNTK_UNMOUNTF) { *************** *** 950,956 **** nmp->nm_numgrps : (cred->cr_ngroups - 1)) << 2) + 5 * NFSX_UNSIGNED; m = nfsm_rpchead(cred, nmp->nm_flag, procnum, auth_type, auth_len, ! mrest, mrest_len, &mheadend, &xid); /* * For stream protocols, insert a Sun RPC Record Mark. --- 952,958 ---- nmp->nm_numgrps : (cred->cr_ngroups - 1)) << 2) + 5 * NFSX_UNSIGNED; m = nfsm_rpchead(cred, nmp->nm_flag, procnum, auth_type, auth_len, ! mrest, mrest_len, &mheadend, &xidp); /* * For stream protocols, insert a Sun RPC Record Mark. *************** *** 961,967 **** (m->m_pkthdr.len - NFSX_UNSIGNED)); } rep->r_mreq = m; ! rep->r_xid = xid; tryagain: if (nmp->nm_flag & NFSMNT_SOFT) rep->r_retry = nmp->nm_retry; --- 963,969 ---- (m->m_pkthdr.len - NFSX_UNSIGNED)); } rep->r_mreq = m; ! rep->r_xid = *xidp; tryagain: if (nmp->nm_flag & NFSMNT_SOFT) rep->r_retry = nmp->nm_retry; *************** *** 1088,1093 **** --- 1090,1098 ---- trylater_delay *= nfs_backoff[trylater_cnt]; if (trylater_cnt < NFS_NBACKOFF - 1) trylater_cnt++; + if (++nfs_xid == 0) + nfs_xid++; + rep->r_xid = *xidp = txdr_unsigned(nfs_xid); goto tryagain; } *** nfs_subs.c.orig Fri Nov 4 16:24:00 2005 --- nfs_subs.c Fri Nov 4 16:29:04 2005 *************** *** 85,91 **** u_int32_t nfs_true, nfs_false; /* And other global data */ ! static u_int32_t nfs_xid = 0; static enum vtype nv2tov_type[8]= { VNON, VREG, VDIR, VBLK, VCHR, VLNK, VNON, VNON }; --- 85,91 ---- u_int32_t nfs_true, nfs_false; /* And other global data */ ! u_int32_t nfs_xid = 0; static enum vtype nv2tov_type[8]= { VNON, VREG, VDIR, VBLK, VCHR, VLNK, VNON, VNON }; *************** *** 156,162 **** struct mbuf * nfsm_rpchead(struct ucred *cr, int nmflag, int procid, int auth_type, int auth_len, struct mbuf *mrest, int mrest_len, struct mbuf **mbp, ! u_int32_t *xidp) { struct mbuf *mb; u_int32_t *tl; --- 156,162 ---- struct mbuf * nfsm_rpchead(struct ucred *cr, int nmflag, int procid, int auth_type, int auth_len, struct mbuf *mrest, int mrest_len, struct mbuf **mbp, ! u_int32_t **xidpp) { struct mbuf *mb; u_int32_t *tl; *************** *** 192,198 **** if (++nfs_xid == 0) nfs_xid++; ! *tl++ = *xidp = txdr_unsigned(nfs_xid); *tl++ = rpc_call; *tl++ = rpc_vers; *tl++ = txdr_unsigned(NFS_PROG); --- 192,199 ---- if (++nfs_xid == 0) nfs_xid++; ! *xidpp = tl; ! *tl++ = txdr_unsigned(nfs_xid); *tl++ = rpc_call; *tl++ = rpc_vers; *tl++ = txdr_unsigned(NFS_PROG); *** nfsm_subs.h.orig Fri Nov 4 16:27:54 2005 --- nfsm_subs.h Fri Nov 4 16:28:11 2005 *************** *** 56,62 **** struct mbuf *nfsm_rpchead(struct ucred *cr, int nmflag, int procid, int auth_type, int auth_len, struct mbuf *mrest, int mrest_len, ! struct mbuf **mbp, u_int32_t *xidp); #define M_HASCL(m) ((m)->m_flags & M_EXT) #define NFSMINOFF(m) \ --- 56,62 ---- struct mbuf *nfsm_rpchead(struct ucred *cr, int nmflag, int procid, int auth_type, int auth_len, struct mbuf *mrest, int mrest_len, ! struct mbuf **mbp, u_int32_t **xidpp); #define M_HASCL(m) ((m)->m_flags & M_EXT) #define NFSMINOFF(m) \ From owner-freebsd-fs@FreeBSD.ORG Fri Nov 4 22:07:27 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1DFED16A41F for ; Fri, 4 Nov 2005 22:07:27 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8DA1C43D45 for ; Fri, 4 Nov 2005 22:07:26 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id jA4M7L5a081060; Fri, 4 Nov 2005 15:07:21 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <436BDB99.5060907@samsco.org> Date: Fri, 04 Nov 2005 15:07:21 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050615 X-Accept-Language: en-us, en MIME-Version: 1.0 To: user References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 snapshots on large filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Nov 2005 22:07:27 -0000 user wrote: > Hello, > > Considering a PC server running FreeBSD with 4 400 GB hard drives attached > to a hardware raid controller doing raid-5. > > So this will present itself to the OS as a 1.2TB filesystem. > > Any comments on taking one or multiple snapshots of a filesystem of this > size ? > > Given current disk capacities, I would not exactly consider this 1.2TB > filesystem a "large" one ... any comments on say ... a 6-8 TB filesystem > and making one or more snapshots of it ? > > Assume they are marginally busy - perhaps a 5-10% data turnover per day... > > Thanks. > The UFS snapshot code was written at a time when disks were typically around 4-9GB in size, not 400GB in size =-) Unfortunately, the amount of time it takes to do the initial snapshot bookkeeping scales linearly with the size of the drive, and many people have reported that it takes considerable amount of time (anywhere from several minutes to several dozen minutes) on large drives/arrays like you describe. So, you should test and plan accordingly if you are interested in using them. Scott