From owner-freebsd-fs@FreeBSD.ORG Sun May 2 14:10:13 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BF34A16A4CE for ; Sun, 2 May 2004 14:10:13 -0700 (PDT) Received: from afields.ca (afields.ca [216.194.67.132]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5519943D4C for ; Sun, 2 May 2004 14:10:13 -0700 (PDT) (envelope-from afields@afields.ca) Received: from afields.ca (localhost.afields.ca [127.0.0.1]) by afields.ca (8.12.6/8.12.9) with ESMTP id i42LABSd035258; Sun, 2 May 2004 17:10:11 -0400 (EDT) (envelope-from afields@afields.ca) Received: (from afields@localhost) by afields.ca (8.12.6/8.12.9/Submit) id i42LAA4n035257; Sun, 2 May 2004 17:10:10 -0400 (EDT) (envelope-from afields) Date: Sun, 2 May 2004 17:10:10 -0400 From: Allan Fields To: Siddharth Aggarwal , Poul-Henning Kamp Message-ID: <20040502211010.GA31553@afields.ca> References: <30551.1083480618@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <30551.1083480618@critter.freebsd.dk> User-Agent: Mutt/1.4i cc: freebsd-fs@freebsd.org Subject: Re: Debugging pseudo-disk driver on FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Allan Fields List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 May 2004 21:10:13 -0000 On Sun, May 02, 2004 at 08:50:18AM +0200, Poul-Henning Kamp wrote: > In message , Siddharth Aggarwal writes: > > > >Hi, > > > >I am working on a Copy on Write disk driver on FreeBSD where I try to save > >the state of a filesystem (/dev/ad0s3) to another device (/dev/ad0s4) by > >making a virtual device that sits on top of these two (/dev/shd0). > > Are you doing this using GEOM under FreeBSD 5 ? If not you should start > doing that now. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. I agree that this should be a GEOM-based facility if it is targeted to FreeBSD-5 as a device-level driver. Additionally, there is the possibility to do this at the vnode level using unions/overlays where heterogenous filesystem layouts are a possibility. Vnode and device-level solutions both have their owm merits. (What's also worth noting is the similarities in the two approaches.) For a copy-on-write scheme you might just as easily employ a union mount as outlined in [1]. A previous post was made to this list [2] on the topic of overlays and FreeBSD's union filesystem a while back. Discussed in the thread is the need for heterogenous layouts. Vnode-stacking allows rich semantics and operates above the filesystem layer. I was attempting to make an argument for enhanced namespace semantics. While you may be able to find some interesting papers with regards to vnode-stacking approaches to overlays at http://www.filesystems.org, the fistgen template code is still in need of some work on the FreeBSD side. [1] J. S. Pendry, M. K Mckusick. Union Mounts in 4.4BSD-Lite. USENIX Conference Proc. January 1995. [2] [freebsd-fs] Re: Overlayfs for FiST? (http://lists.freebsd.org/pipermail/freebsd-fs/2003-April/000090.html) -- Allan Fields Afields Research/AFRSL - http://afields.ca BSDCan: May 2004, Ottawa - http://www.bsdcan.org From owner-freebsd-fs@FreeBSD.ORG Sun May 2 15:26:02 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 54B6616A4CF for ; Sun, 2 May 2004 15:26:02 -0700 (PDT) Received: from afields.ca (afields.ca [216.194.67.132]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0196A43D5A for ; Sun, 2 May 2004 15:26:02 -0700 (PDT) (envelope-from afields@afields.ca) Received: from afields.ca (localhost.afields.ca [127.0.0.1]) by afields.ca (8.12.6/8.12.9) with ESMTP id i42MPwSd035599; Sun, 2 May 2004 18:25:58 -0400 (EDT) (envelope-from afields@afields.ca) Received: (from afields@localhost) by afields.ca (8.12.6/8.12.9/Submit) id i42MPwIJ035598; Sun, 2 May 2004 18:25:58 -0400 (EDT) (envelope-from afields) Date: Sun, 2 May 2004 18:25:58 -0400 From: Allan Fields To: Siddharth Aggarwal Message-ID: <20040502222558.GB31553@afields.ca> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i cc: freebsd-fs@freebsd.org Subject: Re: Debugging pseudo-disk driver on FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 May 2004 22:26:02 -0000 On Sun, May 02, 2004 at 12:41:56AM -0600, Siddharth Aggarwal wrote: > > Hi, > > I am working on a Copy on Write disk driver on FreeBSD where I try to save > the state of a filesystem (/dev/ad0s3) to another device (/dev/ad0s4) by > making a virtual device that sits on top of these two (/dev/shd0). > > 1. So in the strategy routine, I get the block read/write calls to > (/dev/shd0) . > 2. For a write operation, I copy the previous contents of the block > (number corresponding to /dev/ad0s3) on to a free block on /dev/ad0s4 > 3. To restore previous contents of disk, I read the allocated free block > from /dev/ad0s4 and write it back to original block number /dev/ad0s3. > > The virtual device /dev/shd0 is mounted on /mnt > > So to test it out, my /dev/ad0s3 originally had a file "old1" of 13685 > bytes containing repeating string pattern (OLDOLD) > I then copied a file "new1" of 8211 bytes having the repeating pattern > (NEWNEW) to overwrite the old one > i.e. cp new1 /mnt/old1 > > A hexdump shows that a block of 8192 bytes containing "OLDOLD" was copied > over to /dev/ad0s4 and its place being taken be "NEWNEW" in /dev/ad0s3. > Also remaining bytes (beyond the 8192 bytes) still remain in /dev/ad0s3. > So this shows that the copy on write was done correctly. And I correctly > see 8211 bytes of "NEWNEW" in /mnt/old1 (ls -l /mnt/old1) On closer read, I see the advantage of your approach here: were the originating device always has the latest changes but old data is still stored on another device. (But for how long.. until next overwrite. Revisioning possibilities?) This means that the original disk is always consistent with the most recent changes but has a sort of log of old blocks? This is the conceptually opposite approach to the union filesystem which traditionally keeps new changes to files on another filesystem (the overlay) and preserve the underlying filesystem contents. Your facility also allows devices containing arbitrary data which could be for example raw data streams as opposed to a filesystem which is accessible through the VFS. But this carries with it the implications of device-level block-i/o. Restoring any given file would involve translating the inode to physical blocks and restoring only those portions which were changed by the operation. I'm unclear how this works. Take undeleting a file: Wouldn't you need to restore the inode, the direct blocks, any indirect blocks and dirents by referencing these blocks. How do you know how to do this (at file granularity) at the device-level in a filesystem agnostic way? (Could writes be processed atomically?) Alternatively, you can implement this copy-on-write scheme at the vnode layer. > I then send an IOCTL to my driver to restore to the previous state > (expecting it to give me 13685 bytes of "OLDOLD" back in /mnt/old1) So this is like a snapshot of the original state of the filesystem on the device in it's entirety (sort of like snapshots but at the device-level vs. file-system)? How do you ensure it's consistent, especially when the device backing the storage of old blocks becomes full, which do you turf first? (Problem is less significant if you have a 1:1 mapping of blocks like RAID mirror w/ same partition size.) > After unmounting and remounting, I see that the contents of /mnt/old1 have > become OLDOLD, but there are only 8211 bytes instead of 13685. A hexdump of > /dev/ad0s3 however, shows that there are indeed 13685 consecutive bytes of > OLDOLD lying there. > > This has lead me to believe that the Inode of /mnt/old1 is not being > refereshed (or it was never saved off to the /dev/ad0s4 in the first place). Do Inode > read/writes go through the strategy routine in the first place? Can you reboot the machine and see the same effects? I know that sounds like an extreme measure, but that's a way to determine for sure if it's a caching issue. You could also try doing a few large dd's form another filesystem between dis/remount. > Any idea what could be going wrong? No clue. ;) -- Allan Fields AFRSL - http://afields.ca BSDCan: May 2004, Ottawa - http://www.bsdcan.org From owner-freebsd-fs@FreeBSD.ORG Sun May 2 16:13:54 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1052D16A4CE for ; Sun, 2 May 2004 16:13:54 -0700 (PDT) Received: from mailspool.ops.uunet.co.za (mailspool.ops.uunet.co.za [196.7.0.140]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8DD2B43D3F for ; Sun, 2 May 2004 16:13:53 -0700 (PDT) (envelope-from root@raider.ataris.co.za) Received: from [196.7.86.78] (helo=raider.ataris.co.za) by mailspool.ops.uunet.co.za with esmtp (Exim 3.36 #1) id 1BKQ9f-000AQq-00 for freebsd-fs@freebsd.org; Mon, 03 May 2004 01:13:52 +0200 Received: from root by raider.ataris.co.za with local (Exim 4.30; FreeBSD) id 1BKQEq-00025n-59 for freebsd-fs@freebsd.org; Mon, 03 May 2004 01:19:12 +0200 Date: Mon, 3 May 2004 01:19:12 +0200 From: Jacques Marnweck To: freebsd-fs@freebsd.org Message-ID: <20040502231912.GA7750@raider.ataris.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i Subject: Issues using unionfs and vnode backed disks X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 May 2004 23:13:54 -0000 Hi, I'm experiencing some weird issues using unionfs with vnode file backed disks under FreeBSD 5.2.1-RELEASE-p5 on a Intel Celeron 2.4GHz box with 512Mb RAM. I have similar setups using FreeBSD 4.9-STABLE boxes which are in production using vnconfig etc. # df -H Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1a 6.2G 57M 5.7G 1% / devfs 1.0K 1.0K 0B 100% /dev /dev/ad0s1g 52G 8.1G 39G 17% /home /dev/ad0s1f 1037M 934M 20M 98% /tmp /dev/ad0s1d 8.3G 2.1G 5.6G 27% /usr /dev/ad0s1e 8.3G 45M 7.6G 1% /var /dev/md0c 2.0G 142M 1.7G 8% /home/jails/base /dev/md1c 5.1G 173M 4.5G 4% /home/jails/zzzz.co.za devfs 1.0K 1.0K 0B 100% /home/jails/zzzz.co.za/dev :/home/jails/base 7.1G 2.2G 4.5G 33% /home/jails/zzzz.co.za Basically how I do jail()'ed virtual machines is that I first create a base disk image say base.vn which all the jails get so I only have to maintain one base installation which can be shared amongst multiple jails. Another bug I noticed which had similar side effects was where I had /usr/ports mounted below /home/jails/base/usr/ports and from the jail()'ed virtual machine I typed 'make install clean' to install a port, the box had a deadlock. >From reading various posts here, I'm assuming that unionfs under FreeBSD 5.2.1-RELEASE-p5 should have less issues, and should not cause the problems I'm experiencing? Also I've tested this a couple of times today. Who is maintaining the unionfs code, as I would like to chat to him/her. Regards --jm From owner-freebsd-fs@FreeBSD.ORG Sun May 2 17:26:11 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B5AD316A4CE for ; Sun, 2 May 2004 17:26:11 -0700 (PDT) Received: from VARK.homeunix.com (adsl-68-124-137-57.dsl.pltn13.pacbell.net [68.124.137.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 706BE43D53 for ; Sun, 2 May 2004 17:26:11 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: from VARK.homeunix.com (localhost [127.0.0.1]) by VARK.homeunix.com (8.12.10/8.12.10) with ESMTP id i430Pqnx012341; Sun, 2 May 2004 17:25:52 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.homeunix.com (8.12.10/8.12.10/Submit) id i430PqMv012340; Sun, 2 May 2004 17:25:52 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Sun, 2 May 2004 17:25:52 -0700 From: David Schultz To: Jacques Marnweck Message-ID: <20040503002552.GA12216@VARK.homeunix.com> Mail-Followup-To: Jacques Marnweck , freebsd-fs@FreeBSD.ORG References: <20040502231912.GA7750@raider.ataris.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040502231912.GA7750@raider.ataris.co.za> cc: freebsd-fs@FreeBSD.ORG Subject: Re: Issues using unionfs and vnode backed disks X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 May 2004 00:26:11 -0000 On Mon, May 03, 2004, Jacques Marnweck wrote: > Basically how I do jail()'ed virtual machines is that I first create a > base disk image say base.vn which all the jails get so I only have to maintain > one base installation which can be shared amongst multiple jails. > > Another bug I noticed which had similar side effects was where I had /usr/ports > mounted below /home/jails/base/usr/ports and from the jail()'ed virtual > machine I typed 'make install clean' to install a port, the box had a > deadlock. I seem to recall that there are known issues with unionfs on memory-backed devices. See also PR kern/54534. > >From reading various posts here, I'm assuming that unionfs under FreeBSD > 5.2.1-RELEASE-p5 should have less issues, and should not cause the > problems I'm experiencing? > > Also I've tested this a couple of times today. Who is maintaining the > unionfs code, as I would like to chat to him/her. I'm not sure what gave you that impression; unionfs has always been an unsupported use-at-your-own-risk feature *precisely* because it doesn't have a maintainer. (See the mount_unionfs(8) manpage.) Occasionally people fix problems with it, but it isn't held to the same standards as the rest of the system because nobody has the necessary combination of time, interest, and ability to work on it. You're welcome to report problems, and that's definitely beneficial when someone does have time to look at it, but please check the PR database to make sure you're not submitting a duplicate of an existing PR. From owner-freebsd-fs@FreeBSD.ORG Mon May 3 06:43:24 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9409F16A4CE for ; Mon, 3 May 2004 06:43:24 -0700 (PDT) Received: from bsd3.cis.nctu.edu.tw (bsd3.cis.nctu.edu.tw [140.113.23.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id DBC0D43D4C for ; Mon, 3 May 2004 06:43:23 -0700 (PDT) (envelope-from is86012@bsd3.cis.nctu.edu.tw) Received: from bsd3.cis.nctu.edu.tw (is86012@localhost.cis.nctu.edu.tw [127.0.0.1]) by bsd3.cis.nctu.edu.tw (8.12.9/8.12.6) with ESMTP id i43DhMVA044984 for ; Mon, 3 May 2004 21:43:22 +0800 (CST) (envelope-from is86012@bsd3.cis.nctu.edu.tw) Received: (from is86012@localhost) by bsd3.cis.nctu.edu.tw (8.12.9/8.12.9/Submit) id i43DhMAG044983 for freebsd-fs@freebsd.org; Mon, 3 May 2004 21:43:22 +0800 (CST) Date: Mon, 3 May 2004 21:43:22 +0800 From: 8623012 To: freebsd-fs@freebsd.org Message-ID: <20040503134322.GA44973@bsd3.cis.nctu.edu.tw> Mime-Version: 1.0 Content-Type: text/plain; charset=big5 Content-Disposition: inline User-Agent: Mutt/1.5.4i Subject: NFS File System Limit? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 May 2004 13:43:24 -0000 Hello all, How large storage can be controlled on FreeBSD Server using NFS file system? The other words, How large size of a NFS partition can be mounted using FreeBSD? How many NFS partitions can be mounted on a FreeBSD? Best Regards, From owner-freebsd-fs@FreeBSD.ORG Mon May 3 12:56:26 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BB28116A4CF for ; Mon, 3 May 2004 12:56:26 -0700 (PDT) Received: from mail-svr1.cs.utah.edu (mail-svr1.cs.utah.edu [155.99.198.200]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1414E43D5E for ; Mon, 3 May 2004 12:56:26 -0700 (PDT) (envelope-from saggarwa@cs.utah.edu) Received: from faith.cs.utah.edu (faith.cs.utah.edu [155.99.198.108]) by mail-svr1.cs.utah.edu (Postfix) with ESMTP id 659EB346EB; Mon, 3 May 2004 13:56:27 -0600 (MDT) Received: by faith.cs.utah.edu (Postfix, from userid 4973) id 738482EC21; Mon, 3 May 2004 13:56:25 -0600 (MDT) Received: from localhost (localhost [127.0.0.1]) by faith.cs.utah.edu (Postfix) with ESMTP id 6775134406; Mon, 3 May 2004 19:56:25 +0000 (UTC) Date: Mon, 3 May 2004 13:56:25 -0600 (MDT) From: Siddharth Aggarwal To: Allan Fields In-Reply-To: <20040502222558.GB31553@afields.ca> Message-ID: References: <20040502222558.GB31553@afields.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-fs@freebsd.org Subject: Re: Debugging pseudo-disk driver on FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 May 2004 19:56:26 -0000 On Sun, 2 May 2004, Allan Fields wrote: > On Sun, May 02, 2004 at 12:41:56AM -0600, Siddharth Aggarwal wrote: > > > > Hi, > > > > I am working on a Copy on Write disk driver on FreeBSD where I try to save > > the state of a filesystem (/dev/ad0s3) to another device (/dev/ad0s4) by > > making a virtual device that sits on top of these two (/dev/shd0). > > > > 1. So in the strategy routine, I get the block read/write calls to > > (/dev/shd0) . > > 2. For a write operation, I copy the previous contents of the block > > (number corresponding to /dev/ad0s3) on to a free block on /dev/ad0s4 > > 3. To restore previous contents of disk, I read the allocated free block > > from /dev/ad0s4 and write it back to original block number /dev/ad0s3. > > > > The virtual device /dev/shd0 is mounted on /mnt > > > > So to test it out, my /dev/ad0s3 originally had a file "old1" of 13685 > > bytes containing repeating string pattern (OLDOLD) > > I then copied a file "new1" of 8211 bytes having the repeating pattern > > (NEWNEW) to overwrite the old one > > i.e. cp new1 /mnt/old1 > > > > A hexdump shows that a block of 8192 bytes containing "OLDOLD" was copied > > over to /dev/ad0s4 and its place being taken be "NEWNEW" in /dev/ad0s3. > > Also remaining bytes (beyond the 8192 bytes) still remain in /dev/ad0s3. > > So this shows that the copy on write was done correctly. And I correctly > > see 8211 bytes of "NEWNEW" in /mnt/old1 (ls -l /mnt/old1) > > On closer read, I see the advantage of your approach here: were the > originating device always has the latest changes but old data is > still stored on another device. (But for how long.. until next > overwrite. Revisioning possibilities?) This means that the original Yes I am doing some kind of versioning for these blocks which are stored away on the shadow device. > disk is always consistent with the most recent changes but has a > sort of log of old blocks? > > This is the conceptually opposite approach to the union filesystem > which traditionally keeps new changes to files on another filesystem > (the overlay) and preserve the underlying filesystem contents. > > Your facility also allows devices containing arbitrary data which > could be for example raw data streams as opposed to a filesystem > which is accessible through the VFS. But this carries with it the > implications of device-level block-i/o. Restoring any given file > would involve translating the inode to physical blocks and restoring > only those portions which were changed by the operation. I'm unclear > how this works. Take undeleting a file: Wouldn't you need to > restore the inode, the direct blocks, any indirect blocks and > dirents by referencing these blocks. How do you know how to do > this (at file granularity) at the device-level in a filesystem > agnostic way? (Could writes be processed atomically?) > Actually the use case of this thing I am writing doesn't involve much of rolling back to a previous state but instead get a fresh disk image on another machine and then applying these log entries to the new disk in chronological order to reach a similar state on the new machine. So some of the concerns you expresses above may not apply. > Alternatively, you can implement this copy-on-write scheme at the > vnode layer. > > > I then send an IOCTL to my driver to restore to the previous state > > (expecting it to give me 13685 bytes of "OLDOLD" back in /mnt/old1) > > So this is like a snapshot of the original state of the filesystem > on the device in it's entirety (sort of like snapshots but at the > device-level vs. file-system)? How do you ensure it's consistent, > especially when the device backing the storage of old blocks becomes > full, which do you turf first? (Problem is less significant if you > have a 1:1 mapping of blocks like RAID mirror w/ same partition size.) > > > After unmounting and remounting, I see that the contents of /mnt/old1 have > > become OLDOLD, but there are only 8211 bytes instead of 13685. A hexdump of > > /dev/ad0s3 however, shows that there are indeed 13685 consecutive bytes of > > OLDOLD lying there. > > > > This has lead me to believe that the Inode of /mnt/old1 is not being > > refereshed (or it was never saved off to the /dev/ad0s4 in the first place). Do Inode > > read/writes go through the strategy routine in the first place? > > Can you reboot the machine and see the same effects? I know that > sounds like an extreme measure, but that's a way to determine for > sure if it's a caching issue. You could also try doing a few large > dd's form another filesystem between dis/remount. > I tried the reboot option too, but no success :(. One thing though is that, if the file old1 and new1 files are of the same size, i.e. 8211 bytes. I do get the correct behavior :). But obviously that is too ideal a case and I guess it works because filesystem metadata (particularly Inode) is not under question here. > > Any idea what could be going wrong? > > No clue. ;) > > -- > Allan Fields > AFRSL - http://afields.ca > BSDCan: May 2004, Ottawa - http://www.bsdcan.org > From owner-freebsd-fs@FreeBSD.ORG Mon May 3 19:04:00 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 16A6516A4CE for ; Mon, 3 May 2004 19:04:00 -0700 (PDT) Received: from mail-svr1.cs.utah.edu (mail-svr1.cs.utah.edu [155.99.198.200]) by mx1.FreeBSD.org (Postfix) with ESMTP id C1CF643D41 for ; Mon, 3 May 2004 19:03:59 -0700 (PDT) (envelope-from saggarwa@cs.utah.edu) Received: from faith.cs.utah.edu (faith.cs.utah.edu [155.99.198.108]) by mail-svr1.cs.utah.edu (Postfix) with ESMTP id 4595534706 for ; Mon, 3 May 2004 20:04:01 -0600 (MDT) Received: by faith.cs.utah.edu (Postfix, from userid 4973) id 51A232EC21; Mon, 3 May 2004 20:03:59 -0600 (MDT) Received: from localhost (localhost [127.0.0.1]) by faith.cs.utah.edu (Postfix) with ESMTP id 4707F34406 for ; Tue, 4 May 2004 02:03:59 +0000 (UTC) Date: Mon, 3 May 2004 20:03:59 -0600 (MDT) From: Siddharth Aggarwal To: freebsd-fs@freebsd.org In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: Debugging pseudo-disk driver on FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 May 2004 02:04:00 -0000 On Sun, 2 May 2004, Siddharth Aggarwal wrote: > > Hi, > > I am working on a Copy on Write disk driver on FreeBSD where I try to save > the state of a filesystem (/dev/ad0s3) to another device (/dev/ad0s4) by > making a virtual device that sits on top of these two (/dev/shd0). > > 1. So in the strategy routine, I get the block read/write calls to > (/dev/shd0) . > 2. For a write operation, I copy the previous contents of the block > (number corresponding to /dev/ad0s3) on to a free block on /dev/ad0s4 > 3. To restore previous contents of disk, I read the allocated free block > from /dev/ad0s4 and write it back to original block number /dev/ad0s3. > > The virtual device /dev/shd0 is mounted on /mnt > > So to test it out, my /dev/ad0s3 originally had a file "old1" of 13685 > bytes containing repeating string pattern (OLDOLD) > I then copied a file "new1" of 8211 bytes having the repeating pattern > (NEWNEW) to overwrite the old one > i.e. cp new1 /mnt/old1 > > A hexdump shows that a block of 8192 bytes containing "OLDOLD" was copied > over to /dev/ad0s4 and its place being taken be "NEWNEW" in /dev/ad0s3. > Also remaining bytes (beyond the 8192 bytes) still remain in /dev/ad0s3. > So this shows that the copy on write was done correctly. And I correctly > see 8211 bytes of "NEWNEW" in /mnt/old1 (ls -l /mnt/old1) > > I then send an IOCTL to my driver to restore to the previous state > (expecting it to give me 13685 bytes of "OLDOLD" back in /mnt/old1) > After unmounting and remounting, I see that the contents of /mnt/old1 have > become OLDOLD, but there are only 8211 bytes instead of 13685. A hexdump of > /dev/ad0s3 however, shows that there are indeed 13685 consecutive bytes of > OLDOLD lying there. > I think I know what's going wrong here. The Inode it seems is getting cached (probably the in-core inode) and that is overwriting the Inode that I restore from the shadow device. I used cksum to get the CRC of the device /dev/ad0s3 1. right after restoring to the previous state and 2. after restoring to the previous state and the doing ls -l /mnt/old1 followed by a sync. The values returned by the 2 cksums are different. So is there a way to invalidate in-core/cached inodes so that they don't get flushed to the disk and overwrite the inode contents that I want? > This has lead me to believe that the Inode of /mnt/old1 is not being > refereshed (or it was never saved off to the /dev/ad0s4 in the first place). Do Inode > read/writes go through the strategy routine in the first place? > From owner-freebsd-fs@FreeBSD.ORG Wed May 5 07:19:04 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DAA0016A4CE for ; Wed, 5 May 2004 07:19:04 -0700 (PDT) Received: from www5.pobox.sk (www5.pobox.sk [212.5.216.15]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0EC7E43D49 for ; Wed, 5 May 2004 07:19:04 -0700 (PDT) (envelope-from mferko@pobox.sk) Received: from www5.pobox.sk (www5.pobox.sk [212.5.216.15]) by www5.pobox.sk (8.12.9-20030917/8.12.9) with SMTP id i45EJ2u0025323 for ; Wed, 5 May 2004 16:19:02 +0200 Date: Wed, 5 May 2004 16:19:02 +0200 Message-Id: <200405051419.i45EJ2u0025323@www5.pobox.sk> From: Ferko To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=iso-8859-2 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Pobox-Id: 2696988062 X-User-Agent: Opera/7.23 (Windows NT 5.1; U) [en] X-Mailer: POBOX (mico's engine 5.0/4.0.1) Subject: X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Ferko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 May 2004 14:19:05 -0000 Hello. I have created simple UFS1 read only driver for Windows xp. If=20 someone need it, or want to test it, home page is: ffs.szm.sk/en Michal Racek=0A____________________________________=0Ahttp://www.pobox.sk/ = - spolahliva a bezpecna prevadzka=0A=0A=0A From owner-freebsd-fs@FreeBSD.ORG Wed May 5 07:27:13 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6CFC116A4D3 for ; Wed, 5 May 2004 07:27:13 -0700 (PDT) Received: from www4.pobox.sk (www4.pobox.sk [212.5.216.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5238F43D48 for ; Wed, 5 May 2004 07:27:12 -0700 (PDT) (envelope-from mferko@pobox.sk) Received: from www4.pobox.sk (www4.pobox.sk [212.5.216.14]) by www4.pobox.sk (8.12.10/8.12.10) with SMTP id i45ERA5o022760 for ; Wed, 5 May 2004 16:27:10 +0200 Date: Wed, 5 May 2004 16:27:10 +0200 Message-Id: <200405051427.i45ERA5o022760@www4.pobox.sk> From: Ferko To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=iso-8859-2 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Pobox-Id: 2696988062 X-User-Agent: Opera/7.23 (Windows NT 5.1; U) [en] X-Mailer: POBOX (mico's engine 5.0/4.0.1) Subject: UFS1 windows xp driver X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Ferko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 May 2004 14:27:13 -0000 Hello. I have created simple UFS1 read only driver for Windows xp. If someone need it, or want to test it, home page is: ffs.szm.sk/en Michal Racek. PS: Sorry for double posting- I forgot subject. =20 =0A=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D REKLA= MA =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =0AJava Desktop System predstavuje prvu pouzitelnu alternativu voci Windows= za=0Aposlednych 15 rokov, pretoze prinasa z=E1kaznikom bezpecne a doveryho= dne=0Adesktopove riesenie za zlomok ceny Windows.=0AViac informacii najdete= na : http://www.somi.sk/sun/java_desktop.php=0A=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A From owner-freebsd-fs@FreeBSD.ORG Thu May 6 01:22:17 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CBC8116A4CE for ; Thu, 6 May 2004 01:22:17 -0700 (PDT) Received: from ms-dienst.rz.rwth-aachen.de (ms-2.rz.RWTH-Aachen.DE [134.130.3.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7CD9F43D4C for ; Thu, 6 May 2004 01:22:17 -0700 (PDT) (envelope-from chris@unixpages.org) Received: from r220-1 (r220-1.rz.RWTH-Aachen.DE [134.130.3.31]) by ms-dienst.rz.rwth-aachen.de (iPlanet Messaging Server 5.2 HotFix 1.12 (built Feb 13 2003)) with ESMTP id <0HXA008IM9K382@ms-dienst.rz.rwth-aachen.de> for freebsd-fs@freebsd.org; Thu, 06 May 2004 10:14:27 +0200 (MEST) Received: from relay.RWTH-Aachen.DE ([134.130.3.1]) by r220-1 (MailMonitor for SMTP v1.2.2 ) ; Thu, 06 May 2004 10:14:26 +0200 (MEST) Received: from haakonia.hitnet.rwth-aachen.de (haakonia.hitnet.RWTH-Aachen.DE [137.226.181.92])i468EQYT001889; Thu, 06 May 2004 10:14:26 +0200 (MEST) Received: from gondor.middleearth (gondor.middleearth [192.168.1.42]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))(Postfix) with ESMTP id 3E05128418; Thu, 06 May 2004 10:14:26 +0200 (CEST) Received: by gondor.middleearth (Postfix, from userid 1001) id EDD8F6138; Thu, 06 May 2004 10:14:25 +0200 (CEST) Date: Thu, 06 May 2004 10:14:25 +0200 From: Christian Brueffer In-reply-to: <200405051427.i45ERA5o022760@www4.pobox.sk> To: Ferko Message-id: <20040506081425.GA84779@unixpages.org> MIME-version: 1.0 Content-type: multipart/signed; boundary="lrZ03NoBR/3+SXJZ"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-disposition: inline User-Agent: Mutt/1.5.5.1i X-Operating-System: FreeBSD 5.2-CURRENT X-PGP-Key: http://people.freebsd.org/~brueffer/brueffer.key.asc X-PGP-Fingerprint: A5C8 2099 19FF AACA F41B B29B 6C76 178C A0ED 982D References: <200405051427.i45ERA5o022760@www4.pobox.sk> cc: freebsd-fs@freebsd.org Subject: Re: UFS1 windows xp driver X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2004 08:22:17 -0000 --lrZ03NoBR/3+SXJZ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 05, 2004 at 04:27:10PM +0200, Ferko wrote: > Hello. I have created simple UFS1 read only driver for Windows xp. > If someone need it, or want to test it, home page is: ffs.szm.sk/en > Michal Racek. > PS: Sorry for double posting- I forgot subject. >=20 Very nice. I'll see if I can find a Windows box to test it on. - Christian --=20 Christian Brueffer chris@unixpages.org brueffer@FreeBSD.org GPG Key: http://people.freebsd.org/~brueffer/brueffer.key.asc GPG Fingerprint: A5C8 2099 19FF AACA F41B B29B 6C76 178C A0ED 982D --lrZ03NoBR/3+SXJZ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAmfPhbHYXjKDtmC0RAsBfAKDAtwW7RhLIMoTT34oyDSYK0hVd+ACg33yx 1nH1PpkAPJkyRELYs8UgqbU= =rWsr -----END PGP SIGNATURE----- --lrZ03NoBR/3+SXJZ-- From owner-freebsd-fs@FreeBSD.ORG Sat May 8 20:43:16 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A736516A4CE for ; Sat, 8 May 2004 20:43:16 -0700 (PDT) Received: from mta7.pltn13.pbi.net (mta7.pltn13.pbi.net [64.164.98.8]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4FE9643D64 for ; Sat, 8 May 2004 20:43:16 -0700 (PDT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (bea66837f5e10e47abdb7c93a2f914de@adsl-67-115-73-128.dsl.lsan03.pacbell.net [67.115.73.128]) by mta7.pltn13.pbi.net (8.12.10/8.12.10) with ESMTP id i493hFNk021039 for ; Sat, 8 May 2004 20:43:15 -0700 (PDT) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 4913A53549; Sat, 8 May 2004 19:51:15 -0700 (PDT) Date: Sat, 8 May 2004 19:51:15 -0700 From: Kris Kennaway To: fs@FreeBSD.org Message-ID: <20040509025115.GA79812@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="RnlQjJ0d97Da+TV1" Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Subject: NFS deadlock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 May 2004 03:43:16 -0000 --RnlQjJ0d97Da+TV1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I just ran into this on bento: one of the NFS servers was not responding, so I tried to unmount it. The unmount stuck in state nfsfsync, and eventually everything else on the system got stuck in state ufs. Kris db> show lockedvnods Locked vnodes 0xc6010e38: tag ufs, type VDIR, usecount 239, writecount 0, refcount 1, flags (VV_ROOT|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc77b2690 (pid 64855) with 12 pending ino 2, on dev da0s1a (4, 27) 0xc60bc618: tag ufs, type VDIR, usecount 3, writecount 0, refcount 1, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc89983f0 (pid 62988) with 1 pending ino 47731, on dev da0s1a (4, 27) 0xc60bc514: tag ufs, type VDIR, usecount 3, writecount 0, refcount 0, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc73ad540 (pid 61015) with 1 pending ino 47733, on dev da0s1a (4, 27) 0xc68f9b2c: tag nfs, type VREG, usecount 0, writecount 0, refcount 1, flags (VI_XLOCK|VI_BWAIT|VV_OBJBUF), lock type nfs: EXCL (count 1) by thread 0xc712c690 (pid 60978) fileid 1080273 fsid 0x400ff02 db> trace 60978 sched_switch(c712c690,2,c06ab58e,13c,ea95e7d5) at sched_switch+0xf5 mi_switch(1,0,c06ad8f3,16e,c712c690) at mi_switch+0x298 sleepq_switch(c68f9b58,178,4d000001,1,1) at sleepq_switch+0x149 sleepq_wait_sig(c68f9b58,0,c06ab58e,f6,0) at sleepq_wait_sig+0x14 msleep(c68f9b58,c68f9b2c,14d,c06b67cc,0) at msleep+0x4de nfs_flush(c68f9b2c,0,1,c712c690,1) at nfs_flush+0x961 nfs_fsync(e3ebdad4,0,c06b2d53,393,e3ebdad0) at nfs_fsync+0x31 vinvalbuf(c68f9b2c,1,0,c712c690,0) at vinvalbuf+0xff vclean(c68f9b2c,8,c712c690,e3ebdb94,c0571b48) at vclean+0xad vgonel(c68f9b2c,c712c690,c06b2d53,8a4,c5de6444) at vgonel+0x61 vflush(c5de6400,1,0,0,8000000) at vflush+0x32d nfs_unmount(c5de6400,8000000,c712c690,c712c690,0) at nfs_unmount+0x50 dounmount(c5de6400,8000000,c712c690,41e,400ff02) at dounmount+0x224 unmount(c712c690,e3ebdd14,c06c1efb,3e2,2) at unmount+0x24c syscall(2f,2f,2f,804a9f2,804de04) at syscall+0x2a0 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (22, FreeBSD ELF32, unmount), eip = 0x280c1a5f, esp = 0xbfbfe00c, ebp = 0xbfbfe0c8 --- --RnlQjJ0d97Da+TV1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQFAnZyiWry0BWjoQKURAlfiAJ4qO7gNDxu1isKtZgJziFNYAtNSCwCdForJ tpoujpePaD2dMPfCRk42ilM= =xgLb -----END PGP SIGNATURE----- --RnlQjJ0d97Da+TV1--