From owner-freebsd-fs Mon Sep 11 14:55:45 2000 Delivered-To: freebsd-fs@freebsd.org Received: from smtp01.primenet.com (smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (Postfix) with ESMTP id 4B8B437B43C; Mon, 11 Sep 2000 14:55:40 -0700 (PDT) Received: (from daemon@localhost) by smtp01.primenet.com (8.9.3/8.9.3) id OAA16115; Mon, 11 Sep 2000 14:54:59 -0700 (MST) Received: from usr09.primenet.com(206.165.6.209) via SMTP by smtp01.primenet.com, id smtpdAAAoMaOCF; Mon Sep 11 14:54:54 2000 Received: (from tlambert@localhost) by usr09.primenet.com (8.8.5/8.8.5) id OAA18763; Mon, 11 Sep 2000 14:55:27 -0700 (MST) From: Terry Lambert Message-Id: <200009112155.OAA18763@usr09.primenet.com> Subject: Re: CFR: nullfs, vm_objects and locks... (patch) To: bp@butya.kz (Boris Popov) Date: Mon, 11 Sep 2000 21:55:27 +0000 (GMT) Cc: freebsd-fs@FreeBSD.ORG, dillon@FreeBSD.ORG, semenu@FreeBSD.ORG, tegge@FreeBSD.ORG In-Reply-To: from "Boris Popov" at Sep 05, 2000 06:02:19 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > Last few days I've spent trying make nullfs really functional and > stable. There are many issues with the current nullfs code, but below I'll > try to outline the most annoying ones. > > The first one, is an inability to handle mmap() operation. This > comes from the VM/vnode_pager design where each vm_object associated with > a single vnode and vise versa. Looking at the problem in general one may > note, that stackable filesystems may have either separated vm_object per > layer or don't have it at all. Since nullfs essentially maps its vnodes to > underlying filesystem, it is reasonable to map all operations to > underlying vnode. I had a similar approach, which uses only one additional call: struct vnode *VOP_FINALVP(struct vnode *vp); When called on a vnode, it returns the real backing object, instead of a higher level shadow in a stack. Upper level vnodes do not have backing store associated with them. My approach, and the one you have put forward, are both flawed, if you try to move beyond the simple case of a 1:1 correspondance between stacking layers and underlying objects. That is, if we have anything more complex than a page in the final disk image equalling a page in a process address space, then there is a need for intermediate backing object(s). The most obvious case for this would be a compressing stacking layer, where the backing pages and the process address space pages are algorithmically related, but not identical. Similar cases to this one are metadata stuffing (say you take the first 1k of the file for an intermediate layer to enable access control lists, etc.), cryptographic stacks, and transformational stacks (example: an NFS client that maps 8859-1 files into 16 bit Unicode data, transparently). It seems to me that a hybrid approach is required, with explicit coherency calls between layers, at least for the non-correspondance cases, and with something like your approach (or mine) as an optimization, for the simple case. What this means is putting some of the pre-unified VM and buffer cache synchronization points back into the VFS consumer layers: the system call layer, and the NFS client layer. The simplest approach to resolving this is to provide a pager that implements VOP_{GET|PUT}PAGES using the read and write primitives; this would be used in intermediate layers which have their own backing objects in buffer cache/swap, but no disk backing object in an on-disk file system. > P.S. Two hours ago Sheldon Hearn told me that Tor Egge and Semen Ustimenko > worked together on the nullfs problem, but since discussion were private I > didn't know anything about it and probably stepped on their to toes with > my recent cleanup commit :( The code which I have seen on this subject works using the explicit coherency synchronization between backing objects. Unlike the approach in your patches, there is a duplicate backing object. It was my understanding that there was a cache coherency issue for devices that may be mounted after having a null layer stacked on them; specifically, the devices are vnodes, and have their own vm_object_t associated with them, and thus their own pages. From playing around with the patches Tor Egge had provided, I was able to demonstrate coherency failures in a number of circumstances, and it was not at all clear to me that msync() and fsync() would operate as expected. I was able to cause a number of supposedly "synchronized" file systems to fail, one catastrophically (doing a shutdown of a system with a nullfs mounted over /dev, with an FS named /A mounted on a device visible through the nullfs) when it spammed my root partition (not the /A partition!). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Mon Sep 11 18:29:24 2000 Delivered-To: freebsd-fs@freebsd.org Received: from relay.butya.kz (butya-gw.butya.kz [212.154.129.94]) by hub.freebsd.org (Postfix) with ESMTP id 0D9CA37B422; Mon, 11 Sep 2000 18:29:18 -0700 (PDT) Received: by relay.butya.kz (Postfix, from userid 1000) id 190D22876C; Tue, 12 Sep 2000 08:29:15 +0700 (ALMST) Received: from localhost (localhost [127.0.0.1]) by relay.butya.kz (Postfix) with ESMTP id 0411728766; Tue, 12 Sep 2000 08:29:14 +0700 (ALMST) Date: Tue, 12 Sep 2000 08:29:14 +0700 (ALMST) From: Boris Popov To: Terry Lambert Cc: freebsd-fs@FreeBSD.ORG, dillon@FreeBSD.ORG, semenu@FreeBSD.ORG, tegge@FreeBSD.ORG Subject: Re: CFR: nullfs, vm_objects and locks... (patch) In-Reply-To: <200009112155.OAA18763@usr09.primenet.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Mon, 11 Sep 2000, Terry Lambert wrote: > > The first one, is an inability to handle mmap() operation. This > > comes from the VM/vnode_pager design where each vm_object associated with > > a single vnode and vise versa. Looking at the problem in general one may > > note, that stackable filesystems may have either separated vm_object per > > layer or don't have it at all. Since nullfs essentially maps its vnodes to > > underlying filesystem, it is reasonable to map all operations to > > underlying vnode. > > I had a similar approach, which uses only one additional call: > > struct vnode *VOP_FINALVP(struct vnode *vp); Three separate calls provide explicit and clear create/use/destroy paradigm (it is somewhat broken by VFS but can be solved in future). > My approach, and the one you have put forward, are both flawed, if > you try to move beyond the simple case of a 1:1 correspondance > between stacking layers and underlying objects. They're not flawed, but require complex layers to implement its own VOP_GETPAGES/PUTPAGES operations. IMO, there can be more than one VM backing object in the stack, so any layer which performs data conversion tasks or gather data from multiple underlying filesystems can provide its own VM object to keep coherency between mmap() and read/write operations. On some point support routines can be integrated into VFS code. > > P.S. Two hours ago Sheldon Hearn told me that Tor Egge and Semen Ustimenko > > worked together on the nullfs problem, but since discussion were private I > > didn't know anything about it and probably stepped on their to toes with > > my recent cleanup commit :( > > The code which I have seen on this subject works using the explicit > coherency synchronization between backing objects. Unlike the approach > in your patches, there is a duplicate backing object. It was my Yes, I've looked at original Semen's patches posted to -current and they provide only "one way" cache coherency. -- Boris Popov http://www.butya.kz/~bp/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Sep 12 8:54:29 2000 Delivered-To: freebsd-fs@freebsd.org Received: from wally.eecs.harvard.edu (wally.eecs.harvard.edu [140.247.60.30]) by hub.freebsd.org (Postfix) with ESMTP id 51F5D37B422 for ; Tue, 12 Sep 2000 08:54:23 -0700 (PDT) Received: from localhost (stein@localhost) by wally.eecs.harvard.edu (8.10.0/8.10.0) with ESMTP id e8CFru507417; Tue, 12 Sep 2000 11:53:56 -0400 (EDT) Date: Tue, 12 Sep 2000 11:53:56 -0400 (EDT) From: Christopher Stein X-Sender: stein@wally To: Marius Bendiksen Cc: freebsd-fs@FreeBSD.ORG Subject: Re: how mmap buffer writes handled? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Yes, it appears to be done in vfs_setdirty() of kern/vfs_bio.c Like so: for (i = 0; i < bp->b_npages; i++) { vm_page_flag_clear(bp->b_pages[i], PG_ZERO); vm_page_test_dirty(bp->b_pages[i]); } My concern is that, for mmapped workloads, statistics like nbuf and numdirtybuffers, which are used to set the buf_daemon flush rate, will be meaningless as will the clean and dirty buffer lists. vfs_setdirty is called only from within bdwrite() (delayed write that writes from applications into mmapped buffers will not transit through) and vfs_busy_pages() (called before the physical device strategy routine). How are these data structures and statistics kept meaningful under mmapped workloads? thnx -Chris On Sat, 9 Sep 2000, Marius Bendiksen wrote: > > dirtied by mmap moved onto the dirty queue? IS this done > > synchronously by some kind of software intercept of the > > page table operations or are the buffers moved from the > > clean to dirty queues in the background? > > As I recall, a periodic scan of the "modified" bits of the various page > table entries will be made, and the buffers will be dirtied accordingly > as the scan completes. > > Marius > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Sep 12 18:27:32 2000 Delivered-To: freebsd-fs@freebsd.org Received: from mail-relay.eunet.no (mail-relay.eunet.no [193.71.71.242]) by hub.freebsd.org (Postfix) with ESMTP id C781D37B424 for ; Tue, 12 Sep 2000 18:27:29 -0700 (PDT) Received: from login-1.eunet.no (login-1.eunet.no [193.75.110.2]) by mail-relay.eunet.no (8.9.3/8.9.3/GN) with ESMTP id DAA56361; Wed, 13 Sep 2000 03:27:25 +0200 (CEST) (envelope-from mbendiks@eunet.no) Received: from localhost (mbendiks@localhost) by login-1.eunet.no (8.9.3/8.8.8) with ESMTP id DAA85567; Wed, 13 Sep 2000 03:27:25 +0200 (CEST) (envelope-from mbendiks@eunet.no) X-Authentication-Warning: login-1.eunet.no: mbendiks owned process doing -bs Date: Wed, 13 Sep 2000 03:27:25 +0200 (CEST) From: Marius Bendiksen To: Christopher Stein Cc: freebsd-fs@FreeBSD.ORG Subject: Re: how mmap buffer writes handled? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org > How are these data structures and statistics kept meaningful > under mmapped workloads? Most architectures that have an MMU, such as the x86, have a bit in their page tables or equivalent that will indicate whether a page has been modified since the last time that bit was cleared. This can be sampled and cleared in one go. On architectures lacking an MMU, I think the logical approach would be to use some of the protection facilities or such to force an exception to be raised when accessing the page for write, and updating the statistics based on that. Marius To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Sep 12 18:52: 6 2000 Delivered-To: freebsd-fs@freebsd.org Received: from wally.eecs.harvard.edu (wally.eecs.harvard.edu [140.247.60.30]) by hub.freebsd.org (Postfix) with ESMTP id 3F8E437B424 for ; Tue, 12 Sep 2000 18:52:03 -0700 (PDT) Received: from localhost (stein@localhost) by wally.eecs.harvard.edu (8.10.0/8.10.0) with ESMTP id e8D1pxA27844; Tue, 12 Sep 2000 21:51:59 -0400 (EDT) Date: Tue, 12 Sep 2000 21:51:59 -0400 (EDT) From: Christopher Stein X-Sender: stein@wally To: Marius Bendiksen Cc: freebsd-fs@FreeBSD.ORG Subject: Re: how mmap buffer writes handled? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I am aware of MMUs and COW-driven software emulation bits for architectures lacking them. You misinterpreted my concern. Or, more likely, I did not articulate it well enough. Here's another go: The modified bits in the page table entries will be set as an mmapped buffer is dirtied by the application. Suppose this buffer is on the clean buffer queue rather than the dirty queue. How will it be transferred to the dirty queue, where it belongs? If this is done by a periodic scan, then code like the buf_daemon are heavily dependent on the period of this scan to be responsive under mmap heavy workloads. That would be an interesting tuning issue. However, I can't find this comprehensive scan. vfs_setdirty() is something close - scanning through a buffers pages, setting its dirty interval, then setting the pte modified bits to clean so that pageout doesn't begin acting on this FS buffer's pages. vfs_set_dirty() itself is only called from bdwrite() and vfs_busy_pages(). This on its own is insufficient to correct statistics like numdirtybuffers and move a buffer sitting on vp->v_cleanblkhd to vp->v_dirtyblkhd. thnx -Chris On Wed, 13 Sep 2000, Marius Bendiksen wrote: > > How are these data structures and statistics kept meaningful > > under mmapped workloads? > > Most architectures that have an MMU, such as the x86, have a bit in their > page tables or equivalent that will indicate whether a page has been > modified since the last time that bit was cleared. This can be sampled and > cleared in one go. > > On architectures lacking an MMU, I think the logical approach would be to > use some of the protection facilities or such to force an exception to be > raised when accessing the page for write, and updating the statistics > based on that. > > Marius > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Tue Sep 12 20:56: 3 2000 Delivered-To: freebsd-fs@freebsd.org Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by hub.freebsd.org (Postfix) with ESMTP id A849F37B423; Tue, 12 Sep 2000 20:55:59 -0700 (PDT) Received: from shekel.mcl.cs.columbia.edu (shekel.mcl.cs.columbia.edu [128.59.18.15]) by cs.columbia.edu (8.9.3/8.9.3) with ESMTP id XAA27985; Tue, 12 Sep 2000 23:55:56 -0400 (EDT) Received: (from ezk@localhost) by shekel.mcl.cs.columbia.edu (8.9.3/8.9.3) id XAA23270; Tue, 12 Sep 2000 23:55:52 -0400 (EDT) Date: Tue, 12 Sep 2000 23:55:52 -0400 (EDT) Message-Id: <200009130355.XAA23270@shekel.mcl.cs.columbia.edu> From: Erez Zadok To: Boris Popov Cc: Terry Lambert , freebsd-fs@FreeBSD.ORG, dillon@FreeBSD.ORG, semenu@FreeBSD.ORG, tegge@FreeBSD.ORG Subject: Re: CFR: nullfs, vm_objects and locks... (patch) In-reply-to: Your message of "Tue, 12 Sep 2000 08:29:14 +0700." Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org In message , Boris Popov writes: > On Mon, 11 Sep 2000, Terry Lambert wrote: > > > > The first one, is an inability to handle mmap() operation. This > > > comes from the VM/vnode_pager design where each vm_object associated with > > > a single vnode and vise versa. Looking at the problem in general one may > > > note, that stackable filesystems may have either separated vm_object per > > > layer or don't have it at all. Since nullfs essentially maps its vnodes to > > > underlying filesystem, it is reasonable to map all operations to > > > underlying vnode. > > > > I had a similar approach, which uses only one additional call: > > > > struct vnode *VOP_FINALVP(struct vnode *vp); > > Three separate calls provide explicit and clear create/use/destroy > paradigm (it is somewhat broken by VFS but can be solved in future). > > > My approach, and the one you have put forward, are both flawed, if > > you try to move beyond the simple case of a 1:1 correspondance > > between stacking layers and underlying objects. > > They're not flawed, but require complex layers to implement its > own VOP_GETPAGES/PUTPAGES operations. IMO, there can be more than one VM > backing object in the stack, so any layer which performs data conversion > tasks or gather data from multiple underlying filesystems can provide > its own VM object to keep coherency between mmap() and read/write > operations. On some point support routines can be integrated into VFS > code. Cache coherency issues is just one angle. Do either proposals handle fan-out file systems, where may be more than one backing VP? IMHO if we're changing the VFS, might as well have support for custom page coherency schemes, fan-in, fan-out, etc. Erez. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Sep 13 5:57:28 2000 Delivered-To: freebsd-fs@freebsd.org Received: from urban.iinet.net.au (urban.iinet.net.au [203.59.24.231]) by hub.freebsd.org (Postfix) with ESMTP id 57BB437B422 for ; Wed, 13 Sep 2000 05:57:24 -0700 (PDT) Received: from muzak.iinet.net.au (muzak.iinet.net.au [203.59.24.237]) by urban.iinet.net.au (8.8.7/8.8.7) with ESMTP id UAA18418; Wed, 13 Sep 2000 20:57:16 +0800 Received: from jules.elischer.org (reggae-12-112.nv.iinet.net.au [203.59.92.112]) by muzak.iinet.net.au (8.8.5/8.8.5) with SMTP id UAA11114; Wed, 13 Sep 2000 20:57:13 +0800 Message-ID: <39BF79A3.2781E494@elischer.org> Date: Wed, 13 Sep 2000 05:57:07 -0700 From: Julian Elischer X-Mailer: Mozilla 3.04Gold (X11; I; FreeBSD 5.0-CURRENT i386) MIME-Version: 1.0 To: Christopher Stein Cc: Marius Bendiksen , freebsd-fs@FreeBSD.ORG Subject: Re: how mmap buffer writes handled? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Christopher Stein wrote: > > I am aware of MMUs and COW-driven software emulation bits for > architectures lacking them. You misinterpreted my concern. Or, more > likely, I did not articulate it well enough. Here's another go: > > The modified bits in the page table entries will be set as an > mmapped buffer is dirtied by the application. Suppose this > buffer is on the clean buffer queue rather than the dirty queue. > How will it be transferred to the dirty queue, where it belongs? > > If this is done by a periodic scan, then code like the buf_daemon > are heavily dependent on the period of this scan to be responsive > under mmap heavy workloads. That would be an interesting > tuning issue. However, I can't find this comprehensive scan. while buffers are clean they ar emarked read-only in hardware. on first write, a fault is taken, which transfers it to the dirty queue. > > vfs_setdirty() is something close - scanning through a buffers pages, > setting its dirty interval, then setting the pte modified bits to clean so > that pageout doesn't begin acting on this FS buffer's > pages. vfs_set_dirty() itself is only called from bdwrite() and > vfs_busy_pages(). This on its own is insufficient to correct statistics > like numdirtybuffers and move a buffer sitting on vp->v_cleanblkhd to > vp->v_dirtyblkhd. > > thnx > -Chris > > On Wed, 13 Sep 2000, Marius Bendiksen wrote: > > > > How are these data structures and statistics kept meaningful > > > under mmapped workloads? > > > > Most architectures that have an MMU, such as the x86, have a bit in their > > page tables or equivalent that will indicate whether a page has been > > modified since the last time that bit was cleared. This can be sampled and > > cleared in one go. > > > > On architectures lacking an MMU, I think the logical approach would be to > > use some of the protection facilities or such to force an exception to be > > raised when accessing the page for write, and updating the statistics > > based on that. > > > > Marius > > > > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-fs" in the body of the message > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message -- __--_|\ Julian Elischer / \ julian@elischer.org ( OZ ) World tour 2000 ---> X_.---._/ presently in: Perth v To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Sep 13 10: 1:48 2000 Delivered-To: freebsd-fs@freebsd.org Received: from mail.integratus.com (miami.integratus.com [63.209.2.83]) by hub.freebsd.org (Postfix) with SMTP id 7343F37B422 for ; Wed, 13 Sep 2000 10:01:46 -0700 (PDT) Received: (qmail 19282 invoked from network); 13 Sep 2000 17:01:46 -0000 Received: from kungfu.integratus.com (HELO integratus.com) (172.20.5.168) by tortuga1.integratus.com with SMTP; 13 Sep 2000 17:01:46 -0000 Message-ID: <39BFB2F9.AB6CEE34@integratus.com> Date: Wed, 13 Sep 2000 10:01:45 -0700 From: Jack Rusher Organization: Integratus X-Mailer: Mozilla 4.73 [en] (X11; I; Linux 2.2.12 i386) X-Accept-Language: en MIME-Version: 1.0 To: Erez Zadok Cc: Boris Popov , Terry Lambert , freebsd-fs@FreeBSD.ORG, dillon@FreeBSD.ORG, semenu@FreeBSD.ORG, tegge@FreeBSD.ORG Subject: Re: CFR: nullfs, vm_objects and locks... (patch) References: <200009130355.XAA23270@shekel.mcl.cs.columbia.edu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Erez Zadok wrote: > > changing the VFS, might as well have support for custom page coherency > schemes, fan-in, fan-out, etc. Anyone planning to tackle a major overhaul of the VFS/vnode behavior should probably take the half hour to read this paper: http://www.cs.princeton.edu/~rywang/berkeley/papers/spe98.html ...it is a set of observations concerning the creation of the xFS "serverless" network file system. In particular, there are some interesting points made concerning the difficulties of getting the UNIX kernel file system interface to work with designs for highly cache coherent cluster computing systems. -- Jack Rusher, Senior Engineer | mailto:jar@integratus.com Integratus, Inc. | http://www.integratus.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Wed Sep 13 12:28:39 2000 Delivered-To: freebsd-fs@freebsd.org Received: from mail.uni-bielefeld.de (mail2.uni-bielefeld.de [129.70.4.90]) by hub.freebsd.org (Postfix) with ESMTP id 7126B37B422; Wed, 13 Sep 2000 12:28:32 -0700 (PDT) Received: from frolic.no-support.loc (ppp36-214.hrz.uni-bielefeld.de [129.70.36.214]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.05.17.04.13.p6) with ESMTP id <0G0U002D9BFBWS@mail.uni-bielefeld.de>; Wed, 13 Sep 2000 21:28:30 +0200 (MET DST) Received: (from bjoern@localhost) by frolic.no-support.loc (8.9.3/8.9.3) id TAA01958; Wed, 13 Sep 2000 19:05:53 +0200 (CEST envelope-from bjoern) Date: Wed, 13 Sep 2000 19:05:53 +0200 From: Bjoern Fischer Subject: Re: CFR: nullfs, vm_objects and locks... (patch) In-reply-to: <200009051942.MAA76219@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Sep 05, 2000 at 12:42:19PM -0700 To: Matt Dillon Cc: Boris Popov , "Daniel C. Sobral" , freebsd-fs@FreeBSD.ORG, semenu@FreeBSD.ORG, tegge@FreeBSD.ORG Message-id: <20000913190552.B1450@frolic.no-support.loc> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline Content-transfer-encoding: 7BIT User-Agent: Mutt/1.2.5i References: <200009051942.MAA76219@earth.backplane.com> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, Sep 05, 2000 at 12:42:19PM -0700, Matt Dillon wrote: > I agree with all of Boris's points in regards to the two major changes: > Adding VOP's to access the VM object, and integrating the vnode lock > into the vnode directly. > > There is one issue which needs to be resolved, and that is with NFS. It > is not safe to lock vnodes related to NFS, which is why the NFS VOP locking > routines always force shared locks. This problem would have to be > resolved. Would this also apply to a possible NFSv4 implementation in future? There is an implementation under way for OpenBSD, how do they approach locking and stackable fs? Bjoern -- -----BEGIN GEEK CODE BLOCK----- GCS d--(+) s++: a- C+++(-) UB++++OSI++++$ P+++(-) L---(++) !E W- N+ o>+ K- !w !O !M !V PS++ PE- PGP++ t+++ !5 X++ tv- b+++ D++ G e+ h-- y+ ------END GEEK CODE BLOCK------ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Sep 14 2:58:17 2000 Delivered-To: freebsd-fs@freebsd.org Received: from gilberto.physik.rwth-aachen.de (gilberto.physik.rwth-aachen.de [137.226.30.2]) by hub.freebsd.org (Postfix) with ESMTP id 7DBBC37B422 for ; Thu, 14 Sep 2000 02:58:15 -0700 (PDT) Received: (from kuku@localhost) by gilberto.physik.rwth-aachen.de (8.9.3/8.9.3) id LAA77431 for freebsd-fs@freebsd.org; Thu, 14 Sep 2000 11:58:18 +0200 (CEST) (envelope-from kuku) Date: Thu, 14 Sep 2000 11:58:18 +0200 (CEST) From: Christoph Kukulies Message-Id: <200009140958.LAA77431@gilberto.physik.rwth-aachen.de> To: freebsd-fs@freebsd.org Subject: crypto fs? Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Is there an implementation of the crypto filesystem for FreeBSD? Such that a disk that falls into hands of anyone not knowing the secret key cannot be decyphered in the duration of the universe? -- Chris Christoph P. U. Kukulies kuku@gil.physik.rwth-aachen.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Sep 14 3:16:59 2000 Delivered-To: freebsd-fs@freebsd.org Received: from beamer.mchh.siemens.de (beamer.mchh.siemens.de [194.138.158.163]) by hub.freebsd.org (Postfix) with ESMTP id 31CF937B422 for ; Thu, 14 Sep 2000 03:16:57 -0700 (PDT) Received: from moody.mchh.siemens.de (mail2.mchh.siemens.de [194.138.158.226]) by beamer.mchh.siemens.de (8.9.3/8.9.3) with ESMTP id MAA15914; Thu, 14 Sep 2000 12:16:24 +0200 (MET DST) Received: from mchh247e.demchh201e.icn.siemens.de ([139.21.200.57]) by moody.mchh.siemens.de (8.9.1/8.9.1) with ESMTP id MAA12076; Thu, 14 Sep 2000 12:16:14 +0200 (MET DST) Received: by MCHH247E with Internet Mail Service (5.5.2650.21) id ; Thu, 14 Sep 2000 12:16:49 +0200 Message-ID: <67E0BE167008D31185F60008C7289DA0E12F00@MCHH218E> From: Reifenberger Michael To: "'Christoph Kukulies'" , freebsd-fs@FreeBSD.ORG Subject: AW: crypto fs? Date: Thu, 14 Sep 2000 12:16:49 +0200 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hi, see /usr/ports/security/cfs. Bye/2 ------ Michael Reifenberger - IT, UNIX, R/3-Basis Work: Michael.Reifenberger@plaut.de Proj: = Michael.Reifenberger.gp@icn.siemens.de Pers: Michael@Reifenberger.com Webspace: http://www.reifenberger.com > -----Urspr> =FCngliche Nachricht----- > Von: Christoph Kukulies [SMTP:kuku@gilberto.physik.rwth-aachen.de] > Gesendet am: Donnerstag, 14. September 2000 11:58 > An: freebsd-fs@FreeBSD.ORG > Betreff: crypto fs? >=20 >=20 > Is there an implementation of the crypto filesystem for FreeBSD? >=20 > Such that a disk that falls into hands of anyone not knowing > the secret key cannot be decyphered in the duration of the universe? >=20 > --=20 > Chris Christoph P. U. Kukulies kuku@gil.physik.rwth-aachen.de >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Sep 14 7: 1:29 2000 Delivered-To: freebsd-fs@freebsd.org Received: from h-209-91-79-2.gen.cadvision.com (h-209-91-79-2.gen.cadvision.com [209.91.79.2]) by hub.freebsd.org (Postfix) with ESMTP id E3A0537B43E for ; Thu, 14 Sep 2000 07:01:25 -0700 (PDT) Received: from cirp.org (localhost [127.0.0.1]) by h-209-91-79-2.gen.cadvision.com (8.9.3/8.9.3) with ESMTP id IAA03781 for ; Thu, 14 Sep 2000 08:01:22 -0600 (MDT) (envelope-from gtf@cirp.org) Message-Id: <200009141401.IAA03781@h-209-91-79-2.gen.cadvision.com> Date: Thu, 14 Sep 2000 08:01:21 -0600 (MDT) From: "Geoffrey T. Falk" Subject: Re: AW: crypto fs? To: freebsd-fs@freebsd.org In-Reply-To: <67E0BE167008D31185F60008C7289DA0E12F00@MCHH218E> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I have played with CFS. It is unsatisfactory for a number of reasons. It is implemented via a daemon that runs over an NFS connection. This is not clean. It is also slow. CFS takes over a directory in a filesystem. It does not encrypt disk blocks. It leaves information about your directory topology and file sizes available to an attacker. The CFS daemon also has a memory leak (at least in the current version). You will notice this if you copy several GB or if you leave it up and running for a while. A proper crypto filesystem would encrypt the blocks in the strategy() routine. One could run a standard FFS directly on top of it. I have searched for such a project but did not find anything. As an aside, in the process of investigating this, I discovered that documentation on BSD internals is severely underpublished. In contrast, I found an entire O'Reilly book on the Linux filesystem, complete with code samples. Regards g. n 14 Sep, Reifenberger Michael wrote: > Hi, > see /usr/ports/security/cfs. > > Bye/2 > ------ > Michael Reifenberger - IT, UNIX, R/3-Basis > Work: Michael.Reifenberger@plaut.de Proj: Michael.Reifenberger.gp@icn.siemens.de > Pers: Michael@Reifenberger.com Webspace: http://www.reifenberger.com > >> -----Urspr> üngliche Nachricht----- >> Von: Christoph Kukulies [SMTP:kuku@gilberto.physik.rwth-aachen.de] >> Gesendet am: Donnerstag, 14. September 2000 11:58 >> An: freebsd-fs@FreeBSD.ORG >> Betreff: crypto fs? >> >> >> Is there an implementation of the crypto filesystem for FreeBSD? >> >> Such that a disk that falls into hands of anyone not knowing >> the secret key cannot be decyphered in the duration of the universe? >> >> -- >> Chris Christoph P. U. Kukulies kuku@gil.physik.rwth-aachen.de >> >> >> To Unsubscribe: send mail to majordomo@FreeBSD.org >> with "unsubscribe freebsd-fs" in the body of the message > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Sep 14 7:34:50 2000 Delivered-To: freebsd-fs@freebsd.org Received: from h-209-91-79-2.gen.cadvision.com (h-209-91-79-2.gen.cadvision.com [209.91.79.2]) by hub.freebsd.org (Postfix) with ESMTP id 41AAC37B422 for ; Thu, 14 Sep 2000 07:34:48 -0700 (PDT) Received: from cirp.org (localhost [127.0.0.1]) by h-209-91-79-2.gen.cadvision.com (8.9.3/8.9.3) with ESMTP id IAA03818 for ; Thu, 14 Sep 2000 08:34:40 -0600 (MDT) (envelope-from gtf@cirp.org) Message-Id: <200009141434.IAA03818@h-209-91-79-2.gen.cadvision.com> Date: Thu, 14 Sep 2000 08:34:39 -0600 (MDT) From: "Geoffrey T. Falk" Subject: Re: AW: crypto fs? To: freebsd-fs@FreeBSD.ORG In-Reply-To: <200009141401.IAA03781@h-209-91-79-2.gen.cadvision.com> MIME-Version: 1.0 Content-Type: TEXT/plain; CHARSET=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On 14 Sep, I wrote: > A proper crypto filesystem would encrypt the blocks in the strategy() > routine. One could run a standard FFS directly on top of it. To clarify, obviously, I was thinking of implementing an encrypted device as a pseudo- block device, that maps to an existing partition. The passphrase could be set using an ioctl(). A main concern with crypto FS is keeping plaintext blocks from being swapped out. If you are following this approach, you would also encrypt your swap devices. The whole issue of crypto services in the kernel is one I would like to see developing. To my knowledge not even OpenBSD has gone this far. g. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Thu Sep 14 22:41:54 2000 Delivered-To: freebsd-fs@freebsd.org Received: from vbook.express.ru (vbook.express.ru [212.24.37.106]) by hub.freebsd.org (Postfix) with ESMTP id EE68D37B423 for ; Thu, 14 Sep 2000 22:41:50 -0700 (PDT) Received: (from vova@localhost) by vbook.express.ru (8.9.3/8.9.3) id JAA39002; Fri, 15 Sep 2000 09:18:05 +0400 (MSD) (envelope-from vova) From: "Vladimir B. Grebenschikov" MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14785.45324.164570.436002@vbook.express.ru> Date: Fri, 15 Sep 2000 09:18:04 +0400 (MSD) To: "Geoffrey T. Falk" Cc: freebsd-fs@FreeBSD.ORG Subject: Re: AW: crypto fs? In-Reply-To: <200009141434.IAA03818@h-209-91-79-2.gen.cadvision.com> References: <200009141401.IAA03781@h-209-91-79-2.gen.cadvision.com> <200009141434.IAA03818@h-209-91-79-2.gen.cadvision.com> X-Mailer: VM 6.72 under 21.1 (patch 9) "Canyonlands" XEmacs Lucid Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Geoffrey T. Falk writes: > On 14 Sep, I wrote: > > A proper crypto filesystem would encrypt the blocks in the strategy() > > routine. One could run a standard FFS directly on top of it. > > To clarify, obviously, I was thinking of implementing an encrypted > device as a pseudo- block device, that maps to an existing partition. > The passphrase could be set using an ioctl(). May be portalfs helps you ? (man mount_portalfs) > A main concern with crypto FS is keeping plaintext blocks from being > swapped out. If you are following this approach, you would also encrypt > your swap devices. > > The whole issue of crypto services in the kernel is one I would like to > see developing. To my knowledge not even OpenBSD has gone this far. > > g. > -- TSB Russian Express, Moscow Vladimir B. Grebenschikov, vova@express.ru To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Sep 15 7:54:38 2000 Delivered-To: freebsd-fs@freebsd.org Received: from mail.over.ru (over.rinet.ru [195.54.192.99]) by hub.freebsd.org (Postfix) with SMTP id 1BCE237B423 for ; Fri, 15 Sep 2000 07:54:36 -0700 (PDT) Received: (qmail 52094 invoked by uid 1001); 15 Sep 2000 14:54:59 -0000 Date: Fri, 15 Sep 2000 18:54:58 +0400 From: Alex Povolotsky To: "Vladimir B. Grebenschikov" Cc: freebsd-fs@freebsd.org Subject: Re: AW: crypto fs? Message-ID: <20000915185458.D47468@mail.over.ru> References: <200009141401.IAA03781@h-209-91-79-2.gen.cadvision.com> <200009141434.IAA03818@h-209-91-79-2.gen.cadvision.com> <14785.45324.164570.436002@vbook.express.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <14785.45324.164570.436002@vbook.express.ru>; from vova@express.ru on Fri, Sep 15, 2000 at 09:18:04AM +0400 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, Sep 15, 2000 at 09:18:04AM +0400, Vladimir B. Grebenschikov wrote: > May be portalfs helps you ? (man mount_portalfs) Is it alive? As far as I remember, it is broken for at least two major version. Alex. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Fri Sep 15 8:23:54 2000 Delivered-To: freebsd-fs@freebsd.org Received: from mail.uni-bielefeld.de (mail2.uni-bielefeld.de [129.70.4.90]) by hub.freebsd.org (Postfix) with ESMTP id 6F79737B42C for ; Fri, 15 Sep 2000 08:23:49 -0700 (PDT) Received: from frolic.no-support.loc (ppp36-174.hrz.uni-bielefeld.de [129.70.36.174]) by mail.uni-bielefeld.de (Sun Internet Mail Server sims.4.0.2000.05.17.04.13.p6) with ESMTP id <0G0X0005XPFJFC@mail.uni-bielefeld.de> for freebsd-fs@FreeBSD.ORG; Fri, 15 Sep 2000 17:23:47 +0200 (MET DST) Received: (from bjoern@localhost) by frolic.no-support.loc (8.9.3/8.9.3) id CAA03193 for freebsd-fs@FreeBSD.ORG; Fri, 15 Sep 2000 02:51:02 +0200 (CEST envelope-from bjoern) Date: Fri, 15 Sep 2000 02:43:01 +0200 From: Bjoern Fischer Subject: Re: crypto fs? In-reply-to: <200009140958.LAA77431@gilberto.physik.rwth-aachen.de>; from kuku@gilberto.physik.rwth-aachen.de on Thu, Sep 14, 2000 at 11:58:18AM +0200 To: Christoph Kukulies Message-id: <20000915024301.A2859@frolic.no-support.loc> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline Content-transfer-encoding: 7BIT User-Agent: Mutt/1.2.5i References: <200009140958.LAA77431@gilberto.physik.rwth-aachen.de> Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Thu, Sep 14, 2000 at 11:58:18AM +0200, Christoph Kukulies wrote: > > Is there an implementation of the crypto filesystem for FreeBSD? > > Such that a disk that falls into hands of anyone not knowing > the secret key cannot be decyphered in the duration of the universe? You might be interested in Rubberhose : Quote: Rubberhose transparently and deniably encrypts disk data, minimising the effectiveness of warrants, coersive interrogations and other compulsive mechanims, such as U.K RIP legislation. Rubberhose differs from conventional disk encryption systems in that it has an advanced modular architecture, self-test suite, is more secure, portable, utilises information hiding (steganography / deniable cryptography), works with any file system and has source freely available. Currently supported ciphers are DES, 3DES, IDEA, RC5, RC6, Blowfish, Twofish and CAST. bjoern -- -----BEGIN GEEK CODE BLOCK----- GCS d--(+) s++: a- C+++(-) UB++++OSI++++$ P+++(-) L---(++) !E W- N+ o>+ K- !w !O !M !V PS++ PE- PGP++ t+++ !5 X++ tv- b+++ D++ G e+ h-- y+ ------END GEEK CODE BLOCK------ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Sep 16 7:14:21 2000 Delivered-To: freebsd-fs@freebsd.org Received: from vbook.express.ru (vbook.express.ru [212.24.37.106]) by hub.freebsd.org (Postfix) with ESMTP id 9EF0837B424 for ; Sat, 16 Sep 2000 07:14:18 -0700 (PDT) Received: (from vova@localhost) by vbook.express.ru (8.9.3/8.9.3) id KAA14494; Sat, 16 Sep 2000 10:47:20 +0400 (MSD) (envelope-from vova) From: "Vladimir B. Grebenschikov" MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14787.6008.423175.556735@vbook.express.ru> Date: Sat, 16 Sep 2000 10:47:20 +0400 (MSD) To: Alex Povolotsky Cc: "Vladimir B. Grebenschikov" , freebsd-fs@freebsd.org Subject: Re: AW: crypto fs? In-Reply-To: <20000915185458.D47468@mail.over.ru> References: <200009141401.IAA03781@h-209-91-79-2.gen.cadvision.com> <200009141434.IAA03818@h-209-91-79-2.gen.cadvision.com> <14785.45324.164570.436002@vbook.express.ru> <20000915185458.D47468@mail.over.ru> X-Mailer: VM 6.72 under 21.1 (patch 9) "Canyonlands" XEmacs Lucid Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Alex Povolotsky writes: > > May be portalfs helps you ? (man mount_portalfs) > Is it alive? As far as I remember, it is broken for at least two major version. In reality I didn't try to use it, Only mention great possibility. > Alex. -- TSB Russian Express, Moscow Vladimir B. Grebenschikov, vova@express.ru To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Sep 16 14:25:40 2000 Delivered-To: freebsd-fs@freebsd.org Received: from Gloria.CAM.ORG (Gloria.CAM.ORG [205.151.116.34]) by hub.freebsd.org (Postfix) with ESMTP id 6003937B422; Sat, 16 Sep 2000 14:25:36 -0700 (PDT) Received: from localhost (intmktg@localhost) by Gloria.CAM.ORG (8.9.3/8.9.3) with ESMTP id RAA08307; Sat, 16 Sep 2000 17:20:11 -0400 Date: Sat, 16 Sep 2000 17:20:11 -0400 (EDT) From: Marc Tardif To: freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: device naming convention Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org What is the FreeBSD naming convention for devices of disk slices and labels? Considering my system is installed on the first partition of /dev/wd0 (non-dedicated), these are the block-device interfaces I have to my disk: wd0 wd0c wd0f wd0s1 wd0s1c wd0s1f wd0s2 wd0a wd0d wd0g wd0s1a wd0s1d wd0s1g wd0s3 wd0b wd0e wd0h wd0s1b wd0s1e wd0s1h wd0s4 Questions: 1. What are wd0[a-h] used for? 2. If wd0s1 is my first slice, why isn't it named wd0s0? 3. If I format wd0s2 as any type (Xenix for example), will /dev now contain wd0s2[a-h]? Assuming /dev/wd0s2 contains a few blocks, ie /dev/wd0s1 doesn't span to the end of disk: 4. If I want to use /dev/wd0s2 as a raw slice for reading and writing, what are the steps to follow? 4a. Do I need to format the partition as any type? If so is there a recommended type (perhaps one which won't be recognised by the bootloader would be preferable)? 4b. Should I then be using /dev/rwd0s2 or /dev/rwd0s2a for reading and writing (of course, this is assuming block i/o of multiples of 512 bytes)? Lastly, where else could I have found this information other than asking on the FreeBSD mailing list? Thanks, Marc Tardif To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Sep 16 16:27:17 2000 Delivered-To: freebsd-fs@freebsd.org Received: from aaz.links.ru (aaz.links.ru [193.125.152.37]) by hub.freebsd.org (Postfix) with ESMTP id A2B4A37B422; Sat, 16 Sep 2000 16:27:10 -0700 (PDT) Received: (from babolo@localhost) by aaz.links.ru (8.9.3/8.9.3) id DAA17240; Sun, 17 Sep 2000 03:27:05 +0400 (MSD) Message-Id: <200009162327.DAA17240@aaz.links.ru> Subject: Re: device naming convention In-Reply-To: from "Marc Tardif" at "Sep 16, 0 05:20:11 pm" To: intmktg@CAM.ORG (Marc Tardif) Date: Sun, 17 Sep 2000 03:27:05 +0400 (MSD) Cc: freebsd-hackers@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG From: "Aleksandr A.Babaylov" MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Marc Tardif writes: > What is the FreeBSD naming convention for devices of disk slices and > labels? Considering my system is installed on the first partition of > /dev/wd0 (non-dedicated), these are the block-device interfaces I > have to my disk: > > wd0 wd0c wd0f wd0s1 wd0s1c wd0s1f wd0s2 > wd0a wd0d wd0g wd0s1a wd0s1d wd0s1g wd0s3 > wd0b wd0e wd0h wd0s1b wd0s1e wd0s1h wd0s4 > > Questions: > 1. What are wd0[a-h] used for? For wd0sN[a-h] where N is number of first slice recognized as FreeBSD slice > 2. If wd0s1 is my first slice, why isn't it named wd0s0? wd0s0 == wd0 wd0s0a == wd0a > 3. If I format wd0s2 as any type (Xenix for example), > will /dev now contain wd0s2[a-h]? Content of /dev is totally undependent of any hardware and kernel conditions. Do yourself cd /dev ; ./MAKEDEV wd0s2h for wd0s2[a-h] entries > Assuming /dev/wd0s2 contains a few blocks, ie /dev/wd0s1 > doesn't span to the end of disk: > 4. If I want to use /dev/wd0s2 as a raw slice for reading > and writing, what are the steps to follow? You can't write several blocks near /dev/wd0s2 beginning. Use /dev/wd0 with proper address > 4a. Do I need to format the partition as any type? If so > is there a recommended type (perhaps one which won't > be recognised by the bootloader would be preferable)? It depends on usage. And remember - kernel looks up every slice to find FreeBSD label - even if you mark it 0 (unused) > 4b. Should I then be using /dev/rwd0s2 or /dev/rwd0s2a > for reading and writing (of course, this is assuming > block i/o of multiples of 512 bytes)? You can do what you want, but remember, [a-h] can be used only if partitino have a FreeBSD label For example, for label on wd0s2 # some space not included in FreeBSD partition 514080 0 a: 514080 514080 b: 514080 1028160 you can use wd0s2a AND wd0s2 as different file systems IF you properly initialise with newfs. > Lastly, where else could I have found this information other > than asking on the FreeBSD mailing list? Read sources and experiment. BE AWARE - FreeBSD 4.X in difference with FreeBSD 2.2.X is highly unstable while experiment with labels -- @BABOLO http://links.ru/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Sep 16 18:17: 4 2000 Delivered-To: freebsd-fs@freebsd.org Received: from Gloria.CAM.ORG (Gloria.CAM.ORG [205.151.116.34]) by hub.freebsd.org (Postfix) with ESMTP id EFD7237B423; Sat, 16 Sep 2000 18:16:56 -0700 (PDT) Received: from localhost (intmktg@localhost) by Gloria.CAM.ORG (8.9.3/8.9.3) with ESMTP id VAA09153; Sat, 16 Sep 2000 21:11:27 -0400 Date: Sat, 16 Sep 2000 21:11:27 -0400 (EDT) From: Marc Tardif To: "Aleksandr A.Babaylov" Cc: freebsd-hackers@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG Subject: Re: device naming convention In-Reply-To: <200009162327.DAA17240@aaz.links.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org [ snip ] > > 1. What are wd0[a-h] used for? > For wd0sN[a-h] where N is number of first slice recognized > as FreeBSD slice > If I understand correctly, wd0[a-h] will be the same as wd0s3[a-h] in a situation where DOS is on first slice, Linux on second and FreeBSD on third, right? But what if the fourth slice is also FreeBSD? In such a case, I'll assume you meant "booted slice" instead of "first slice", where the slice selected when booting will be referred to by the OS as wd0[a-h] which would translate to "current slice". Confirmation of my assumption would be appreciated. > > 2. If wd0s1 is my first slice, why isn't it named wd0s0? > wd0s0 == wd0 > wd0s0a == wd0a > I somehow doubt that. Considering wd0s* goes from 1 to 4 inclusively, I would tend to believe the first slice is wd0s1. [ snip ] > > Assuming /dev/wd0s2 contains a few blocks, ie /dev/wd0s1 > > doesn't span to the end of disk: > > 4. If I want to use /dev/wd0s2 as a raw slice for reading > > and writing, what are the steps to follow? > You can't write several blocks near /dev/wd0s2 beginning. > Use /dev/wd0 with proper address > That is rather risky. Wouldn't it be safer to have a device name I could dedicate to some purpose. In such a case, I could chown the device to an appropriate username and group. Furthermore, I could avoid the unfortunate mistake of overwriting my current FreeBSD fs in case I get the addresses wrong. > > 4a. Do I need to format the partition as any type? If so > > is there a recommended type (perhaps one which won't > > be recognised by the bootloader would be preferable)? > It depends on usage. And remember - kernel looks up every > slice to find FreeBSD label - even if you mark it 0 (unused) > How does it depend on usage? Are some formats preferable for some specific usage (consider I'll only be using the raw interface to the device)? [ snip ] Thanks for the first message, Marc Tardif To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message From owner-freebsd-fs Sat Sep 16 19:39:45 2000 Delivered-To: freebsd-fs@freebsd.org Received: from aaz.links.ru (aaz.links.ru [193.125.152.37]) by hub.freebsd.org (Postfix) with ESMTP id 10FD437B506; Sat, 16 Sep 2000 19:39:39 -0700 (PDT) Received: (from babolo@localhost) by aaz.links.ru (8.9.3/8.9.3) id GAA18264; Sun, 17 Sep 2000 06:39:34 +0400 (MSD) Message-Id: <200009170239.GAA18264@aaz.links.ru> Subject: Re: device naming convention In-Reply-To: from "Marc Tardif" at "Sep 16, 0 09:11:27 pm" To: intmktg@CAM.ORG (Marc Tardif) Date: Sun, 17 Sep 2000 06:39:34 +0400 (MSD) Cc: babolo@links.ru, freebsd-hackers@FreeBSD.ORG, freebsd-fs@FreeBSD.ORG From: "Aleksandr A.Babaylov" MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Marc Tardif writes: > [ snip ] > > > 1. What are wd0[a-h] used for? > > For wd0sN[a-h] where N is number of first slice recognized > > as FreeBSD slice > > > If I understand correctly, wd0[a-h] will be the same as wd0s3[a-h] in a > situation where DOS is on first slice, Linux on second and FreeBSD on > third, right? But what if the fourth slice is also FreeBSD? In such a > case, I'll assume you meant "booted slice" instead of "first slice", where > the slice selected when booting will be referred to by the OS as wd0[a-h] > which would translate to "current slice". Confirmation of my assumption > would be appreciated. As far as I remember not booted, but first May be it is version dependant? > > > 2. If wd0s1 is my first slice, why isn't it named wd0s0? > > wd0s0 == wd0 > > wd0s0a == wd0a > I somehow doubt that. Considering wd0s* goes from 1 to 4 inclusively, I > would tend to believe the first slice is wd0s1. Bits in minor with slice number can be from 0 to 31 (5 bits). 0 is for wd0s0 == wd0 And lok at /dev/MAKEDEV > [ snip ] > > > Assuming /dev/wd0s2 contains a few blocks, ie /dev/wd0s1 > > > doesn't span to the end of disk: > > > 4. If I want to use /dev/wd0s2 as a raw slice for reading > > > and writing, what are the steps to follow? > > You can't write several blocks near /dev/wd0s2 beginning. > > Use /dev/wd0 with proper address > That is rather risky. Wouldn't it be safer to have a device name I could > dedicate to some purpose. In such a case, I could chown the device to an > appropriate username and group. Furthermore, I could avoid the unfortunate > mistake of overwriting my current FreeBSD fs in case I get the addresses > wrong. My tests in this area are old enough, may be up to 3.1 It depends on whether /dev/wd0s2 has a FreeBSD label or not. If pure MSDOS slice, it is not write protected, and in times I had some M$ slices I restored it by dd. Any slices that recognized with labels (not only FreeBSD labels, but FreeBSD take not care about whos labels are) have some write protected block so you can't restore, for example, boot loader by simlpe dd. In such a case you need use for example dd of=/dev/wd0 seek=(shift of slice from disk begin) > > > 4a. Do I need to format the partition as any type? If so > > > is there a recommended type (perhaps one which won't > > > be recognised by the bootloader would be preferable)? > > It depends on usage. And remember - kernel looks up every > > slice to find FreeBSD label - even if you mark it 0 (unused) > How does it depend on usage? Are some formats preferable for some specific > usage (consider I'll only be using the raw interface to the device)? Partition you mean is M$ partition? (slice in FBSD) or partition in FBSD slice? You are NOT restricted by FBSD slices to have ufs in. But if use some slice without ufs be aware from occasionaly create some pattern that FBSD think as label. May be just not use slice begin. If you mean FBSD partition, the first of them hase write protected blocks. I was not tested another. The same - just not use partition's begin. IMHO 8K, but I am not hard in this. -- @BABOLO http://links.ru/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message