Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Jul 1996 11:26:40 +0900 (JST)
From:      Michael Hancock <michaelh@cet.co.jp>
To:        freebsd-fs@FreeBSD.ORG
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Fixing Union_mounts
Message-ID:  <Pine.SV4.3.93.960710105207.28386D-100000@parkplace.cet.co.jp>
In-Reply-To: <199606251931.MAA00496@phaeton.artisoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Please trim off current and leave fs when replying]

Terry posted this reply to the "making in /usr/src" thread.  I'd like to
see all this stackable fs stuff made usable.

I have some questions on Terry's remedies items 2) and 4) below:

2) Moving vnode locking to the vnode from the per fs inode will fix the
help fix the stacking problems, but what will it do for future advanced
file systems that need to have special locking requirements?

4) Moving the vnodes from the global pool to a per fs pool to improve
locality of reference.  Won't this make it hard to manage memory?  How
will efficient reclaim operations be implemented?

This stacked fs stuff is really cool.  You can implement a simple undelete
in the Union layer by making whiteout entries (See the 4.4 deamon book).
This would only work for the duration of the mount unlike Novell's
persistent transactional stuff, but still very useful.

There are already crypto-fs implementation out there, but I'd like to see
more; especially non ITAR restricted ones that can be used world-wide.

Regards,


Mike Hancock

On Tue, 25 Jun 1996, Terry Lambert wrote:

> This is the intrinsic "union" option.
> 
> It does not work.
> 
> It does not work because VOP_ADVLOCK does not veto.
> 
> It does not work because VOP_LOCK can not be stacked because it is
> stupidly referencing flags specific to the underlying vnode for lock
> resoloution instead of the union vnode.
> 
> It does not work because VOP_LOOKUP, VOP_RENAME, etc. can not
> be stacked because they actually deallocate path structures that
> were allocated by code in vfs_syscalls.c, instead of the buffers
> being deallocated in vfs_syscalls.c as well, as you would expect
> in a proper idempotent layering implementation.
> 
> VOP_LOCK stupidly references these flags because vclean needs them.
> 
> vclean is an abomination before God, and is a half-kludge to deal
> with not having both vnode/offset and dev/offset based cache
> references simultaneously.
> 
> Use of vnode/offset cache entries is a result of the unified cache
> implementation.  It saves a bmap call when moving data to/from
> user space.  It's why FreeBSD has faster I/O than most other systems.
> 
> The lack of a parallel dev/offset based caching allows us to be lazy,
> and enlarges the bit limit on FS storage, though it does not help
> the inherent limit on file size (due to mapping).
> 
> The lack of a parallel dev/offset results in the need for
> implementation of a "second chance cache" via ihash.  Still, we
> will discard perfectly good pages from cache as a side effect of
> having no way to reassociate them with a vnode.
> 
> The use of a global vnode pool instead of per FS mount instance vnode
> allocations damages cache locality.  Combined with vclean, it also
> damages cache coherency.
> 
> 
> To repair:
> 
> 1)	Fix the stackability issues with the VFS interface itself,
> 	which will incidently cause the VFS to more closely conform
> 	to the Heidemann Thesis design on which it is based.  Currently
> 	it only implements a subset of the specified functionality.
> 
> 2)	Migrate the vnode locking to the vnode instead of the per FS
> 	inode; get rid of the second chance cache at the same time
> 	(the Lite2 code does some of this).  The pointer should have
> 	been in the vnode, not the inode, from the very beginning.
> 
> 3)	Move the directory name cache out of the per FS code and
> 	into the lookup code.
> 
> 4)	Move the vnodes from the global pool; establish a per-FS
> 	vnode free routine.
> 
> 5)	Establish VOP_GETPAGE/VOP_PUTPAGE, etc...
> 
> 6)	Union mounts will then work without kludges in lookup, locking,
> 	and other code.  They *could* be made to work with great, gross
> 	kludges and changes to at least 3 FS's (that I know of), but
> 	that's a kludge I won't do.
> 
> 
> 					Terry Lambert
> 					terry@lambert.org
> ---
> Any opinions in this posting are my own and not those of my present
> or previous employers.
> 

--
michaelh@cet.co.jp                                http://www.cet.co.jp
CET Inc., Daiichi Kasuya BLDG 8F 2-5-12, Higashi Shinbashi, Minato-ku,
Tokyo 105 Japan              Tel: +81-3-3437-1761 Fax: +81-3-3437-1766




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SV4.3.93.960710105207.28386D-100000>