Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Jun 1996 12:31:11 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        lyndon@orthanc.com (Lyndon Nerenberg VE7TCP)
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: Building inside of /usr/src?
Message-ID:  <199606251931.MAA00496@phaeton.artisoft.com>
In-Reply-To: <199606250203.TAA06487@multivac.orthanc.com> from "Lyndon Nerenberg VE7TCP" at Jun 24, 96 07:03:42 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > I'm not sure that the safety trade-off here makes it anything I'd ever
> > want to deliberately encourage anyone to do.  I mean, if you still
> > *really* wanted to do it then you could just move the relevant obj
> > directory aside (and I'm sure a studly guy like Nate could make a
> > shell function out if it so that he never had to know the
> > corresponding obj dir location).
> 
> This is really hollering for an implementation of stackable mounts.
> (Or translucent mounts, or whatever the current buzzword for it is.)
> Stacked mounts are one of the joys of Plan 9, and we could do well
> by following their lead. (Wasn't there broken FS code in 4.4 that
> tried to implement something like this? It's been a while since I
> looked at a raw 4.4 source tree ...)
> 
> I'm sure Terry can comment on this :-)

This is the intrinsic "union" option.

It does not work.

It does not work because VOP_ADVLOCK does not veto.

It does not work because VOP_LOCK can not be stacked because it is
stupidly referencing flags specific to the underlying vnode for lock
resoloution instead of the union vnode.

It does not work because VOP_LOOKUP, VOP_RENAME, etc. can not
be stacked because they actually deallocate path structures that
were allocated by code in vfs_syscalls.c, instead of the buffers
being deallocated in vfs_syscalls.c as well, as you would expect
in a proper idempotent layering implementation.

VOP_LOCK stupidly references these flags because vclean needs them.

vclean is an abomination before God, and is a half-kludge to deal
with not having both vnode/offset and dev/offset based cache
references simultaneously.

Use of vnode/offset cache entries is a result of the unified cache
implementation.  It saves a bmap call when moving data to/from
user space.  It's why FreeBSD has faster I/O than most other systems.

The lack of a parallel dev/offset based caching allows us to be lazy,
and enlarges the bit limit on FS storage, though it does not help
the inherent limit on file size (due to mapping).

The lack of a parallel dev/offset results in the need for
implementation of a "second chance cache" via ihash.  Still, we
will discard perfectly good pages from cache as a side effect of
having no way to reassociate them with a vnode.

The use of a global vnode pool instead of per FS mount instance vnode
allocations damages cache locality.  Combined with vclean, it also
damages cache coherency.


To repair:

1)	Fix the stackability issues with the VFS interface itself,
	which will incidently cause the VFS to more closely conform
	to the Heidemann Thesis design on which it is based.  Currently
	it only implements a subset of the specified functionality.

2)	Migrate the vnode locking to the vnode instead of the per FS
	inode; get rid of the second chance cache at the same time
	(the Lite2 code does some of this).  The pointer should have
	been in the vnode, not the inode, from the very beginning.

3)	Move the directory name cache out of the per FS code and
	into the lookup code.

4)	Move the vnodes from the global pool; establish a per-FS
	vnode free routine.

5)	Establish VOP_GETPAGE/VOP_PUTPAGE, etc...

6)	Union mounts will then work without kludges in lookup, locking,
	and other code.  They *could* be made to work with great, gross
	kludges and changes to at least 3 FS's (that I know of), but
	that's a kludge I won't do.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606251931.MAA00496>