From owner-freebsd-hackers  Fri Jan  8 11:29:11 1999
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id LAA12137
          for freebsd-hackers-outgoing; Fri, 8 Jan 1999 11:29:11 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA12130
          for <freebsd-hackers@FreeBSD.ORG>; Fri, 8 Jan 1999 11:29:05 -0800 (PST)
          (envelope-from tlambert@usr09.primenet.com)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.8.8/8.8.8) id MAA07761;
	Fri, 8 Jan 1999 12:28:20 -0700 (MST)
Received: from usr09.primenet.com(206.165.6.209)
 via SMTP by smtp04.primenet.com, id smtpd007632; Fri Jan  8 12:28:04 1999
Received: (from tlambert@localhost)
	by usr09.primenet.com (8.8.5/8.8.5) id MAA22492;
	Fri, 8 Jan 1999 12:27:58 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199901081927.MAA22492@usr09.primenet.com>
Subject: Re: questions/problems with vm_fault() in Stable
To: dillon@apollo.backplane.com (Matthew Dillon)
Date: Fri, 8 Jan 1999 19:27:57 +0000 (GMT)
Cc: tlambert@primenet.com, pfgiffun@bachue.usc.unal.edu.co,
        freebsd-hackers@FreeBSD.ORG
In-Reply-To: <199901080347.TAA37034@apollo.backplane.com> from "Matthew Dillon" at Jan 7, 99 07:47:28 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

>     The answer is that you need a general purpose cache coherency protocol
>     that you can count on to propogate throughout the VFS layering.  *NOT*
>     an ad-hoc implemention in one VFS layer and another ad-hoc implementation
>     in another.  What the VFS layer does on its backend -- if implementing
>     a network protocol such as AFS, NFS, CODA, or something like that, is
>     up to it.  Those protocols may have their own cache coherency protocols
>     for their network interfaces, but is has very little to do with the 
>     cache coherency protocol that needs to be implemented between VFS 
>     layers.

This is what "Choices" from the University of Kentucky does.  It
supports object inheritance between FS implementations.  In one
demonstration in 1994, I saw quotas implemented in about 8 minutes.

I think that object modelling is going to far.  Basically, "If all
you have are ``design patterns'', then everything looks like an
object".  This seems to be an illness in modern CS education, one
that came in with VMS and the idea that memory and CPU cycles are
free.

I'm willing to be convinced the other way, but if the underlying goal
is generalized object inheritance, then there's no reason to reinvent
it, since examples abound in FS research.

I really think that maybe it's time to contact John directly about
his design.  Have you read the thesis?  It's at:

	ftp://ftp.cs.ucla.edu/pub/ficus/heidemann_thesis.ps.gz

This is basically the design document for the VFS stacking
architecture document.

It'd be nice to identify the design goals you have in addition to
the design goals in the architecture document, and for everyone to
have an understanding of how the current implementation falls
short of the architecture document.  FWIW, I'm not happy with the
code in FreeBSD either, but I'm unhappy with it from the perspective
that it falls far short of the intended design for no good reason,
not dissatisfaction with the design itself.  I think that's a
rather important distinction.


> :(2)	MNFS manages distribute cache coherency across a network
> :	within the context of the existing framework.
> 
>     MNFS is an externalized protocol, just as CODA and standard NFS 
>     are.   What they implement in their network layer is very different
>     from what they have to deal with in their VFS layer.
> 
>     For example, lets say you export a UFS filesystem via MNFS.  Ok, fine...
>     now lets say you import an MNFS filesystem and then re-export it to
>     another machine, and that machine imports it and then re-export 
>     it to yet another machine.  Will cache coherency be maintained across 
>     the chain with MNFS?  Without a general cache-coherency protocol for
>     inter-layer VFS it can't unless MNFS short-circuits the protocol.

This is a long standing argument.

Basically, this is the argument between hosted FS's, or more generally,
hosted OS's, vs. native applications competing with the hosting software.

This is the Linux argument about user space NFS services or the SAMBA
argument about user space CIFS services or the NetWare argument about
user space NCP services.

It's also about the argument about whether I should be able to reexport
something that I import, or should the final client be "forced" to do
its import from the original source, and avoid the coherency problem
entirely.

The general NFS answer (except on Linux) is to force the client to
go to the original source.

The general NetWare answer is to allow only local FS's to be exported.

The general SAMBA answer is to allow the export, with the caveat
that the coherency battle has already been lost because the OS's
hosting the server don't export an opportunity locking mechanism (on
4.4BSD, this could be implemented by externalizing the VOP_LEASE
mechanism to user space), nor do most host environments support
mandatory locking.


I think the biggest problem here is mandatory locking, and that's not
addressed.


>     Fine, so now mix protocols... you have a combination of MNFS, AFS, and
>     CODA mounts done in a chain.  Lets say each one of these has cache
>     coherency.  Will the coherency be maintained across the chain?  Nope,
>     it won't.  Not unless the VFS layer implements a general cache 
>     coherency protocol that these filesystems use.

My answer would be "don't do this".  You already have a problem in
the SAMBA and AFS server cases that's unresolvable: how do I ensure
that the semantics of my server  (e.g., mandatory file range locks)
are observed by other programs running in the hosting environment?

The answer is "I can't".

There are some things you just can't do if you allow user space
programs access to the network at all, and assert all of the
preconditions that have been asserted.


If I provide a cooperating layer in the kernel for mandatory lock
enforcement of file region access (for example), then in order for
it to be effective, I *can't* (not "shouldn't") expose the layers
stacked underneath it, and expect to be able to interoperate my
client machine's applications with local applications in the host
environment that aren't the server itself.


>     The real power of building a cache coherency protocol into the VFS
>     layering is that it allows you to do things you may not have thought of 
>     yet.

I agree.

My preference for such a protocol, however, would be to not limit
it to only cached VM information.

One could consider the soft updates mechanism as a cache coherency
protocol that is more general.

When an FS is instanced, you would instantiate a dependency graph
for the VFS layer, thinking of the VFS in terms of events.  The
nodes on the graph are inter-object dependency resolution functions.

You can see the existing soft updates framework fitting in as a small
subset of this mechanism.  But this mechanism would be able to resolve
dependencies between stacking layers as well, by specifying inter-VOP
dependency relationships, and conflict resolver code for that.

Consider: Say I want to implement a stacking layer that supports
transactioning.  How do I do this without engaging in a two stage
object/index commit protocol, using VOP's, which destroys the
actual utility of having Soft Updates in the underlying FS on
which the transactiong layer is stacked?

The only way I can do this reasonably (since otherwise transaction
updates must be ordered, and the only ordering that operates between
VFS layers is synchronous VOP completion) is to implement a hook
mechanism whereby the dependency graph can be dynamically extended,
and conflicts between adjacent edges (each describing an ordering
dependency) have conflict resolution handler code provided.

*This* is a general cache coherency mechanism, without tying the idea
of a cached object abritraily to merely "VM".

The only problem with this is... any group of 2 to 4 people looking
for a PhD?


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message