Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 09 Jun 2001 10:14:49 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Peter Wemm <peter@wemm.org>
Cc:        "Patrick W. Penzias Dirks" <pwd@apple.com>, FreeBSD-FS@FreeBSD.ORG, FreeBSD-Arch@FreeBSD.ORG
Subject:   Re: Support for pivot_root-like system call?
Message-ID:  <3B225989.6E2110@mindspring.com>
References:  <20010608153011.D7AF1380C@overcee.netplex.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Wemm wrote:
> Terry, the 'cache coherency' bugs have been fixed in -current for ~8
> months now (September 2000).  The infrastructure changes for this are
> subject to a call-for-review right now for a merge to 4.x.

Peter, the 'cache coherency' bugs have only been fixed for
the trivial case where the pages at the top are identical
to the pages at the bottom of a vnode stack.

In the case of a transforming stack, the "final VP" that
should be returned is actually an intermediate VP, and
you need to take a write fault in the putpages in order
to do the correct layer boundary transition.

As a simple case, consider an FS that at the top presents
one page, but at the bottom presents two pages from which
that one page is derived.  This could be a cryptographic
FS, or it could be an FS that converts from ISO-8859-1 to
ISO-10646, etc..  The point is that the page contents will
undergo a transformation.

The problem is the same as the explicit cache coherency
code that resolved the getpages/putpages in one layer by
calling the read/write function in the underlying layer
in the historical "nullfs" workaround, which was a kludge.

When FreeBSD moved to a unified VM and buffer cache, it
erroneously removed the "hint points" at which an explicit
coherency call would occur to synchronize the VM and buffer
cache views of an object.  This is precisely the code that
is needed to synchronize a vm_object_t with the backing
vm_object_t after a transformation.

What FreeBSD has now will work for about 1/4 of the proposed
uses of stacking FS layers.  It will _NOT_ work for most
of the interesting uses of a stacking architecture, which
involve MUX'es (e.g. "translucent FS" for "writing" to a
CDROM) or for "proxy FS" for debugging your FS code in
user space, etc. -- I have around 16 examples where the
current code still fails.

Really, you want to define the actual device I/O in terms
of a "disk block FS".

The thing that most people apparent fail to "get" is that
there is a significant difference between a stacking layer
and a local media FS.  A local media FS has to interact
with the VM system (or buffer cache, or whatever).  We need
to couch UFS in terms of talking to a local media FS on top
of which it is stacked.

Unfortunately, it seems that most people are unwilling to
actually email John Heidemann, the architect of the stacking
code in 4.4BSD (and therefore FreeBSD).

Frankly, it really pisses me off that this stuff doesn't
work in FreeBSD.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B225989.6E2110>