Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Jan 1999 15:06:21 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        tlambert@primenet.com, dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co, freebsd-hackers@FreeBSD.ORG
Subject:   Re: questions/problems with vm_fault() in Stable
Message-ID:  <199901072306.PAA35328@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
:>     on a soft block.  For example, UFS/FFS was never designed to terminate
:>     on memory, much less swap-backed memory.  Then came along MFS and
:>     suddently were (and still are) all sorts of problems.
:
:I'd argue that MFS is an inappropriate use of the UFS code, since the
:UFS code doesn't acknowledge the idea of shrinking block-backing
:objects, and barely (with severe fragmentation based degradation)
:recognizes growing block-backing objects.

    I have no idea what you are talking about here.  MFS operates on top of
    its idea of a fixed block device, just like UFS.  A VOP mechanism already
    exists to handle block freeing and, in fact, after the 15th, MFS will start
    to use it.... file fragments and meta-data will still get unnecessary swap
    assignments, but full file blocks will not.  This is a major improvement
    and it took me all of 40 minutes to add the support for it.

    Running MFS on top of a file rather then swap is another story, but running
    MFS with swap-backed memory for backing store works pretty well now and 
    will work very well after the 15th.
    
:So the idea that there are "suddenly" problems as a result of an
:"unexpected but legitimate use" is predicated on the false
:assumption that such use is legitimate.

    'Legitimate use' is anything that more then a handful of people start
    using a non-feature for.  If the non-feature (aka MFS) was not meant to do
    something in a certain way, then it should not have tried to do it.

    Since more then a handful of people are using MFS in ways that it was
    never designed to handle, but tries to handle anyway, the solution has
    got to be "fix MFS any any other brokeness that is preventing it from
    operating the way people want it to operate".

    You *cannot* just take the cop-out and say that since it was not originally
    designed for the use it is being put to now, that we should then explicitly
    disallow the use and break half the people using it!

:I also believe that the canonically correct way to deal with MFS
:issues (if you *insist* on using an inappropriate FS architecture)
:is by providing a device interface to anonymous memory, and creating
:the FS on that "device", instead.  Clean pages that haven't been
:written at all don't need to be instanced, and you will have the
:same (effectively) soloution as the current MFS soloution, but
:without the current MFS problems.

    This does not solve any of the current MFS problems, because this
    is *already* how MFS already works!

:>     There's nothing wrong with block access - it is, after all, what most
:>     VFS layers expect.  But to implement block access only through 
:>     VOP_GETPAGES/PUTPAGES is insane.
:
:OK.  Define another interface with a couple functions or smaller
:footprint, and call the functions something else.  The point is that
:the interaction footprint for UFS/FFS right now is over 150 kernel
:functions.

    I have described it three times already.  Go back in read some of my
    previous email.

:>     multiple VFS layers and to maintain cache coherency between VFS
:>     layers, and in order to get the efficiency that cache coherency gives
:>     you - much less memory waste.
:
:There's *no* memory waste, if you don't instanceincoherent copies
:of pages in the first place.

    You are ignoring the point.  I mmap() a file.  I mmap() the block device
    underlying the filesystem the file resides on.  I start accessing pages.
    Poof... no coherency, plus lots of wasted memory.

    You are assuming that VFS devices can be collapsed together such that the
    inner layers are not independantly accessible, and thus cannot be 
    independantly accessed without going through the upper VFS layers. 
    This is an extremely restrictive view which fails utterly in a number 
    of already existing cases and fails even worse when you try to extend
    the model across a network.

:The ability to "shortcut multiple VFS layers" is an artifact of
:the non-collapse of stacks.  The UFS/FFS interaction is an example
:of a correct collape.  If it the interfaces weren't skewed, NULLFS

    You must be joking.  UFS/FFS are like twins - there is no way in
    hell the UFS/FFS layering model could be applied to joe random
    VFS layer.

:would be another example, where multiple NULLFS instances collapsed
:to *no* local vnode definitions, and one call boundary.  Instead,
:you are suggesting that we instance vnodes in each NULLFS layer,

    You are assuming that these things are collapsable, but very *few* 
    VFS layers are actually collapseable.  For example, there is no way 
    you could possibly collapse RAID or encryption layer or a mirroring 
    mid-layer.  You can't collapse an MFS layer that is file-backed.
    You can't collapse a mirror.   You can't collapse a VN device due
    to partition translations.  In all cases the intermediate layers 
    can be independantly accessed and, in fact, it is *desireable* to
    have the ability to independantly access them.

:and that we complicate this by associating VM object aliases with
:each layer instance to deal with the coherency issues that come
:from adding VM object aliases in the first place, and *then* we
:"shortcut" page references (and *only* page references, as pigs which
:are more equal than other references) by referncing through the
:alias.
:
:This is rather an insane amount of useless complexity to get around
:the coherency problems which wouldn't exist had you not introduced
:vnodes in the null stacking layer case as placeholders for your
:coherency mechanism, don't you think?

    Introducing vnodes to the null stacking layer does not change the
    coherency problems associated with the current VFS layering one 
    iota.  You are, again, assuming that the coherency issue will be 
    magically solved by collapsing VFS layers and ignoring the fact
    that most VFS layers (A) can't be collapsed, and (B) that your coherency
    solution fails utterly the moment you take a network hop.

:>     The GETPAGES/PUTPAGES model *cannot* maintain cache coherency across
:>     VFS layers.  It doesn't work.  It has never worked.  That's the fraggin
:>     problem!
:
:Works on SunOS.  Works on Solaris.  If you have a source license,
:or sign non-disclosure, John Heidemann will show you the code.

    Explain to me how it works rather then point me at three hours worth of
    research that I have to 'understand' to understand your point. 

:> 
:>     Uh, I think you missed the point.  What you are basically saying is:
:>     "I don't want cache coherency".... because that is what you get.  That
:>     is, in fact, what we have now, and it means that things like MFS
:>     waste a whole lot of memory double-caching pages and that it is not
:>     possible to span VFS layers across a network in any meaningful way.
:
:No.  What I'm saying is that I don't want to allow things that
:result in coherency tracking problems in the first place.

    And I'm saying that that is a pipe dream.  There are already a number 
    of situations where coherency tracking is desireable.  Extending the
    model across a network tops the list.  Being able to use a coherent
    mmap() on a common NFS-served partition from N different machines, 
    for example.  Filesystem-parallel support processes such as defragger's. 
    Filesystems mounted over shared remote block devices.  
    Machine clusters -- and I mean *real* clusters, not the linux clustering
    junk -- machine clusters do not work at all without cross-network cache 
    coherency.

    You can't just dispose of the issue by saying that you will magically
    arrange the layering such that there is no cache coherency problem.
    That doesn't solve any of the problems that need solving.

:It's like the Aluminum plaques in an Alzheimer's sufferer: there's
:no proven connection between Aluminum consumption and these plaques,
:but it's unlikely that the human body is capable of transmuting
:Potassium into Aluminum, through some magical process, in the
:abscense of dietary Aluminum that could be used by the body in
:their construction.
:
:If you don't introduce the building blocks for the problem, then the
:problem can't be built.

    Precisely.  Now consider the massive restrictions you have described
    in order to 'solve' the coherency problem.  Try to solve just one case
    with your methodology - try to solve the case of doing a coherent mmap()
    on 5 different machines on the same NFS-based file..  Something that does
    not work at all right now.


:>     This doesn't work if the VFS layering traverses a network.  Furthermore,
:>     it is *extremely* inefficient.
:
:So traversing a network,  The issue of latency is not going to go away
:if you make something that's 1/1000th of the latency into 1/10,000th
:of the latency.  You're optimizing the wrong thing, if network transport
:is your concern.

    You are missing the point!!!!!  You are *doubly* missing the point!

    FIRST, the issue when you traverse the network is NOT latency, it's
    CACHE COHERENCY.  When you have 50 machines talking to a single file 
    server.... situations like that.

    SECOND, the whole point of having the cache coherency mechanism in the
    first place is to AVOID having to go over the network for every operation.

    The whole idea is that if you have cache coherency, you can actually
    realistically *cache* things on the local machine and use them WITHOUT
    going over the network every time you need to do something!!

:>     Hell, it doesn't even maintain cache
:>     coherency even if you DO use VOP_GETPAGES/PUTPAGES.
:
:Not true.  Add the address of the memory to be filled as an argument
:(it's there already), and the data from the remote end can be
:marshalled via the argument descriptor.

    Huh?  Where's the cache?  What?  No cache?  Are you nuts?  Or are you
    assuming the client-side caches the object locally.  But then, where's
    the cache coherency?  Right out the window.

:>     Now, Terry, if you are arguing that we don't need cache coherency, then
:>     ok ... but if you are arguing that we should have cache coherency, you
:>     need to reexamine the plate.
:
:If there's only one object, there's not a coherency problem.  If you
:make more than one object, then you need explicit instead of implicit
:coherency.  If everything you need is in the descriptor, however,
:then it's marshallable over *any* interface.  Network, or user/kernel
:proxy for user space developement environment.

    There ISN'T only one object.  I will repeat that a thousand times.
    Your entire model assumes that there is only one object or that all the
    VFS layers can be collapsed.  Neither assumption works.

:					Terry Lambert
:					terry@lambert.org

    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
                    Communications & God knows what else.
    <dillon@backplane.com> (Please include original email in any response)    

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901072306.PAA35328>