Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Dec 1999 19:47:20 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        David Greenman <dg@root.com>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: Proposed patch to fix VN device (again) 
Message-ID:  <199912280347.TAA36239@apollo.backplane.com>
References:   <199912280258.SAA12518@implode.root.com>

next in thread | previous in thread | raw e-mail | index | archive | help

:
:   I've heard from both of you that you think the other is wrong. This isn't
:very helpful, however, in finding the correct solution. What I'd like to hear
:from both of you is the reasons why swap is better as a device, or not. There
:seems to be some unstated architectural philosophy that needs to be stated
:before any informed decision can be made about what is the right direction to
:go in.
:
:-DG
:
:David Greenman
:Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org

    Ok, I'll buy that.  Though I will note that VN has been broken for over
    a month - I found that out about half a month ago and posted the fact,
    and if Poul is so interested in fixing it he could easily have done so
    then.  I'm the one who has spent all the time tracking it down, patching,
    and testing it, and frankly that should count for a hellofalot more then
    nil.  Poul frankly does not have a leg to stand on to demand that I work
    on a 'better' solution when I didn't cause the problem in the first place,
    nor does he have any right to pollute the VM code to end-run around the
    issue.  It would be tragic if Poul's behavior were allowed to conclude
    the matter.

    That said, here is my position, split into a 'history' and my 'reasoning':

    Previously we had /dev/drum, which didn't work, and the
    SWAP device was layered around a device and swap I/O was vectored through
    the device subsystem.  

    One month ago, after discussions, Peter Wemm removed the userland frontend
    for the swap device, /dev/drum, and at the same time thought it would be
    a good idea to collapse the VOP_STRATEGY routines (that ran through the
    SWAP dev_t) into direct calls to swstrategy().  At the time I was
    ambivalent but since the change seemed to clear out a bunch of cruft I
    went with it.

    However, this broke the VN device.  I found this out about half a month
    later (around half a month ago) but at the time was hip deep in NFS work
    for the 3.4 release and could not address the issue.  Now that the NFS
    work is finished I revisited the now broken VN device to determine what
    went wrong.  It turns out that the VN device indirectly runs through the
    swap device using several generic filesystem buffer related subsystems.
    When the swap dev_t was removed, the result was a panic.

    -

    The VN device uses vm_pager_strategy() to implement swap backing store.
    vm_pager_strategy() is a generic mechanism that I introduced into the VM
    system a number of months ago for two reasons:

	(1) As a sounding board for giving the vm_pager_*() functions a
	    strategy entry point, to see how well it would work.

	(2) To implement the swap backing store feature for the VN device.

    vm_pager_strategy() in turn looks at the pager ID and does a table lookup
    call through to the swap_pager_strategy() function.  This function takes
    a VM object and a *generic* bp.  While the function itself is
    swap-specific, the design of the code is supposed to be as generic as
    possible.

    vm_pager_strategy() may need to cut up the bp into subrequests to deal
    with clustering issues.  It uses the generic 'chainbuf' routines to
    manage the asynchronous I/O.

    The 'chainbuf' routines are designed to be as generic as possible.. they
    take BP's.  They do *NOT* and are not intended to be special cased in
    any way.  This is my design for these routines, and even though they
    are currently used only by the swap_pager_strategy() code I believe the
    generic nature of the routines must be maintained.

    I/O sequencing for the swap-backed VN device is thus:  
    VN->vm_pager_strategy->chainbuf->STRATEGY.

    I advocate that the best solution is to simply undo part of the original
    commit and restore SWAP's dev_t which was removed a month ago.  Poul's
    solution is considerably more complicated (the patch he offered will not
    work in all cases, and I've already corrected all his other misstatements
    that I do not really feel it appropriate to continue to correct his
    misstatements when all he does with those corrections is use them as
    a poolboard to make his next set of incorrect statements!).

    I will tell you why Poul's patch will not work, but I am NOT going to
    tell Poul because, damn it, he has no right to spend 5 minutes working
    up a shoddy patch and then try to dictate to me that it should be used
    over my well tested 8-hours-of-work patch!

    Poul's patch will not work because flushchainbuf() is not the only chain
    routine that flushes the buffers.  getchainbuf() and waitchainbuf() may 
    *also* flush the buffers if there are already too many chain buffers
    built up in the linked list.  Poul only addressed flushchainbuf().  
    He obviously didn't bother to read the code carefully and he just
    as obviously doesn't understand it.  And not understanding it I can't
    imagine how he can possibly believes he understasnds the concepts
    behind it.

    Poul's patch also pollutes the chainbuf code which is supposed to be
    *GENERIC*.

    -

    Poul has demonstrated over and over again (if you read the thread) that
    he understands not a single issue regarding this whole chain of events.
    He advocates polluting my carefully designed and implemented VM code,
    both the vm_pager_strategy() design and the chainbuf design, in order
    to avoid creating a dev_t for the swap device.

    I strongly believe that the VN/PAGER/CHAINBUFIO code sequence must be 
    maintained without such pollution.  Since swap really is a device - it has
    a fully functional strategy entry point and has logical separation from 
    the VM swap subsystem by taking on the responsibility for managing the 
    block devices mounted as swap - I see no reason why we can't implement 
    a simple dev_t for it.  Giving it a dev_t allows us to use all the 
    standard filesystem buffer cache I/O routines with it, and we've 
    already demonstrated that it is useful to do so in the VN swap-backing
    device.  There could easily be other situations that come up in the
    future that may also take advantage of this.

    I will also point out that only a month ago the SWAP device *WAS* a 
    dev_t.  It was removed.  That removal was a mistake because nobody
    (including me) realized that we just blew up VN.  It's as simple
    as that.  Do we back out a small portion of the patch to make VN work
    again, or do we spend hours or days rewriting an unrelated subsystem
    to work around the loss of the dev_t? 

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>








To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912280347.TAA36239>