Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Dec 1998 17:10:47 -0800
From:      David Greenman <dg@root.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        cvs-all@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG
Subject:   Re: new swap system work 
Message-ID:  <199812270110.RAA03548@implode.root.com>
In-Reply-To: Your message of "Sat, 26 Dec 1998 16:07:29 PST." <199812270007.QAA33903@apollo.backplane.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
>    The first part I'm working on now and expect to commit sometime POST 3.0.1.
>    We'll see how long it takes me to get it solid.

   You will not commit anything like this without careful review by at least
myself and perhaps others.

>    Basically, fixing the swap system requires moving the allocation of the
>    swap metadata structures out of the pageout code.  To accomplish this,
>    vm_page_t will get a new field, called 'swapblk'.  All swap-backed 
>    memory-resident pages will have their swap blocks stored in the vm_page_t
>    rather then the swap-metadata structure.   Swap blocks assigned to 
>    resident pages do not have to be moved into the object swap metadata
>    structures until the page is actually freed (at which point there is
>    free memory available to allocate the swap metadata structure, hence
>    the ability to operate in a zero-free-page environment).

   This seems to assume that all pages are backed by swap, which is definately
not the case. On many system, it is not even 'most'. I could almost swallow
this if it was abstracted to a pager-private struct.

>    The side effects of doing this are all beneficial.

   I don't agree. I can think of at least two negatives: It bloats the vm_page
struct and it makes a mess out of the layering.

>  The VM system becomes
>    more swap-aware and doesn't have to worry about free memory as much.

   I don't think this is a significant advantage. Most of the problems we've
seen in the past are actually on the vnode pager side and not the swap pager
side.

>    A great deal of simplification can be done all over place.

   I'm not convinced of this. I'm sure the code will be different, but I doubt
it will be much simpler.

>  These 
>    simplifications will take longer to accomplish since my goal is to get
>    the thing working first, but I think the long term prospects are very 
>    good.  Eventually we should be able to page out swap metadata associated
>    with active processes (but that's a long ways off).  The raw swap
>    allocation / deallocation code (the rlist stuff) will also eventually be 
>    rewritten to remove the memory blocking constraints that rlist_free 
>    currently has and to make it possible to remove swap.

   It is possible to remove swap with the current framework. Noone has
bothered to write the code to do it, however. It seems to me that it will
be much more difficult to remove swap in the future if you put pager related
storage data in each struct vm_page.

>    I'll start work on the second part after I finish the first part.  Fixing
>    VOP_STRATEGY basically involves giving each device or filesystem its own
>    guarenteed pool of N private pages (e.g. like 5 or so per active device
>    or mount).

   Yuck. One of the benefits of 4.4BSD (and further work by us) was getting
rid of private pools of memory. In some cases we reverted for performance
reasons, but private pools almost always get in the way of dynamicly scaled
systems.

>    Fixing VOP_STRATEGY() and the swapper will together allow reliable
>    paging to files and remove memory deadlock issues related to VFS
>    layering (e.g. like mounting a vn partition on top of NFS and then
>    mounting a filesystem through that) - though even so there are still a
>    number of deadlock issues still remaining in the VFS layering department.

   I think the deadlock issues are a bit overrated. The main problem that
I know about has to do with allocating really large swap block arrays for
large objects. There are ways of solving this at the swap pager level
without moving it into the struct vm_page.

-DG

David Greenman
Co-founder/Principal Architect, The FreeBSD Project

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812270110.RAA03548>