Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Feb 1998 22:57:42 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        dima@tejblum.dnttm.rssi.ru (Dmitrij Tejblum)
Cc:        tlambert@primenet.com, FreeBSD-current@FreeBSD.ORG
Subject:   Re: VM: Process hangs sleeping on vmpfw
Message-ID:  <199802282257.PAA08480@usr08.primenet.com>
In-Reply-To: <199802281813.VAA02290@tejblum.dnttm.rssi.ru> from "Dmitrij Tejblum" at Feb 28, 98 09:13:09 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Actually, the warning is emitted in the getpages.  If the getpages
fails, you are supposed to be trying a putpages.

> In fact, this mean that you are trying to _page in from_ FS that was 
> recently broken (i.e. local media FS that does not support 
> VOP_GETPAGES). vnode_pager_putpages does not check for EOPNOTSUPP.
> I thought it is job of the submitter/committer to find all places 
> broken by the changes and fix them.

There are counter examples, but you're right, that's not good
practice.  The thing here is that it really wants testing by
people who use the FS's.  The warning was Mike's (good) addition;
I personally wanted to panic when it happened.

Before this gets blown out of proportion, though, there are two
ways to approach a fix.  I would like to use the first:

(1)	Fix the FS's on a case-by-case basis.

(2)	Hack the get/put routines to see the EOPNOTSUPP, issue
	a warning, and then fallback to the old behaviour.

I'm against #2 because if it's not considered a serious warning,
the FS's won't find themselves fixed like they should because
people will blow off reporting the problem.

> Here is a (partial?) list of broken filesystems: NFS, CD9660,
> EXT2FS, MSDOSFS.

Local media FS's for the most part, yes.  The NFS case is rather
problematic (find out why, below).  The CD9660 needs a generic
getpages, but a read-only (ie: release-only) putpages.


> > The FS needs corrected, so if you can identify
> > which one it is, I can do a patch for you (it's pretty easy to make
> > a VOP_{GET|PUT}PAGES to use the legacy code, but it must be explicitly
> > used; doing this will [later] enable user space FS module developement
> > to be stacked on top.  
> 
> Well, I don't know what do you need for user space FS module 
> development, but I still believe that you introduced lot of unnecessary 
> complexity.

Stacking FS's will have similar problems until they correctly
implement the "bypass" mechanisms.  The reson for the change is
not purely for user space FS developement.  The original
rationale was posted when I posted the URL for the patches.

> First, why default/standard/generic getpages/putpages routines does not 
> have interface of VOP_GETPAGES/VOP_PUTPAGES vnode operations? It would be 
> easier for a filesystem to just add some entries to their operations 
> tables than also cut&paste implementation (even trivial) of these 
> operations from ffs. 

Look at /sys/ufs/ufs/ufs_readwrite.c:

==========================================================================
/*
 * put page routine
 *
 * XXX By default, wimp out... note that a_offset is ignored (and always
 * XXX has been).
 */
int
ffs_putpages(ap)
	struct vop_putpages_args *ap;
{
	return vnode_pager_generic_putpages(ap->a_vp, ap->a_m, ap->a_count,
		ap->a_sync, ap->a_rtvals);
}
==========================================================================

This is what I do in the case of a local media FS without a specific
get/put implementation.  This is exactly the fix that needs to go into
the local media FS's (with the 'put' replaced by 'get' and the argument
list adjusted).

The CD9660 code should have a variant putpages (obviously).


> Second, why don't put the operations to default_vnodeop_entries? It is 
> used exactly by local media filesystems. Stacking layers use bypass 
> routines instead (unionfs is an exception). So, filesystems even would 
> not notice this change, until they really want their own implementation 
> of getpages/putpages
> 
> What is wrong in the above?

Unionfs.  Specifically, the point is that the unionfs implementation
should fan out to the correct underlying implementation.  This means
it shouldn't go into the default.

Secondly, you can't make FS-specific optimizations.  One example here
is the preterbation of the putpages by the soft updates code... an
FFS-specific thing that should be reflected only in the FFS putpages.
Another example would be that the generic putpage doesn't know about
clustering (except what VOP_WRITE can do) for partial cluster writes,
etc..


> I can send a patch for you... It is indeed pretty easy...

See above for the patch.  It's the identification problem and the
stacking problem that I wanted to handle on a case-by-case basis.

As I said, it's possible to make this change go transparent (#2,
above); I would actually prefer an implementation that does *not*
define the generic code as default ops, so if it has to go
transparent, it should go transparent that way, not the default
ops way, so at least there are warning messages emitted.

Have you looked at John's comments about intended complexity in the
/sys/vm/vnode_pager.c?  Right now, the generic mechanism seperates
(but does not yet remove) the complexity he's talking about in
those comments.  This is the intended direction in the VM code; I'd
like to follow it through if we can, instead of crippling it.

You should also follow the NFS case in vnode_pager_generic_getpages
and look at all the other vp->v_mount references.  Look what the
change means to VOP_BMAP, specifically with regard to the assumptions
comment in vnode_pager_generic_getpages and vnode_pager_input_smlfs's
being called -- with the same assumptions but without the test.  It's
pretty obvious that the VOP_BMAP return test is equal to the NFS
test.  This code is at the heart of a lot of problems, and I'd like
to take it slow...

Any help would, of course, be appreciated.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199802282257.PAA08480>