Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jul 2001 23:43:29 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        Thomas Moestl <tmoestl@gmx.net>
Cc:        hackers@FreeBSD.ORG, Boris Popov <bp@FreeBSD.ORG>
Subject:   more on Re: Please review: bugfix for vinvalbuf()
Message-ID:  <200107110643.f6B6hTB24707@earth.backplane.com>
References:   <20010711003926.B8799@crow.dom2ip.de>

next in thread | previous in thread | raw e-mail | index | archive | help

:Hi,
:
:I've just tripped over an obviously long-standing (since about
:Jan. 1998) bug in vinvalbuf while looking into PR kern/26224. The
:problematic code looks like (on -CURRENT):
:
:	/*
:	 * Destroy the copy in the VM cache, too.
:	 */
:	mtx_lock(&vp->v_interlock);
:	if (VOP_GETVOBJECT(vp, &object) == 0) {
:		vm_object_page_remove(object, 0, 0,
:			(flags & V_SAVE) ? TRUE : FALSE);
:	}
:	mtx_unlock(&vp->v_interlock);
:
:The locks seem to be needed for file systems that don't perform real
:locking (on -STABLE, they are simplelocks).
:This, however, is incorrect because vm_object_page_remove may sleep.
:I've attached a patch that passes the interlock to
:vm_object_page_remove, which in turn passes it to a modified version
:of vm_page_sleep, which unlocks it around the sleep.
:I think that this is correct, because the object should be in a valid
:state when we sleep (and all checks are reexecuted in that case).
:
:Since I'm not very experienced with vfs and vm stuff, I'd be glad if
:this patch could get some review. In particular, is the lock/unlock
:pair really still needed, and are there notable differeces in -STABLE
:(because the fix would need the be MFC'ed)?
:
:Thanks,
:	- thomas

    Ok, I've looked at this more.  What is supposed to happen is that
    the 'while (vp->v_numoutput) ...' code just above the section you
    cited is supposed to prevent the deadlock.  The while code looks like
    this:

        while (vp->v_numoutput > 0) {
                vp->v_flag |= VBWAIT;
                tsleep(&vp->v_numoutput, PVM, "vnvlbv", 0);
        }

    However, as Ian points out in his followup, it doesn't work:

:...
:I've seen a related deadlock here in 4.x with NFS; vm_fault locks
:a page and then calls vput which aquires the v_interlock. This code
:in vinvalbuf locks the v_interlock first, and then vm_object_page_remove()
:locks the page. That's a simple lock-order reversal deadlock which I
:guess would still exist even with this patch.

    It doesn't work for the simple reason that vm_fault isn't busying the
    page for writing, it's busying it for reading, so v_numoutput will be
    0.

    I think the best solution is to change vinvalbuf's wait loop to
    wait on vm_object->paging_in_progress instead of vp->v_numoutput,
    or perhaps to wait for both conditions to be satisfied.

    vm_object->paging_in_progress applies to reads and writes, while
    vp->v_numoutput only applies to writes.

    I know this isn't the most correct solution... there is still the
    lock reversal if vm_object_remove_pages() were ever to sleep.  The
    idea is that it won't sleep if there is no paging in progress because
    nobody will have any of the object's pages locked.  I think it is
    the best we can do for the moment.  There are several places in
    vm/vm_object.c where the same assumption is made (testing against
    vm_object->paging_in_progress, though, which works, not vp->v_numoutput).

    -

    Thomas, if you are interested this could be a little project for you.
    See if you can code-up another while() loop to also wait for
    the object's paging_in_progress count to become 0 in the vinvalbuf()
    code.  Look in vm/vm_object.c for examples of how to construct such
    a loop.  I will be happy to review and test whatever you come up with
    and I'm sure Ian will too!

						-Matt



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200107110643.f6B6hTB24707>