Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Sep 1998 06:23:16 -0700
From:      Don Lewis <Don.Lewis@tsc.tdk.com>
To:        Luoqi Chen <luoqi@watermarkgroup.com>, current@FreeBSD.ORG
Subject:   Re: Yet another patch to try for softupdates panic
Message-ID:  <199809221323.GAA17424@salsa.gv.tsc.tdk.com>
In-Reply-To: Luoqi Chen <luoqi@watermarkgroup.com> "Re: Yet another patch to try for softupdates panic" (Sep 21,  7:27pm)

next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 21,  7:27pm, Luoqi Chen wrote:
} Subject: Re: Yet another patch to try for softupdates panic
} > It looks like this bogus unlocking could also affect vinvalbuf() which
} > expects the vnode to be locked.
} > 
} The unlock is not bogus at all. You have to maintain a strict locking order
} to avoid deadlocks. For vnodes, this locking order is the directory tree
} itself, i.e., you have to lock the parent first, then the child. So if you
} hold the lock on the child and intend to lock the parent, you have to
} release the lock on child first, then acquire the lock on the parent, and
} then reacquire lock the the child. That's also the reason why you can't have
} hardlinks to a directory.

Yes, I read the comments in the code and understand why the locking must
be done that way, and I also understand why this new code is needed when
a user process calls fsync() for a file that exists on a filesystem using
softupdates.  What's bogus is that this new code changes the semantics of
VOP_FSYNC().  The fact that VOP_FSYNC() may now unlock the vnode for a while
then lock it again may break other parts of the kernel that expect the
vnode to remain locked across a call VOP_FSYNC().  This change in semantics
caused the directory truncation race that your patch fixed, although
admittedly this particular call to VOP_FSYNC() was added to support
softupdates.  For another example, take a look at vinvalbuf(), which
has the following code:

/*                          
 * Flush out and invalidate all buffers associated with a vnode.
 * Called with the underlying object locked.
 */             
int     
vinvalbuf(vp, flags, cred, p, slpflag, slptimeo)
[ snip ]
                if (vp->v_dirtyblkhd.lh_first != NULL) {
                        splx(s);
                        if ((error = VOP_FSYNC(vp, cred, MNT_WAIT, p)) != 0)
                                return (error);
                        s = splbio();
                        if (vp->v_numoutput > 0 ||
                            vp->v_dirtyblkhd.lh_first != NULL)
                                panic("vinvalbuf: dirty bufs"); 
                }
                splx(s);

It sure looks to me like if VOP_FSYNC() called ffs_fsync(), ffs_fsync()
would first write all the dirty buffers and then call softdep_fsync().
Softdep_fsync() would unlock the vnode in order to sync the parent
directories, and while the vnode was unlocked, another process could
grab the vnode and dirty its buffers.  Softdep_fsync() would relock the
vnode and return.  When VOP_FSYNC() returns, we get a nice panic ...

You could tweak this call to VOP_FSYNC() to get it to avoid  the
call to softdep_fsync(), but how many other places in the kernel
also need to be fixed?  It may be better to only call softdep_fsync()
from within the fsync() syscall handler.  I don't know that any other
users of VOP_FSYNC() need to ensure that the parent directories are
pushed to disk.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809221323.GAA17424>