FreeBSD Mail Archives

Date:      Sat, 30 Nov 2002 15:15:43 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        Michal Mertl <mime@traveller.cz>, current@freebsd.org
Subject:   Re: system locks with vnode backed md(4)
Message-ID:  <3DE9469F.CAC6CB82@mindspring.com>
References:  <Pine.NEB.3.96L.1021130130417.77446C-100000@fledge.watson.org>

Robert Watson wrote:
> On Sat, 30 Nov 2002, Michal Mertl wrote:
> > I'm now unable to make it dead-lock again. Yet it happened quite easily.
> > I had more md backing files in the same directory at the beginning (to
> > test Terry's suspicion mentioned in thread 'jail' on hackers@).
> 
> I've noticed that chroot() environments tend to make existing deadlock
> opportunities more likely.  I'm not quite sure why that is.  :-)

Lock to parent.  It's the same reason you can lock up if you
use automount, with all the automount mount points happening
in the same subdirectory.

> There are a fair number of vnode locking deadlock scenarios that are
> unavoidable where we rely on grabbing vnode locks out of the directory
> structure lock order.  This occurs for vnode-backed md devices, quotas,
> and UFS1 extended attributes, and probably some other situations.  I
> suspect that Terry is correct that operations on the vnode backing file
> storage directory are triggering the problem, since that increases the
> chances that a vnode lock "race to root" will occur from both the file
> system backed into the md device, and for the md backing vnodes during
> blocking I/O.

See other postings.  The "race to root" is the one I was
originally commenting on.  I'm not sure that it applies in
this case, I think this case might be the "out of memory to
create new soft dependencies" case, where you can end up
holding a lock on a buffer that needs to be flushed to recover
memory, until you can satisfy the request to create a dependency
(starvation deadlock).  The "race to root" is a "deadly embrace"
deadlock.

> If you can avoid directory operations on the md backing
> directory, that would probably be one way to avoid triggering the bug.

Yes.  By placing each vnconfiged device in its own subdirectory,
you avoid them.  There's still a window on your host OS doing
it's own traversal, but that's (effectively) a "whole FS lock",
so it doesn't trigger a problem.

> Seeing it reproduced would probably confirm that this is the case.

It's a pain.  I wasted a couple of days trying to reproduce,
without a box I could wipe and make into a wscratch box, with
little luck.  I think that it requires reproducing the failing
box in detail, which I wasn't willing to do (hence the workaround).

> On the
> other hand, there may be other deadlocks in the vnode/ufs/md code that can
> be more easily corrected than this general VFS problem, so details there
> would be very useful.

There are a number of them; they are all a pain.  It's really
tempting to just refactor the code so that all locking occurs
at the same logical layer, without being held across function
calls.  That'd be a heck of a lot of work, though... probably
worth it, in the end.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DE9469F.CAC6CB82>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation