Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Sep 1998 05:51:22 -0700
From:      Don Lewis <Don.Lewis@tsc.tdk.com>
To:        Luoqi Chen <luoqi@watermarkgroup.com>, current@FreeBSD.ORG
Subject:   Re: Yet another patch to try for softupdates panic
Message-ID:  <199809211251.FAA14043@salsa.gv.tsc.tdk.com>
In-Reply-To: Luoqi Chen <luoqi@watermarkgroup.com> "Yet another patch to try for softupdates panic" (Sep 18,  3:41pm)

next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 18,  3:41pm, Luoqi Chen wrote:
} Subject: Yet another patch to try for softupdates panic
} This patch could be the real cure for the `initiate_write_filepage' panic
} people were seeing during make -j# world. I have posted another patch
} about a week ago (in fact, I have committed it), but it turned out to be
} no more than a no-op (thanks to Bruce for pointing it out, it was an
} embarrassing silly mistake of mine). I certainly hope this patch will do
} its work: this patch should fix a race condition between directory truncation
} and file creation that could lead to the `initiate_write_filepage' panic.

Yeah, it looks like it might fix the problem.  I tracked down the
brokenness that I have been seeing to what looks like concurrent
directory access, even though directories are supposed to be locked
while they are being fiddled with.  It looks like the
initiate_write_filepage panic is caused by two processes trying to
store directory entries in the same slot.  I've seen one process start
a ufs_lookup() in a directory while another process was doing a
ufs_direnter() on that directory.  This shouldn't be possible because
ufs_direnter() should only happen if the directory is locked, and
ufs_lookup() shouldn't be called until it's caller can lock the directory.

If ufs_direnter() decides to compact the directory, it calls
UFS_TRUNCATE(), which ends up calling softdep_fsync() if softupdates
are enabled.  Softdep_fsync() will unlock the directory, which is evil,
and your patch should prevent this.  What bothers me is that the
directory truncation doesn't happen until after ufs_direnter() has
stored the new directory entry, so I don't see how the softdep_fsync()
unlocking bug causes the symptoms.  It looks to me like the directory
is somehow getting unlocked before the new directory entry is
installed.  It seems like the first process which wants to create a
directory entry finds a free directory slot and calls ufs_direnter()
which somehow unlocks the directory and goes to sleep for a while.
Meanwhile another process finds the same directory slot and fills it.
The first process then wakes up and overwrites the directory slot used
by the second process.  BOOM!  If the directory slot is written before
the lock is released, then the lookup() in the second process shouldn't
find that slot free and there shouldn't be a collision.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809211251.FAA14043>