Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 May 2013 08:46:54 -0700
From:      Marcel Moolenaar <marcel@xcllnt.net>
To:        John Baldwin <jhb@freebsd.org>
Cc:        attilio@freebsd.org, svn-src-head@freebsd.org, svn-src-all@freebsd.org, Marcel Moolenaar <marcel@freebsd.org>, src-committers@freebsd.org
Subject:   Re: svn commit: r250411 - in head/sys: conf kern sys
Message-ID:  <405C7C78-A626-4836-BD90-16FD08DD3196@xcllnt.net>
In-Reply-To: <201305100952.45101.jhb@freebsd.org>
References:  <201305091628.r49GSI33039873@svn.freebsd.org> <CAJ-FndBY%2ByuUdvO4zP3kf2W4gDvB-uih19bqdmkFW3E4NcrHtw@mail.gmail.com> <CC06FD75-868C-40B3-9C10-D66B56327803@xcllnt.net> <201305100952.45101.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On May 10, 2013, at 6:52 AM, John Baldwin <jhb@freebsd.org> wrote:
>>>=20
>>> The way to fix this is to implement LK_NOWITNESS on a per-lock basis
>>> into lockmgr, propagate the same concept to the vn_lock() (which
>>> should be basically done automatically) and finally identify the
>>> false-positive case and commit for them explicitely LK_NOWITNESS on =
a
>>> per-call basis, explaining in detail why the single case reported is =
a
>>> false-positive.
>>=20
>> This is worse. You want witness involved.
>=20
> Well, I disagree with both of you a bit.  I mostly agree with Attilio =
in
> that the committed change is a really large sledgehammer.

I do not disagree that it isn't a fix. I wasn't aiming to fix it.

>  However, I think LK_NOWITNESS is also too large
> of a sledgehammer for this as well.

I agree. I think it's worse, by virtue of it being presented as both
better as well as a fix.

>=20
> 2) vnode locks from a local filesystem that report a LOR with a =
"devfs"
>   vnode.  Typical reports are either "ufs" -> "devfs" or in some cases
>   "ufs" -> "devfs" -> "ufs".  As with 1), I would much rather tag the
>   offending location than to disable all WITNESS checking on vnode =
locks.

With more file system types in use, this will get mixed up with the
other file systems and noise you get is rather severe. It is a big
problem for us at Juniper.


> I had originally thought of doing this by passing a flag in the =
relevant
> lock call, but that requires having a "flags" variant of all lock =
calls
> and passing the flag down, etc.  While writing this I thought of an
> alternative which would be to have a WITNESS_SUSPEND/WITNESS_RESUME =
pair
> of macros that either set a per-thread flag that we can invoke around
> known cases where a reversal is ok.  When witness is suspended it =
would
> still remember that the lock was acquired, but it would not report a
> LOR due to that acquisition (we may have to set a flag in the lock =
instance
> so that future LORs related to that acquisition are also ignored) and =
it
> would not learn any new orders from that acquisition.  So in code it
> would look like:
>=20
> 	WITNESS_SUSPEND();
> 	vn_lock(...);
> 	WITNESS_RESTORE();
>=20
> Does this sound agreeable?  If so, I definitely think Marcel's change =
should
> be reverted in favor of this (more general yet more fine-grained) =
approach=20
> instead.

This is a similar approach to what Attilio suggested (IMO) and I'm
very concerned on 2 grounds:
1.  Disable/suspending witness and making it easy to do so has known
    side-effects: people will use it simple to get rid of witness
    warnings and not even bother to understand the root cause of the
    witness complaints and thus fix the root cause problem. Fixing
    the root cause could be improving witness, or fixing the locks.
    If we add means and ways to have locks not checked by witness
    then witness stops being the useful tool it mostly is now.
2.  This, to me, also doesn't come close to fixing the root cause
    problem. It's a just a different bandaid. What I would like to
    see is a statement as to why witness reports LORs (in this case)
    and why they are not, so that we can improve witness.
    For example, a statement could go like this: witness operates on
    lock names. All vnode structures have a lock and they bare the
    name of the file system. Consequently, we have many instances of
    a particular lock type that share the same name. This causes,
    ... (please complete this statement appropriately).
    If our root cause problem is fundamentally that witness cannot
    distinguish 1 UFS vnode lock from another UFS vnode lock then
    we should fix that. Either:
    1.  Construct unique lock names when witness is enabled, or
    2.  Create the lock with a flag to tell witness that there are
	multiple locks of the same name and as such does not have
	enough the knowledge to report LORs adequately. How we
	change witness in the face of such a flag is another story
	and a good subject for ongoing discussion.

I'm all for fixing this. My change should be backed out if we have
a real fix. But I must say that I'm disappointed by the responses
of both you and Attilio. I do not get a sense that either of you
knows what the problem is. I definitely do not have had time to
understand the problem, let alone work up a real fix, so I claim
ignorance without prejudice and I'm the first to admit that this is
not a fix. But the best suggestion I get is to exclude the locks
from any consideration by witness and that's worse than what I did.
I merely added a kernel option to supressing the printing. Not even
the checking.

And all I did is to allow someone (=3D Juniper) to not print the LOR
for this well-known and mostly ignored case that is impacting our
ability to keep witness enabled. And the reason I had to do that is
that this is a long-standing LOR that isn't being addressed. The
FreeBSD community apparently has settled on just ignoring it.

Let's do better. Let's understand the root cause and work up a real
fix that doesn't involve disabling witness or excluding them from
witness. I have no problem doing the grunt work on that in due time
and rip out my hack at the same time.

--=20
Marcel Moolenaar
marcel@xcllnt.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?405C7C78-A626-4836-BD90-16FD08DD3196>