Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Dec 2009 18:02:54 +0100
From:      Attilio Rao <attilio@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r200447 - in head: share/man/man9 sys/kern sys/sys
Message-ID:  <3bbf2fe10912140902m407fa766q3a5e5bb6993723f9@mail.gmail.com>
In-Reply-To: <200912141013.32839.jhb@freebsd.org>
References:  <200912122131.nBCLV71f064304@svn.freebsd.org> <200912141013.32839.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2009/12/14 John Baldwin <jhb@freebsd.org>:
> On Saturday 12 December 2009 4:31:07 pm Attilio Rao wrote:
>> Author: attilio
>> Date: Sat Dec 12 21:31:07 2009
>> New Revision: 200447
>> URL: http://svn.freebsd.org/changeset/base/200447
>>
>> Log:
>> =C2=A0 In current code, threads performing an interruptible sleep (on bo=
th
>> =C2=A0 sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr),=
 will
>> =C2=A0 leave the waiters flag on forcing the owner to do a wakeup even w=
hen if
>> =C2=A0 the waiter queue is empty.
>> =C2=A0 That operation may lead to a deadlock in the case of doing a fake=
 wakeup
>> =C2=A0 on the "preferred" (based on the wakeup algorithm) queue while th=
e other
>> =C2=A0 queue has real waiters on it, because nobody is going to wakeup t=
he 2nd
>> =C2=A0 queue waiters and they will sleep indefinitively.
>>
>> =C2=A0 A similar bug, is present, for lockmgr in the case the waiters ar=
e
>> =C2=A0 sleeping with LK_SLEEPFAIL on. =C2=A0In this case, even if the wa=
iters queue
>> =C2=A0 is not empty, the waiters won't progress after being awake but th=
ey will
>> =C2=A0 just fail, still not taking care of the 2nd queue waiters (as ins=
tead the
>> =C2=A0 lock owned doing the wakeup would expect).
>>
>> =C2=A0 In order to fix this bug in a cheap way (without adding too much =
locking
>> =C2=A0 and complicating too much the semantic) add a sleepqueue interfac=
e which
>> =C2=A0 does report the actual number of waiters on a specified queue of =
a
>> =C2=A0 waitchannel (sleepq_sleepcnt()) and use it in order to determine =
if the
>> =C2=A0 exclusive waiters (or shared waiters) are actually present on the=
 lockmgr
>> =C2=A0 (or sx) before to give them precedence in the wakeup algorithm.
>> =C2=A0 This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In or=
der to
>> =C2=A0 cope with it, add the tracking of how many exclusive LK_SLEEPFAIL=
 waiters
>> =C2=A0 a lockmgr has and if all the waiters on the exclusive waiters que=
ue are
>> =C2=A0 LK_SLEEPFAIL just wake both queues.
>>
>> =C2=A0 The sleepq_sleepcnt() introduction and ABI breakage require
>> =C2=A0 __FreeBSD_version bumping.
>
> Hmm, do you need an actual count of waiters or would a 'sleepq_empty()'
> (similar to turnstile_empty()) method be sufficient?

I need the count in order to fix properly LK_SLEEPFAIL case (the idea
is: track exclusive waiters with LK_SLEEPFAIL on; if the number is
equal to the actual sleepers on the queue then wake up both queues,
otherwise nobody is going to take care of the shared waiters queue).

Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10912140902m407fa766q3a5e5bb6993723f9>