Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Dec 2009 10:13:32 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r200447 - in head: share/man/man9 sys/kern sys/sys
Message-ID:  <200912141013.32839.jhb@freebsd.org>
In-Reply-To: <200912122131.nBCLV71f064304@svn.freebsd.org>
References:  <200912122131.nBCLV71f064304@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 12 December 2009 4:31:07 pm Attilio Rao wrote:
> Author: attilio
> Date: Sat Dec 12 21:31:07 2009
> New Revision: 200447
> URL: http://svn.freebsd.org/changeset/base/200447
> 
> Log:
>   In current code, threads performing an interruptible sleep (on both
>   sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
>   leave the waiters flag on forcing the owner to do a wakeup even when if
>   the waiter queue is empty.
>   That operation may lead to a deadlock in the case of doing a fake wakeup
>   on the "preferred" (based on the wakeup algorithm) queue while the other
>   queue has real waiters on it, because nobody is going to wakeup the 2nd
>   queue waiters and they will sleep indefinitively.
>   
>   A similar bug, is present, for lockmgr in the case the waiters are
>   sleeping with LK_SLEEPFAIL on.  In this case, even if the waiters queue
>   is not empty, the waiters won't progress after being awake but they will
>   just fail, still not taking care of the 2nd queue waiters (as instead the
>   lock owned doing the wakeup would expect).
>   
>   In order to fix this bug in a cheap way (without adding too much locking
>   and complicating too much the semantic) add a sleepqueue interface which
>   does report the actual number of waiters on a specified queue of a
>   waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
>   exclusive waiters (or shared waiters) are actually present on the lockmgr
>   (or sx) before to give them precedence in the wakeup algorithm.
>   This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
>   cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
>   a lockmgr has and if all the waiters on the exclusive waiters queue are
>   LK_SLEEPFAIL just wake both queues.
>   
>   The sleepq_sleepcnt() introduction and ABI breakage require
>   __FreeBSD_version bumping.

Hmm, do you need an actual count of waiters or would a 'sleepq_empty()'
(similar to turnstile_empty()) method be sufficient?

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200912141013.32839.jhb>