Date: Tue, 2 Sep 2008 16:08:56 -0400 From: John Baldwin <jhb@freebsd.org> To: Tor Egge <Tor.Egge@cvsup.no.freebsd.org> Cc: attilio@freebsd.org, kevinxlinuz@163.com, freebsd-current@freebsd.org, Benjamin.Close@clearchain.com, kib@freebsd.org Subject: Re: [BUG] I think sleepqueue need to be protected in sleepq_broadcast Message-ID: <200809021608.57542.jhb@freebsd.org> In-Reply-To: <20080901.013117.74700691.Tor.Egge@cvsup.no.freebsd.org> References: <200808230003.44081.jhb@freebsd.org> <48B6BC81.5060300@clearchain.com> <20080901.013117.74700691.Tor.Egge@cvsup.no.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday 31 August 2008 09:31:17 pm Tor Egge wrote: > > sleepq_resume_thread() contains an ownership handover of sq if the resumed > thread is the last one blocked on the wait channel. After the handover, sq is > no longer protected by the sleep queue chain lock and should no longer be > accessed by sleepq_broadcast(). > > Normally, when sleepq_broadcast() incorrectly accesses sq after the handover, > it will find the sq->sq_blocked queue to be empty, and the code appears to > work. > > If the last correctly woken thread manages to go to sleep again very quickly on > another wait channel, sleepq_broadcast() might incorrectly determine that the > sq->sq_blocked queue isn't empty, and start doing the wrong thing. So disregard my earlier e-mail. Here is a simple fix for the sleepq case: Index: subr_sleepqueue.c =================================================================== --- subr_sleepqueue.c (revision 182679) +++ subr_sleepqueue.c (working copy) @@ -779,7 +779,7 @@ sleepq_broadcast(void *wchan, int flags, int pri, int queue) { struct sleepqueue *sq; - struct thread *td; + struct thread *td, *tdn; int wakeup_swapper; CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags); @@ -793,8 +793,7 @@ /* Resume all blocked threads on the sleep queue. */ wakeup_swapper = 0; - while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) { - td = TAILQ_FIRST(&sq->sq_blocked[queue]); + TAILQ_FOREACH_SAFE(td, &sq->sq_blocked[queue], td_slpq, tdn) { thread_lock(td); if (sleepq_resume_thread(sq, td, pri)) wakeup_swapper = 1; This only uses 'sq' to fetch the head of the queue once up front. It won't use it again once it has started waking up threads. > A similar (but probably much more difficult to trigger) issue is present with > regards to thread_lock() and turnstiles. > > The caller of thread_lock() might have performed sufficient locking to ensure > that the thread to be locked doesn't go away, but any turnstile spin lock > pointed to by td->td_lock isn't protected. Making turnstiles type stable > (setting UMA_ZONE_NOFREE flag for turnstile_zone) should fix that issue. Note that unlike the sleepq case, turnstiles are not made runnable until all of them are dequeued from the turnstile and assigned a new turnstile. Only after all that is settled are the threads made runnable in turnstile_unpend(). However, that doesn't fix this specific race (though it means the turnstile code is not subject to the same exact race as the sleepq code above). Making turnstiles type-stable is indeed probably the only fix for this. :-/ -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809021608.57542.jhb>