Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 May 2015 11:33:02 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Alexander Kabaev <kabaev@gmail.com>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r282971 - in head/sys: kern sys
Message-ID:  <2114957.Lbz1KNQfZU@ralph.baldwin.cx>
In-Reply-To: <20150520120046.268dde86@kan>
References:  <201505151350.t4FDocQT054144@svn.freebsd.org> <20150520120046.268dde86@kan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, May 20, 2015 12:00:46 PM Alexander Kabaev wrote:
> On Fri, 15 May 2015 13:50:38 +0000 (UTC)
> John Baldwin <jhb@FreeBSD.org> wrote:
> 
> > Author: jhb
> > Date: Fri May 15 13:50:37 2015
> > New Revision: 282971
> > URL: https://svnweb.freebsd.org/changeset/base/282971
> > 
> > Log:
> >   Previously, cv_waiters was only updated by cv_signal or cv_wait. If
> > a thread awakened due to a time out, then cv_waiters was not
> > decremented. If INT_MAX threads timed out on a cv without an
> > intervening cv_broadcast, then cv_waiters could overflow. To fix
> > this, have each sleeping thread decrement cv_waiters when it resumes.
> >   
> >   Note that previously cv_waiters was protected by the sleepq chain
> > lock. However, that lock is not held when threads resume from sleep.
> > In addition, the interlock is also not always reacquired after
> > resuming (cv_wait_unlock), nor is it always held by callers of
> > cv_signal() or cv_broadcast(). Instead, use atomic ops to update
> > cv_waiters. Since the sleepq chain lock is still held on every
> > increment, it should still be safe to compare cv_waiters against zero
> > while holding the lock in the wakeup routines as the only way the
> > race should be lost would result in extra calls to sleepq_signal() or
> > sleepq_broadcast(). 
> >   Differential Revision:	https://reviews.freebsd.org/D2427
> >   Reviewed by:	benno
> >   Reported by:	benno (wrap of cv_waiters in the field)
> >   MFC after:	2 weeks
> > 
> > Modified:
> >   head/sys/kern/kern_condvar.c
> >   head/sys/sys/condvar.h
> > 
> 
> This breaks ZFS range locking code, which expects to be able to wakeup
> everyone on the condition variable and then free the structure that
> contains it. Having woken up threads modify cv_waiters results in a
> race that leads to already freed memory to be accessed.
> 
> It is debatable just how correct ZFS code in its expectations, but I
> think this commit should probably be reverted until either ZFS is
> changed not to expect cv modifiable by waking threads or until
> alternative solution is found to the cv_waiters overflow issue fixed by
> this commit.

Yes, this is fine to revert for now.  I'll have to think about which way
to fix the bug in question.  The simplest route is probably to remove
cv_waiters entirely, though hacking up sleepq_timeout to recognize cv's
and decrement the count is another option (but hackier).

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2114957.Lbz1KNQfZU>