From owner-freebsd-bugs@FreeBSD.ORG Mon Oct 13 11:50:25 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 851C516A4B3 for ; Mon, 13 Oct 2003 11:50:25 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5825D43F93 for ; Mon, 13 Oct 2003 11:50:24 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h9DIoNFY052187 for ; Mon, 13 Oct 2003 11:50:24 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h9DIoN5t052183; Mon, 13 Oct 2003 11:50:23 -0700 (PDT) (envelope-from gnats) Date: Mon, 13 Oct 2003 11:50:23 -0700 (PDT) Message-Id: <200310131850.h9DIoN5t052183@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Stefan Farfeleder Subject: Re: kern/57945: [patch] Add locking to kqueue to make it MP-safe X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Stefan Farfeleder List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2003 18:50:25 -0000 The following reply was made to PR kern/57945; it has been noted by GNATS. From: Stefan Farfeleder To: Bruce Evans Cc: bug-followup@FreeBSD.org Subject: Re: kern/57945: [patch] Add locking to kqueue to make it MP-safe Date: Mon, 13 Oct 2003 20:47:38 +0200 On Tue, Oct 14, 2003 at 03:56:19AM +1000, Bruce Evans wrote: > Spin mutexes are actually less efficient, since in the usual uncontested > case they do the same things plus extra calls to critical_enter() and > critical exit. Ok, thanks. > > --- kqueue_lock.diff begins here --- > > Index: src/sys/kern/kern_event.c > > =================================================================== > > RCS file: /usr/home/ncvs/src/sys/kern/kern_event.c,v > > retrieving revision 1.60 > > diff -u -r1.60 kern_event.c > > --- src/sys/kern/kern_event.c 18 Jun 2003 18:16:39 -0000 1.60 > > +++ src/sys/kern/kern_event.c 13 Oct 2003 00:35:42 -0000 > > ... > > @@ -704,15 +705,16 @@ > > > > start: > > kevp = kq->kq_kev; > > - s = splhigh(); > > + mtx_lock(&kq->kq_mtx); > > if (kq->kq_count == 0) { > > if (timeout < 0) { > > error = EWOULDBLOCK; > > + mtx_unlock(&kq->kq_mtx); > > } else { > > kq->kq_state |= KQ_SLEEP; > > - error = tsleep(kq, PSOCK | PCATCH, "kqread", timeout); > > + error = msleep(kq, &kq->kq_mtx, PSOCK | PCATCH | PDROP, > > + "kqread", timeout); > > } > > - splx(s); > > Not copying the spl locking seems to give a bug here. msleep() returns with > the mutex held. The PDROP flag should prevent msleep() from reacquiring the lock. > > if (error == 0) > > goto retry; > > /* don't restart after signals... */ > > @@ -728,20 +730,22 @@ > > kn = TAILQ_FIRST(&kq->kq_head); > > TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); > > if (kn == &marker) { > > - splx(s); > > + mtx_unlock(&kq->kq_mtx); > > if (count == maxevents) > > goto retry; > > goto done; > > } > > + kq->kq_count--; > > + mtx_unlock(&kq->kq_mtx); > > if (kn->kn_status & KN_DISABLED) { > > kn->kn_status &= ~KN_QUEUED; > > - kq->kq_count--; > > + mtx_lock(&kq->kq_mtx); > > continue; > > } > > I don't see any reason to unlock/lock here. There were no splx()/splhigh()'s. > splhigh() is very global but kq_mtx is very local so keeping the > critical sections short should be much less needed than before. Ah, right. So it should look like this? + kq->kq_count--; if (kn->kn_status & KN_DISABLED) { kn->kn_status &= ~KN_QUEUED; - kq->kq_count--; continue; } + mtx_unlock(&kq->kq_mtx); > > if ((kn->kn_flags & EV_ONESHOT) == 0 && > > kn->kn_fop->f_event(kn, 0) == 0) { > > kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE); > > - kq->kq_count--; > > + mtx_lock(&kq->kq_mtx); > > continue; > > } > > *kevp = kn->kn_kevent; > > Here dropping the mutex seems to be be necessary for calling the > function, but the function call was locked by splhigh() before. Yes, not dropping kq_mtx causes LORs between at least the kqueue lock and the pipe lock. > BTW, what locks kn? Nothing at the moment, I hoped it wouldn't be necessary. Do you have a specific race in mind? Stefan