Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Aug 2007 10:16:33 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        Alfred Perlstein <alfred@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: Lockless uidinfo.
Message-ID:  <200708221016.34107.jhb@freebsd.org>
In-Reply-To: <20070822063552.GC4187@garage.freebsd.pl>
References:  <20070818120056.GA6498@garage.freebsd.pl> <200708211753.34697.jhb@freebsd.org> <20070822063552.GC4187@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 22 August 2007 02:35:52 am Pawel Jakub Dawidek wrote:
> On Tue, Aug 21, 2007 at 05:53:34PM -0400, John Baldwin wrote:
> > On Tuesday 21 August 2007 03:19:02 pm Pawel Jakub Dawidek wrote:
> > > > Memory barriers on another CPU don't mean anything about the CPU 
thread 2 
> > is 
> > > > on.  Memory barriers do not flush caches on other CPUs, etc.  Normally 
> > when 
> > > > objects are refcounted in a table, the table holds a reference on the 
> > object, 
> > > > but that doesn't seem to be the case here. [...]
> > > 
> > > But the memory barrier from 'mtx_lock(&uihashtbl_mtx)' above
> > > 'if (uip->ui_ref > 0)' would do the trick and I can safely avoid using
> > > atomic read in this if statement, right?
> > 
> > Yes, though that may be rather non-obvious to other people, or to yourself 
6 
> > months down the road.  You should probably document it.
> 
> Agreed, I added a comment that should explain it.
> 
> > > > I wouldn't use a more complex algo in uifree() unless the simple one 
is 
> > shown 
> > > > to perform badly.  Needless complexity is a hindrance to future 
> > maintenance.
> > > 
> > > Of coure we could do that, but I was trying really hard to remove
> > > contention in the common case. Before we used UIDINFO_LOCK() in the
> > > common case, now you suggesting using global lock here, and I'd really,
> > > really prefer using one atomic only.
> > 
> > *sigh*  You ignored my last sentence.  You need to think about other 
people 
> > who come and read this later or yourself 12 months from now when you've 
paged 
> > out all the uidinfo stuff from your head.
> > 
> > Hmm, what happens if you get this:
> > 
> > thread A
> > --------
> > 
> >  refcount_release() - drops to 0
> > 
> >   preempted
> > 
> >                                  thread B
> >                                  --------
> >                                  uihold() - refcount goes to 1
> >                                  ...
> >                                  uifree() - refcount goes to 0
> >                                  removes object from table and frees it
> > 
> >   resumes
> > 
> >   mtx_lock(&uihashtbl);
> > 
> >   sees refcount of 0 so tries to destroy object again
> > 
> >  *BOOM* (corruption, panic, etc.)
> 
> Grr, good catch.
> 
> > This is the race that the current code handles by taking a reference
> > on the uid while it drops the uidinfo lock to acquire the table lock.
> > 
> > Probably you need to not drop the last reference unless you hold the 
> > uihashtbl_mtx, which means not using refcount_release() and manually use 
an 
> > atomic_cmpset_int() if the refcount is > 1 to drop a ref to optimize the 
> > common case:
> > 
> > 	old = uip->ui_refs;
> > 	if (old > 1) {
> > 		if (atomic_cmpset_int(&uip->ui_refs, old, old - 1))
> > 			return;
> > 	}
> > 
> > 	mtx_lock(&uihashtbl_mtx);
> > 	if (refcount_release(&uip->ui_refs)) {
> > 		/* remove from table and free */
> > 	}
> > 	mtx_unlock(&uihashtbl_mtx);
> 
> How about we relookup it after acquiring the uihashtbl_mtx lock?
> Something like this (comments removed):
> 
> 	uid_t uid;
> 
> 	uid = uip->ui_uid;
> 	if (!refcount_release(&uip->ui_ref))
> 		return;
> 	mtx_lock(&uihashtbl_mtx);
> 	uip = uilookup(uid);
> 	if (uip != NULL && uip->ui_ref == 0) {
> 		LIST_REMOVE(uip, ui_hash);
> 		mtx_unlock(&uihashtbl_mtx);
> 		FREE(uip, M_UIDINFO);
> 		return;
> 	}
> 	mtx_unlock(&uihashtbl_mtx);
> 
> I updated the patch at:
> 
> 	http://people.freebsd.org/~pjd/patches/uidinfo_waitfree.patch

That actually adds more overhead than what I suggested above to the case where 
you are going to free it.  Also, I'm leery of having an object hang around 
with a zero ref count while it is in the table with the lock not held.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708221016.34107.jhb>