Date: Tue, 12 Dec 2000 18:34:14 -0800 From: Jason Evans <jasone@canonware.com> To: John Baldwin <jhb@FreeBSD.org> Cc: arch@FreeBSD.org Subject: Re: An opaque refcount type Message-ID: <20001212183414.K2312@canonware.com> In-Reply-To: <XFMail.001212122450.jhb@FreeBSD.org>; from jhb@FreeBSD.org on Tue, Dec 12, 2000 at 12:24:50PM -0800 References: <XFMail.001212122450.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 12, 2000 at 12:24:50PM -0800, John Baldwin wrote: > Here's another bikeshed war for everyone to get in on: I've implemented a > relatively light weight and very simple opaque reference counter. It defines > an opaque refcount_t type. In the INVARIANTS case, this maps to a structure > that contains an integer and a mutex. The mutex is used to protect the > integer count as well as several KASSERT()'s. In the non-INVARIANTS case, it > maps to a single integer on all of our current platforms (x86, alpha, ia64) and > is managed via atomic operations. It defines a rather simple API: > > void refcount_init(refcount_t *count, u_int value) > - Initializes the reference counter and sets its inital count to 'value'. > void refcount_destroy(refcount_t *count) > - Cleans up any internals used in a refcount, frees resources, etc. > void refcount_acquire(refcount_t *count) > - Atomically bump the reference count by one. > int refcount_release(refcount_t *count) > - Atomically decerement the reference count by one and return true if the > count drops to zero. As at least John and Alfred know, I'm opposed to an atomic refcount type, for at least the following reasons: 1) The number of bits of accuracy varies from platform to platform. It seems that the least common denominator we'd need to settle on is 24 bits (sparc64), which is enough for most, but not all uses. This is the kernel we're talking about here; I don't like the uncertainty of the accuracy. Maybe 16.7 million is enough accuracy, but... 2) It is hard to say at this point since we still have a lot of conversion work to do, but I suspect that the number of places where a refcount is ideally manipulated outside of a mutex is small. 3) The refcount_init() and refcount_destroy() calls are mandatory, but are only really necessary for platforms that have to rely on mutexes for the internal implementation (or if INVARIANTS is defined). For the most part, this just amounts to extra API complexity. 4) For platforms that can't support an atomic refcount with atomic operations, we have to use mutexes. Now, instead of having using normal atomic operations for refcounts, we have a mutex structure in place of an integer. So, in some cases, this API is actually a pessimization rather than an optimization, and any structures that use a refcount_t are suddenly much larger. Naturally, this means that any structures with a refcount_t that are exposed to userland will potentially vary in size, depending on INVARIANTS. 5) At the moment we have mtx_*(), atomic_*(), [mt]sleep(), and the lockmgr. In the foreseeable future (maybe not by 5.0, but eventually) we will probably have mtx_*(), cv_*() (condition variables, replaces [mt]sleep()), and enhanced reader/writer (actually SIX) locks (replaces lockmgr). I'd like to see the number of synchronization-related interfaces get smaller and simpler, not larger. In another email thread, I demonstrated how to safely manipulate refcounts with atomic operations. The main problem to solve is how to avoid a cleanup race when decrementing the refcount to 0. The following pseudo-code demonstrates a way to avoid this race. atomic_decrement(count) if (compare_and_set(count, 0, 1)) { /* iff (count == 0) {count = 1}. */ /* Clean up. */ } This requires two atomic operations for refcount decrement, which, as John has pointed out, is the same number of locked instructions as would be required for a mutex acquire/release. However, refcount increment still only requires one locked instruction. Therefore, this method requires 50% more locked instructions than a refcount_t implementation would require on a cooperative platform, and 33% fewer locked instructions than a refcount_t implementation on an uncooperative platform. My knowledge of memory bus locking isn't fresh, but I suspect that using two locked instructions on the same address in short succession only costs significantly more than one locked instruction if there is contention. Of course, if there is contention, then an implementation that only uses one locked instruction is going to get hurt too, so I suspect that the actual performance difference between this and a streamlined refcount_t implementation is rather small. (Someone please correct me if this is incorrect.) Even if performance is significantly worse for this method, point 2 above could make it irrelevant anyway. In summary, I have a number of specific and general issues with introducing an atomic refcount type with associated functions. Jason To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001212183414.K2312>