From owner-freebsd-arch@FreeBSD.ORG Tue Jan 24 22:38:51 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E05D16A41F; Tue, 24 Jan 2006 22:38:51 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6E4B843D49; Tue, 24 Jan 2006 22:38:50 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 6883515 for multiple; Tue, 24 Jan 2006 17:38:37 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k0OMcm6c016149; Tue, 24 Jan 2006 17:38:48 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: arch@freebsd.org Date: Tue, 24 Jan 2006 17:39:39 -0500 User-Agent: KMail/1.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200601241739.41027.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1248/Tue Jan 24 05:54:38 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: current@freebsd.org Subject: [PATCH] Initial (working!) version of rwlocks X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 22:38:51 -0000 If you've been following p4 submits to //depot/user/jhb/lock/... recently you've noticed that I've been hacking on an implementation of reader/writer locks. I've just finished doing an initial set of tests on a 4-cpu box and am confident that at least the really big races should all be handled. First, rwlocks are basically read/write mutexes. They cannot be held while sleeping similar to mutexes (and thus different from sx and lockmgr), but this means that they can be used in ithreads, etc. Also, they can do some form of priority propagation. To achieve this latter part, I've patched the turnstile code to grow the notion of having multiple queues of waiters on a given turnstile: a queue of shared (read) waiters, and a queue of exclusive (write) waiters. The modified turnstile code (with suitable updates to the mutex code) ran without incident for several days on alpha, amd64, i386, and sparc64 machines running buildworld -j XX in an infinite loop. Now, as far as limitations, etc. of this reader/writer lock implementation. This implementation is not meant to be the end-all be-all holy grail of reader/writer lock implementations. The goals of this version are to have a _stable_, _working_ implementation that can be used in the tree as well as to provide a base implementation that people can hack on to try out other algorithms and ideas. This means that folks like the networking stack guys can go ahead and have working rwlocks now (though perhaps not 100% optimal or perfect, but allowing more parallelism than mutexes) and that independently other folks can play with other ideas for rwlocks. In other words, I don't want to bikeshed forever about missing features or theoretical changes to the wakeup algorithm, etc. I'd rather commit this now as a starting point. :) That said, here are the known limitations, etc.: - Currently no attempt is made to do propogate priority to threads that hold read locks. Not even a Solaris-style "owner of record" type deal. For one, it would require some more hacking on the turnstile code. Secondly, it would add some sort of expense to read-lock operations and I'm unsure if the extra atomic ops (and thus penalty on the more common read lock operations) would be worth it. - Currently we allow read locks to recurse. For one thing, w/o some way of tracking which threads hold read locks (which would be expensive) you can't verify that code doesn't break this rule unless you use WITNESS. Most of our other simple lock assertions can be verified w/o needing WITNESS, which is why I allowed this. - Because of the previous, readers don't block if the lock is read-locked but there are writers waiting. Otherwise you end up in a trivial deadlock as expounded upon further in the comments. - Because of the previous, the read unlock algorithm is quite simple: it wakes up all write waiters because that's all the waiters there are to wake up. - The algorithm for write unlock is simplistic and matches sx for now in that it prefers readers to writers. - There is no explicit lock handoffs, but our mutexes don't do that either. - Read lock operations are not currently inlined. I think this might could be done now though. At first I was worried that read lock operations would be too complex to inline and so I've only inlined write lock operations. Now that the implementation is in a working state though, it seems that there are some simple easy cases that could possibly be inlined. - Probably more I haven't thought of, etc. The changes are available as a patch and the two new files (sys/rwlock.h and kern/kern_rwlock.c) here: http://www.FreeBSD.org/~jhb/patches/rwlock/ One suggestion I have had and haven't acted on yet is to change the turnstile code to track an internal state (share locked, exclusive locked, unlocked) to make some of the assertions possibly saner. Also, there is some debugging stuff in kern_rwlock.c to map KTR_LOCK to KTR_SUBSYS so I could get ktr traces of just rwlocks w/o the traces muddied by mutex and sx lock traces. I would appreciate people looking at the turnstile changes and the implementation of the rwlocks to see if there are races I've missed, etc. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org