From owner-cvs-src@FreeBSD.ORG Sat Jul 31 11:33:28 2004 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC31816A4CE; Sat, 31 Jul 2004 11:33:28 +0000 (GMT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 53B2743D39; Sat, 31 Jul 2004 11:33:28 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i6VBWnje022396; Sat, 31 Jul 2004 21:32:49 +1000 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) i6VBWkic011941; Sat, 31 Jul 2004 21:32:47 +1000 Date: Sat, 31 Jul 2004 21:32:46 +1000 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Robert Watson In-Reply-To: Message-ID: <20040731202525.H1082@epsplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: src-committers@FreeBSD.org cc: Pawel Jakub Dawidek cc: jhb@FreeBSD.org cc: Alan Cox cc: cvs-src@FreeBSD.org cc: Scott Long cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/alpha/alpha pmap.c src/sys/amd64/amd64pmap.c src/sys/i386/i386 pmap.c src/sys/vm vm_page.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Jul 2004 11:33:29 -0000 On Thu, 29 Jul 2004, Robert Watson wrote: > On Thu, 29 Jul 2004, Pawel Jakub Dawidek wrote: > > > Hmm, why? Code seems to be very simple, I would even risk: simpler than > > when lock is obtained at first time (but still one atomic operation...). > > I'm not saying here that you don't know what you're talking about, I'm > > just looking for an explanation. > > An uncontended mutex acquire is one atomic cmpset and one dereference to > curthread in an inline/macro. > > A recursed mutex operation performs the atomic operation, but finds the > lock is already owned, so falls back into _mtx_lock_sleep(), a function. > That function then re-dereferences curthread (optimization opportunity), > calls mtx_owned(), which re-dereferences curthread again (optimization > opportunity), increment the recursion counter as a regular integer > operation, and use an atomic operation to set the MTX_RECURSED flag. Er, there are no extra atomic operations for recursive mutexes. E.g., there isn't one to set the MTX_RECURSED flag, since that flag is only set during initialization to indicate that recursion is permitted; recursion consists mainly of unlocked (but implicitly atomic) increments and decrements of the recursion counter. Recursive mutexes aren't really expensive. We just don't optimize for them. They basically just take one extra comparison operation and an increment or decrement, as in the spin mutex macros. Normal use of recursive mutexes produces lots of self-contention on the lock, so some of the less optimized code paths are taken more often than for non-recursive mutexes. Not optimizing for recursive mutexes is correct since recursive mutexes shouldn't exist. > I.e., don't recurse except in unusual cases. It looks like we can > optimize out two uses of curthread though, which might lighten the cost of > recursion. While most of our locking doesn't do recursion, Giant can and > does recurse and it might help to do that optimization. > > I found that eliminating an extra curthread dereference from > critical_enter()/critical_exit() made a noticeable difference for UP > kernel-intensive benchmarks, FWIW. My version has spin mutexes uninlined with critical_*() inlined instead. This reduces curthread references as a side effect. The corresponding uninlining for sleep mutexes doesn't work as well since it gives extra function calls. curthread should be declared __pure or something so that gcc can automatically avoid reloading it. I think it is sufficiently invariant for this in most or all places. It is more invariant than most global variables. Bruce