From owner-cvs-src@FreeBSD.ORG  Sat Jul 31 11:33:28 2004
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id DC31816A4CE; Sat, 31 Jul 2004 11:33:28 +0000 (GMT)
Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 53B2743D39; Sat, 31 Jul 2004 11:33:28 +0000 (GMT)
	(envelope-from bde@zeta.org.au)
Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au
	[61.8.0.87])i6VBWnje022396;	Sat, 31 Jul 2004 21:32:49 +1000
Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246])
	i6VBWkic011941;	Sat, 31 Jul 2004 21:32:47 +1000
Date: Sat, 31 Jul 2004 21:32:46 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-X-Sender: bde@epsplex.bde.org
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <Pine.NEB.3.96L.1040729155850.46374L-100000@fledge.watson.org>
Message-ID: <20040731202525.H1082@epsplex.bde.org>
References: <Pine.NEB.3.96L.1040729155850.46374L-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: src-committers@FreeBSD.org
cc: Pawel Jakub Dawidek <pjd@FreeBSD.org>
cc: jhb@FreeBSD.org
cc: Alan Cox <alc@FreeBSD.org>
cc: cvs-src@FreeBSD.org
cc: Scott Long <scottl@FreeBSD.org>
cc: cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/alpha/alpha pmap.c src/sys/amd64/amd64pmap.c
 src/sys/i386/i386 pmap.c src/sys/vm vm_page.c
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 31 Jul 2004 11:33:29 -0000

On Thu, 29 Jul 2004, Robert Watson wrote:

> On Thu, 29 Jul 2004, Pawel Jakub Dawidek wrote:
>
> > Hmm, why? Code seems to be very simple, I would even risk: simpler than
> > when lock is obtained at first time (but still one atomic operation...).
> > I'm not saying here that you don't know what you're talking about, I'm
> > just looking for an explanation.
>
> An uncontended mutex acquire is one atomic cmpset and one dereference to
> curthread in an inline/macro.
>
> A recursed mutex operation performs the atomic operation, but finds the
> lock is already owned, so falls back into _mtx_lock_sleep(), a function.
> That function then re-dereferences curthread (optimization opportunity),
> calls mtx_owned(), which re-dereferences curthread again (optimization
> opportunity), increment the recursion counter as a regular integer
> operation, and use an atomic operation to set the MTX_RECURSED flag.

Er, there are no extra atomic operations for recursive mutexes.  E.g.,
there isn't one to set the MTX_RECURSED flag, since that flag is only
set during initialization to indicate that recursion is permitted;
recursion consists mainly of unlocked (but implicitly atomic) increments
and decrements of the recursion counter.

Recursive mutexes aren't really expensive.  We just don't optimize for
them.  They basically just take one extra comparison operation and an
increment or decrement, as in the spin mutex macros.  Normal use of
recursive mutexes produces lots of self-contention on the lock, so
some of the less optimized code paths are taken more often than for
non-recursive mutexes.  Not optimizing for recursive mutexes is correct
since recursive mutexes shouldn't exist.

> I.e., don't recurse except in unusual cases.  It looks like we can
> optimize out two uses of curthread though, which might lighten the cost of
> recursion.  While most of our locking doesn't do recursion, Giant can and
> does recurse and it might help to do that optimization.
>
> I found that eliminating an extra curthread dereference from
> critical_enter()/critical_exit() made a noticeable difference for UP
> kernel-intensive benchmarks, FWIW.

My version has spin mutexes uninlined with critical_*() inlined instead.
This reduces curthread references as a side effect.  The corresponding
uninlining for sleep mutexes doesn't work as well since it gives extra
function calls.

curthread should be declared __pure or something so that gcc can
automatically avoid reloading it.  I think it is sufficiently invariant
for this in most or all places.  It is more invariant than most global
variables.

Bruce