From owner-freebsd-current@FreeBSD.ORG  Wed May  5 14:23:44 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 62D8916A4CE
	for <freebsd-current@freebsd.org>;
	Wed,  5 May 2004 14:23:44 -0700 (PDT)
Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C82C943D48
	for <freebsd-current@freebsd.org>;
	Wed,  5 May 2004 14:23:41 -0700 (PDT)
	(envelope-from gallatin@cs.duke.edu)
Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu
	[152.3.145.30])
	by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i45LNfxZ019919
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 5 May 2004 17:23:41 -0400 (EDT)
Received: (from gallatin@localhost)
	by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i45LNUft001054;
	Wed, 5 May 2004 17:23:30 -0400 (EDT)
	(envelope-from gallatin)
From: Andrew Gallatin <gallatin@cs.duke.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <16537.23378.375946.857908@grasshopper.cs.duke.edu>
Date: Wed, 5 May 2004 17:23:30 -0400 (EDT)
To: Bruce Evans <bde@zeta.org.au>
In-Reply-To: <20040505222636.H15444@gamplex.bde.org>
References: <FE045D4D9F7AED4CBFF1B3B813C85337021AB377@mail.sandvine.com>
	<20040505222636.H15444@gamplex.bde.org>
X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid
cc: freebsd-current@freebsd.org
cc: Gerrit Nagelhout <gnagelhout@sandvine.com>
Subject: RE: 4.7 vs 5.2.1 SMP/UP bridging performance
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 May 2004 21:23:44 -0000


Bruce Evans writes:

 > 
 > Athlon XP2600 UP system:  !SMP case: 22 cycles   SMP case: 37 cycles
 > Celeron 366 SMP system:              35                    48
 > 
 > The extra cycles for the SMP case are just the extra cost of a one lock
 > instruction.  Note that SMP should cost twice as much extra, but the
 > non-SMP atomic_store_rel_int(&slock, 0) is pessimized by using xchgl
 > which always locks the bus.  After fixing this:
 > 
 > Athlon XP2600 UP system:  !SMP case:  6 cycles   SMP case: 37 cycles
 > Celeron 366 SMP system:              10                    48
 > 
 > Mutexes take longer than simple locks, but not much longer unless the
 > lock is contested.  In particular, they don't lock the bus any more
 > and the extra cycles for locking dominate (even in the !SMP case due
 > to the pessimization).
 > 
 > So there seems to be something wrong with your benchmark.  Locking the
 > bus for the SMP case always costs about 20+ cycles, but this hasn't
 > changed since RELENG_4 and mutexes can't be made much faster in the
 > uncontested case since their overhead is dominated by the bus lock
 > time.
 > 

Actually, I think his tests are accurate and bus locked instructions
take an eternity on P4.  See
http://www.uwsg.iu.edu/hypermail/linux/kernel/0109.3/0687.html 

For example, with your test above, I see 212 cycles for the UP case on
a 2.53GHz P4.  Replacing the atomic_store_rel_int(&slock, 0) with a
simple slock = 0; reduces that count to 18 cycles.

If its really safe to remove the xchg* from non-SMP atomic_store_rel*,
then I think you should do it.  Of course, that still leaves mutexes
as very expensive on SMP (253 cycles on the 2.53GHz from above).

Drew