From owner-freebsd-smp  Tue Jan  2 11:36:52 2001
From owner-freebsd-smp@FreeBSD.ORG  Tue Jan  2 11:36:46 2001
Return-Path: <owner-freebsd-smp@FreeBSD.ORG>
Delivered-To: freebsd-smp@freebsd.org
Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88])
	by hub.freebsd.org (Postfix) with ESMTP id E2E5437B715
	for <smp@FreeBSD.org>; Tue,  2 Jan 2001 11:36:43 -0800 (PST)
Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241])
	by meow.osd.bsdi.com (8.11.1/8.9.3) with ESMTP id f02JZtG01040;
	Tue, 2 Jan 2001 11:35:55 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.010102113619.jhb@FreeBSD.org>
X-Mailer: XFMail 1.4.0 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200101020510.WAA13199@fast.cs.utah.edu>
Date: Tue, 02 Jan 2001 11:36:19 -0800 (PST)
From: John Baldwin <jhb@FreeBSD.org>
To: Kevin Van Maren <vanmaren@fast.cs.utah.edu>
Subject: Re: atomic increment?
Cc: smp@FreeBSD.org, cp@bsdi.com
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On 02-Jan-01 Kevin Van Maren wrote:
> I didn't see the Jason Evans flame thread on -arch.  Does
> anyone have a pointer to it in the mail archive?
> 
> In the interim, I think atomic_{increment,decrement}, even if
> they are just syntactic sugar to atomic_{add,subtract}, should be
> provided.  After all, we use "++" as syntactic sugar to "+=1".
> [The fact that gcc uses an intermediate register to add an immediate
> constant is bogus, and not sufficient reason by itself to use
> atomic_increment.]  On x86 the code is slightly better for "++"
> ver "+=1", while it really is just syntactic sugar on other systems.

There is also a desire to try and keep the atomic API from being too huge I
think.  Atomic operations are _expensive_.  One thing you are forgetting about
on the x86 is that an atomic op on an SMP system requires a 'lock' prefix.  The
cost of locking the bus drowns out the savings you may get by getting one or
two less instructions.

> However, I also have another thought.  Often times I need to
> modify a value and also (atomically) determine it's (old|new)
> value.  Primitives that use "xadd" instead of "add" or "sub"
> provide atomicity and eliminate an extra read.  Yes, trashing
> the added value may cause a register to spill if the value is
> needed again later, but often times it is never used again anyway.
> [pre-processor can negate the "sub"; worst case we need an extra
> "neg" instruction for atomic_subtract, but the subtraction will
> still be atomic.]  Essentially, the (old) value is available for
> free, so why not provide it?  It might even make sense to always
> provide the old value for add/subtract, and have gcc throw away
> the unused output (unless the input value is reused, when we'd
> lose a whole register to the xadd).

This might very well be a good idea to do.  Having each of the atomic ops that
currently return void return the new value (probably easiest to do that).  The
ia64 has a 'fetchadd' instruction for example.

> Even if the processor does not support xadd-like operations, it
> can be emulated (more expensively) using load, add, cmpxchg, loop-
> if-failed [similar to the code frag below, but with a while() loop.]
> But by providing a primitive, it can be optimized much further
> than just C code using an atomic cmpxchg operation (and is
> "negative cost" on x86 -- faster than the non-atomic version).

Yes, an atomic_cmpset() loop is how the atomic ops are performed on the ia64.

> In Julian's acquire_writer, we need to do an atomic compare-and-swap
> operation, instead of assuming two operations are atomic (because the
> above acquire_reader code could be executed between the two following
> statements):
> 
>  => if ((ngq->q_flags & (~SINGLE_THREAD_ONLY)) == 0) {
>  =>   atomic_add_long(&ngq->q_flags, WRITER_ACTIVE);
> 
> Here is a possible code sequence to "just get it working" (at least
> I *think* this fixes the alleged problem):
>     [ register int flags; ]
>  => flags = ngq->q_flags;
>  => if ((flags & (~SINGLE_THREAD_ONLY) == 0) &&
>  =>     atomic_cmpset(&ngq->q_flags, flags, flags + WRITER_ACTIVE)) {

Yes, this does look correct.

> One more thought on atomic operations: If we don't assume assignments
> are atomic, and always use atomic_load and atomic_store, then we a) can
> easily provide atomic 64-bit operations on x86 (quick hack would be
> to use a single mutex for all 64-bit operations), and b) we can port
> to platforms where atomic_add requires a mutex to protect the atomic_add
> or atomic_cmpset sequence.  [Slow as molasses]  On x86, the load/store
> macros are NOPs, but the use also (c) makes it clear that we are
> manipulating a variable we perform atomic operations on.

Note that the only atomic_load and atomic_store primities are those that
include memory barriers (and I think they are broken on the x86 for that
matter; they need to use a lock'd cmpxchgl in the load case and a lock'd xchgl
in the store case I think.)

> Kevin

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message