From owner-freebsd-smp Tue Jan 2 11:36:52 2001 From owner-freebsd-smp@FreeBSD.ORG Tue Jan 2 11:36:46 2001 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88]) by hub.freebsd.org (Postfix) with ESMTP id E2E5437B715 for ; Tue, 2 Jan 2001 11:36:43 -0800 (PST) Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241]) by meow.osd.bsdi.com (8.11.1/8.9.3) with ESMTP id f02JZtG01040; Tue, 2 Jan 2001 11:35:55 -0800 (PST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200101020510.WAA13199@fast.cs.utah.edu> Date: Tue, 02 Jan 2001 11:36:19 -0800 (PST) From: John Baldwin To: Kevin Van Maren Subject: Re: atomic increment? Cc: smp@FreeBSD.org, cp@bsdi.com Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On 02-Jan-01 Kevin Van Maren wrote: > I didn't see the Jason Evans flame thread on -arch. Does > anyone have a pointer to it in the mail archive? > > In the interim, I think atomic_{increment,decrement}, even if > they are just syntactic sugar to atomic_{add,subtract}, should be > provided. After all, we use "++" as syntactic sugar to "+=1". > [The fact that gcc uses an intermediate register to add an immediate > constant is bogus, and not sufficient reason by itself to use > atomic_increment.] On x86 the code is slightly better for "++" > ver "+=1", while it really is just syntactic sugar on other systems. There is also a desire to try and keep the atomic API from being too huge I think. Atomic operations are _expensive_. One thing you are forgetting about on the x86 is that an atomic op on an SMP system requires a 'lock' prefix. The cost of locking the bus drowns out the savings you may get by getting one or two less instructions. > However, I also have another thought. Often times I need to > modify a value and also (atomically) determine it's (old|new) > value. Primitives that use "xadd" instead of "add" or "sub" > provide atomicity and eliminate an extra read. Yes, trashing > the added value may cause a register to spill if the value is > needed again later, but often times it is never used again anyway. > [pre-processor can negate the "sub"; worst case we need an extra > "neg" instruction for atomic_subtract, but the subtraction will > still be atomic.] Essentially, the (old) value is available for > free, so why not provide it? It might even make sense to always > provide the old value for add/subtract, and have gcc throw away > the unused output (unless the input value is reused, when we'd > lose a whole register to the xadd). This might very well be a good idea to do. Having each of the atomic ops that currently return void return the new value (probably easiest to do that). The ia64 has a 'fetchadd' instruction for example. > Even if the processor does not support xadd-like operations, it > can be emulated (more expensively) using load, add, cmpxchg, loop- > if-failed [similar to the code frag below, but with a while() loop.] > But by providing a primitive, it can be optimized much further > than just C code using an atomic cmpxchg operation (and is > "negative cost" on x86 -- faster than the non-atomic version). Yes, an atomic_cmpset() loop is how the atomic ops are performed on the ia64. > In Julian's acquire_writer, we need to do an atomic compare-and-swap > operation, instead of assuming two operations are atomic (because the > above acquire_reader code could be executed between the two following > statements): > > => if ((ngq->q_flags & (~SINGLE_THREAD_ONLY)) == 0) { > => atomic_add_long(&ngq->q_flags, WRITER_ACTIVE); > > Here is a possible code sequence to "just get it working" (at least > I *think* this fixes the alleged problem): > [ register int flags; ] > => flags = ngq->q_flags; > => if ((flags & (~SINGLE_THREAD_ONLY) == 0) && > => atomic_cmpset(&ngq->q_flags, flags, flags + WRITER_ACTIVE)) { Yes, this does look correct. > One more thought on atomic operations: If we don't assume assignments > are atomic, and always use atomic_load and atomic_store, then we a) can > easily provide atomic 64-bit operations on x86 (quick hack would be > to use a single mutex for all 64-bit operations), and b) we can port > to platforms where atomic_add requires a mutex to protect the atomic_add > or atomic_cmpset sequence. [Slow as molasses] On x86, the load/store > macros are NOPs, but the use also (c) makes it clear that we are > manipulating a variable we perform atomic operations on. Note that the only atomic_load and atomic_store primities are those that include memory barriers (and I think they are broken on the x86 for that matter; they need to use a lock'd cmpxchgl in the load case and a lock'd xchgl in the store case I think.) > Kevin -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message