Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Jan 2001 11:39:29 -0700 (MST)
From:      Kevin Van Maren <vanmaren@fast.cs.utah.edu>
To:        jhb@freebsd.org, vanmaren@fast.cs.utah.edu
Cc:        smp@freebsd.org
Subject:   Re: atomic increment?
Message-ID:  <200101031839.LAA08694@fast.cs.utah.edu>

next in thread | raw e-mail | index | archive | help
> > load/cmpxchg code shouldn't be too hard: you just need a scratch
> > register, set it equal to eax (and ANY garbage value), and LOCK
> > cmpxchg it with the address.  The read value is placed in (part of,
> > for 8/16 bit ops) eax.  One register-register mov and one atomic
> > memory RMW cycle (which works because Intel always does the RMW
> > cycle; it writes back the original value if the cmp fails, and
> > eax will always contain the value that was in memory).  Should
> > be a pretty efficient inline asm.
> > 
> > Do you want me to send in a patch?
> 
> If you'd like, sure.  I was planning to use a lock'd xchgl for the store, but
> wasn't sure if that would work properly.  Trying to use a cmpxchgl for the
> store would get ugly as you would have to do a loop until it actually did a
> store.  Yuck.

For the store you'd want to do a straight xchgl, which will even
return the old value (see my previous message about adding variants
that return the old values...).
We can even save a byte in the code by not using the LOCK prefix, since
lock is always asserted for xchg.

Hmmm.  Actually, with writes being visible in-order on IA32, even on
SMP, the store's release semantics may be okay as is?  Okay, now I'M
confusing myself.  It won't provide memory fence semantics or strong
ordering, but that isn't required by a store_rel.  If the PPro retires
instructions in order, and doesn't "execute" writes until they are
retired, then the write can't complete before a previous read, so the
current store code already has release semantics.  Right?
On IA64, a release doesn't have to occur before a following acquire;
so we can allow a subsequent read to "pass" this store, as long as
the CPU executes the store after all previos instructions.

Certainly the read needs to be changed to provide acquire semantics.

I'll look at writing a patch when I get off work tonight...

Maybe I'll look at writing 64-bit operations using cmpxchg8b too (but
that requires a Pentium -- and there were some dual 486 PCs built, not
that we'd ever support the second CPU on them anyway).  On uni-procs,
we can disable interrupts and be okay.

Kevin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200101031839.LAA08694>