Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 May 2011 16:46:34 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        Max Laier <max@love2party.net>, FreeBSD current <freebsd-current@FreeBSD.org>, neel@FreeBSD.org, Peter Grehan <grehan@FreeBSD.org>
Subject:   Re: proposed smp_rendezvous change
Message-ID:  <4DD27C3A.3040509@FreeBSD.org>
In-Reply-To: <4DD26256.2070008@FreeBSD.org>
References:  <4DCD357D.6000109@FreeBSD.org> <201105161421.27665.jhb@freebsd.org> <4DD17AB3.1070606@FreeBSD.org> <201105161609.21898.jhb@freebsd.org> <4DD22BD9.6070504@FreeBSD.org> <4DD26256.2070008@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 17/05/2011 14:56 John Baldwin said the following:
> On 5/17/11 4:03 AM, Andriy Gapon wrote:
>> Couldn't [Shouldn't] the whole:
>>
>>>>>       /* Ensure we have up-to-date values. */
>>>>>       atomic_add_acq_int(&smp_rv_waiters[0], 1);
>>>>>       while (smp_rv_waiters[0]<  smp_rv_ncpus)
>>>>>           cpu_spinwait();
>>
>> be just replaced with:
>>
>> rmb();
>>
>> Or a proper MI function that does just a read memory barrier, if rmb() is not that.
> 
> No, you could replace it with:
> 
>     atomic_add_acq_int(&smp_rv_waiters[0], 1);

What about
(void)atomic_load_acq(&smp_rv_waiters[0]);

In my opinion that should ensure that the hardware must post the latest value from
a master CPU to memory of smp_rv_waiters[0] and a slave CPU gets it from there.
And also, because of memory barriers inserted by store_rel on the master CPU and
load_acq on the slave CPU, the latest values of all other smp_rv_* fields should
become visible to the slave CPU.

> The key being that atomic_add_acq_int() will block (either in hardware or
> software) until it can safely perform the atomic operation.  That means waiting
> until the write to set smp_rv_waiters[0] to 0 by the rendezvous initiator is
> visible to the current CPU.
> 
> On some platforms a write by one CPU may not post instantly to other CPUs (e.g. it
> may sit in a store buffer).  That is fine so long as an attempt to update that
> value atomically (using cas or a conditional-store, etc.) fails.  For those
> platforms, the atomic(9) API is required to spin until it succeeds.
> 
> This is why the mtx code spins if it can't set MTX_CONTESTED for example.
> 

Thank you for the great explanation!
Taking sparc64 as an example, I think that atomic_load_acq uses a degenerate cas
call, which should take care of hardware synchronization.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DD27C3A.3040509>