Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Mar 2019 14:51:19 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        Bruce Evans <brde@optusnet.com.au>, freebsd-hackers Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: TSC "skew" (was: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed])
Message-ID:  <20190314125119.GG2492@kib.kiev.ua>
In-Reply-To: <3C3486AE-DA3A-49DF-BAA5-139D4E99FADB@yahoo.com>
References:  <20190303041441.V4781@besplex.bde.org> <20190303111931.GI68879@kib.kiev.ua> <20190303223100.B3572@besplex.bde.org> <20190303161635.GJ68879@kib.kiev.ua> <20190304043416.V5640@besplex.bde.org> <20190304114150.GM68879@kib.kiev.ua> <20190305031010.I4610@besplex.bde.org> <20190305223415.U1563@besplex.bde.org> <20190313190558.GB2492@kib.kiev.ua> <3C3486AE-DA3A-49DF-BAA5-139D4E99FADB@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 13, 2019 at 02:47:35PM -0700, Mark Millard wrote:
> 
> 
> On 2019-Mar-13, at 12:05, Konstantin Belousov <kostikbel@gmail.com> wrote:
> 
> >> . . .
> > I am not sure I follow. MFENCE is documented by wording that implies,
> > without any doubts, that store buffers are flushed before the
> > instruction is retired.  It is not so obvious for SFENCE, which
> > sounds like a real fence instead of full flush, at least for normal
> > write-back memory where it is NOP as far as ISA is considered.
> > 
> > It is known and documented in optimization manuals that locked
> > operations are much more efficient, but locked ops are only documented
> > to ensure ordering and not flush.  So SFENCE is not suitable as our
> > barrier.
> 
> What I've seen in papers for the C++ Load/Store Seq Cst mappings to 
> processors is:
> 
> For write-fencing style:
> 
> Load Seq Cst:                MOV from memory
> Store Seq Cst alternative 0: XCHG (which as an implicit lock prefix)
> Store Seq Cst alternative 1: MOV into memory; MFENCE
> 
> For read-fencing style:
> 
> Load Seq Cst alternative 0: LOCK XADD(0)
> Load Seq Cst alternative 1: MFENCE; MOV from memory
> Store Seq Cst:              MOV into memory
> 
> There is also:
> 
> Seq Cst Fence: MFENCE
> 
> I've never seen SFENCE (or LFENCE) suggested for any of the above.
I do not discuss implementation of the C++11 memory model primitives.
FWIW, FreeBSD atomic(9) uses more optimal variant of what you call
read-fencing style on x86.  I did not looked (or rather, do not
remember what I saw) at the implementation of C1x memory model
load_acq and store_rel in clang and gcc.

My text above is about
1. ensuring that RDTSC is executed not earlier
than the previous instructions in the program order, and 
2. making stores from the server thread visible to the subordinate
thread as soon as possible, so that the store buffer latency was
not accounted for the RDTSC inter-core communication latency.

> 
> I would expect for C++ Seq Cst that the XCHG and the LOCK XADD(0)
> would need to flush store buffers in order for those alternatives
> to be valid for C++ Seq Cst.
> 
> I've seen a reference to a "locked identity operation to the top of
> stack" as another form of locked style of Seq Cst fencing.
> 
> (write-fencing and read-fencing can not be generally mixed for Seq
> Cst: they do not inter-operate.)
> 
> > And, the second point, LFENCE there does not work as barrier for IPC.
> > It only ensures that RDTSC is not started earlier than the previous
> > instructions.  No store buffer flushing is done.
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190314125119.GG2492>