From owner-freebsd-hackers@freebsd.org Thu Mar 14 12:51:34 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D95D11541207; Thu, 14 Mar 2019 12:51:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 21B69763BB; Thu, 14 Mar 2019 12:51:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x2ECpK3K056767 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 14 Mar 2019 14:51:24 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x2ECpK3K056767 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x2ECpJwD056766; Thu, 14 Mar 2019 14:51:19 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 14 Mar 2019 14:51:19 +0200 From: Konstantin Belousov To: Mark Millard Cc: Bruce Evans , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: TSC "skew" (was: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]) Message-ID: <20190314125119.GG2492@kib.kiev.ua> References: <20190303041441.V4781@besplex.bde.org> <20190303111931.GI68879@kib.kiev.ua> <20190303223100.B3572@besplex.bde.org> <20190303161635.GJ68879@kib.kiev.ua> <20190304043416.V5640@besplex.bde.org> <20190304114150.GM68879@kib.kiev.ua> <20190305031010.I4610@besplex.bde.org> <20190305223415.U1563@besplex.bde.org> <20190313190558.GB2492@kib.kiev.ua> <3C3486AE-DA3A-49DF-BAA5-139D4E99FADB@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3C3486AE-DA3A-49DF-BAA5-139D4E99FADB@yahoo.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Mar 2019 12:51:34 -0000 On Wed, Mar 13, 2019 at 02:47:35PM -0700, Mark Millard wrote: > > > On 2019-Mar-13, at 12:05, Konstantin Belousov wrote: > > >> . . . > > I am not sure I follow. MFENCE is documented by wording that implies, > > without any doubts, that store buffers are flushed before the > > instruction is retired. It is not so obvious for SFENCE, which > > sounds like a real fence instead of full flush, at least for normal > > write-back memory where it is NOP as far as ISA is considered. > > > > It is known and documented in optimization manuals that locked > > operations are much more efficient, but locked ops are only documented > > to ensure ordering and not flush. So SFENCE is not suitable as our > > barrier. > > What I've seen in papers for the C++ Load/Store Seq Cst mappings to > processors is: > > For write-fencing style: > > Load Seq Cst: MOV from memory > Store Seq Cst alternative 0: XCHG (which as an implicit lock prefix) > Store Seq Cst alternative 1: MOV into memory; MFENCE > > For read-fencing style: > > Load Seq Cst alternative 0: LOCK XADD(0) > Load Seq Cst alternative 1: MFENCE; MOV from memory > Store Seq Cst: MOV into memory > > There is also: > > Seq Cst Fence: MFENCE > > I've never seen SFENCE (or LFENCE) suggested for any of the above. I do not discuss implementation of the C++11 memory model primitives. FWIW, FreeBSD atomic(9) uses more optimal variant of what you call read-fencing style on x86. I did not looked (or rather, do not remember what I saw) at the implementation of C1x memory model load_acq and store_rel in clang and gcc. My text above is about 1. ensuring that RDTSC is executed not earlier than the previous instructions in the program order, and 2. making stores from the server thread visible to the subordinate thread as soon as possible, so that the store buffer latency was not accounted for the RDTSC inter-core communication latency. > > I would expect for C++ Seq Cst that the XCHG and the LOCK XADD(0) > would need to flush store buffers in order for those alternatives > to be valid for C++ Seq Cst. > > I've seen a reference to a "locked identity operation to the top of > stack" as another form of locked style of Seq Cst fencing. > > (write-fencing and read-fencing can not be generally mixed for Seq > Cst: they do not inter-operate.) > > > And, the second point, LFENCE there does not work as barrier for IPC. > > It only ensures that RDTSC is not started earlier than the previous > > instructions. No store buffer flushing is done. > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar)