From owner-freebsd-arch@FreeBSD.ORG Tue Aug 19 19:24:17 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 62593AD6 for ; Tue, 19 Aug 2014 19:24:17 +0000 (UTC) Received: from mail-ie0-x22b.google.com (mail-ie0-x22b.google.com [IPv6:2607:f8b0:4001:c03::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2B0F9366D for ; Tue, 19 Aug 2014 19:24:17 +0000 (UTC) Received: by mail-ie0-f171.google.com with SMTP id at1so1726811iec.30 for ; Tue, 19 Aug 2014 12:24:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=AhsNDcdclo1l2TRZsvQfLnLnR5DxqhmfcXkd/jJWcPc=; b=BNJrdKhVfXWJuAXB7PLbYJQ1f2XHe/S1cUUOsRgv2j/hQGSjHyDMOnWA931HCpuhO/ 7Nrebe6UCFNRsrOBGkSOvqZ0JVpmP8ESUBDvGxjFTka+s4V49HDvvnhDY+RE7kNsqSrV OumDKpSiltmk5hHj5HAeKvIrkq1kVMmHNWAaiKapROlL/zyxeG1IqnlnxlO+i+h6XeZO YhuS0+wEmV+4ePOsTZK8O57yLWtkU4zac8MMosYM8ZKH4qPJkULmk3zyiz3Vv0GqBiqV 7WB5L29wvH44V4UNYkfCEV7BBD5Vhu0jBPd8j63bhuXMGY2NhIQDHWUuy++fvfsFy7FM rO9g== MIME-Version: 1.0 X-Received: by 10.43.164.130 with SMTP id ms2mr44412552icc.9.1408476256600; Tue, 19 Aug 2014 12:24:16 -0700 (PDT) Received: by 10.43.17.196 with HTTP; Tue, 19 Aug 2014 12:24:16 -0700 (PDT) Reply-To: alc@freebsd.org In-Reply-To: <20140817012646.GA21025@dft-labs.eu> References: <1408064112-573-1-git-send-email-mjguzik@gmail.com> <1408064112-573-2-git-send-email-mjguzik@gmail.com> <20140816093811.GX2737@kib.kiev.ua> <20140816185406.GD2737@kib.kiev.ua> <20140817012646.GA21025@dft-labs.eu> Date: Tue, 19 Aug 2014 14:24:16 -0500 Message-ID: Subject: Re: [PATCH 1/2] Implement simple sequence counters with memory barriers. From: Alan Cox To: Mateusz Guzik Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Konstantin Belousov , Johan Schuijt , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 19:24:17 -0000 On Sat, Aug 16, 2014 at 8:26 PM, Mateusz Guzik wrote: > On Sat, Aug 16, 2014 at 09:54:06PM +0300, Konstantin Belousov wrote: > > On Sat, Aug 16, 2014 at 12:38:11PM +0300, Konstantin Belousov wrote: > > > On Fri, Aug 15, 2014 at 02:55:11AM +0200, Mateusz Guzik wrote: > > > > --- > > > > sys/sys/seq.h | 126 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 126 insertions(+) > > > > create mode 100644 sys/sys/seq.h > > > > > > > > diff --git a/sys/sys/seq.h b/sys/sys/seq.h > > > > new file mode 100644 > > > > index 0000000..0971aef > > > > --- /dev/null > > > > +++ b/sys/sys/seq.h > [..] > > > > +#ifndef _SYS_SEQ_H_ > > > > +#define _SYS_SEQ_H_ > > > > + > > > > +#ifdef _KERNEL > > > > + > > > > +/* > > > > + * Typical usage: > > > > + * > > > > + * writers: > > > > + * lock_exclusive(&obj->lock); > > > > + * seq_write_begin(&obj->seq); > > > > + * ..... > > > > + * seq_write_end(&obj->seq); > > > > + * unlock_exclusive(&obj->unlock); > > > > + * > > > > + * readers: > > > > + * obj_t lobj; > > > > + * seq_t seq; > > > > + * > > > > + * for (;;) { > > > > + * seq =3D seq_read(&gobj->seq); > > > > + * lobj =3D gobj; > > > > + * if (seq_consistent(&gobj->seq, seq)) > > > > + * break; > > > > + * cpu_spinwait(); > > > > + * } > > > > + * foo(lobj); > > > > + */ > > > > + > > > > +typedef uint32_t seq_t; > > > > + > > > > +/* A hack to get MPASS macro */ > > > > +#include > > > > +#include > > > > + > > > > +#include > > > > + > > > > +static __inline bool > > > > +seq_in_modify(seq_t seqp) > > > > +{ > > > > + > > > > + return (seqp & 1); > > > > +} > > > > + > > > > +static __inline void > > > > +seq_write_begin(seq_t *seqp) > > > > +{ > > > > + > > > > + MPASS(!seq_in_modify(*seqp)); > > > > + (*seqp)++; > > > > + wmb(); > > > This probably ought to be written as atomic_add_rel_int(seqp, 1); > > Alan Cox rightfully pointed out that better expression is > > v =3D *seqp + 1; > > atomic_store_rel_int(seqp, v); > > which also takes care of TSO on x86. > > > > Well, my memory-barrier-and-so-on-fu is rather weak. > > I had another look at the issue. At least on amd64, it looks like only > compiler barrier is required for both reads and writes. > > According to AMD64 Architecture Programmer=E2=80=99s Manual Volume 2: Sys= tem > Programming, 7.2 Multiprocessor Memory Access Ordering states: > > "Loads do not pass previous loads (loads are not reordered). Stores do > not pass previous stores (stores are not reordered)" > > Since the code modifying stuff only performs a series of writes and we > expect exclusive writers, I find it applicable to this scenario. > > I checked linux sources and generated assembly, they indeed issue only > a compiler barrier on amd64 (and for intel processors as well). > > atomic_store_rel_int on amd64 seems fine in this regard, but the only > function for loads issues lock cmpxhchg which kills performance > (median 55693659 -> 12789232 ops in a microbenchmark) for no gain. > > Additionally release and acquire semantics seems to be a stronger than > needed guarantee. > > This statement left me puzzled and got me to look at our x86 atomic.h for the first time in years. It appears that our implementation of atomic_load_acq_int() on x86 is, umm ..., unconventional. That is, it is enforcing a constraint that simple acquire loads don't normally enforce. For example, the C11 stdatomic.h simple acquire load doesn't enforce this constraint. Moreover, our own implementation of atomic_load_acq_int() on ia64, where the mapping from atomic_load_acq_int() to machine instructions is straightforward, doesn't enforce this constraint either. Give us a chance to sort this out before you do anything further. As Kostik said, but in different words, we've always written our machine-independent layer code using acquires and releases to express the required ordering constraints and not {r,w}mb() primitives. > As far as sequence counters go, we should be able to get away with > making the following: > - all relevant reads are performed between given points > - all relevant writes are performed between given points > > As such, I propose introducing another atomic_* function variants > (or stealing smp_{w,r,}mb idea from linux) which provide just that. > > So for amd64 reading guarantee and writing guarantee could be provided > in the same way with a compiler barrier. > > > > Same note for all other linux-style barriers. In fact, on x86 > > > wmb() is sfence and it serves no useful purpose in seq_write*. > > > > > > Overall, it feels too alien and linux-ish for my taste. > > > Since we have sequence bound to some lock anyway, could we introduce > > > some sort of generation-aware locks variants, which extend existing > > > locks, and where lock/unlock bump generation number ? > > Still, merging it to the guts of lock implementation is right > > approach, IMO. > > > > Current usage would be along with filedesc (sx) lock. The lock protects > writes to entire fd table (and lock holders can block in malloc), while > each file descriptor has its own counter. Also areas covered by seq are > short and cannot block. > > As such, I don't really see any way to merge the lock with the counter. > > I agree it would be useful, provided area protected by the lock would be > the same as the one protected by the counter. If this code hits the tree > and one day turns out someone needs such functionality, there should not > be any problems (apart from time effort) in implementing this. > > -- > Mateusz Guzik > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >