Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Aug 2014 03:26:47 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Johan Schuijt <johan@transip.nl>, freebsd-arch@freebsd.org
Subject:   Re: [PATCH 1/2] Implement simple sequence counters with memory barriers.
Message-ID:  <20140817012646.GA21025@dft-labs.eu>
In-Reply-To: <20140816185406.GD2737@kib.kiev.ua>
References:  <1408064112-573-1-git-send-email-mjguzik@gmail.com> <1408064112-573-2-git-send-email-mjguzik@gmail.com> <20140816093811.GX2737@kib.kiev.ua> <20140816185406.GD2737@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 16, 2014 at 09:54:06PM +0300, Konstantin Belousov wrote:
> On Sat, Aug 16, 2014 at 12:38:11PM +0300, Konstantin Belousov wrote:
> > On Fri, Aug 15, 2014 at 02:55:11AM +0200, Mateusz Guzik wrote:
> > > ---
> > >  sys/sys/seq.h | 126 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 126 insertions(+)
> > >  create mode 100644 sys/sys/seq.h
> > > 
> > > diff --git a/sys/sys/seq.h b/sys/sys/seq.h
> > > new file mode 100644
> > > index 0000000..0971aef
> > > --- /dev/null
> > > +++ b/sys/sys/seq.h
[..]
> > > +#ifndef _SYS_SEQ_H_
> > > +#define _SYS_SEQ_H_
> > > +
> > > +#ifdef _KERNEL
> > > +
> > > +/*
> > > + * Typical usage:
> > > + *
> > > + * writers:
> > > + * 	lock_exclusive(&obj->lock);
> > > + * 	seq_write_begin(&obj->seq);
> > > + * 	.....
> > > + * 	seq_write_end(&obj->seq);
> > > + * 	unlock_exclusive(&obj->unlock);
> > > + *
> > > + * readers:
> > > + * 	obj_t lobj;
> > > + * 	seq_t seq;
> > > + *
> > > + * 	for (;;) {
> > > + * 		seq = seq_read(&gobj->seq);
> > > + * 		lobj = gobj;
> > > + * 		if (seq_consistent(&gobj->seq, seq))
> > > + * 			break;
> > > + * 		cpu_spinwait();
> > > + * 	}
> > > + * 	foo(lobj);
> > > + */		
> > > +
> > > +typedef uint32_t seq_t;
> > > +
> > > +/* A hack to get MPASS macro */
> > > +#include <sys/systm.h>
> > > +#include <sys/lock.h>
> > > +
> > > +#include <machine/cpu.h>
> > > +
> > > +static __inline bool
> > > +seq_in_modify(seq_t seqp)
> > > +{
> > > +
> > > +	return (seqp & 1);
> > > +}
> > > +
> > > +static __inline void
> > > +seq_write_begin(seq_t *seqp)
> > > +{
> > > +
> > > +	MPASS(!seq_in_modify(*seqp));
> > > +	(*seqp)++;
> > > +	wmb();
> > This probably ought to be written as atomic_add_rel_int(seqp, 1);
> Alan Cox rightfully pointed out that better expression is
> v = *seqp + 1;                                                                  
> atomic_store_rel_int(seqp, v);
> which also takes care of TSO on x86.
> 

Well, my memory-barrier-and-so-on-fu is rather weak.

I had another look at the issue. At least on amd64, it looks like only
compiler barrier is required for both reads and writes.

According to AMD64 Architecture Programmer’s Manual Volume 2: System
Programming, 7.2 Multiprocessor Memory Access Ordering states:

"Loads do not pass previous loads (loads are not reordered). Stores do
not pass previous stores (stores are not reordered)"

Since the code modifying stuff only performs a series of writes and we
expect exclusive writers, I find it applicable to this scenario.

I checked linux sources and generated assembly, they indeed issue only
a compiler barrier on amd64 (and for intel processors as well).

atomic_store_rel_int on amd64 seems fine in this regard, but the only
function for loads issues lock cmpxhchg which kills performance
(median 55693659 -> 12789232 ops in a microbenchmark) for no gain.

Additionally release and acquire semantics seems to be a stronger than
needed guarantee.

As far as sequence counters go, we should be able to get away with
making the following:
- all relevant reads are performed between given points
- all relevant writes are performed between given points

As such, I propose introducing another atomic_* function variants
(or stealing smp_{w,r,}mb idea from linux) which provide just that.

So for amd64 reading guarantee and writing guarantee could be provided
in the same way with a compiler barrier.

> > Same note for all other linux-style barriers.  In fact, on x86
> > wmb() is sfence and it serves no useful purpose in seq_write*.
> > 
> > Overall, it feels too alien and linux-ish for my taste.
> > Since we have sequence bound to some lock anyway, could we introduce
> > some sort of generation-aware locks variants, which extend existing
> > locks, and where lock/unlock bump generation number ?
> Still, merging it to the guts of lock implementation is right
> approach, IMO.
> 

Current usage would be along with filedesc (sx) lock. The lock protects
writes to entire fd table (and lock holders can block in malloc), while
each file descriptor has its own counter. Also areas covered by seq are
short and cannot block.

As such, I don't really see any way to merge the lock with the counter.

I agree it would be useful, provided area protected by the lock would be
the same as the one protected by the counter. If this code hits the tree
and one day turns out someone needs such functionality, there should not
be any problems (apart from time effort) in implementing this.

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140817012646.GA21025>