Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Jul 2006 20:27:01 +0200
From:      "Attilio Rao" <attilio@freebsd.org>
To:        "John Baldwin" <jhb@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
Message-ID:  <3bbf2fe10607261127p3f01a6c3w80027754f7d4e594@mail.gmail.com>
In-Reply-To: <3bbf2fe10607251004wf94e238xb5ea7a31c973817f@mail.gmail.com>
References:  <3bbf2fe10607250813w8ff9e34pc505bf290e71758@mail.gmail.com> <3bbf2fe10607250814m1a476f09p2d962dedc0c99be1@mail.gmail.com> <200607251232.51230.jhb@freebsd.org> <3bbf2fe10607251004wf94e238xb5ea7a31c973817f@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2006/7/25, Attilio Rao <attilio@freebsd.org>:
> 2006/7/25, John Baldwin <jhb@freebsd.org>:
> > On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
> > > 2006/7/25, Attilio Rao <attilio@freebsd.org>:
> > > > Hi,
> > > > Intel documentation points out that having a 128-bytes aligned
> > > > syncronizing primitive  (which fits in a cache line) will minimize the
> > > > traffic for cache bus, so this patch implements an alignment for i386
> > > > on turnstiles.
> > > >
> > > > Any comments, feedbacks?
> > >
> > > Oh, sorry, I've unforgotten the diff.
> > >
> > > Attilio
> >
> > I think a better approach would be to stick turnstiles (and sleepqueues) in a
> > UMA zone and specify cache-size alignment to the zone.  However, turnstiles
> > aren't really sychronization primitives in that you don't spin on a variable
> > inside the structure, and I think it's the spinning and avoiding bouncing
> > cache lines around that Intel's documentation is really about.  In that case,
> > the things you want aligned are things like mutexes, rwlocks, etc.
>
> Well, I think that this is referred in particular to the latter issue
> you mentioned.
> Spinning is not really concerned to cache bus issues (more, in
> particular, datapath latency).
> With this point of view, turnstiles (as sleepqueues) are passed around
> CPUs more than a mutex/rwlock (or a cv), I guess, so I was thinking
> that it's better optimizing turnstile than the real syncronizing
> primitive itself.

This is a patch which let turnstiles/sleepqueues using an UMA zone.

I've tried in my 6.1R branch and it works quite fine, so this HEAD
version might be alright (I've not tried yet, so please test):
http://users.gufi.org/~rookie/works/patches/uma_sync.diff

It, obviously, set default alignment for i386 at 128 bytes.
Any comments, feedbacks, ideas, are welcome.

Attilio

PS: I know that I could simplify *_alloc(), *_free() routines
implementing init/fini but it is simpler and more optimized having
things like so.


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10607261127p3f01a6c3w80027754f7d4e594>