Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Aug 2014 13:31:17 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-arch@freebsd.org
Subject:   Re: Getting rid of atomic_load_acq_int(&fdp->fd_nfiles)) from fget_unlocked
Message-ID:  <20140813124428.K927@besplex.bde.org>
In-Reply-To: <20140813010046.GB17869@dft-labs.eu>
References:  <20140713035500.GC16884@dft-labs.eu> <20140713132521.GY93733@kib.kiev.ua> <20140713133421.GA93733@kib.kiev.ua> <20140813010046.GB17869@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 13 Aug 2014, Mateusz Guzik wrote:

> On Sun, Jul 13, 2014 at 04:34:21PM +0300, Konstantin Belousov wrote:
>> On Sun, Jul 13, 2014 at 04:25:21PM +0300, Konstantin Belousov wrote:
>>> On Sun, Jul 13, 2014 at 05:55:00AM +0200, Mateusz Guzik wrote:
>>>> Currently:
>>>>         /*
>>>>          * Avoid reads reordering and then a first access to the
>>>>          * fdp->fd_ofiles table which could result in OOB operation.
>>>>          */
>>>>         if (fd < 0 || fd >= atomic_load_acq_int(&fdp->fd_nfiles))
>>>>                 return (EBADF);
>>>>
>>>> However, if we put fd_nfiles and fd_otable into one atomically replaced
>>>> structure the only need to:
>>>> 1. make sure the pointer is read once
>>>> 2. issue a data dependency barrier - this is a noop on all supported
>>>> architectures and we don't even have approprate macro, so doing nothing
>>>> seems fine
>>>>
>>>> The motivation is to boost performance to amortize for seqlock cost, in
>>>> case it hits the tree.
>>>>
>>>> This has no impact on races with capability lookup.
>>>>
>>>> In a microbenchmark of 16 threads reading from the same pipe fd
>>>> immediately returning EAGAIN the numbers are:
>>>> x vanilla-readpipe-run-sum
>>>> + noacq-readpipe-run-sum
>>>> [..]
>>>>     N           Min           Max        Median           Avg        Stddev
>>>> x  20      13133671      14900364      13893331      13827075     471500.82
>>>> +  20      59479718      59527286      59496714      59499504     13752.968
>>>> Difference at 95.0% confidence
>>>> 	4.56724e+07 +/- 213483
>>>> 	330.312% +/- 1.54395%
>>>>
>>>> There are 3 steps:
>>>> 1. tidy up capsicum to accept fde:
>>>> http://people.freebsd.org/~mjg/patches/single-fdtable-read-capsicum.patch
>>>> 2. add __READ_ONCE:
>>>> http://people.freebsd.org/~mjg/patches/read-once.patch
>>>> 3. put stuff into one structure:
>>>> http://people.freebsd.org/~mjg/patches/filedescenttable.patch
>>>>
>>>> Comments?
>>>
>>> We use 4-space indent for the continuation lines.  Look at the malloc(9)
>>> call in the patch 3.
>>>
>>> The filedescenttable is really long name.  Could it be, for instance,
>>> fdescenttbl ?
>>>
>>> Other than that, I think that the patches 2 and 3 are fine.  I did not
>>> looked at the patch 1.
>>
>> As an afterthought, you do not need __READ_ONCE(), the __DEVOLATILE() alone
>> would do what you need as well.
>
> Turns out patch 2 was quite bad.

You pressed one of my hot keys (flame thrower).  Anything that uses
__DEVOLATILE() is bad.

> Reading http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf
> (pdf page 77) reveals:
> A cast of a value to a qualified type has no effect; the qualification
> (volatile, say) can have no effect on the access since it has occurred
> prior to the cast. If it is necessary to access a non-volatile object
> using volatile semantics, the technique is to cast the address of the
> object to the appropriate pointer-to-qualified type, then dereference
> that pointer.

I wonder why it spells that out.  It can go without saying that qualifiers
only effect lvalues.

> So how about we just follow the recomandation and also get the type
> automagically like linux folks do (added to sys/param.h):
> #define READ_ONCE(var)  (*(volatile __typeof(var) *)&(var))

Casting this wat is indeed better.  You have a variables that are
usually locked by mutexes or another method.  They are non-volatile
then.  But for some other accesses, they become volatile.

READ_ONCE() is not a good name.  I would have guessed it meant an
atomic read.  I think the ONCE in it just means "at least" once
(instead of possibly zero times).  The semantics of volatile are
unclear except that they force the compiler to generate code that
acts as if it forces the read.  The comment in the patch says that
it is to read the variable "only" once.  Volatile semantics probably
imply that too.  But most uses don't require that.  Reading volatile
hardware status registers requires that, but for software you just
want a non-cached value.

> http://people.freebsd.org/~mjg/patches/read-once.patch

Macros like this are broken for general use, since if the variable
type already has a qualifier in it then adding the same qualifier
gives too many qualifiers -- a constraint error.  __DEVOLATILE() has
similar bugs.  It doesn't work on const-qualified types since it
casts away the const in the first cast that it applies.  That is
just the first bug in it.  I didn't fix this since I burned
__DEVOLATILE() before it reached my tree.  It is easier to fix that
the above since it doesn't dereference anything.  __typeof() is not
very well designed since it doesn't allow breaking up a type into
its components.

Style bugs in the patch:
- space instead of tab after #define.  The previous macro visible in the
   patch has the same bug.  The file has many other instances of this bug.
   That macro also spells __typeof() in gnu style (__typeof__()), and is
   missing parentheses for 'offset'.
- sentence break of 1 space in the comment.  Sentence breaks are 2
   spaces in KNF.  The file has only 2 other instances of this bug.
- grammar and punctuation errors in "Useful e.g.".  The sentence is
   missing an object, and I think commas are needd on each side of
   "e.g.", but that would be too formal so it would be better to
   simplify the wording a bit so as not to need so many commas.

More bugs in the patch:
- further undocumented namespace pollution.  The previous macro visible in
   the patch has 2 underscores in its name so it doesn't have that bug.
   Both of thes macros may belong in <sys/cdefs.h> where the underscores
   are more important.  <sys/param.h> didn't have any with the underscores
   before the previous macro was added.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140813124428.K927>