Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Jun 2011 05:31:25 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Warner Losh <imp@bsdimp.com>
Cc:        src-committers@FreeBSD.org, Alan Cox <alc@rice.edu>, Alan Cox <alc@FreeBSD.org>, svn-src-all@FreeBSD.org, Attilio Rao <attilio@FreeBSD.org>, "Bjoern A. Zeeb" <bz@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-head@FreeBSD.org
Subject:   Re: svn commit: r223307 - head/sys/vm
Message-ID:  <20110623050025.O1587@besplex.bde.org>
In-Reply-To: <82E2B828-97C9-4C35-A619-ACDB5C40E99B@bsdimp.com>
References:  <201106191913.p5JJDOqJ006272@svn.freebsd.org> <BBC34F79-FFA7-4A05-83B3-DE17E0AB14D0@FreeBSD.org> <20110622063258.D2275@besplex.bde.org> <BANLkTi=7WnYPQRwE4Hi472DuJz91d1sK=g@mail.gmail.com> <4E0128FF.6020804@rice.edu> <BEDB4F71-8151-4799-9272-8CE79CDA6C17@bsdimp.com> <BANLkTinZofib5HSjOfeb9qx5QXuUzGonCw@mail.gmail.com> <82E2B828-97C9-4C35-A619-ACDB5C40E99B@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 22 Jun 2011, Warner Losh wrote:

> On Jun 22, 2011, at 3:26 AM, Attilio Rao wrote:
>
>> 2011/6/22 Warner Losh <imp@bsdimp.com>:
>>>
>>> On Jun 21, 2011, at 5:27 PM, Alan Cox wrote:
>>>
>>> On 06/21/2011 16:09, Attilio Rao wrote:
>>>
>>> 2011/6/21 Bruce Evans<brde@optusnet.com.au>:
>>>
>>> On Tue, 21 Jun 2011, Bjoern A. Zeeb wrote:
>>>
>>> On Jun 19, 2011, at 7:13 PM, Alan Cox wrote:
>>>
>>> Hi Alan,
>>>
>>> Author: alc
>>>
>>> Date: Sun Jun 19 19:13:24 2011
>>>
>>> New Revision: 223307
>>>
>>> URL: http://svn.freebsd.org/changeset/base/223307
>>>
>>> Log:
>>>
>>>  Precisely document the synchronization rules for the page's dirty field.
>>>
>>>  (Saying that the lock on the object that the page belongs to must be
>>>
>>> held
>>>
>>>  only represents one aspect of the rules.)
>>>
>>>  Eliminate the use of the page queues lock for atomically performing
>>>
>>> read-
>>>
>>>  modify-write operations on the dirty field when the underlying
>>>
>>> architecture
>>>
>>>  supports atomic operations on char and short types.
>>>
>>>  Document the fact that 32KB pages aren't really supported.
>>>
>>> contrary to the tinderbox I'd like to point out that all mips kernels
>>>
>>> built by universe are broken with a SVN HEAD from earlier today.  Could you
>>>
>>> please check and see if you can fix it?  The errors I get are:
>>>
>>> vm_page.o: In function `vm_page_clear_dirty':
>>>
>>> /sys/vm/vm_page.c:(.text+0x18d0): undefined reference to `atomic_clear_8'
>>>
>>> /sys/vm/vm_page.c:(.text+0x18d0): relocation truncated to fit: R_MIPS_26
>>>
>>> against `atomic_clear_8'
>>>
>>> vm_page.o: In function `vm_page_set_validclean':
>>>
>>> /sys/vm/vm_page.c:(.text+0x38f0): undefined reference to `atomic_clear_8'
>>>
>>> /sys/vm/vm_page.c:(.text+0x38f0): relocation truncated to fit: R_MIPS_26
>>>
>>> against `atomic_clear_8'
>>>
>>> Atomic types shorter than int cannot be used in MI code, since they might
>>>
>>> not exist.  Apparently they don't exist on mips.  jake@ fixed all their
>>>
>>> old uses for sparc4 in ~Y2K.
>>>
>>> I'm sure they do, they exist in support.S though and may not have the
>>>
>>> _8 form (they may just have the _char version). I may look at the code
>>>
>>> again to be sure.
>>>
>>>
>>> It appears that while mips/include/atomic.h declares the existence of
>>> atomic_clear_8, mips/mips/support.S doesn't implement it.  In other words,
>>> only support for int's and short's is currently implemented, not char's:
>>>
>>> # grep atomic_clear mips/mips/support.S
>>> * atomic_clear_32(u_int32_t *a, u_int32_t b)
>>> LEAF(atomic_clear_32)
>>> END(atomic_clear_32)
>>> * atomic_clear_16(u_int16_t *a, u_int16_t b)
>>> LEAF(atomic_clear_16)
>>> END(atomic_clear_16)
>>>
>>> The current crop of atomic*16 and atomic*8 functions have the restriction
>>> that the address must be 32-bit aligned (and it forces this by aligning to
>>> 32-bits silently and then operates on the low 8 or 16 bits in that word!)
>>> I'm guessing that this is likely just wrong.  Comments?
>>> Warner
>>
>> That is wrong, of course, and my personal opinion is that one should
>> not implement atomic operations if they cannot be done efficiently
>> (example: if you need to disable interrupts or similar expensive
>> operation just to assure atomicity of operation, just don't support
>> it) as long as not having _8/_char is perfectly fine.
>
> I think it can be efficient, for some reasonable definition of efficient.  The masking and shifting operations aren't that onerous to write and the instructions cycles would be dwarfed by the ll/sc pair, which are required to implement atomic.
>
> The issue is that the code today is clearly wrong and needs to be fixed.

I think it is still good to discourage use of non-bus-width atomic ops
by not supporting at the MI level them for any arch.

Suppose only bus-width accesses are atomic, as seems to be the case
on mips.  Then you can still do atomic accesses on bytes by doing an
atomic op on the neigbouring 3 or 7 bytes.  But this may increase
memory contention by a factor of 3 or 7.  The contention can be
reduced either by putting only rarely-accessed bytes near the bytes
that need the atomic ops, or by putting unused padding there, or
by not using byte-sized objects to begin with (this gives the same
effect and even the same memory accesses as unused padding bytes --
there are unused bits in the objects instead of external unused bytes).
The former can be arranged easily enough in the vm_page struct.  But
MI code shouldn't know about this.

Suppose byte accesses are atomic at the instruction level.  Then the
hardware probably still needs to expand to the bus width.  Depending
on how well it virtualizes this, there may be the same factor of 3, 7
or more in memory contention.  The relevant bus width may be the cache
line size which on modern CPUs is much more than 4 bytes and probably
more like 32.  So the CPU must already being doing good virtualization
to make 4-byte atomic accesses not too slow (non-atomic accesses are
already virtualized by the caches).  For efficiency, we really should
pay more attention to packing related int variables into cache lines
(either the same line to increase cache hits or separate lines, to
reduce contention), but the best way to do this is very MD and unclear.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110623050025.O1587>