Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Jan 2018 10:03:32 +0000
From:      David Chisnall <theraven@FreeBSD.org>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: Programmatically cache line
Message-ID:  <71E8D6E7-F833-4B7E-B1F1-AD07A49CAF98@FreeBSD.org>
In-Reply-To: <35d2d373-92f1-499f-f470-e4528b08b937@freebsd.org>
References:  <CALM2mEmWYz5nyqvxMJwMWoFOXnDTvWFrEug7UUha6xe7Um6ODw@mail.gmail.com> <20171230082812.GL1684@kib.kiev.ua> <CAJ-VmomxGJsn8eOtWoqevdW-spUPgcSGKEc7eR4xuXLP-E1XRA@mail.gmail.com> <08038E36-9679-4286-9083-FCEDD637ADCC@FreeBSD.org> <20180101103655.GF1684@kib.kiev.ua> <CABh_MK=2uvPoNCg7qL14yVuxo_%2BHVSvccLTBAnRAHNzqor--0g@mail.gmail.com> <35d2d373-92f1-499f-f470-e4528b08b937@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 3 Jan 2018, at 22:12, Nathan Whitehorn <nwhitehorn@freebsd.org> =
wrote:
>=20
> On 01/03/18 13:37, Ed Schouten wrote:
>> 2018-01-01 11:36 GMT+01:00 Konstantin Belousov <kostikbel@gmail.com>:
>>>>>> On x86, the CPUID instruction leaf 0x1 returns the information in
>>>>>> %ebx register.
>>>>> Hm, weird. Why don't we extend sysctl to include this info?
>>> For the same reason we do not provide a sysctl to add two integers.
>> I strongly agree with Kostik on this one. Why add stuff to the =
kernel,
>> if userspace is already capable of extracting this? Adding that stuff
>> to sysctl has the downside that it will effectively introduce yet
>> another FreeBSDism, whereas something generic already exists.
>>=20
>=20
> Well, kind of. The userspace version is platform-dependent and not =
always available: for example, on PPC, you can't do this from userland =
and we provide a sysctl machdep.cacheline_size to userland. It would be =
nice to have an MI API.

On ARMv8, similarly, sometimes the kernel needs to advertise the wrong =
size.  A few big.LITTLE cores have 64-byte cache lines on one cluster =
and 32-byte on the other.  If you query the size from userspace while =
running on a 64-byte cluster, then issue the zero-cache-line instruction =
while migrated to the 32-byte cluster, you only clear half the size.  =
Linux works around this by trapping and emulating the instruction to =
query the cache size and always reporting the size for the smallest =
cache lines.  ARM tells people not to build systems like this, but it =
doesn=E2=80=99t always stop them.  Trapping and emulating is much slower =
than just providing the information in a shared page, elf aux args =
vector, or even (often) a system call.

To give another example, Linux provides a very cheap way for a userspace =
process to enquire which core it=E2=80=99s running on.  Some more recent =
high-performance mallocs use this to have a second-layer per-core cache =
after the per-thread cache for free blocks.  Unlike the per-thread =
cache, the per-core cache does need a lock, but it=E2=80=99s very =
unlikely to be contended (it will only be contended if either a thread =
is migrated in between checking and locking, so acquires the wrong =
CPU=E2=80=99s lock, or if a thread is preempted in the middle of middle =
of the very brief fill operation).  The author of the SuperMalloc paper =
tried doing this with CPUID and found that it was slower by a sufficient =
margin to almost entirely offset the benefits of the extra layer of =
caching. =20

Just because userspace can get at the information directly from the =
hardware doesn=E2=80=99t mean that this is the most efficient or best =
way for userspace to get at it.

Oh, and some of these things are useful in portable code, so having to =
write some assembly for every target to get information that the kernel =
already knows is wasteful.

David




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?71E8D6E7-F833-4B7E-B1F1-AD07A49CAF98>