Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Dec 2016 12:10:01 -0800
From:      Adrian Chadd <adrian.chadd@gmail.com>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        David Chisnall <David.Chisnall@cl.cam.ac.uk>, Alan Somers <asomers@freebsd.org>,  "current@freebsd.org" <current@freebsd.org>
Subject:   Re: best approximation of getcpu() ?
Message-ID:  <CAJ-VmonnwTR%2Bh%2BmmXhjMFN%2B=0m02BKZqVvb99Jg=xYSUvzZONA@mail.gmail.com>
In-Reply-To: <20161216194519.GA71398@onelab2.iet.unipi.it>
References:  <20161216021719.GA63374@onelab2.iet.unipi.it> <CAOtMX2hdkCk3ho%2Byedpv7iPPi97be4eFViYm4%2Bmi8EC-iR2Uvg@mail.gmail.com> <D9F98972-ED18-4B59-AB3A-73B89F3C220D@cl.cam.ac.uk> <20161216194519.GA71398@onelab2.iet.unipi.it>

next in thread | previous in thread | raw e-mail | index | archive | help
On 16 December 2016 at 11:45, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
> On Fri, Dec 16, 2016 at 09:29:15AM +0000, David Chisnall wrote:
>> On 16 Dec 2016, at 03:10, Alan Somers <asomers@FreeBSD.org> wrote:
>> >
>> > What about pthread_setaffinity(3) and friends?  You can use it to pin
>> > a thread to a single CPU, and know that it will never migrate.
>>
>> This is not a useable solution for anything that needs to live in a libr=
ary and also doesn???t solve the problem.
>>
>> The Linux get_cpu call() is used for caches that are somewhere between g=
lobal and thread-local.  Accessing them still requires a lock, but it???s v=
ery likely to be uncontended (contention only happens when you???re context=
 switched at exactly the wrong time, or if a thread is migrated between cor=
es in between the get_cpu() call and usage) and so you can use the userspac=
e fast path for the lock and not suffer from cache contention effects.
>>
>> One x86, you can use cpuid from userspace and get the current core ID.  =
I have some code that does this and re-checks every few hundred accesses, s=
toring the current CPU ID in a thread-local variable.  Using the per-CPU ca=
ches is a lot faster than using the global cache (and reduces contention on=
 the global cache).  It would be great if we could have a syscall to do thi=
s on FreeBSD (it would be even better if we could have specify a TLS variab=
le that the kernel automatically updates for the userspace thread when the =
scheduler migrates the thread between cores).
>
> indeed the following line seems to do the job for x86
>         asm volatile("cpuid" : "=3Dd"(curcpu), "=3Da"(tmp), "=3Db"(tmp), =
"=3Dc"(tmp) : "a"(0xb) );
> (there must be a better way to tell the compiler that eax, ebx, ecx, edx =
are
> all clobbered).
>
> 0xb is the CPUID function that returns the current APIC id for the
> core (not necessarily matching the OS core-id)
>
> The only problem is that this instruction is serialising and slow,
> seems to take some 70-100ns on several of my machines so you
> cannot afford to call it at all times but need the value cached
> somewhere. Exposing it as thread local storage, or a VDSO syscall,
> would be nicer because the kernel knows when it is actually changing
> value.

The problem is your CPU ID can change in the middle of packet handling.

So if you want it to be accurate, you need to bind your worker thread to a =
CPU.



-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonnwTR%2Bh%2BmmXhjMFN%2B=0m02BKZqVvb99Jg=xYSUvzZONA>