Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Jun 2007 11:51:18 -0700 (PDT)
From:      Jeff Roberson <jroberson@chesapeake.net>
To:        John Baldwin <jhb@freebsd.org>
Cc:        marcel@freebsd.org, kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, arch@freebsd.org, jake@freebsd.org, freebsd-arch@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org
Subject:   Re: New cpu_switch() and cpu_throw().
Message-ID:  <20070605114745.I606@10.0.0.1>
In-Reply-To: <200706051012.18864.jhb@freebsd.org>
References:  <20070604220649.E606@10.0.0.1> <200706051012.18864.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 5 Jun 2007, John Baldwin wrote:

> On Tuesday 05 June 2007 01:32:46 am Jeff Roberson wrote:
>> For every architecture we need to support a new features in cpu_switch()
>> and cpu_throw() before they can support per-cpu schedlock.  I'll describe
>> those below.  I'm soliciting help or advice in implementing these on
>> platforms other than x86, and amd64, especially on ia64 where things are
>> implemented in C!
>>
>> I checked in the new version of cpu_switch() for amd64 today after
>> threadlock went in.  Basically, we have to release a thread's lock when
>> it's switched out and acquire a lock when it's switched in.
>>
>> The release must happen after we're totally done with the stack and
>> vmspace of the thread to be switched out.  On amd64 this meant after we
>> clear the active bits for tlb shootdown.  The release actually makes use
>> of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to
>> this argument rather than unlocking a real lock.  td_lock has previously
>> been set to the blocked lock, which is always blocked.  Threads
>> spinning in thread_lock() will notice the td_lock pointer change and
>> acquire the new lock.  So this is simple, just a non-atomic store with a
>> pointer passed as an argument.  On amd64:
>>
>>  	movq	%rdx, TD_LOCK(%rdi)		/* Release the old thread */
>>
>> The acquire part is slightly more complicated and involves a little loop.
>> We don't actually have to spin trying to lock the thread.  We just spin
>> until it's no longer set to the blocked lock.  The switching thread
>> already owns the per-cpu scheduler lock for the current cpu.  If we're
>> switching into a thread that is set to the blocked_lock another cpu is
>> about to set it to our current cpu's lock via the mtx argument mentioned
>> above.  On amd64 we have:
>>
>>  	/* Wait for the new thread to become unblocked */
>>  	movq	$blocked_lock, %rdx
>> 1:
>>  	movq	TD_LOCK(%rsi),%rcx
>>  	cmpq	%rcx, %rdx
>>  	je	1b
>
> If this is to handle a thread migrating from one CPU to the next (and there's
> no interlock to control migration, otherwise you wouldn't have to spin here)
> then you will need memory barriers on the first write (i.e. the first write
> above should be an atomic_store_rel()) and the equivalent of an _acq barrier
> here.

So, thanks for pointing this out.  Attilio also mentions that on x86 and 
amd64 we need a pause in the wait loop.  As we discussed, we can just use 
sfence rather than atomics on amd64, however, x86 will need atomics since 
you can't rely on the presence of *fence.  Other architectures will have 
to ensure memory ordering as appropriate.

Jeff

>
> -- 
> John Baldwin
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070605114745.I606>