Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Sep 2000 09:32:32 +0100 (BST)
From:      Doug Rabson <dfr@nlsystems.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        smp@freebsd.org, cp@bsdi.com, alpha@freebsd.org
Subject:   Re: Status update
Message-ID:  <Pine.BSF.4.21.0009260922050.35016-100000@salmon.nlsystems.com>
In-Reply-To: <XFMail.000923003422.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 23 Sep 2000, John Baldwin wrote:

> Ok, the alpha seems to be rather stable now without the need for obscene hacks
> to the mutex code to dink with mtx_saveipl.  To summarize, here are the changes
> thus far:
> 
> - The interrupt state of the sched_lock is now saved in a process's PCB during
>   cpu_switch().  This way, code before and after a call to either mi_switch()
>   or cpu_switch() is guaranteed to be run at the same interrupt state.  Without
>   this I was having problems on the alpha where the idle loop was running at
>   ALPHA_PSL_IPL_SOFT (1) and as a result init's child process was never ran,
>   among other things.
> 
> This last change is something I'd like some feedback on.  I've checked
> the BSD/OS x86 code, and it onyl saves the recursion count of the
> sched_lock in the pcb.  However, after the problems with the alpha and
> some discussion with Peter Wemm on IRC, I decided that we should be
> doing this.  However, I'm not completely certain, and any thoughts
> that anyone has would be appreciated.

I think this is probably unnecessary. After implementing swi threads, the
only place in the kernel where ipl will be non-zero when sched_lock is
taken should be in the interrupt code and I don't think a context switch
is possible there until the AST after returning.


> 
> There are also a few more weirdism's on the alpha.  In a few places in
> sys/kern, we call spl0() instead of splx().  I've added some debugging code to
> do a printf() if we aren't actually at IPL_0 (what spl0 used to do) after the
> mtx_exit().  It does trigger in several cases during /etc/rc at least, but the
> machine seems to be running stable regardless (I'll be running a buildworld -j
> 8 tonight to stress test it).  My question is: is it ok for the code to run
> with some interrupts disabled or do we need to replace the calls to spl0()
> with enable_intr()?

I'm testing this now and I'm seeing a flood of diagnostic messages like:

	../../kern/kern_fork.c:537:fork1() spl0 needs fixing 

I think these are all due to the fact that sched_lock is held at that
point. The preceding mtx_exit() just decrements the recursion count.
I haven't worked out exactly why sched_lock is held here because it
shouldn't be. Possibly its because we don't clear pcb_schednest in
cpu_fork().

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0009260922050.35016-100000>