From owner-freebsd-alpha  Tue Sep 26  1:32:55 2000
Delivered-To: freebsd-alpha@freebsd.org
Received: from anchor-post-33.mail.demon.net (anchor-post-33.mail.demon.net [194.217.242.91])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7BE6C37B424; Tue, 26 Sep 2000 01:32:48 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by anchor-post-33.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13dqAQ-0003bt-0X; Tue, 26 Sep 2000 09:32:47 +0100
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA74959;
	Tue, 26 Sep 2000 09:36:51 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Tue, 26 Sep 2000 09:32:32 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: John Baldwin <jhb@freebsd.org>
Cc: smp@freebsd.org, cp@bsdi.com, alpha@freebsd.org
Subject: Re: Status update
In-Reply-To: <XFMail.000923003422.jhb@FreeBSD.org>
Message-ID: <Pine.BSF.4.21.0009260922050.35016-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-alpha@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Sat, 23 Sep 2000, John Baldwin wrote:

> Ok, the alpha seems to be rather stable now without the need for obscene hacks
> to the mutex code to dink with mtx_saveipl.  To summarize, here are the changes
> thus far:
> 
> - The interrupt state of the sched_lock is now saved in a process's PCB during
>   cpu_switch().  This way, code before and after a call to either mi_switch()
>   or cpu_switch() is guaranteed to be run at the same interrupt state.  Without
>   this I was having problems on the alpha where the idle loop was running at
>   ALPHA_PSL_IPL_SOFT (1) and as a result init's child process was never ran,
>   among other things.
> 
> This last change is something I'd like some feedback on.  I've checked
> the BSD/OS x86 code, and it onyl saves the recursion count of the
> sched_lock in the pcb.  However, after the problems with the alpha and
> some discussion with Peter Wemm on IRC, I decided that we should be
> doing this.  However, I'm not completely certain, and any thoughts
> that anyone has would be appreciated.

I think this is probably unnecessary. After implementing swi threads, the
only place in the kernel where ipl will be non-zero when sched_lock is
taken should be in the interrupt code and I don't think a context switch
is possible there until the AST after returning.


> 
> There are also a few more weirdism's on the alpha.  In a few places in
> sys/kern, we call spl0() instead of splx().  I've added some debugging code to
> do a printf() if we aren't actually at IPL_0 (what spl0 used to do) after the
> mtx_exit().  It does trigger in several cases during /etc/rc at least, but the
> machine seems to be running stable regardless (I'll be running a buildworld -j
> 8 tonight to stress test it).  My question is: is it ok for the code to run
> with some interrupts disabled or do we need to replace the calls to spl0()
> with enable_intr()?

I'm testing this now and I'm seeing a flood of diagnostic messages like:

	../../kern/kern_fork.c:537:fork1() spl0 needs fixing 

I think these are all due to the fact that sched_lock is held at that
point. The preceding mtx_exit() just decrements the recursion count.
I haven't worked out exactly why sched_lock is held here because it
shouldn't be. Possibly its because we don't clear pcb_schednest in
cpu_fork().

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message