From owner-freebsd-smp  Tue Sep 26  1:49:43 2000
Delivered-To: freebsd-smp@freebsd.org
Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39])
	by hub.freebsd.org (Postfix) with ESMTP
	id A34AC37B42C; Tue, 26 Sep 2000 01:49:24 -0700 (PDT)
Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com)
	by finch-post-11.mail.demon.net with esmtp (Exim 2.12 #1)
	id 13dqQS-000B5T-0B; Tue, 26 Sep 2000 08:49:23 +0000
Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3])
	by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA75095;
	Tue, 26 Sep 2000 09:53:49 +0100 (BST)
	(envelope-from dfr@nlsystems.com)
Date: Tue, 26 Sep 2000 09:49:30 +0100 (BST)
From: Doug Rabson <dfr@nlsystems.com>
To: John Baldwin <jhb@freebsd.org>
Cc: smp@freebsd.org, cp@bsdi.com, alpha@freebsd.org
Subject: Re: Status update
In-Reply-To: <Pine.BSF.4.21.0009260922050.35016-100000@salmon.nlsystems.com>
Message-ID: <Pine.BSF.4.21.0009260948210.35016-100000@salmon.nlsystems.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Tue, 26 Sep 2000, Doug Rabson wrote:

> On Sat, 23 Sep 2000, John Baldwin wrote:
> > 
> > There are also a few more weirdism's on the alpha.  In a few places in
> > sys/kern, we call spl0() instead of splx().  I've added some debugging code to
> > do a printf() if we aren't actually at IPL_0 (what spl0 used to do) after the
> > mtx_exit().  It does trigger in several cases during /etc/rc at least, but the
> > machine seems to be running stable regardless (I'll be running a buildworld -j
> > 8 tonight to stress test it).  My question is: is it ok for the code to run
> > with some interrupts disabled or do we need to replace the calls to spl0()
> > with enable_intr()?
> 
> I'm testing this now and I'm seeing a flood of diagnostic messages like:
> 
> 	../../kern/kern_fork.c:537:fork1() spl0 needs fixing 
> 
> I think these are all due to the fact that sched_lock is held at that
> point. The preceding mtx_exit() just decrements the recursion count.
> I haven't worked out exactly why sched_lock is held here because it
> shouldn't be. Possibly its because we don't clear pcb_schednest in
> cpu_fork().

As I suspected, clearing pcb_schednest makes this lot go away. Try this
version of vm_machdep.c:

Index: vm_machdep.c
===================================================================
RCS file: /home/ncvs/src/sys/alpha/alpha/vm_machdep.c,v
retrieving revision 1.33
diff -u -r1.33 vm_machdep.c
--- vm_machdep.c	2000/09/07 01:32:39	1.33
+++ vm_machdep.c	2000/09/26 08:38:55
@@ -210,6 +210,13 @@
 		up->u_pcb.pcb_context[2] = (u_long) p2;	/* s2: a0 */
 		up->u_pcb.pcb_context[7] =
 		    (u_int64_t)switch_trampoline;	/* ra: assembly magic */
+
+		/*
+		 * Clear the saved recursion count for sched_lock
+		 * since the child needs only one count which is
+		 * released in switch_trampoline.
+		 */
+		up->u_pcb.pcb_schednest = 0;
 	}
 }
 

-- 
Doug Rabson				Mail:  dfr@nlsystems.com
					Phone: +44 20 8348 6160


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message