From owner-freebsd-smp Tue Sep 26 1:49:43 2000 Delivered-To: freebsd-smp@freebsd.org Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39]) by hub.freebsd.org (Postfix) with ESMTP id A34AC37B42C; Tue, 26 Sep 2000 01:49:24 -0700 (PDT) Received: from nlsys.demon.co.uk ([158.152.125.33] helo=herring.nlsystems.com) by finch-post-11.mail.demon.net with esmtp (Exim 2.12 #1) id 13dqQS-000B5T-0B; Tue, 26 Sep 2000 08:49:23 +0000 Received: from salmon.nlsystems.com (salmon.nlsystems.com [10.0.0.3]) by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id JAA75095; Tue, 26 Sep 2000 09:53:49 +0100 (BST) (envelope-from dfr@nlsystems.com) Date: Tue, 26 Sep 2000 09:49:30 +0100 (BST) From: Doug Rabson To: John Baldwin Cc: smp@freebsd.org, cp@bsdi.com, alpha@freebsd.org Subject: Re: Status update In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, 26 Sep 2000, Doug Rabson wrote: > On Sat, 23 Sep 2000, John Baldwin wrote: > > > > There are also a few more weirdism's on the alpha. In a few places in > > sys/kern, we call spl0() instead of splx(). I've added some debugging code to > > do a printf() if we aren't actually at IPL_0 (what spl0 used to do) after the > > mtx_exit(). It does trigger in several cases during /etc/rc at least, but the > > machine seems to be running stable regardless (I'll be running a buildworld -j > > 8 tonight to stress test it). My question is: is it ok for the code to run > > with some interrupts disabled or do we need to replace the calls to spl0() > > with enable_intr()? > > I'm testing this now and I'm seeing a flood of diagnostic messages like: > > ../../kern/kern_fork.c:537:fork1() spl0 needs fixing > > I think these are all due to the fact that sched_lock is held at that > point. The preceding mtx_exit() just decrements the recursion count. > I haven't worked out exactly why sched_lock is held here because it > shouldn't be. Possibly its because we don't clear pcb_schednest in > cpu_fork(). As I suspected, clearing pcb_schednest makes this lot go away. Try this version of vm_machdep.c: Index: vm_machdep.c =================================================================== RCS file: /home/ncvs/src/sys/alpha/alpha/vm_machdep.c,v retrieving revision 1.33 diff -u -r1.33 vm_machdep.c --- vm_machdep.c 2000/09/07 01:32:39 1.33 +++ vm_machdep.c 2000/09/26 08:38:55 @@ -210,6 +210,13 @@ up->u_pcb.pcb_context[2] = (u_long) p2; /* s2: a0 */ up->u_pcb.pcb_context[7] = (u_int64_t)switch_trampoline; /* ra: assembly magic */ + + /* + * Clear the saved recursion count for sched_lock + * since the child needs only one count which is + * released in switch_trampoline. + */ + up->u_pcb.pcb_schednest = 0; } } -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message