From owner-freebsd-smp Tue Jun 24 12:18:39 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id MAA01172 for smp-outgoing; Tue, 24 Jun 1997 12:18:39 -0700 (PDT) Received: from Kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA01136; Tue, 24 Jun 1997 12:18:20 -0700 (PDT) Received: from Mercury.mcs.net (fredriks@Mercury.mcs.net [192.160.127.80]) by Kitten.mcs.com (8.8.5/8.8.2) with ESMTP id OAA18884; Tue, 24 Jun 1997 14:18:18 -0500 (CDT) Received: (from fredriks@localhost) by Mercury.mcs.net (8.8.5/8.8.2) id OAA19076; Tue, 24 Jun 1997 14:18:17 -0500 (CDT) From: Lars Fredriksen Message-Id: <199706241918.OAA19076@Mercury.mcs.net> Subject: Re: Recent (last two days) smp kernel is hanging for me To: smp@csn.net (Steve Passe) Date: Tue, 24 Jun 1997 14:18:16 -0500 (CDT) Cc: peter@spinner.dialix.com.au, bob@luke.pmr.com, lars@fredriks-1.pr.mcs.net, dyson@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG In-Reply-To: <199706241807.MAA11971@Ilsa.StevesCafe.com> from "Steve Passe" at Jun 24, 97 12:07:01 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Steve Passe writes: > > --- > So to summarize the failure on Lar's system: > > I just booted the SMP kernel and broke into the debugger when > it got hung (after saying that all 2 cpus are online). Well it isn't > actually the kernel per se that is hung. It is in the smp_idleloop. > > The problem is that one of the rc scripts is running swapon and it > is hung - on wait channel 0xf0b60a00 (i'll try to find out what this > is) > > I have 2 scsi controllers in my system; a 2940 and a 2944. > This morning when I booted the SMP kernel with the lock fixes I let it sit for about 10 minutes or so after it got hung and I did get error messages from the scsi driver where it said that it timed out "Timeout SCB handled by another timeout" on the controller that has the swapdevices on it which to me indicates that the kernel does not see the interrupt back from the controller board. Now I am making one hell of an assumption here, and that is that the firmware download and initialization of the board succeeded. The dmesg did indicate that. Now I'll do one more test and that is to comment out the swapon in /etc/rc. I expect that if I do that, the machine will just fail to mount any of the file systems that is connected through the second (ahc0) controller. Now from what I can see, ahc0 is sharing the interrupt with vga and ethernet controllers, even though they seem to be steered to different pins(?) by means of pci0.19.0, pci.18.0, and so forth. With respect to the request of getting the kernel messages through the serial port, that is going to take some time. I first need to get the other box up and running :-) Lars -- ------------------------------------------------------------------- Lars Fredriksen fredriks@mcs.com (home) lars@fredriks-2.pr.mcs.net (home-home)