From owner-freebsd-smp Tue Jun 24 11:09:38 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id LAA28251 for smp-outgoing; Tue, 24 Jun 1997 11:09:38 -0700 (PDT) Received: from Ilsa.StevesCafe.com (Ilsa.StevesCafe.com [205.168.119.129]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA28232; Tue, 24 Jun 1997 11:09:15 -0700 (PDT) Received: from Ilsa.StevesCafe.com (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.8.5/8.8.5) with ESMTP id MAA11971; Tue, 24 Jun 1997 12:07:01 -0600 (MDT) Message-Id: <199706241807.MAA11971@Ilsa.StevesCafe.com> X-Mailer: exmh version 2.0gamma 1/27/96 From: Steve Passe To: Peter Wemm cc: Bob Willcox , lars@fredriks-1.pr.mcs.net, dyson@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG Subject: Re: Recent (last two days) smp kernel is hanging for me In-reply-to: Your message of "Tue, 24 Jun 1997 20:53:00 +0800." <199706241253.UAA10065@spinner.dialix.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 24 Jun 1997 12:07:01 -0600 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, > Bob Willcox wrote: > > On Tue, Jun 24, 1997 at 01:32:10AM -0600, Steve Passe wrote: > > > > > > I found something that would explain the new code working on the intels, > > > but failing on both the ASUS and the P6DNF. > > > ... > > Well, that seemed to do the trick on my system!! :-) > > Well, pass me the pointy hat! *-<8-) > > Cheers, > -Peter sorry, Peter, but if thats the only thing you missed in that massive set of changes you don't get to wear the hat! John Dyson also reports that SMP is working again for him. Lars reports that this fix gets him further along, but is still hanging in swapon. --- So to summarize the failure on Lar's system: I just booted the SMP kernel and broke into the debugger when it got hung (after saying that all 2 cpus are online). Well it isn't actually the kernel per se that is hung. It is in the smp_idleloop. The problem is that one of the rc scripts is running swapon and it is hung - on wait channel 0xf0b60a00 (i'll try to find out what this is) I have 2 scsi controllers in my system; a 2940 and a 2944. - config: # ahc1 is the controller for sd0, swap devices on ahc0 controller ahc0 controller ahc1 controller scbus0 at ahc0 controller scbus1 at ahc1 disk sd0 at scbus1 target 0 unit 0 #disk sd1 at scbus0 target 1 tape st0 at scbus1 target 2 tape st1 at scbus1 target 5 device sd0 #Only need one of these, the code dynamically grows device st0 device cd0 - dmesg: ahc0: rev 0x03 int a irq 11 on pci0.19.0 using shared irq11. ahc0: Reading SEEPROM...done. low byte termination enabled, high byte termination enabled ahc0: aic7870 Wide Channel, SCSI Id=7, 16/255 SCBs ahc0: Resetting Channel A ahc0: Downloading Sequencer Program...ahc0: 411 instructions downloaded Done ahc0: Probing channel A Choosing drivers for scbus configured at 0 ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 ahc0: target 0 synchronous at 8.0MHz, offset = 0xf ahc0: target 0 Tagged Queuing Device scbus0 target 0 lun 0: type 0 fixed SCSI 2 sd1 at scbus0 target 0 lun 0 sd1: Direct-Access 2030MB (4159462 512 byte sectors) sd1: with 2626 cyls, 19 heads, and an average 83 sectors/track ahc0: target 1 synchronous at 4.4MHz, offset = 0xf scbus0 target 1 lun 0: type 0 fixed SCSI 1 sd2 at scbus0 target 1 lun 0 sd2: Direct-Access 955MB (1956864 512 byte sectors) sd2: with 1931 cyls, 15 heads, and an average 67 sectors/track ahc0: target 2 synchronous at 10.0MHz, offset = 0x8 ahc0: target 2 Tagged Queuing Device scbus0 target 2 lun 0: type 0 fixed SCSI 2 sd3 at scbus0 target 2 lun 0 sd3: Direct-Access 1003MB (2054864 512 byte sectors) sd3: with 2051 cyls, 13 heads, and an average 77 sectors/track ahc0: target 3 using asynchronous transfers ahc0:A:3: Warning - unknown message received from target (0x1). SEQ_FLAGS == 0x6. Rejecting scbus0 target 3 lun 0: type 1 removable SCSI 2 st2 at scbus0 target 3 lun 0 st2: Sequential-Access density code 0x0, drive empty ahc0: target 4 synchronous at 4.4MHz, offset = 0xf scbus0 target 4 lun 0: type 0 fixed SCSI 1 sd4 at scbus0 target 4 lun 0 sd4: Direct-Access 955MB (1956864 512 byte sectors) sd4: with 1931 cyls, 15 heads, and an average 67 sectors/track ahc0: target 5 synchronous at 4.4MHz, offset = 0xf scbus0 target 5 lun 0: type 0 fixed SCSI 1 sd5 at scbus0 target 5 lun 0 sd5: Direct-Access 955MB (1956864 512 byte sectors) sd5: with 1931 cyls, 15 heads, and an average 67 sectors/track ahc0: target 6 synchronous at 4.4MHz, offset = 0xf scbus0 target 6 lun 0: type 0 fixed SCSI 1 sd6 at scbus0 target 6 lun 0 sd6: Direct-Access 955MB (1956864 512 byte sectors) sd6: with 1931 cyls, 15 heads, and an average 67 sectors/track probe0(ahc0:9:0): scsi_cmd probe0(ahc0:9:0): scsi_done scbus0 target 9 lun 0: command: 0,0,0,0,0,0-[0 bytes] probe0(ahc0:9:0): scsi_cmd probe0(ahc0:9:0): scsi_done scbus0 target 9 lun 0: command: 12,0,0,0,2c,0-[44 bytes] ------------------------------ 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 032: 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ found-> vendor=0x9004, dev=0x8178, revid=0x00 class=01-00-00, hdrtype=0x00, mfdev=0 intpin=a, irq=14 map[0]: type 4, range 32, base 0000e800, size 8 map[1]: type 1, range 32, base fe3e7000, size 12 ahc1: rev 0x00 int a irq 14 on pci0.20.0 ahc1: Reading SEEPROM...done. internal50 cable is present internal68 cable not present brdctl == 0xac external cable is present eprom is present brdctl == 0xac low byte termination disabled, high byte termination enabled ahc1: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs ahc1: Resetting Channel A ahc1: Downloading Sequencer Program...ahc1: 418 instructions downloaded Done ahc1: Probing channel A Choosing drivers for scbus configured at 1 ahc1: waiting for scsi devices to settle scbus1 at ahc1 bus 0 ahc1: target 0 synchronous at 4.4MHz, offset = 0xf scbus1 target 0 lun 0: type 0 fixed SCSI 1 sd is configured at 0 sd0 at scbus1 target 0 lun 0 sd0: Direct-Access 955MB (1956864 512 byte sectors) sd0: with 1931 cyls, 15 heads, and an average 67 sectors/track ahc1:A:3: refuses synchronous negotiation. Using asynchronous transfers scbus1 target 3 lun 0: type 5 removable SCSI 2 cd0 at scbus1 target 3 lun 0 cd0: CD-ROM cd present [78823 x 2048 byte records] scbus1 target 3 lun 1: type 5 removable SCSI 2 cd1 at scbus1 target 3 lun 1 cd1: CD-ROM can't get the size scbus1 target 3 lun 2: type 5 removable SCSI 2 cd2 at scbus1 target 3 lun 2 cd2: CD-ROM cd present [196535 x 2048 byte records] scbus1 target 3 lun 3: type 5 removable SCSI 2 cd3 at scbus1 target 3 lun 3 cd3: CD-ROM cd present [227432 x 2048 byte records] scbus1 target 3 lun 4: type 5 removable SCSI 2 cd4 at scbus1 target 3 lun 4 cd4: CD-ROM cd present [167490 x 2048 byte records] scbus1 target 3 lun 5: type 5 removable SCSI 2 cd5 at scbus1 target 3 lun 5 cd5: CD-ROM cd present [87066 x 2048 byte records] scbus1 target 3 lun 6: type 5 removable SCSI 2 cd6 at scbus1 target 3 lun 6 cd6: CD-ROM cd present [303991 x 2048 byte records] ahc1: target 5 synchronous at 5.0MHz, offset = 0x8 scbus1 target 5 lun 0: type 1 removable SCSI 2 st is configured at 1 st1 at scbus1 target 5 lun 0 st1: Sequential-Access density code 0x13, drive empty probe0(ahc1:9:0): scsi_cmd probe0(ahc1:9:0): scsi_done scbus1 target 9 lun 0: command: 0,0,0,0,0,0-[0 bytes] probe0(ahc1:9:0): scsi_cmd probe0(ahc1:9:0): scsi_done scbus1 target 9 lun 0: command: 12,0,0,0,2c,0-[44 bytes] ------------------------------ 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 016: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 032: 00 00 00 00 00 00 00 00 00 00 00 00 ------------------------------ ------------------------------------------------------------------------------- I noticed long ago that a line in the config file like: config kernel root on sd0 swaps on sd0 ^^^^^^^^^^^^ would hang SMP. So since these might be related, I am going to try adding that here and hopefully attack the problem locally. If anyone has any clues/theories plese speak up... -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD