Date: Thu, 9 Dec 1999 11:56:25 -0400 (AST) From: The Hermit Hacker <scrappy@hub.org> To: Ben Speirs <igiveup@ix.netcom.com> Cc: freebsd-scsi@freebsd.org, freebsd-stable@freebsd.org Subject: Re: SCSI problem ... OS or just bus? Message-ID: <Pine.BSF.4.21.9912091154270.500-100000@thelab.hub.org> In-Reply-To: <384F3A52.23868C19@ix.netcom.com>
next in thread | previous in thread | raw e-mail | index | archive | help
As an update, so far...without turning news back on again, and after upgrading the kernel and doing a make world 12hrs ago, things have been stable *so far*...the first time this happened after we added the drives, it took about 17hrs or so...subsequent ones generally took 2-4hrs... I'm going to re-enable news this afternoon and see if adding that extra thrashing to the system causes a repeat of the problem or not... On Wed, 8 Dec 1999, Ben Speirs wrote: > The Hermit Hacker wrote: > > > > I recently did two upgrades in the course of a few days...upgraded my > > 3.3-STABLE to a more recent version, and added hard drives onto the > > system...now I'm getting SCSI problems that make no sense :( > > > > The machine just hung once more, which its doing every few hours...I can > > get down to the debugger, but a 'trace' doesn't appear to show anyting, so > > I panic... > > > > ========== > > (da4:ahc0:0:8:0): Other SCB Timeout > > (da4:ahc0:0:8:0): SCB 0xeb - timed out in dataout phase, SEQADDR == 0x10f > > (da4:ahc0:0:8:0): Other SCB Timeout > > (da2:ahc0:0:5:0): SCB 0x24 - timed out in dataout phase, SEQADDR == 0x10f > > (da2:ahc0:0:5:0): BDR message in message buffer > > (da2:ahc0:0:5:0): SCB 0x92 - timed out in dataout phase, SEQADDR == 0x10f > > (da2:ahc0:0:5:0): no longer in timeout, status = 34b > > ahc0: Issued Channel A Bus Reset. 98 SCBs aborted > > Just another data point - A similar thing happened to me. I rebuilt the > kernel and world back in September and my previously happy SCSI system > started issuing the same type of messages. I saved the output of the > system log. Portions of it are listed below: > > Copyright (c) 1992-1999 FreeBSD Inc. > Copyright (c) 1982, 1986, 1989, 1991, 1993 > The Regents of the University of California. All rights > reserved. > FreeBSD 3.3-STABLE #3: Fri Sep 24 21:00:39 PDT 1999 > root@sloth:/usr/src/sys/compile/SLOTH > [...trim...] > ahc0: <Adaptec 2940 Ultra SCSI adapter> rev 0x00 int a irq 9 on pci0.9.0 > ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs > [...trim...] > Waiting 8 seconds for SCSI devices to settle > changing root device to da0s3a > da0 at ahc0 bus 0 target 15 lun 0 > da0: <FUJITSU M2954Q-512 0142> Fixed Direct Access SCSI-2 device > da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > Enabled > da0: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > cd0 at ahc0 bus 0 target 0 lun 0 > cd0: <TOSHIBA CD-ROM XM-5701TA 3136> Removable CD-ROM SCSI-2 device > cd0: 10.000MB/s transfers (10.000MHz, offset 8) > cd0: Attempt to query device size failed: NOT READY, Medium not present > cd1 at ahc0 bus 0 target 1 lun 0 > cd1: <NEC CD-ROM DRIVE:500 2.5> Removable CD-ROM SCSI-2 device > cd1: 3.300MB/s transfers > cd1: Attempt to query device size failed: NOT READY, Medium not present > > > Unexpected busfree. LASTPHASE == 0x1 > SEQADDR == 0x153 > ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE > RESET > SAVED_TCL == 0x0, ARG_1 == 0xff, SEQ_FLAGS == 0x0 > (cd0:ahc0:0:0:0): SCB 0x16 - timed out in datain phase, SEQADDR == 0x153 > (cd0:ahc0:0:0:0): Other SCB Timeout > (da0:ahc0:0:15:0): SCB 0x3 - timed out in datain phase, SEQADDR == 0x153 > (da0:ahc0:0:15:0): BDR message in message buffer > (da0:ahc0:0:15:0): SCB 0x3 - timed out in datain phase, SEQADDR == 0x153 > (da0:ahc0:0:15:0): no longer in timeout, status = 34b > ahc0: Issued Channel A Bus Reset. 2 SCBs aborted > fd0c: hard error reading fsbn 0 (No status) > > > The problem occurred while accessing the da0 device and cd0 device at > the same time. I could reproduce it at will, and almost instantly by > copying a file from the CD-ROM to the hard drive. I could not reproduce > the error with the older, slower NEC cd1 CD-ROM device. I rechecked all > my termination and unplugged one device after another without any > success. My guess was that the cd0 drive had gone goofy on me. The > only thing I have not tried is replacing the cables. Since I had the > other CD available my fix was to yank out the suspect device. It has > been near the bottom of my 'things to do' list. > > Maybe we both got bit by the same "fix" that uncovered hidden hardware > problems. Maybe not, it looks like you have problems with only Wide > channel devices. > > -- > -Ben Speirs > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.9912091154270.500-100000>