From owner-freebsd-stable Wed Dec 8 21:13:14 1999 Delivered-To: freebsd-stable@freebsd.org Received: from smtp10.atl.mindspring.net (smtp10.atl.mindspring.net [207.69.200.246]) by hub.freebsd.org (Postfix) with ESMTP id 2C5171556D; Wed, 8 Dec 1999 21:12:59 -0800 (PST) (envelope-from igiveup@ix.netcom.com) Received: from ix.netcom.com (user-2ini8pe.dialup.mindspring.com [165.121.35.46]) by smtp10.atl.mindspring.net (8.9.3/8.8.5) with ESMTP id AAA28198; Thu, 9 Dec 1999 00:12:51 -0500 (EST) Message-ID: <384F3A52.23868C19@ix.netcom.com> Date: Wed, 08 Dec 1999 21:12:50 -0800 From: Ben Speirs X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 3.3-STABLE i386) X-Accept-Language: en MIME-Version: 1.0 To: The Hermit Hacker Cc: freebsd-scsi@freebsd.org, freebsd-stable@freebsd.org Subject: Re: SCSI problem ... OS or just bus? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG The Hermit Hacker wrote: > > I recently did two upgrades in the course of a few days...upgraded my > 3.3-STABLE to a more recent version, and added hard drives onto the > system...now I'm getting SCSI problems that make no sense :( > > The machine just hung once more, which its doing every few hours...I can > get down to the debugger, but a 'trace' doesn't appear to show anyting, so > I panic... > > ========== > (da4:ahc0:0:8:0): Other SCB Timeout > (da4:ahc0:0:8:0): SCB 0xeb - timed out in dataout phase, SEQADDR == 0x10f > (da4:ahc0:0:8:0): Other SCB Timeout > (da2:ahc0:0:5:0): SCB 0x24 - timed out in dataout phase, SEQADDR == 0x10f > (da2:ahc0:0:5:0): BDR message in message buffer > (da2:ahc0:0:5:0): SCB 0x92 - timed out in dataout phase, SEQADDR == 0x10f > (da2:ahc0:0:5:0): no longer in timeout, status = 34b > ahc0: Issued Channel A Bus Reset. 98 SCBs aborted Just another data point - A similar thing happened to me. I rebuilt the kernel and world back in September and my previously happy SCSI system started issuing the same type of messages. I saved the output of the system log. Portions of it are listed below: Copyright (c) 1992-1999 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.3-STABLE #3: Fri Sep 24 21:00:39 PDT 1999 root@sloth:/usr/src/sys/compile/SLOTH [...trim...] ahc0: rev 0x00 int a irq 9 on pci0.9.0 ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs [...trim...] Waiting 8 seconds for SCSI devices to settle changing root device to da0s3a da0 at ahc0 bus 0 target 15 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) cd0 at ahc0 bus 0 target 0 lun 0 cd0: Removable CD-ROM SCSI-2 device cd0: 10.000MB/s transfers (10.000MHz, offset 8) cd0: Attempt to query device size failed: NOT READY, Medium not present cd1 at ahc0 bus 0 target 1 lun 0 cd1: Removable CD-ROM SCSI-2 device cd1: 3.300MB/s transfers cd1: Attempt to query device size failed: NOT READY, Medium not present Unexpected busfree. LASTPHASE == 0x1 SEQADDR == 0x153 ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET SAVED_TCL == 0x0, ARG_1 == 0xff, SEQ_FLAGS == 0x0 (cd0:ahc0:0:0:0): SCB 0x16 - timed out in datain phase, SEQADDR == 0x153 (cd0:ahc0:0:0:0): Other SCB Timeout (da0:ahc0:0:15:0): SCB 0x3 - timed out in datain phase, SEQADDR == 0x153 (da0:ahc0:0:15:0): BDR message in message buffer (da0:ahc0:0:15:0): SCB 0x3 - timed out in datain phase, SEQADDR == 0x153 (da0:ahc0:0:15:0): no longer in timeout, status = 34b ahc0: Issued Channel A Bus Reset. 2 SCBs aborted fd0c: hard error reading fsbn 0 (No status) The problem occurred while accessing the da0 device and cd0 device at the same time. I could reproduce it at will, and almost instantly by copying a file from the CD-ROM to the hard drive. I could not reproduce the error with the older, slower NEC cd1 CD-ROM device. I rechecked all my termination and unplugged one device after another without any success. My guess was that the cd0 drive had gone goofy on me. The only thing I have not tried is replacing the cables. Since I had the other CD available my fix was to yank out the suspect device. It has been near the bottom of my 'things to do' list. Maybe we both got bit by the same "fix" that uncovered hidden hardware problems. Maybe not, it looks like you have problems with only Wide channel devices. -- -Ben Speirs To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message