Date: Thu, 11 Jun 1998 13:23:09 +0000 From: "Greg Rowe" <greg@uswest.net> To: Travis Mikalson <bofh@terranova.net> Cc: freebsd-stable@FreeBSD.ORG Subject: Re: Anyone know what this SCSI error is about? Message-ID: <9806111323.ZM22628@psv.oss.uswest.net> In-Reply-To: Travis Mikalson <bofh@terranova.net> "Re: Anyone know what this SCSI error is about?" (Jun 10, 12:41pm) References: <00aa01bd93ec$a118bd80$02dd71d1@fargo.os.com> <9806101333.ZM16262@psv.oss.uswest.net> <357EB726.2F3F@terranova.net>
next in thread | previous in thread | raw e-mail | index | archive | help
We have about 300 or so C's and D's in production with no problems at various release levels. We still have a couple E's in boxes with light loads and they run OK except for the occasional errors. We found that high user load or a couple bonnie or iozone runs can cause the systems to crash. Also, performance numbers under bonnie are terrible on the E cards. The E cards will also fail on our CAM test systems. All other Adaptec problems at the later FreeBSD releases can be traced to the usual drive/cable/termination problems. Greg > Hmm how about this one (this machine is my playbox running -CURRENT so > I'm guessing the rev output is a bit different than 2.2.x): > ahc0: <Adaptec 2940 Ultra SCSI host adapter> rev 0x01 int a irq 19 on pci0.11.0 > ahc0: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs > > That gives me those problems ONLY when I stress the drive to the > absolute max with dd and silly benchmarks. Looks like rev 0x01 means > I should exchange the card for a D. No mystery there. > > I've never ever seen the aborts and timeouts in the 155 days this one > (also a -CURRENT box from.. well.. 156 days ago) has been up which is a > lot more than I can say for the next box :( > ahc0: <Adaptec 2940 Ultra SCSI host adapter> rev 0x00 int a irq 10 on pci0.10.0 > ahc0: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs > > > this is the last one with a 2940UW in it where there have been so many > kernel messages (most of them from the SCSI timeouts, aborts, retries) > that I can no longer see the beginning of the dmesg... > I see it's a Rev C, not an E, but I still have the problems with this > one that have been described earlier: > sd1(ahc0:2:0): SCB 0x6 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > SEQADDR = 0x5 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0xa > Ordered Tag queued > sd1(ahc0:2:0): SCB 0x6 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > SEQADDR = 0x4 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0xa > sd1(ahc0:2:0): Queueing an Abort SCB > sd1(ahc0:2:0): Abort Message Sent > sd1(ahc0:2:0): SCB 6 - Abort Tag Completed. > sd1(ahc0:2:0): no longer in timeout > Ordered Tag sent > > This one's been up for 85 days and is running 2.2.6-BETA from mid-March > but has had this problem since upgrading from 2.1.5 to 2.2.[1 I think] > Same symptoms with three different cables and without the AHC_ options. > > All of these machines have all the AHC_ options enabled right now and no > CAM. -- Greg Rowe <greg@uswest.net> US WEST - !NTERACT Internet Services "To err is human, to really foul up requires the root password." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9806111323.ZM22628>