Date: Thu, 26 Jan 2006 16:50:41 -0600 From: Guy Helmer <ghelmer@palisadesys.com> To: Joshua Coombs <jcoombs@gwi.net> Cc: freebsd-stable@freebsd.org Subject: Re: Adaptec SCSI going nuts Message-ID: <43D95241.6010703@palisadesys.com> In-Reply-To: <drbehj$dl$1@sea.gmane.org> References: <drbehj$dl$1@sea.gmane.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Joshua Coombs wrote: > Just got a dual Opteron system to play with amd64 builds of FreeBSD. > Poking around to see how it behaived, I initially tossed FreeBSD 4.11 > on it, and was surprised to see a dump of the scsi controller state at > the end of the dmesg. Tried 6.0-Rel, both x86 and amd64, same > behavior. Bumped up to 6-stable from yesterday, same behavior. After > the dump, the system appears solid, no other errors, so I'm guessing > it's just a problem with how the card is initialized, FreeBSD corrects > it and moves on. > > Should I be more concerned, or consider it a quirk of the machine? Oh, you have your Adaptec controller hooked up to one of those Cheetah 10K.7 drives like the ones that are driving us a bit batty at work (except I'm using ST373207LC [73GB] drives). IIRC, we've been seeing this SCSI state dump at startup on some of our 5.4-based kernels but not always. Until recently, I thought the drives were working OK but now I'm not so sure. We're trying to track down an infrequent lockup of the SCSI system with Adaptec AIC7902 Ultra320 controller built-into the motherboard and a single Seagate ST373207LC drive while running FreeBSD 5.4 on a dual-Xeon system. We've been running bonnie++ to stress the drive, and we're having a hard time pinning the failure down to a particular component (controller or disk) because it's taken bonnie++ running continuously as many as 6 days straight for the failure to occur. We've tried updating the firmware on the Seagate drive to ST373207LC from version 3 to 4 and turning off Packetized transfers in the Adaptec BIOS but we still seem to be able to make it fail after a while. The next thing I'm tempted to try changing is the SCSI cable if we still don't have a stable system. So I'd suggest that you run something like bonnie++ that can stress the disk & controller for a while, and see whether or not you encounter any problems... Good luck! Guy Helmer > > Joshua Coombs > > Copyright (c) 1992-2005 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 6.0-RELEASE-p4 #0: Thu Jan 26 15:55:58 EST 2006 > root@testbed.gwi.net:/usr/obj/usr/src/sys/SMP > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Opteron(tm) Processor 246 (1994.67-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 > Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> > > AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow+,3DNow> > real memory = 1073676288 (1023 MB) > avail memory = 1024929792 (977 MB) > ACPI APIC Table: <A M I OEMAPIC > > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > MADT: Forcing active-low polarity and level trigger for SCI > ioapic0 <Version 1.1> irqs 0-23 on motherboard > ioapic1 <Version 1.1> irqs 24-27 on motherboard > ioapic2 <Version 1.1> irqs 28-31 on motherboard > acpi0: <A M I OEMXSDT> on motherboard > acpi0: Power Button (fixed) > pci_link0: <ACPI PCI Link LNKA> irq 9 on acpi0 > pci_link1: <ACPI PCI Link LNKB> irq 10 on acpi0 > pci_link2: <ACPI PCI Link LNKC> irq 11 on acpi0 > pci_link3: <ACPI PCI Link LNKD> irq 15 on acpi0 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > cpu0: <ACPI CPU> on acpi0 > acpi_throttle0: <ACPI CPU Throttling> on cpu0 > cpu1: <ACPI CPU> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pcib1: <ACPI PCI-PCI bridge> at device 6.0 on pci0 > pci3: <ACPI PCI bus> on pcib1 > pci3: <display, VGA> at device 6.0 (no driver attached) > fxp0: <Intel 82551 Pro/100 Ethernet> port 0xbc00-0xbc3f mem > 0xfeafb000-0xfeafbfff,0xfeaa0000-0xfeabffff irq 18 at device 8.0 on pci3 > miibus0: <MII bus> on fxp0 > inphy0: <i82555 10/100 media interface> on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:34:64:da > isab0: <PCI-ISA bridge> at device 7.0 on pci0 > isa0: <ISA bus> on isab0 > atapci0: <AMD 8111 UDMA133 controller> port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0 > ata0: <ATA channel 0> on atapci0 > ata1: <ATA channel 1> on atapci0 > pci0: <serial bus, SMBus> at device 7.2 (no driver attached) > pci0: <bridge> at device 7.3 (no driver attached) > pcib2: <ACPI PCI-PCI bridge> at device 10.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > ahd0: <Adaptec AIC7902 Ultra320 SCSI adapter> port > 0x8000-0x80ff,0x7800-0x78ff mem 0xfc8fc000-0xfc8fdfff irq 24 at device > 6.0 on pci2 > ahd0: [GIANT-LOCKED] > aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs > ahd1: <Adaptec AIC7902 Ultra320 SCSI adapter> port > 0x8800-0x88ff,0x8400-0x84ff mem 0xfc8fe000-0xfc8fffff irq 25 at device > 6.1 on pci2 > ahd1: [GIANT-LOCKED] > aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs > bge0: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2003> mem > 0xfc8b0000-0xfc8bffff,0xfc8a0000-0xfc8affff irq 24 at device 9.0 on pci2 > miibus1: <MII bus> on bge0 > brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:34:65:28 > bge1: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2003> mem > 0xfc8e0000-0xfc8effff,0xfc8d0000-0xfc8dffff irq 25 at device 9.1 on pci2 > miibus2: <MII bus> on bge1 > brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:34:65:29 > pci0: <base peripheral, interrupt controller> at device 10.1 (no > driver attached) > pcib3: <ACPI PCI-PCI bridge> at device 11.0 on pci0 > pci1: <ACPI PCI bus> on pcib3 > pci0: <base peripheral, interrupt controller> at device 11.1 (no > driver attached) > acpi_button0: <Power Button> on acpi0 > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 > on acpi0 > sio0: type 16550A > fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq > 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > orm0: <ISA Option ROM> at iomem 0xc0000-0xc7fff on isa0 > ppc0: cannot reserve I/O port range > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > Waiting 5 seconds for SCSI devices to settle > acd0: CDROM <GCR-8525B/1.02> at ata0-slave PIO4 > ahd0: Invalid Sequencer interrupt occurred. >>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< > ahd0: Dumping Card State at program address 0x23c Mode 0x0 > Card was paused > INTSTAT[0x0] SELOID[0x1] SELID[0x0] HS_MAILBOX[0x0] > INTCTL[0x80]:(SWTMINTMASK) SEQINTSTAT[0x0] SAVED_MODE[0x11] > DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE) > SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0] > LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0] > SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0] > SEQINTCTL[0x6]:(INTMASK1|INTMASK2) > SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] QFREEZE_COUNT[0x2] > KERNEL_QFREEZE_COUNT[0x2] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff] > SSTAT0[0x0] SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] > SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0] > LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x0] > LQOSTAT2[0x0] > > SCB Count = 16 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0xe NEXTSCB 0xff40 > qinstart = 28 qinfifonext = 28 > QINFIFO: > WAITING_TID_QUEUES: > Pending list: > Total 0 > Kernel Free SCB list: 15 14 1 2 3 4 5 6 7 8 9 10 11 12 13 0 > Sequencer Complete DMA-inprog list: > Sequencer Complete list: > Sequencer DMA-Up and Complete list: > Sequencer On QFreeze and Complete list: > > > ahd0: FIFO0 Free, LONGJMP == 0x8000, SCB 0xf > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > > ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0xe > SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) > > SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL) > SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0] > SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0 > HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL) > LQIN: 0x8 0x0 0x0 0xf 0x0 0x1 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > 0x0 0x0 0x0 0x0 > ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 > ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1 > ahd0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 > > SIMODE0[0xc]:(ENOVERRUN|ENIOERR) > CCSCBCTL[0x4]:(CCSCBDIR) > ahd0: REG0 == 0x8f60, SINDEX = 0x10e, DINDEX = 0x104 > ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff40, SCB_NEXT2 == 0xe > CDB 12 20 0 80 88 e6 > STACK: 0x237 0x2 0x0 0x0 0x0 0x0 0x0 0x0 > <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0 > 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 > SMP: AP CPU #1 Launched! > da0 at ahd0 bus 0 target 0 lun 0 > da0: <SEAGATE ST336807LC 0C01> Fixed Direct Access SCSI-3 device > da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > da1 at ahd0 bus 0 target 1 lun 0 > da1: <SEAGATE ST336807LC 0C01> Fixed Direct Access SCSI-3 device > da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged > Queueing Enabled > da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) > Trying to mount root from ufs:/dev/da0s1a > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" -- Guy Helmer, Ph.D. Principal System Architect Palisade Systems, Inc.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43D95241.6010703>