Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Nov 2006 13:18:47 -0800
From:      Mark Dotson <mark@dmglobal.net>
To:        Atanas <atanas@asd.aplus.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: twa: Passthru request timed out! Resetting controller...
Message-ID:  <455A32B7.9080304@dmglobal.net>
In-Reply-To: <455A1DEA.20304@asd.aplus.net>
References:  <455A1DEA.20304@asd.aplus.net>

next in thread | previous in thread | raw e-mail | index | archive | help
I've had continued problems with the 3ware series SATA cards and the 
Tyan boards.  Specifically, I have a "Tyan S5360-1U" and both a 
9500S-4LP and a 8506 series 3ware cards.

In my case the first error is different, but the 'resetting' over and 
over is VERY familiar.  This could be triggered by a simple file copy 
from one part of a container to another; degrading the unit and 
triggering the resetting crap.  Note that the drives are fine, I tested 
that first thing.

Sep  8 11:59:23 localhost kernel: 3w-9xxx: scsi0: WARNING: 
(0x06:0x002C): Unit #1: Command (0x2a) timed out, resetting card.
Sep  8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E):
Cache synchronized after power fail:unit=0.
Sep  8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E):
Cache synchronized after power fail:unit=1.

I also found this problem to exist across platforms, not just FreeBSD. 
For example, the excerpt above is from a CentOS box.

All tests were done with newest firmware for both card and mobo, and 
using the newest drivers provided by 3ware.

Once I removed the card and drives from the Tyan system and stuck them 
in pretty much ANY other system, they worked fantastically.

I don't have an answer for the "resetting problem" as of yet... 3ware 
and Tyan (And my system vendor "Appro") are still trying to find my 
specific problem and solve it.  I believe they are currently doing the 
"replace everything" method of troubleshooting.

-Mark

Atanas wrote:
> Has anyone experiencing this:
> 
> twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = 
> 0xca839d20
> twa0: INFO: (0x16: 0x1108): Resetting controller...:
> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0
> ...
> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7
> twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1
> twa0: INFO: (0x16: 0x1107): Controller reset done!:
> 
> This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a 
> number of machines with the following hardware configuration:
> 
> - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM
> - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives
>   (configured as 8 SINGLE DISK units, aka JBOD)
> 
> All hardware components, including the server chassis, are listed in the 
> 3ware hardware compatibility lists. It doesn't seem to be a cabling or 
> power issue. The controller and hard drives are already flashed to the 
> latest firmware revisions. I tried turning off NCQ, but it didn't make 
> any difference. I tried also switching the kernel from PAE to non-PAE 
> (reducing the usable memory to 3GB), but it didn't help either.
> 
> I have another machines with similar I/O configurations (3ware), but 
> with Intel motherboards and running FreeBSD-5.5, and these run fine for 
> about a year already. Now I'm thinking about swapping the drives between 
> a working Intel and AMD based box, to see where controller timeouts will 
> follow.
> 
> The problem happens sporadically once in a month or so and is very hard 
> to reproduce. Sometimes it takes several weeks until the next crash 
> happens, sometimes it crashes again in just a few hours.
> 
> When the thing happens, the kernel sometimes panics (most likely due to 
> the inconsistent filesystem state caused by the controller reset), 
> sometimes just hangs. It can be interrupted (I have a serial console), 
> but the only usable thing after that seems to be "call cpu_reset()", 
> followed by full (and sometimes painfully long) filesystem check.
> 
> Here are the diffs against the default GENERIC and PAE kernel 
> configurations:
> 
> < cpu       I486_CPU
> < ident     GENERIC
> < options   INET6               # IPv6 communications protocols
> < options   SCSI_DELAY=5000     # Delay (in ms) before probing SCSI
> 
>  > options   QUOTA
>  > options   SMP                 # Symmetric MultiProcessor Kernel
>  > options   BREAK_TO_DEBUGGER
>  > options   DDB
>  > options   KDB
>  > options   KDB_UNATTENDED
> 
>  > options   IPFIREWALL
>  > options   DUMMYNET
> 
> I'm attaching the dmesg.boot following the latest crash.
> 
> Regards,
> Atanas
> 
> 
> ------------------------------------------------------------------------
> 
> Copyright (c) 1992-2006 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> 	The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006
>     root@xyz:/var/obj/usr/src/sys/XYZ-PAE
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU)
>   Origin = "AuthenticAMD"  Id = 0x20f12  Stepping = 2
>   Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>   Features2=0x1<SSE3>
>   AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow>
>   AMD Features2=0x3<LAHF,CMP>
>   Cores per package: 2
> real memory  = 5368709120 (5120 MB)
> avail memory = 4182241280 (3988 MB)
> ACPI APIC Table: <PTLTD  	 APIC  >
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
>  cpu2 (AP): APIC ID:  2
>  cpu3 (AP): APIC ID:  3
> ioapic0 <Version 1.1> irqs 0-23 on motherboard
> ioapic1 <Version 1.1> irqs 24-27 on motherboard
> ioapic2 <Version 1.1> irqs 28-31 on motherboard
> kbd1 at kbdmux0
> acpi0: <PTLTD   RSDT> on motherboard
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> cpu2: <ACPI CPU> on acpi0
> cpu3: <ACPI CPU> on acpi0
> acpi_button0: <Power Button> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pci0: <memory> at device 0.0 (no driver attached)
> isab0: <PCI-ISA bridge> at device 1.0 on pci0
> isa0: <ISA bus> on isab0
> pci0: <serial bus, SMBus> at device 1.1 (no driver attached)
> pci0: <serial bus, USB> at device 2.0 (no driver attached)
> atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0
> ata0: <ATA channel 0> on atapci0
> ata1: <ATA channel 1> on atapci0
> pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pci1: <display, VGA> at device 6.0 (no driver attached)
> fxp0: <Intel 82551 Pro/100 Ethernet> port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1
> miibus0: <MII bus> on fxp0
> inphy0: <i82555 10/100 media interface> on miibus0
> inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> fxp0: Ethernet address: 00:e0:81:33:b5:f1
> pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> pcib3: <ACPI PCI-PCI bridge> at device 14.0 on pci0
> pci3: <ACPI PCI bus> on pcib3
> pcib4: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci24: <ACPI PCI bus> on pcib4
> pcib5: <ACPI PCI-PCI bridge> at device 10.0 on pci24
> pci25: <ACPI PCI bus> on pcib5
> 3ware device driver for 9000 series storage controllers, version: 3.60.02.012
> twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25
> twa0: [FAST]
> twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002
> pci24: <base peripheral, interrupt controller> at device 10.1 (no driver attached)
> pcib6: <ACPI PCI-PCI bridge> at device 11.0 on pci24
> pci26: <ACPI PCI bus> on pcib6
> bge0: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26
> miibus1: <MII bus> on bge0
> brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
> bge0: Ethernet address: 00:e0:81:33:b6:f4
> bge1: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26
> miibus2: <MII bus> on bge1
> brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
> bge1: Ethernet address: 00:e0:81:33:b6:f5
> pci24: <base peripheral, interrupt controller> at device 11.1 (no driver attached)
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
> sio0: type 16550A, console
> fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
> fdc0: [FAST]
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> pmtimer0 on isa0
> orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0
> ppc0: parallel port not found.
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> Timecounters tick every 1.000 msec
> ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled
> da0 at twa0 bus 0 target 0 lun 0
> da0: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da0: 100.000MB/s transfers
> da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da1 at twa0 bus 0 target 1 lun 0
> da1: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da1: 100.000MB/s transfers
> da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da2 at twa0 bus 0 target 2 lun 0
> da2: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da2: 100.000MB/s transfers
> da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da3 at twa0 bus 0 target 3 lun 0
> da3: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da3: 100.000MB/s transfers
> da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da4 at twa0 bus 0 target 4 lun 0
> da4: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da4: 100.000MB/s transfers
> da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da5 at twa0 bus 0 target 5 lun 0
> da5: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da5: 100.000MB/s transfers
> da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da6 at twa0 bus 0 target 6 lun 0
> da6: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da6: 100.000MB/s transfers
> da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> da7 at twa0 bus 0 target 7 lun 0
> da7: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device 
> da7: 100.000MB/s transfers
> da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C)
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #3 Launched!
> Trying to mount root from ufs:/dev/da0s1a
> WARNING: / was not properly dismounted
> /: mount pending error: blocks 208 files 5
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?455A32B7.9080304>