From owner-freebsd-current Sat Dec 18 16:56:59 1999 Delivered-To: freebsd-current@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id EDE3B14CB7 for ; Sat, 18 Dec 1999 16:56:47 -0800 (PST) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id TAA22644; Sat, 18 Dec 1999 19:56:46 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.9.3/8.9.1) id TAA10284; Sat, 18 Dec 1999 19:56:16 -0500 (EST) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Sat, 18 Dec 1999 19:56:15 -0500 (EST) To: Soren Schmidt Cc: freebsd-current@freebsd.org Subject: ATA: more Promise Ultra wedges X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14428.10180.553556.127603@grasshopper.cs.duke.edu> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG S=F8ren, It looks like I spoke to soon when I said the world was safe for Promise Ultra users: ad3: ad_timeout: lost disk contact - resetting ata4: resetting devices .. ad3: HARD WRITE ERROR blk# 6594768ad3: DMA p= roblem en countered, fallback to PIO mode ad3: DMA problem encountered, fallback to PIO mode done ad1: UDMA CRC READ ERROR blk# 10522095 retrying ad3: ad_timeout: lost disk contact - resetting ata4: resetting devices .. done At this point the machine is unpingable & will not respond to a break on the console. This is with a ccd stripe set, striped across 4 Maxtor "Diamondmax" drives attached one per channel to 2 Promise Ultra cards. `The kernel sources are dated slightly before the build time in= the below boot messages. (I'd have given you verbose messages, but this is a transcript from the serial console logs & the machine is wedged solid right now). I'm running with a timeout of 30 seconds as I was hoping to avoid a 'lost contact - resetting' situation as all hell breaks loose when those appear. BTW, I'd really like a tunable or some way to prevent a permanent fallback to PIO. I'm more than willing to tolerate one hard error per week or so on a disk which sees 10s of gigabytes of data read & written between errors. =20 The driver was much more stable back in July when (I guess) you just ignored errors. Using a July kernel, this machine will stay up for months with nothing but the occasional: ad3: status=3D51 error=3D84 ad_interrupt: hard error It never looses contact, never wedges. Oh for the good old days.. Cheers, Drew -----------------------------------------------------------------------= ------- Andrew Gallatin, Sr Systems Programmer=09http://www.cs.duke.edu/~gallat= in Duke University=09=09=09=09Email: gallatin@cs.duke.edu Department of Computer Science=09=09Phone: (919) 660-6590 Copyright (c) 1992-1999 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserve= d. FreeBSD 4.0-CURRENT #2: Wed Dec 15 20:59:01 EST 1999 gallatin@grasshopper.cs.duke.edu:/a/muffin/export/ari_scratch2/gall= atin/src/ sys/compile/SLICEX86 Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 451024869 Hz CPU: Pentium II/Xeon/Celeron (451.02-MHz 686-class CPU) Origin =3D "GenuineIntel" Id =3D 0x653 Stepping =3D 3 Features=3D0x183fbff real memory =3D 536870912 (524288K bytes) avail memory =3D 517353472 (505228K bytes) Preloaded elf kernel "kernel" at 0xc0305000. ccd0-3: Concatenated disk drivers Pentium Pro MTRR support enabled devclass_alloc_unit: pcib0 already exists, using next available unit nu= mber npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib2: at device 1.0 on p= ci0 pci1: on pcib2 isab0: at device 7.0 on pci0 isa0: on isab0 ata-pci0: at device 7.1 on pci0 ata-pci0: Busmastering DMA supported ata0 at 0x01f0 irq 14 on ata-pci0 pci0: Intel 82371AB/EB (PIIX4) USB controller (vendor=3D0x8086, dev=3D0= x7112) at 7.2 intpm0: at device 7.3 on pc= i0 intpm0: I/O mapped 440 intpm0: intr IRQ 9 enabled revision 0 smbus0: on intsmb0 smb0: on smbus0 intpm0: PM I/O mapped 400=20 ata-pci1: irq 10 at device 15.0 on pc= i0 ata-pci1: Busmastering DMA supported ata2 at 0xeff0 irq 10 on ata-pci1 ata3 at 0xefa8 irq 10 on ata-pci1 ata-pci2: irq 11 at device 18.0 on pc= i0 ata-pci2: Busmastering DMA supported ata4 at 0xefa0 irq 11 on ata-pci2 ata5 at 0xef68 irq 11 on ata-pci2 fxp0: irq 10 at device 19.0 o= n pci0 fxp0: Ethernet address 00:a0:c9:e7:9d:f6 devclass_alloc_unit: pci1 already exists, using next available unit num= ber pcib1: on motherboard pci2: on pcib1 fdc0: at port 0x3f0-0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold atkbdc0: at port 0x60-0x6f on isa0 atkbd0: irq 1 on atkbdc0 sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A RTC BIOS diagnostic error 20 ad0: ATA-4 disk at ata0 as master ad0: 4892MB (10018890 sectors), 10602 cyls, 15 heads, 63 S/T, 512 B/S ad0: 16 secs/int, 1 depth queue, UDMA33 ad1: ATA-4 disk at ata2 as master ad1: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S ad1: 16 secs/int, 1 depth queue, UDMA33 ad2: ATA-4 disk at ata3 as master ad2: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S ad2: 16 secs/int, 1 depth queue, UDMA33 ad3: ATA-4 disk at ata4 as master ad3: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S ad3: 16 secs/int, 1 depth queue, UDMA33 ad4: ATA-4 disk at ata5 as master ad4: 16479MB (33750864 sectors), 33483 cyls, 16 heads, 63 S/T, 512 B/S ad4: 16 secs/int, 1 depth queue, UDMA33 Mounting root from ufs:/dev/ad0s1a To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message