Date: Fri, 8 Oct 2004 12:55:36 -0400 (EDT) From: Mikhail Teterin <mi@aldan.algebra.com> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/72451: Continuing problems with Silicon Image SATA controllers Message-ID: <200410081650.i98Go6O5015987@harik.murex.com> Resent-Message-ID: <200410081700.i98H0qRA049950@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 72451 >Category: kern >Synopsis: Continuing problems with Silicon Image SATA controllers >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Oct 08 17:00:51 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Mikhail Teterin >Release: FreeBSD 5.3-BETA5 amd64 >Organization: Virtual Estates, Inc. >Environment: System: FreeBSD pandora 5.3-BETA5 FreeBSD 5.3-BETA5 #4: Mon Sep 20 16:45:55 EDT 2004 mteterin@pandora:/backup/obj/usr/src/sys/DIOSCURI amd64 Relevant dmesg.boot entries: atapci0: <SiI 3114 SATA150 controller> port 0x9c00-0x9c0f,0xa000-0xa003,0xa400-0xa407,0xa800-0xa803,0xac00-0xac07 mem 0xff3ff400-0xff3ff7ff irq 17 at device 11.0 on pci3 ad6: 190782MB <ST3200822AS/3.01> [387621/16/63] at ata3-master SATA150 Ident information from the running kernel: $FreeBSD: src/sys/dev/ata/ata-all.c,v 1.227 2004/09/16 09:35:01 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.34 2004/08/27 14:48:32 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.47 2004/09/03 12:10:44 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.88 2004/08/20 06:19:25 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.88 2004/09/10 10:31:37 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $ $FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.177 2004/09/01 12:15:44 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp $ $FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $ >Description: Under _combined_ disk and CPU load, the following errors start popping up: ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=53404031 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=54910687 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=56806527 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=61715903 ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=176444927 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=311594591 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=196040671 ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=306623743 After a while, all disk IO starts hanging and even a gracefull reboot becomes impossible -- the machine hangs after saying: "some processes would not die..." We replaced the disk and the cables twice already. Under just the disk load, the problem does not appear -- the box survives a full run of `iozone -a' without a hitch, for example. But when we, for example, dump databases on it (over NFS) and, at the same time, gzip the dump for archiving, we see this. Or, when a big file is being uploaded with scp over a fast link with ssh compression. So it looks like something inside the ata driver is not attended to fast enough... >How-To-Repeat: Run `iozone -a' on a disk, while gzip-ing a big file off of the same drive. >Fix: >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410081650.i98Go6O5015987>