From owner-freebsd-bugs@FreeBSD.ORG Mon Feb 28 22:50:08 2005 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 27C0116A4CE for ; Mon, 28 Feb 2005 22:50:08 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA31E43D1D for ; Mon, 28 Feb 2005 22:50:07 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.1/8.13.1) with ESMTP id j1SMo70p004138 for ; Mon, 28 Feb 2005 22:50:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.1/8.13.1/Submit) id j1SMo7ca004137; Mon, 28 Feb 2005 22:50:07 GMT (envelope-from gnats) Resent-Date: Mon, 28 Feb 2005 22:50:07 GMT Resent-Message-Id: <200502282250.j1SMo7ca004137@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jason Hitt Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E26CD16A4CE for ; Mon, 28 Feb 2005 22:45:18 +0000 (GMT) Received: from www.freebsd.org (www.freebsd.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 92E0443D39 for ; Mon, 28 Feb 2005 22:45:18 +0000 (GMT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j1SMjInk073950 for ; Mon, 28 Feb 2005 22:45:18 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id j1SMjIJB073949; Mon, 28 Feb 2005 22:45:18 GMT (envelope-from nobody) Message-Id: <200502282245.j1SMjIJB073949@www.freebsd.org> Date: Mon, 28 Feb 2005 22:45:18 GMT From: Jason Hitt To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-2.3 Subject: kern/78216: WRITE_DMA UDMA ICRC errors while copying data to a disk X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Feb 2005 22:50:08 -0000 >Number: 78216 >Category: kern >Synopsis: WRITE_DMA UDMA ICRC errors while copying data to a disk >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Feb 28 22:50:07 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Jason Hitt >Release: FreeBSD 5.3-STABLE i386 >Organization: >Environment: FreeBSD calandor 5.3-STABLE FreeBSD 5.3-STABLE #0: Sun Feb 13 22:01:06 CST 2005 root@calandor:/usr/obj/usr/src/sys/FILESERVER_5 i386 >Description: My system was configured with 4.10 using vinum with a simple mirroring setup. I upgraded to 5.3 and attempted to convert to gmirror. I removed /dev/ad2 from my vinum volume and created a gmirror volume on it instead (on /dev/ad2s1). I then successfully copied all my data from the mounts residing on /dev/ad0 to the mounts residing on /dev/ad2 without a single error. I rebooted using /dev/ad2 and reset /dev/ad0. Upon adding /dev/ad0s1 to the gmirror volume, I immediately began receiving errors of the form: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=######## I then removed /dev/ad0s1 from the gmirror volume, attempted to create a new volume on it, and simply copy data from the volume on /dev/ad2s1 to this new volume. Again, the exact same results occurred. I disabled dma via hw.ata.ata_dma in /boot/loader.conf, and everything immediately began working without error. Two interesting points about the above process: 1) I had no problems whatsoever copying nearly 100 gigs of data from /dev/ad0 to /dev/ad2 (repeatedly...it re-did the copy three times before i decided my new setup met my desires). 2) After attempting to add /dev/ad0 to the gmirror volume and seeing errors, I rebooted my PC to use a hard disk diagnostic tool. When the machine rebooted, the BIOS reported the first drive in CHS mode, not LBA mode. Zeroing out the drive and re-fdisking corrected this. Attempting to copy data to the drive caused it to re-occur (with the associated WRITE_DMA errors popping up as well). The only customizations i have made to the config file were to disable drivers i do not use (various network cards, some drive controllers... basically just hardware i will never own). I have two hard disks, each on their own 80 conductor IDE cable. Below is my startup dump. FreeBSD 5.3-STABLE #0: Sun Feb 13 22:01:06 CST 2005 root@calandor:/usr/obj/usr/src/sys/FILESERVER_5 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Duron(tm) processor (798.64-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x631 Stepping = 1 Features=0x183f9ff AMD Features=0xc0440000 real memory = 536805376 (511 MB) avail memory = 515620864 (491 MB) npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 cpu0: on acpi0 acpi_tz0: on acpi0 acpi_button0: on acpi0 acpi_button1: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xe0000000-0xe1ffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) uhci0: port 0xd000-0xd01f irq 11 at device 16.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xd400-0xd41f irq 3 at device 16.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xd800-0xd81f irq 10 at device 16.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered pci0: at device 16.3 (no driver attached) isab0: at device 17.0 on pci0 isa0: on isab0 atapci0: port 0xdc00-0xdc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 17.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 pci0: at device 17.5 (no driver attached) vr0: port 0xe800-0xe8ff mem 0xe8001000-0xe80010ff irq 11 at device 18.0 on pci0 miibus0: on vr0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto vr0: Ethernet address: 00:0d:87:b0:00:55 fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model Generic PS/2 mouse, device ID 0 orm0: at iomem 0xc0000-0xc9fff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 798642802 Hz quality 800 Timecounters tick every 10.000 msec ad0: 114473MB [232581/16/63] at ata0-master PIO4 ad2: 114473MB [232581/16/63] at ata1-master PIO4 GEOM_MIRROR: Device m0s1 created (id=1279703646). GEOM_MIRROR: Device m0s1: provider ad0s1 detected. GEOM_MIRROR: Device m0s1: provider ad2s1 detected. GEOM_MIRROR: Device m0s1: provider ad2s1 activated. GEOM_MIRROR: Device m0s1: provider ad0s1 activated. GEOM_MIRROR: Device m0s1: provider mirror/m0s1 launched. Mounting root from ufs:/dev/mirror/m0s1a Accounting enabled >How-To-Repeat: Unknown if this is repeatable on any random system. It appears to be an issue for many people, however, i did not see any reports of multiple drive configurations such as mine. The fact that my second drive had no DMA issues while my first drive did may be revealing. >Fix: Workaround: disable dma access via hw.ata.ata_dma in /boot/loader.conf I have not yet tested various DMA modes other than UDMA100, but PIO4 works flawlessly (albeit quite slowly) >Release-Note: >Audit-Trail: >Unformatted: