Date: Sun, 12 Dec 2004 15:01:33 GMT From: Peter Risdon <peter@circlesquared.com> To: FreeBSD-gnats-submit@FreeBSD.org Subject: i386/74988: dma errors with large maxtor hard drives Message-ID: <200412121501.iBCF1X5S064747@lorna.circlesquared.com> Resent-Message-ID: <200412121510.iBCFAR0x001332@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 74988 >Category: i386 >Synopsis: dma errors with large maxtor hard drives >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-i386 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Dec 12 15:10:27 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Peter Risdon >Release: FreeBSD 5.3-STABLE i386 >Organization: the circle squared >Environment: System: FreeBSD lorna.circlesquared.com 5.3-STABLE FreeBSD 5.3-STABLE #0: Fri Dec 3 11:32:19 GMT 2004 peter@lorna.circlesquared.com:/usr/obj/usr/src/sys/LORNA5 i386 >Description: 5.3 generates dma errors with some Maxtor hard drives when the drive is subject to heavy use. This has affected four 250GB drives I have tested, all of which work without problem under 4.10. Errors in /var/log/messages are as follows: Dec 12 20:06:54 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165575 Dec 12 20:06:56 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165576 and so on. When fsck is run on the drive, the following output is obtained: #fsck /dev/ad0s1d ** /dev/ad0s1d ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes CANNOT READ BLK: 185165408 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] y THE FOLLOWING DISK SECTORS COULD NOT BE READ: 185165504, 185165505, 185165506, 185165507, 185165508, 185165509, 185165512, 185165513, 185165521, CANNOT READ BLK: 194197856 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] y THE FOLLOWING DISK SECTORS COULD NOT BE READ: 194197921, 194197923, 194197926, 194197928, 194197931, CANNOT READ BLK: 237478336 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] n and so on. This is entirely repeatable. I have tested this on a machine as follows: # uname -a FreeBSD goodparley.circlesquared.com 5.3-STABLE FreeBSD 5.3-STABLE #0: Sat Dec 4 23:19:42 GMT 2004 peter@goodparley.circlesquared.com:/usr/obj/usr/src/sys/GENERIC i386 # df Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/da0s1a 495726 66874 389194 15% / devfs 1 1 0 100% /dev /dev/da0s1d 13668160 2130142 10444566 17% /usr # less /var/run/dmesg.boot Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-STABLE #0: Sat Dec 4 23:19:42 GMT 2004 peter@goodparley.circlesquared.com:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Celeron(R) CPU 2.40GHz (2394.01-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS ,HTT,TM,PBE> real memory = 1056899072 (1007 MB) avail memory = 1024679936 (977 MB) ACPI APIC Table: <IntelR AWRDACPI> ioapic0 <Version 2.0> irqs 0-23 on motherboard npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <IntelR AWRDACPI> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_tz0: <Thermal Zone> on acpi0 acpi_button0: <Power Button> on acpi0 acpi_button1: <Sleep Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <Intel 82865G (865G GMCH) SVGA controller> port 0xb000-0xb007 mem 0xfa000000-0xfa07ffff,0xf0000000-0xf7ffffff irq 16 at devic e 2.0 on pci0 agp0: detected 16252k stolen memory agp0: aperture size is 128M uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xa000-0xa01f irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xa400-0xa41f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xa800-0xa81f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xac00-0xac1f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered pci0: <serial bus, USB> at device 29.7 (no driver attached) pcib1: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci1: <ACPI PCI bus> on pcib1 ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0x9000-0x90ff mem 0xf9000000-0xf9000fff irq 17 at device 1.0 on pci1 ahc0: [GIANT-LOCKED] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs rl0: <RealTek 8139 10/100BaseTX> port 0x9400-0x94ff mem 0xf9001000-0xf90010ff irq 19 at device 3.0 on pci1 miibus0: <MII bus> on rl0 rlphy0: <RealTek internal media interface> on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl0: Ethernet address: 00:e0:4c:b8:c2:e4 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel ICH5 UDMA100 controller> port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 pci0: <serial bus, SMBus> at device 31.3 (no driver attached) pci0: <multimedia, audio> at device 31.5 (no driver attached) fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: <Standard parallel printer port> port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] orm0: <ISA Option ROM> at iomem 0xd2000-0xd47ff on isa0 pmtimer0 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2394011200 Hz quality 800 Timecounters tick every 10.000 msec ad0: 239372MB <Maxtor 7Y250P0/YAR41BW0> [486344/16/63] at ata0-master UDMA100 Waiting 15 seconds for SCSI devices to settle da0 at ahc0 bus 0 target 0 lun 0 da0: <SEAGATE ST318406LW 8A03> Fixed Direct Access SCSI-3 device da0: 40.000MB/s transfers (20.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 17366MB (35566478 512 byte sectors: 255H 63S/T 2213C) Mounting root from ufs:/dev/da0s1a WARNING: / was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted /var: mount pending error: blocks 20 files 5 /var: superblock summary recomputed ##################################################################### No dma errors yet. But try doing something intensive with the disk... ##################################################################### #fsck /dev/ad0s1d ** /dev/ad0s1d ** Last Mounted on /mnt ** Phase 1 - Check Blocks and Sizes CANNOT READ BLK: 185165408 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] y THE FOLLOWING DISK SECTORS COULD NOT BE READ: 185165504, 185165505, 185165506, 185165507, 185165508, 185165509, 185165512, 185165513, 185165521, CANNOT READ BLK: 194197856 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] y THE FOLLOWING DISK SECTORS COULD NOT BE READ: 194197921, 194197923, 194197926, 194197928, 194197931, CANNOT READ BLK: 237478336 UNEXPECTED SOFT UPDATE INCONSISTENCY CONTINUE? [yn] n #tail /var/log/messages Dec 12 20:06:54 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165575 Dec 12 20:06:56 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165576 Dec 12 20:06:59 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165584 Dec 12 20:07:00 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197919 Dec 12 20:07:07 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197984 Dec 12 20:07:08 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197986 Dec 12 20:07:10 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197989 Dec 12 20:07:11 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197991 Dec 12 20:07:13 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197994 Dec 12 20:07:15 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=237478399 ###################################### Then I reinstalled the same machine using 4.10 and could use the disk without problem. >How-To-Repeat: Install 5.3 on any machine with an Intel 865 or 845 chipset, install Maxtor 250GB hard drive. It has been entirely consistent. >Fix: None. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200412121501.iBCF1X5S064747>