Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Dec 2004 15:01:33 GMT
From:      Peter Risdon <peter@circlesquared.com>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   i386/74988: dma errors with large maxtor hard drives	
Message-ID:  <200412121501.iBCF1X5S064747@lorna.circlesquared.com>
Resent-Message-ID: <200412121510.iBCFAR0x001332@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         74988
>Category:       i386
>Synopsis:       dma errors with large maxtor hard drives
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Dec 12 15:10:27 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Peter Risdon
>Release:        FreeBSD 5.3-STABLE i386
>Organization:
the circle squared	
>Environment:
System: FreeBSD lorna.circlesquared.com 5.3-STABLE FreeBSD 5.3-STABLE #0: Fri Dec 3 11:32:19 GMT 2004 peter@lorna.circlesquared.com:/usr/obj/usr/src/sys/LORNA5 i386


>Description:
5.3 generates dma errors with some Maxtor hard drives when the drive is subject to heavy use.
This has affected four 250GB drives I have tested, all of which work without problem under 4.10.

Errors in /var/log/messages are as follows:
Dec 12 20:06:54 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165575
Dec 12 20:06:56 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165576

and so on. When fsck is run on the drive, the following output is obtained:

#fsck /dev/ad0s1d
** /dev/ad0s1d
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes

CANNOT READ BLK: 185165408
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 185165504, 185165505, 185165506, 185165507, 185165508, 185165509, 185165512, 185165513, 185165521,

CANNOT READ BLK: 194197856
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 194197921, 194197923, 194197926, 194197928, 194197931,

CANNOT READ BLK: 237478336
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] n

and so on. This is entirely repeatable.

I have tested this on a machine as follows:

# uname -a
FreeBSD goodparley.circlesquared.com 5.3-STABLE FreeBSD 5.3-STABLE #0: Sat Dec  4 23:19:42 GMT 2004     peter@goodparley.circlesquared.com:/usr/obj/usr/src/sys/GENERIC  i386


# df
Filesystem  1K-blocks    Used    Avail Capacity  Mounted on
/dev/da0s1a    495726   66874   389194    15%    /
devfs               1       1        0   100%    /dev
/dev/da0s1d  13668160 2130142 10444566    17%    /usr

# less /var/run/dmesg.boot

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #0: Sat Dec  4 23:19:42 GMT 2004
    peter@goodparley.circlesquared.com:/usr/obj/usr/src/sys/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Celeron(R) CPU 2.40GHz (2394.01-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS
,HTT,TM,PBE>
real memory  = 1056899072 (1007 MB)
avail memory = 1024679936 (977 MB)
ACPI APIC Table: <IntelR AWRDACPI>
ioapic0 <Version 2.0> irqs 0-23 on motherboard
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <IntelR AWRDACPI> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <Intel 82865G (865G GMCH) SVGA controller> port 0xb000-0xb007 mem 0xfa000000-0xfa07ffff,0xf0000000-0xf7ffffff irq 16 at devic
e 2.0 on pci0
agp0: detected 16252k stolen memory
agp0: aperture size is 128M
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xa000-0xa01f irq 16 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xa400-0xa41f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xa800-0xa81f irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xac00-0xac1f irq 16 at device 29.3 on pci0
uhci3: [GIANT-LOCKED]
usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci1: <ACPI PCI bus> on pcib1
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0x9000-0x90ff mem 0xf9000000-0xf9000fff irq 17 at device 1.0 on pci1
ahc0: [GIANT-LOCKED]
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
rl0: <RealTek 8139 10/100BaseTX> port 0x9400-0x94ff mem 0xf9001000-0xf90010ff irq 19 at device 3.0 on pci1
miibus0: <MII bus> on rl0
rlphy0: <RealTek internal media interface> on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: Ethernet address: 00:e0:4c:b8:c2:e4
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
pci0: <multimedia, audio> at device 31.5 (no driver attached)
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
fdc0: [FAST]
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
ppc0: <Standard parallel printer port> port 0x378-0x37f irq 7 on acpi0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
orm0: <ISA Option ROM> at iomem 0xd2000-0xd47ff on isa0
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2394011200 Hz quality 800
Timecounters tick every 10.000 msec
ad0: 239372MB <Maxtor 7Y250P0/YAR41BW0> [486344/16/63] at ata0-master UDMA100
Waiting 15 seconds for SCSI devices to settle
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST318406LW 8A03> Fixed Direct Access SCSI-3 device
da0: 40.000MB/s transfers (20.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 17366MB (35566478 512 byte sectors: 255H 63S/T 2213C)
Mounting root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
WARNING: /usr was not properly dismounted
WARNING: /var was not properly dismounted
/var: mount pending error: blocks 20 files 5
/var: superblock summary recomputed

#####################################################################
No dma errors yet. But try doing something intensive with the disk...
#####################################################################

#fsck /dev/ad0s1d
** /dev/ad0s1d
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes

CANNOT READ BLK: 185165408
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 185165504, 185165505, 185165506, 185165507, 185165508, 185165509, 185165512, 185165513, 185165521,

CANNOT READ BLK: 194197856
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 194197921, 194197923, 194197926, 194197928, 194197931,

CANNOT READ BLK: 237478336
UNEXPECTED SOFT UPDATE INCONSISTENCY

CONTINUE? [yn] n


#tail /var/log/messages
Dec 12 20:06:54 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165575
Dec 12 20:06:56 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165576
Dec 12 20:06:59 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=185165584
Dec 12 20:07:00 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197919
Dec 12 20:07:07 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197984
Dec 12 20:07:08 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197986
Dec 12 20:07:10 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197989
Dec 12 20:07:11 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197991
Dec 12 20:07:13 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=194197994
Dec 12 20:07:15 goodparley kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=237478399

######################################
Then I reinstalled the same machine using 4.10 and could use the disk without problem.
>How-To-Repeat:
Install 5.3 on any machine with an Intel 865 or 845 chipset, install Maxtor 250GB hard drive. It has been entirely consistent.	
>Fix:

None.

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200412121501.iBCF1X5S064747>