Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Jun 2008 13:27:52 -0500
From:      Reid Linnemann <lreid@cs.okstate.edu>
To:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   READ_DMA timeouts, etc. on FreeBSD 7-STABLE SATA
Message-ID:  <48628E28.7080004@cs.okstate.edu>

next in thread | raw e-mail | index | archive | help
Hi guys,

I'm running 7-STABLE, last synced early June (June 7 I think). I have
two SATA disks, identical 160G Western Digital WD1600AAJS on a SiS 180
SATA controller that are gmirrored, and the mirror provides all of my
individual filesystems. After I built the mirror in single user mode and
rebooted, I started getting DMA errors such as:

Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1
retry left) LBA=2830976
Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1
retry left) LBA=2901888
Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1
retry left) LBA=2995328

The LBA is apparently random. Most of the time this just makes the
machine crawl and is annoying, but if, say, a filesystem were removed
uncleanly from a power failure, the combined activity of the mirror
rebuilding and the fsck cause much more disconcerting errors, eg:

Jun 21 11:48:46 hautlos kernel: ad4: WARNING - SETFEATURES SET TRANSFER
MODE taskqueue timeout - completing request directly
Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES SET TRANSFER
MODE taskqueue timeout - completing request directly
Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE
taskqueue timeout - completing request directly
Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE
taskqueue timeout - completing request directly
Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SET_MULTI taskqueue
timeout - completing request directly
Jun 21 11:49:02 hautlos kernel: ad4: TIMEOUT - WRITE_DMA retrying (1
retry left) LBA=196200751
Jun 21 11:49:02 hautlos kernel: ad6: TIMEOUT - WRITE_DMA retrying (1
retry left) LBA=196442127

But, now for the weird part...

I tried booting in single user mode, disabling DMA, and disabling ACPI,
to no avail. Soft boot, hard boot, doesn't matter. But - if I power the
machine down, cut power to the power supply, and cycle the remaining
juice through the system by hitting the ATX power on, and then boot up,
the DMA errors completely or nearly completely vanish. Since I did this
on Jun 21 I have logged only 2 READ_DMA timeouts:

messages:Jun 22 03:02:15 hautlos kernel: ad4: TIMEOUT - READ_DMA
retrying (1 retry left) LBA=56884207
messages:Jun 24 10:52:41 hautlos kernel: ad4: TIMEOUT - READ_DMA
retrying (1 retry left) LBA=243514511

Does anyone have any ideas? I've googled but can't find any solutions.
I'm not currently subscribed to stable@, so please cc: me in responses.
My uname -a and dmesg follows.

FreeBSD hautlos 7.0-STABLE FreeBSD 7.0-STABLE #7: Sat Jun  7 10:46:48
CDT 2008     root@:/usr/obj/usr/src/sys/HAUTLOS  i386


Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-STABLE #7: Sat Jun  7 10:46:48 CDT 2008
    root@:/usr/obj/usr/src/sys/HAUTLOS
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 Processor 3000+ (1999.44-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x20fc2  Stepping = 2

Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  Features2=0x1<SSE3>
  AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!>
  AMD Features2=0x1<LAHF>
real memory  = 1073676288 (1023 MB)
avail memory = 1037291520 (989 MB)
ACPI APIC Table: <AWARD  AWRDACPI>
ioapic0 <Version 1.4> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <AWARD AWRDACPI> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, 3fef0000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port
0xcf8-0xcff,0x480-0x48f,0x1000-0x10df,0x10e0-0x10ff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <SiS 755 host to AGP bridge> on hostb0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
vgapci0: <VGA-compatible display> port 0xd000-0xd0ff mem
0xd0000000-0xd7ffffff,0xe8020000-0xe802ffff irq 16 at device 0.0 on pci1
vgapci1: <VGA-compatible display> mem
0xd8000000-0xdfffffff,0xe8030000-0xe803ffff at device 0.1 on pci1
isab0: <PCI-ISA bridge> at device 2.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <SiS 964 UDMA133 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x4000-0x400f at device 2.5 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pcm0: <SiS 7012> port 0xe000-0xe0ff,0xe100-0xe17f irq 18 at device 2.7
on pci0
pcm0: [ITHREAD]
pcm0: <Avance Logic ALC655 AC97 Codec>
ohci0: <SiS 5571 USB controller> mem 0xe8124000-0xe8124fff irq 20 at
device 3.0 on pci0
ohci0: [GIANT-LOCKED]
ohci0: [ITHREAD]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <SiS 5571 USB controller> on ohci0
usb0: USB revision 1.0
uhub0: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 3 ports with 3 removable, self powered
ohci1: <SiS 5571 USB controller> mem 0xe8120000-0xe8120fff irq 21 at
device 3.1 on pci0
ohci1: [GIANT-LOCKED]
hci1: [ITHREAD]
usb1: OHCI version 1.0, legacy support
usb1: SMM does not respond, resetting
usb1: <SiS 5571 USB controller> on ohci1
usb1: USB revision 1.0
uhub1: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 3 ports with 3 removable, self powered
ohci2: <SiS 5571 USB controller> mem 0xe8121000-0xe8121fff irq 22 at
device 3.2 on pci0
ohci2: [GIANT-LOCKED]
ohci2: [ITHREAD]
usb2: OHCI version 1.0, legacy support
usb2: SMM does not respond, resetting
usb2: <SiS 5571 USB controller> on ohci2
usb2: USB revision 1.0
uhub2: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xe8122000-0xe8122fff irq
23 at device 3.3 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb3: EHCI version 1.0
usb3: companion controllers, 3 ports each: usb0 usb1 usb2
usb3: <EHCI (generic) USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: <SiS EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3
uhub3: 8 ports with 8 removable, self powered
umass0: <Apple iPod, class 0/0, rev 2.00/0.02, addr 2> on uhub3
sis0: <SiS 900 10/100BaseTX> port 0xe200-0xe2ff mem
0xe8123000-0xe8123fff irq 19 at device 4.0 on pci0
miibus0: <MII bus> on sis0
rlphy0: <RTL8201L 10/100 media interface> PHY 1 on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
sis0: Ethernet address: 00:14:2a:68:cf:ff
sis0: [ITHREAD]
atapci1: <SiS 180 SATA150 controller> port
0xe300-0xe307,0xe400-0xe403,0xe500-0xe507,0xe600-0xe603,0xe700-0xe70f
irq 17 at device 5.0 on pci0
atapci1: [ITHREAD]
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ahc0: <Adaptec 2930CU SCSI adapter> port 0xe800-0xe8ff mem
0xe8125000-0xe8125fff irq 16 at device 12.0 on pci0
ahc0: [ITHREAD]
aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs
acpi_tz0: <Thermal Zone> on acpi0
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
sio0: [FILTER]
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xc0000-0xccfff,0xd0000-0xd7fff,0xd8000-0xd87ff pnpid ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse, device ID 3
ppc0: <Parallel port> at port 0x378-0x37f on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Polled port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ukbd0: <vendor 0x05af USB Keyboard, class 0/0, rev 1.10/1.30, addr 2> on
uhub0
kbd2 at ukbd0
uhid0: <vendor 0x05af USB Keyboard, class 0/0, rev 1.10/1.30, addr 2> on
uhub0
ugen0: <American Power Conversion Back-UPS XS  900 FW:830.E6 .D USB
FW:E6, class 0/0, rev 1.10/1.06, ad
dr 2> on uhub2
Timecounter "TSC" frequency 1999440640 Hz quality 800
Timecounters tick every 1.000 msec
acd0: CDRW <Memorex 52MAX 325216AJv2/RW$5> at ata0-master UDMA33
ad4: 152627MB <WDC WD1600AAJS-00PSA0 05.06H05> at ata2-master SATA150
ad6: 152627MB <WDC WD1600AAJS-00PSA0 05.06H05> at ata3-master SATA150
GEOM_MIRROR: Device mirror/gm0 launched (2/2).
Waiting 2 seconds for SCSI devices to settle
da0 at umass-sim0 bus 0 target 0 lun 0
da0: <Apple iPod 1.62> Removable Direct Access SCSI-0 device
da0: 40.000MB/s transfers
da0: 1936MB (991232 2048 byte sectors: 255H 63S/T 61C)
GEOM_LABEL: Label for provider da0 is label/ipod.
GEOM_LABEL: Label for provider da0s2 is msdosfs/IPOD.
Trying to mount root from ufs:/dev/mirror/gm0s1a
drm0: <ATI Radeon AD 9500> on vgapci0
info: [drm] AGP at 0xe0000000 128MB
info: [drm] Initialized radeon 1.25.0 20060524
info: [drm] Setting GART location based on new memory map
info: [drm] Loading R300 Microcode
info: [drm] writeback test succeeded in 1 usecs
drm0: [ITHREAD]
powernow0: <Cool`n'Quiet K8> on cpu0



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48628E28.7080004>