Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Jan 2013 14:19:38 +0200
From:      "Vladislav Prodan" <universite@ukr.net>
To:        fs@freebsd.org
Cc:        current@freebsd.org
Subject:   AHCI timeout when using ZFS + AIO + NCQ
Message-ID:  <13391.1359029978.3957795939058384896@ffe16.ukr.net>

next in thread | raw e-mail | index | archive | help
I have the server:

FreeBSD 9.1-PRERELEASE #0: Wed Jul 25 01:40:56 EEST 2012

Jan 24 12:53:01 vesuvius kernel: atapci0: <JMicron ATA controller> port 0xc040-0xc047,0xc030-0xc033,0xc020-0xc027,0xc010-0xc013,0xc000-0xc00f mem 0xfe210000-0xfe2101ff irq 51 at device 0.0 on pci3
...
Jan 24 12:53:01 vesuvius kernel: ahci0: <ATI IXP700 AHCI SATA controller> port 0xf040-0xf047,0xf030-0xf033,0xf020-0xf027,0xf010-0xf013,0xf000-0xf00f mem 0xfe307000-0xfe3073ff irq 19 at device 17.0 on pci0
Jan 24 12:53:01 vesuvius kernel: ahci0: AHCI v1.20 with 6 6Gbps ports, Port Multiplier supported
...
Jan 24 12:53:01 vesuvius kernel: ada2 at ahcich2 bus 0 scbus4 target 0 lun 0
Jan 24 12:53:01 vesuvius kernel: ada2: <ST3000DM001-9YN166 CC4C> ATA-8 SATA 3.x device
Jan 24 12:53:01 vesuvius kernel: ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jan 24 12:53:01 vesuvius kernel: ada2: Command Queueing enabled
Jan 24 12:53:01 vesuvius kernel: ada2: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
Jan 24 12:53:01 vesuvius kernel: ada2: Previously was known as ad12
...
I use 4 HDD in RAID10 via ZFS.

With a very irregular intervals fall off HDD drives. As a result, the server stops.

Jan 24 06:48:06 vesuvius kernel: ahcich2: Timeout on slot 6 port 0
Jan 24 06:48:06 vesuvius kernel: ahcich2: is 00000000 cs 00000000 ss 000000c0 rs 000000c0 tfd 40 serr 00000000 cmd 0000e817
Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 4c 4e 1e 40 68 00 00 01 00 00
Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): CAM status: Command timeout
Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): Retrying command
Jan 24 06:51:11 vesuvius kernel: ahcich2: AHCI reset: device not ready after 31000ms (tfd = 00000080)
Jan 24 06:51:11 vesuvius kernel: ahcich2: Timeout on slot 8 port 0
Jan 24 06:51:11 vesuvius kernel: ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 00 serr 00000000 cmd 0000e817
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): CAM status: Command timeout
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): Error 5, Retry was blocked
Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192
Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192
Jan 24 06:51:11 vesuvius kernel: ahcich2: AHCI reset: device not ready after 31000ms (tfd = 00000080)
Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192
Jan 24 06:51:11 vesuvius kernel: ahcich2: Timeout on slot 8 port 0
Jan 24 06:51:11 vesuvius kernel: ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 00 serr 00000000 cmd 0000e817
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): CAM status: Command timeout
Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): Error 5, Retry was blocked
Jan 24 06:51:11 vesuvius kernel: swap_pager: I/O error - pagein failed; blkno 4227133,size 8192, error 6
Jan 24 06:51:11 vesuvius kernel: (ada2:(pass2:vm_fault: pager read error, pid 1943 (named)
Jan 24 06:51:11 vesuvius kernel: ahcich2:0:ahcich2:0:0:0:0): lost device
Jan 24 06:51:11 vesuvius kernel: 0): passdevgonecb: devfs entry is gone
Jan 24 06:51:11 vesuvius kernel: pid 1943 (named), uid 53: exited on signal 11
...

Helps only restart by pressing Power.
Judging by the state of SMART, HDD have no problems. SATA data cable changed.


I found a similar problem:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html
PR: amd64/165547: NVIDIA MCP67 AHCI SATA controller timeout 

-- 
Vladislav V. Prodan            
System & Network Administrator 
http://support.od.ua           
+380 67 4584408, +380 99 4060508
VVP88-RIPE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13391.1359029978.3957795939058384896>