Date: Tue, 2 Mar 2010 23:52:54 -0800 From: Jeremy Chadwick <freebsd@jdc.parodius.com> To: freebsd-stable@freebsd.org Subject: Re: ahcich timeouts, only with ahci, not with ataahci Message-ID: <20100303075254.GA47119@icarus.home.lan> In-Reply-To: <4B8E1489.2070306@omnilan.de> References: <1266934981.00222684.1266922202@10.7.7.3> <4B83EFD4.8050403@FreeBSD.org> <4B8E1489.2070306@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 03, 2010 at 08:49:29AM +0100, Harald Schmalzbauer wrote: > Alexander Motin schrieb am 23.02.2010 16:10 (localtime): > >Harald Schmalzbauer wrote: > >>I'm frequently getting my machine locked with ahcichX timeouts: > >>ahcich2: Timeout on slot 0 > >>ahcich2: is 00000000 cs 00000001 ss 00000000 rs 00000001 tfd c0 serr > >>00000000 > >>ahcich2: Timeout on slot 8 > >>ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd c0 serr > >>00000000 > >>ahcich2: Timeout on slot 8 > >>ahcich2: is 00000000 cs fffff07f ss ffffff7f rs ffffff7f tfd c0 serr > >>00000000 > >>... > > > >Looking that is (Interrupt status) is zero and `rs == cs | ss` (running > >command bitmasks in driver and hardware), controller doesn't report > >command completion. Looking on TFD status 0xc0 with BUSY bit set, I > >would suppose that either disk stuck in command processing for some > >reason, or controller missed command completion status. > > > >Have you noticed 30 second (default ATA timeout) pause before timeout > >message printed? Just want to be sure that driver waited enough before > >give up. > > > >>This happens when backup over GbE overloads ZFS/HDD capabilities. > >>I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking > >>up almost immediately, but from it still happens. > >>When I don't use ahci but ataahci (the old driver if I understand things > >>correct) I also see the ZFS burst write congestion, but this doesn't > >>lead to controller timeouts, thus blocking the machine. > >> > >>Sometimes the machine recovers from the disk lock, but most often I have > >>to reboot. > > > >How it looks when it doesn't? Can you send me full log messages? > > Hello, this morning I had a stall, but the machine recovered after > about one Minute. Here's what I got from the kernel: > ahcich2: Timeout on slot 29 > ahcich2: is 00000000 cs 00000003 ss e0000003 rs e0000003 tfd c0 serr > 00000000 > em1: watchdog timeout -- resetting > em1: watchdog timeout -- resetting Please provide the following output: pciconf -lv vmstat -i -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100303075254.GA47119>