From owner-freebsd-stable@FreeBSD.ORG Tue Feb 23 15:10:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E49401065679 for ; Tue, 23 Feb 2010 15:10:18 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-fx0-f223.google.com (mail-fx0-f223.google.com [209.85.220.223]) by mx1.freebsd.org (Postfix) with ESMTP id 714CC8FC17 for ; Tue, 23 Feb 2010 15:10:17 +0000 (UTC) Received: by fxm23 with SMTP id 23so3835999fxm.3 for ; Tue, 23 Feb 2010 07:10:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=6wZBze8iGSaVNY/bd4Z57phfFM2UGe9F3pggrD5GYFs=; b=wMcqpaoUtxKcggrmYJ5lxIXJble48JkfplHIMFmUXMB5NJ0+u99IN3ppeVfWOJFsKV lTgYeXyPu3insmRLVeQRFb2fCQvXiXRbvU17Rl4kZ/V5C+BGinhWDBpTIzZjazb2avRr eBqLs2pMc8Ktg8D6KJ02FrtjklhwZUo4pxOjA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=vhuBaYj6zedq72r3yuEBvunv0GUbS22l3vnUdLJLlTmjCa8AVfbpMiBh34WwrDGQCA /nHhfsi4kKqb+/hq8CmfJlB81FrSVjhEqPtxInS2begemLnMgErBNCdHwDkaWhdsrk85 esRIXZF8EoRIiTldWqQ59yR15wu+0ssRQGBlA= Received: by 10.223.15.133 with SMTP id k5mr1525182faa.39.1266937814615; Tue, 23 Feb 2010 07:10:14 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id 15sm2499779fxm.0.2010.02.23.07.10.13 (version=SSLv3 cipher=RC4-MD5); Tue, 23 Feb 2010 07:10:14 -0800 (PST) Sender: Alexander Motin Message-ID: <4B83EFD4.8050403@FreeBSD.org> Date: Tue, 23 Feb 2010 17:10:12 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: Harald Schmalzbauer References: <1266934981.00222684.1266922202@10.7.7.3> In-Reply-To: <1266934981.00222684.1266922202@10.7.7.3> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: ahcich timeouts, only with ahci, not with ataahci X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 15:10:19 -0000 Harald Schmalzbauer wrote: > I'm frequently getting my machine locked with ahcichX timeouts: > ahcich2: Timeout on slot 0 > ahcich2: is 00000000 cs 00000001 ss 00000000 rs 00000001 tfd c0 serr > 00000000 > ahcich2: Timeout on slot 8 > ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd c0 serr > 00000000 > ahcich2: Timeout on slot 8 > ahcich2: is 00000000 cs fffff07f ss ffffff7f rs ffffff7f tfd c0 serr > 00000000 > ... Looking that is (Interrupt status) is zero and `rs == cs | ss` (running command bitmasks in driver and hardware), controller doesn't report command completion. Looking on TFD status 0xc0 with BUSY bit set, I would suppose that either disk stuck in command processing for some reason, or controller missed command completion status. Have you noticed 30 second (default ATA timeout) pause before timeout message printed? Just want to be sure that driver waited enough before give up. > This happens when backup over GbE overloads ZFS/HDD capabilities. > I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking > up almost immediately, but from it still happens. > When I don't use ahci but ataahci (the old driver if I understand things > correct) I also see the ZFS burst write congestion, but this doesn't > lead to controller timeouts, thus blocking the machine. > > Sometimes the machine recovers from the disk lock, but most often I have > to reboot. How it looks when it doesn't? Can you send me full log messages? > Kernel is from Feb. 19, so recent ahci improovements are active. > Controller is ICH9R with 3 Samsung F3 SpinPoints. > > Any ideas how to work arround the hangs other than using the old ahci > driver? Old ataahci driver wasn't using NCQ. NCQ may trigger some bugs in drive firmware or expose some protocol inconsistencies. I would recommend you to search for some errata for your drive and possibly firmware update. -- Alexander Motin