From owner-freebsd-stable@FreeBSD.ORG Thu Jan 22 21:49:05 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6A58106568A; Thu, 22 Jan 2009 21:49:05 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 683168FC22; Thu, 22 Jan 2009 21:49:05 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.local ([192.168.254.200]) (authenticated bits=0) by pooker.samsco.org (8.14.2/8.14.2) with ESMTP id n0MLmwOW054263; Thu, 22 Jan 2009 14:48:58 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <4978E9CA.8040908@samsco.org> Date: Thu, 22 Jan 2009 14:48:58 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.13) Gecko/20080313 SeaMonkey/1.1.9 MIME-Version: 1.0 To: Steve Polyack References: <4978CDE3.4040700@comcast.net> <4978E202.70408@samsco.org> <4978E4AA.8050601@comcast.net> In-Reply-To: <4978E4AA.8050601@comcast.net> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=3.8 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: Mike Tancsa , freebsd-hardware@freebsd.org Subject: Re: amr driver issues in 7.1-RELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2009 21:49:06 -0000 Steve Polyack wrote: > Scott Long wrote: >> The fix for this that I was thinking of is already in 7.1. There >> might still be a driver bug, but I'm leaning more towards the >> controller simply being busy. Do you have a reproducible test case >> that I could >> try? >> >> Scott >> > We saw this one while backups wrote from an array on the PERC4/DC to a > tape drive (on a separate controller). > amr1: Too many retries on command 0xffffffff80a6d060. Controller is > likely dead > > The other four which I noted came during writes to the array attached to > the PERC4/DC (external Dell PowerVault). I want to say they showed up > while writing a 30G junkfile (/dev/random) to the array which we were > using to test the tape access; either that, or while we wrote that file > out to the tape drive. > > If it matters, we also use ports/sysutils/linux-megacli2 to periodically > check the status of our arrays. It's possible that this happened during > one of these long writes/reads. I'm not having any luck reproducing at > the moment, but if I come across a reproducible test, I will let you know. > I don't know too much about the internals of the AMR firmware, but I imagine that it could be possible that a management command from megacli could stall the firmware and make this warning pop up. I'll see if I can reproduce it. The warning is harmless, though, even if it is strongly worded. Scott