From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 13 21:07:24 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 222FE106566C for ; Thu, 13 Jan 2011 21:07:24 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id A41E88FC12 for ; Thu, 13 Jan 2011 21:07:23 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p0DKboos045562; Thu, 13 Jan 2011 13:37:50 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p0DKboaK045561; Thu, 13 Jan 2011 13:37:50 -0700 (MST) (envelope-from ken) Date: Thu, 13 Jan 2011 13:37:50 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110113203750.GA39494@nargothrond.kdm.org> References: <4D2DAA45.30602@FreeBSD.org> <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jan 2011 21:07:24 -0000 On Thu, Jan 13, 2011 at 01:14:50 +0100, Joachim Tingvold wrote: > On Wed, Jan 12, 2011, at 23:29:53PM GMT+01:00, Joachim Tingvold wrote: > >If I were copying from the AHCI-attached disk to the mps controller, > >and the AHCI-attached disk timeouts, wouldn't this cause the disks > >on the mps controller to timeout as well? > > Now it happened again (while copying from 'zroot' to 'storage'). This > time only mps0 produced errors; > . As the > timeout seem to be over quickly, I find it strange that whatever process > that accessed the disks (in my case, 'mv'), doesn't continue once the > disks are available -- or is this some kind of safeguard against corrupted > data? Did the system recover this time? The 'out of chain frames' messages are somewhat worrysome. From looking at the logic in the driver (mpssas_action_scsiio() and mps_data_cb()), it looks like if it runs out of chain frames, it won't cancel the timeout on the command. So you'll wind up getting timeouts. But sending an abort for a command that hasn't gone down to the chip is rather pointless. Did you see any other messages before the 'out of chain frames' messages popped up? Try editing sys/dev/mps/mpsvar.h, and change MPS_CHAIN_FRAMES from 1024 to 2048 and see if that helps things any. That won't fix the underlying problem, but it may help you avoid running out of that resource. Ken -- Kenneth Merry ken@FreeBSD.ORG