From owner-freebsd-stable@FreeBSD.ORG Mon Jul 28 09:30:42 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 29C6937B401; Mon, 28 Jul 2003 09:30:42 -0700 (PDT) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 934EB43F85; Mon, 28 Jul 2003 09:30:41 -0700 (PDT) (envelope-from gibbs@scsiguy.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h6SGUeo08226; Mon, 28 Jul 2003 09:30:40 -0700 Received: from [10.100.253.70] (aslan.btc.adaptec.com [10.100.253.70]) by redfish.adaptec.com (8.8.8p2+Sun/8.8.8) with ESMTP id JAA28295; Mon, 28 Jul 2003 09:30:40 -0700 (PDT) Date: Mon, 28 Jul 2003 10:32:22 -0600 From: "Justin T. Gibbs" To: "Marc G. Fournier" , freebsd-stable@freebsd.org Message-ID: <834690000.1059409942@aslan.btc.adaptec.com> In-Reply-To: <20030726115857.M37284@hub.org> References: <20030726115857.M37284@hub.org> X-Mailer: Mulberry/3.1.0b3 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline cc: freebsd-scsi@freebsd.org Subject: Re: Dump Card State Begins ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: "Justin T. Gibbs" List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jul 2003 16:30:42 -0000 > Hi ... > > Can someone tell me whether or not this is indicative of a hardware, or > software, problem? It happened a few times today, on two different > drives, and it seem to "self-recover", since the server is still purring > along without any noticeable problems: > > neptune# grep "timed out" /var/log/messages > Jul 25 03:52:51 neptune /kernel: (da2:ahd1:0:2:0): SCB 0x40 - timed out > Jul 25 03:57:22 neptune /kernel: (da2:ahd1:0:2:0): SCB 0x18 - timed out > Jul 25 03:58:53 neptune /kernel: (da1:ahd1:0:1:0): SCB 0x1e - timed out > Jul 26 10:55:46 neptune /kernel: (da2:ahd1:0:2:0): SCB 0x39 - timed out > > The drives are all U320 Seagate Cheetah 70G ... no RAID involved, its > just straight drives using the motherboard's onboard SCSI controller ... > the motherboard is the Intel SE7501, in the SR2300 chassis ... I need the exact model number and firmware for these drives. There are at least three different Cheetah 70G U320 drives. > > It did it back on the 19th as well: > > Jul 19 19:37:16 neptune /kernel: (da2:ahd1:0:2:0): SCB 0x46 - timed out > Jul 19 19:38:46 neptune /kernel: (da1:ahd1:0:1:0): SCB 0x2d - timed out > > But again, appears to have recovered with no ill effects ... > > > Jul 25 03:52:51 neptune /kernel: (da2:ahd1:0:2:0): SCB 0x40 - timed out > Jul 25 03:53:06 neptune /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< The Dump Card state output is much easier to parse if you enable register pretty printing. From GENERIC: options AHD_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~215k to driver. >From what I can tell here, your drives are sitting on some commands instead of completing them. This was one of the problems in early revisions of Seagate's U320 drive firmware. Without knowing more details of the system though, I can't comment definitively. -- Justin