From owner-freebsd-sparc64@FreeBSD.ORG Sun Jun 23 15:30:05 2013 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DFCB8D3 for ; Sun, 23 Jun 2013 15:30:05 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 7BD6D1110 for ; Sun, 23 Jun 2013 15:30:05 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.7/8.14.7/ALCHEMY.FRANKEN.DE) with ESMTP id r5NFU3fX084701; Sun, 23 Jun 2013 17:30:03 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.7/8.14.7/Submit) id r5NFU3DF084700; Sun, 23 Jun 2013 17:30:03 +0200 (CEST) (envelope-from marius) Date: Sun, 23 Jun 2013 17:30:03 +0200 From: Marius Strobl To: Riccardo Veraldi Subject: Re: watchdog timeout Message-ID: <20130623153003.GC940@alchemy.franken.de> References: <51C60DAA.5020300@gmx.it> <20130622233911.GA81789@alchemy.franken.de> <51C6DAE3.6050602@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51C6DAE3.6050602@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: fddi , freebsd-sparc64 X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Jun 2013 15:30:05 -0000 On Sun, Jun 23, 2013 at 01:24:19PM +0200, Riccardo Veraldi wrote: > > it is 9.1-RELEASE > > I have no way to reproduce it, because it appeared suddently without any > specific cause. > The system was not under heavy load, I Was not compiling anything... > > But I had other errors recently which are really not related I think, > mainly parity SCSI errors on isp0 > > Jun 12 04:48:38 blade kernel: (da3:isp0:0:1:0): CAM status: SCSI Status > Error > Jun 12 04:48:38 blade kernel: (da3:isp0:0:1:0): SCSI sense: ABORTED > COMMAND asc:47,0 (SCSI parity error) In fact, there's one open issue with isp(4) causing timeouts of the kind you have seen but which never got quite resolved. Investigating that as a potential culprit for the case you have encountered would require updating to stable/9 and possibly giving a patch a try, though. Besides, we'd need a way to reliably detect whether the problem is gone. On the other hand, the parity errors indicate that this system is suffering from hardware problems, including the possibility that a disk is defective or has broken firmware and the combination of some components not getting along for some reason. Marius