From owner-freebsd-stable@FreeBSD.ORG Wed Apr 23 00:21:22 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E356F37B401; Wed, 23 Apr 2003 00:21:21 -0700 (PDT) Received: from spider.deepcore.dk (cpe.atm2-0-56339.0x50c6aa0a.abnxx2.customer.tele.dk [80.198.170.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 053C543F3F; Wed, 23 Apr 2003 00:21:20 -0700 (PDT) (envelope-from sos@spider.deepcore.dk) Received: (from sos@localhost) by spider.deepcore.dk (8.12.8p1/8.12.8) id h3N7LIoi037105; Wed, 23 Apr 2003 09:21:18 +0200 (CEST) (envelope-from sos) From: Soeren Schmidt Message-Id: <200304230721.h3N7LIoi037105@spider.deepcore.dk> In-Reply-To: <20030422200449.C95995@cvs.imp.ch> To: Martin Blapp Date: Wed, 23 Apr 2003 09:21:17 +0200 (CEST) X-Mailer: ELM [version 2.4ME+ PL98b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=ISO-8859-1 cc: freebsd-stable@FreeBSD.ORG cc: sos@FreeBSD.ORG Subject: Re: Deadlock with ATA disk on FreeBSD 4.8 Stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2003 07:21:22 -0000 It seems Martin Blapp wrote: > We encounter here a deadlock with a quite new ATA 120GB disk. > The disk worked good for about 3 weeks, but now we have a strange > problem. > > There seems to be one defective file on the disk. fsck doesn't find > it, and if I do a > > cat file > /dev/null > > The machine locks completly. Serial console is dead, no remote DDB > via ALTBREAK possible anymore, no panic message, just freezed. > > The evil began after we installed this ATA disk. It only happens for > one file on this disk, the other 10000 files are fine. > > Do you have a idea why break into the debugger doesn't work anymore ? > Even if we have a hard-error on the disk, this should still work, > shouln't it ? One thing that could explain this is that the disk stops working in the midst of a busmaster DMA transfer, that will effectively lockup the system in most cases. I'm suspecting bad HW here, as it makes no sense you can access the rest of the disk without problems.. If you can spare the data, I'd do a dd if=/dev/zero of=/dev/adN bs=1m and try to get the disk to remap all bad sectors... -Søren