From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 15 09:20:10 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0F54937B405 for ; Tue, 15 Jul 2003 09:20:10 -0700 (PDT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 0023343FBF for ; Tue, 15 Jul 2003 09:20:07 -0700 (PDT) (envelope-from dwmalone@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 15 Jul 2003 17:20:06 +0100 (BST) Date: Tue, 15 Jul 2003 17:20:06 +0100 From: David Malone To: Sumit Shah Message-ID: <20030715162006.GA47687@walton.maths.tcd.ie> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.3i Sender: dwmalone@maths.tcd.ie cc: freebsd-hackers@freebsd.org Subject: Re: RAID and NFS exports (Possible Data Corruption) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2003 16:20:10 -0000 On Tue, Jul 15, 2003 at 06:26:24AM -0700, Sumit Shah wrote: > Here is a message I sent to freebsd-questions and I was hoping I could > get some help debugging this. It seems very unlikely that restarting mountd cound cause an error like: >ad4: hard error reading fsbn 242727552 The error means that that the disk said that there was an error trying to read this block. You say that when you rebooted that the controler said a disk had gone bad, so this would sort of confirm this. (I could believe that restarting mountd might upset raid stuff if there were a kernel bug, but it seems very unlikely it could cause a disk to go bad.) My best guess would be that you have a bad batch of disks that happen to have failed in similar ways. It is possible that restarting mountd uncovered the errors, 'cos I think mountd internally does a remount of the filesystem in question and that might cause a chunk of stuff to be flushed out on to the disk, highlighting an error. (I had a bunch of the IBM "deathstar" disks fail on me within the space of a week or so, after they'd been in use for about six months.) David.