From owner-freebsd-questions Sat Oct 28 16: 9:55 2000 Delivered-To: freebsd-questions@freebsd.org Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by hub.freebsd.org (Postfix) with ESMTP id 2ABE437B479 for ; Sat, 28 Oct 2000 16:09:50 -0700 (PDT) Received: (from grog@localhost) by wantadilla.lemis.com (8.11.0/8.9.3) id e9SN9Vd28619; Sun, 29 Oct 2000 09:39:31 +1030 (CST) (envelope-from grog) Date: Sun, 29 Oct 2000 09:39:31 +1030 From: Greg Lehey To: Jesse Cc: freebsd-questions@FreeBSD.ORG Subject: Re: handling disk failures with vinum Message-ID: <20001029093931.H22174@wantadilla.lemis.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: ; from j@lumiere.net on Sat, Oct 28, 2000 at 06:01:34AM -0700 Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG [Format recovered--see http://www.lemis.com/email/email-format.html] Please don't wrap log output. On Saturday, 28 October 2000 at 6:01:34 -0700, Jesse wrote: > > Hi, > > I've setup two 30GB IDE drives for RAID 1 mirroring. > It works during normal conditions, but I'd like to test some failure > modes. > > I tried disconnecting the power to one of the drives. I got a bunch of > consoles messages -- access to the filesystems on the mirror blocked. Once > I powered up the second drive again, accesses completed. Here's the logs: > > Oct 28 05:21:40 leaf /kernel: ata1-master: no status, reselecting device > Oct 28 05:21:40 leaf last message repeated 757 times > Oct 28 05:21:40 leaf /kernel: ata1-master: timeout waiting to give command=c8 s=ff e=ff > Oct 28 05:21:40 leaf /kernel: ad2: error executing command - resetting > Oct 28 05:21:40 leaf /kernel: ata1: resetting devices .. done > Oct 28 05:21:50 leaf /kernel: ad2: READ command timeout tag=0 serv=0 - resetting > Oct 28 05:21:50 leaf /kernel: ata1: resetting devices .. done > > So.. is vinum capable of continuing to operate when a drive fails, > or will the system always die block on accesses and require a > reboot? I don't understand this comment; it contradicts your previous statement above. But this is a disk subsystem issue, not a Vinum issue. The only way Vinum can tell if a disk is dead is when the driver tells it so. From your output above, you only waited 10 seconds; the drivers should take a reasonable amount of time to retry before they give up on a drive, but it's possible that ata is waiting too long. If so, please enter a PR against the ata driver; I know from my own experience that the CAM drivers (SCSI) don't have problems. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply. For more information, see http://www.lemis.com/questions.html Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message