Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Oct 2000 09:39:31 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Jesse <j@lumiere.net>
Cc:        freebsd-questions@FreeBSD.ORG
Subject:   Re: handling disk failures with vinum
Message-ID:  <20001029093931.H22174@wantadilla.lemis.com>
In-Reply-To: <Pine.BSF.4.21.0010280523380.1282-100000@localhost>; from j@lumiere.net on Sat, Oct 28, 2000 at 06:01:34AM -0700
References:  <Pine.BSF.4.21.0010280523380.1282-100000@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
[Format recovered--see http://www.lemis.com/email/email-format.html]

Please don't wrap log output.

On Saturday, 28 October 2000 at  6:01:34 -0700, Jesse wrote:
>
> Hi,
>
> I've setup two 30GB IDE drives for RAID 1 mirroring.
> It works during normal conditions, but I'd like to test some failure
> modes.
>
> I tried disconnecting the power to one of the drives. I got a bunch of
> consoles messages -- access to the filesystems on the mirror blocked. Once
> I powered up the second drive again, accesses completed. Here's the logs:
>
> Oct 28 05:21:40 leaf /kernel: ata1-master: no status, reselecting device
> Oct 28 05:21:40 leaf last message repeated 757 times
> Oct 28 05:21:40 leaf /kernel: ata1-master: timeout waiting to give command=c8 s=ff e=ff
> Oct 28 05:21:40 leaf /kernel: ad2: error executing command - resetting
> Oct 28 05:21:40 leaf /kernel: ata1: resetting devices .. done
> Oct 28 05:21:50 leaf /kernel: ad2: READ command timeout tag=0 serv=0 - resetting
> Oct 28 05:21:50 leaf /kernel: ata1: resetting devices .. done
>
> So.. is vinum capable of continuing to operate when a drive fails,
> or will the system always die block on accesses and require a
> reboot?

I don't understand this comment; it contradicts your previous
statement above.  But this is a disk subsystem issue, not a Vinum
issue.  The only way Vinum can tell if a disk is dead is when the
driver tells it so.  From your output above, you only waited 10
seconds; the drivers should take a reasonable amount of time to retry
before they give up on a drive, but it's possible that ata is waiting
too long.  If so, please enter a PR against the ata driver; I know
from my own experience that the CAM drivers (SCSI) don't have
problems.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply.
For more information, see http://www.lemis.com/questions.html
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001029093931.H22174>