Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Sep 2006 13:46:52 -0400
From:      John Nielsen <lists@jnielsen.net>
To:        freebsd-questions@freebsd.org
Cc:        Robin Becker <robin@reportlab.com>, Alex Zbyslaw <xfb52@dial.pipex.com>
Subject:   Re: gmirror HD failure detection
Message-ID:  <200609211346.53159.lists@jnielsen.net>
In-Reply-To: <4512664A.1090606@dial.pipex.com>
References:  <45116E76.6020009@chamonix.reportlab.co.uk> <45117BA6.2040700@chamonix.reportlab.co.uk> <4512664A.1090606@dial.pipex.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 21 September 2006 06:15, Alex Zbyslaw wrote:
> Robin Becker wrote:
> > Dave wrote:
> >> Hi,
> >>    I've got smartd going on a gmirror system, however when smartd
> >> starts up it says it can't find the various drives. I've tried both
> >> the autodetection line as well as specifying the individual drives.
> >> If this does work i'd like to know about it as i believe i might have
> >> one failing drive, but am not sure which one.
> >> Thanks.
> >> Dave.
> >
> > well as root I can certainly run smartctl -a /dev/ad4 (or /dev/ad6) so
> > I assume smartd could.
> >
> > I like the idea of using gmirror status -s , but I don't know what the
> > results would be if one of the disks were going bad. Would it change
> > from COMPLETE to DEGRADED suddenly?
>
> I would expect gmirror to report a problem when a disk gad *gone* bad.
> Going bad from a SMART point of view can mean, for example, too high a
> rate of read retries or too many bad sectors remapped.  At that point
> the drive is technically working, so there is nothing technically wrong
> with the array status.  In such a case SMART would just be telling you
> that the disk is likely to go kablooey soon; time for backups, new drive
> etc. etc.
>
> Something like gmirror status -s you can presumably run even every five
> minutes from cron; if you weed out the good results you'll only get
> email if something does go wrong.
>
> Use both approaches since they tell you different things which just
> happen some of the time to coincide.

If you happen to be one of the smart admins who actually reviews the output of 
the periodic scripts, then simply adding
	daily_status_gmirror_enable="YES"
to /etc/periodic.conf will give you a daily health check. If you want more 
granularity than a single day, you could use the contents of the periodic 
script as a starting point for rolling your own.

JN



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200609211346.53159.lists>