Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 5 Oct 2014 18:50:57 +0400 (MSK)
From:      Dmitry Morozovsky <marck@rinet.ru>
To:        Mikolaj Golub <to.my.trociny@gmail.com>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, Matt Churchyard <matt.churchyard@userve.net>
Subject:   Re: HAST with broken HDD
Message-ID:  <alpine.BSF.2.00.1410051846480.72273@woozle.rinet.ru>
In-Reply-To: <20141003175439.GA7664@gmail.com>
References:  <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <CA%2BdUSypO8xTR3sh_KSL9c9FLxbGH%2BbTR9-gPdcCVd%2Bt0UgUF-g@mail.gmail.com> <542BF853.3040604@internetx.com> <CA%2BdUSyp4vMB_qUeqHgXNz2FiQbWzh8MjOEFYw%2BURcN4gUq69nw@mail.gmail.com> <542C019E.2080702@internetx.com> <CA%2BdUSyoEcPdJ1hdR3k1vNROFG7p1kN0HB5S2a_0gYhiV75OLAw@mail.gmail.com> <542C0710.3020402@internetx.com> <CA%2BdUSyr9OK9SvN3wX-O4DeriLBP-EEuAA8TTSYwdGfcR1asdtQ@mail.gmail.com> <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> <20141003175439.GA7664@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 3 Oct 2014, Mikolaj Golub wrote:

> Disk errors are recorded to syslog. Also error counters are displayed
> in `hastctl list' output. There is snmp_hast(3) in base -- a module
> for bsnmp to retrieve this statistics via snmp protocol (traps are not
> supported though).
> 
> For notifications, the hastd can be configured to execute an arbitrary
> command on various HAST events (see description for `exec' in
> hast.conf(5)). Unfortunately, it does not have hooks for I/O error
> events currently. It might be worth adding though. The problem with
> this that it may generate to many events, so some throttling is
> needed.

And, I it, this should be noted, some kind of error-coalescing or similar 
before going from "warning" shate (there are some read error, but otherwise the 
disk is useable, and it would be overly hassle to switch to remote component 
completely) to "error" state (component is unuseable and needs to be replaced 
ASAP; drop it from HAST pair, and switchover if needed). 

Error such as "device lost" is, of course, fatal from the very beginning; but 
-- how should we interpret, well, sporadic controller resets with the disk 
coming back and catching syncing again?


-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1410051846480.72273>