Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Mar 2020 16:37:58 -0400 (EDT)
From:      Daniel Feenberg <feenberg@nber.org>
To:        Bob Proulx <bob@proulx.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: drive selection for disk arrays
Message-ID:  <alpine.BSF.2.21.9999.2003261630030.47777@mail2.nber.org>
In-Reply-To: <20200326124648725158537@bob.proulx.com>
References:  <20200325081814.GK35528@mithril.foucry.net> <713db821-8f69-b41a-75b7-a412a0824c43@holgerdanske.com> <20200326124648725158537@bob.proulx.com>

next in thread | previous in thread | raw e-mail | index | archive | help

The disturbing frequency of multiple drives going offline in quick 
succession is, in my view, largely a result of defects being discovered in 
quick succession, rather than occuring in quick succession. If a defect 
occurs in a sector that is rarely visited it can remain hidden for a long 
time. During a resilver that defect will be noticed and the drive failed 
out. I do think that is an overly aggressive action by the resilvering 
process, as that may be the only bad sector, it may be possible to recover 
all the data from the remaining drives (if the first failing drive can 
read the appropriate sector), and that sector may not even be in an active 
file.

This issue makes scrubbing particularly important, especially in this era 
of very large filesystems that can take days or weeks to restore.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.21.9999.2003261630030.47777>