Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Jun 2008 11:06:25 -0400 (EDT)
From:      Tuc at T-B-O-H <ml@t-b-o-h.net>
To:        daniel_k_eriksson@telia.com (Daniel Eriksson)
Cc:        freebsd-questions@freebsd.org, ryan.coleman@cwis.biz
Subject:   Re: "Fixing" a RAID
Message-ID:  <200806191506.m5JF6PSm021061@vjofn.tucs-beachin-obx-house.com>
In-Reply-To: <4F9C9299A10AE74E89EA580D14AA10A61A1947@royal64.emp.zapto.org> from "Daniel Eriksson" at Jun 19, 2008 11:02:14 AM

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> I recently had this happen to me on an 8 x 1 TB RAID-5 array on a
> Highpoint RocketRAID 2340 controller. For some unknown reason two drives
> developed unreadable sectors within hours of each other. To make a long
> story short, the way I "fixed" this was to:
> 
	Not FreeBSD related, so you can delete now if not interested...

	We had a 1.5TB NetApp filer at my previous place. It was originally
backed up by another 1.5TB filer taking snapshots every few hours. After
a few years, the customer decided it was "too safe" so they used the 2nd
filer for something else. A month later, we had a double disk failure in the
same volume.

	The NetApp freaked out and rebooted, but when it did it marked one
disk dead, and the other as fine. Since there was a hot spare, it started
to attempt a rebuild. It took 9 hours for a 72G disk, and the 1/2 failed
drive sounded like it was putting the head through the media with lead shot
in it. The filer performed at about 1/2 speed during that time. The SECOND
that it finished, and the software claimed that the array was in optimal 
mode, we immediately pulled the bad disk out and replaced it with a fresh
disk. That rebuild went fine. Pulled the failed disk, and put another disk
in for hot spare.

	Not sure if its a testimony to NetApp, or our and the customers
luck. They had specifically not wanted backups, and rebuilding the data
would have taken months, many man hours, and loss of revenue to the site. 

	Ever since then, I try to get disks made at different times and
different batches. You figure that if they were MADE around the same time,
they will most likely DIE around the same time. :)

			Tuc



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200806191506.m5JF6PSm021061>