Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 May 2009 19:50:35 -0500
From:      "James R. Van Artsdalen" <james-freebsd-current@jrv.org>
Cc:        freebsd-current@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: ZFS panic under extreme circumstances (2/3 disks corrupted)
Message-ID:  <4A19EB5B.1000806@jrv.org>
In-Reply-To: <gvckuv$u9l$1@ger.gmane.org>
References:  <4E6E325D-BB18-4478-BCFD-633D6F4CFD88@exscape.org>	<4FE794E9-075D-4563-B395-BD5E459937DF@exscape.org> <gvckuv$u9l$1@ger.gmane.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Ivan Voras wrote:
> Thomas Backman wrote:
>   
>> On May 24, 2009, at 09:02 PM, Thomas Backman wrote:
>>
>>     
>>> 5) Check if the md5 of file: everything OK, zpool status shows a
>>> degraded pool.
>>> 6) Repeat step #4, but with disk 3.
>>> 7) zpool scrub test
>>> 8) Panic!
>>>
>>>       
> Did you account for the time factor? Between your steps 5 and 6,
> wouldn't ZFS automatically begin data repair?
>   


ZFS probably only repairs errors it sees in step 5, i.e. if he reads a
corrupted sector that sector might be fixed, but ZFS does not start a
scrub looking for other corruption.

His test probably clobbered metadata for the pool or such: something not
touched by the md5(1) in step 5.  That error might not have been seen
until step 7 by which point step 6 has rendered the pool unrepairable.

The original test might need to actually read the disk blocks before
overwrite to make sure it's file data and not something else otherwise
the test probably isn't going to be a valid test of automatic self-repair.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A19EB5B.1000806>