Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 04 Oct 2009 15:04:57 -0500
From:      Aaron Hurt <aaron@goflexitllc.com>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Help needed! ZFS I/O error recovery?
Message-ID:  <4AC8FFE9.90606@goflexitllc.com>
In-Reply-To: <20091004174746.GF1660@garage.freebsd.pl>
References:  <683849754.20091001110503@pyro.de> <20091004174746.GF1660@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------040508080101090601010809
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Pawel Jakub Dawidek wrote:
> On Thu, Oct 01, 2009 at 11:05:03AM +0200, Solon Lutz wrote:
>   
>> Hi erverybody,
>>
>> I'm faced with a 10TB ZFS pool on a 12TB RAID6 Areca controller.
>> And yes, I know, you shouldn't put a zpool on a RAID-device... =(
>>     
>
> Just to be sure: you have no redundancy on ZFS level at all? That's
> very, very bad idea for important data (you know that already, but to
> warn others)...
>
>   
>> The cable was replaced, a parity check was run on the RAID-Volume and
>> showed no errors, the zfs scrub however showed some 'defective' files.
>> After copying these files with 'dd -conv=noerror...' and comparing them
>> to the originals, they were error-free.
>>
>> Yesterday however, three more defective cables forced the controller
>> to take the RAID6 volume offline. Now all cables were replaced and a parity
>> check was run on the RAID-Volume -> data integrity OK.
>>     
>
> This means absolutely nothing. It just means that parity match the
> actual data, it doesn't mean the data is fine from file system or
> application perspective.
>
>   
>> But now ZFS refuses to mount all volumes:
>>
>> Solaris: WARNING: can't process intent log for temp/space1
>> Solaris: WARNING: can't process intent log for temp/space2
>> Solaris: WARNING: can't process intent log for temp/space3
>> Solaris: WARNING: can't process intent log for temp/space4
>>
>> A scrub revealed to following:
>>
>> errors: Permanent errors have been detected in the following files:
>>
>>         temp:<0x0>
>>         temp/space1:<0x0>
>>         temp/space2:<0x0>
>>         temp/space3:<0x0>
>>         temp/space4:<0x0>
>>
>>
>> I tried to switch off checksums for this pool, but that didn't help in any
>> way. I also mounted the pool by hand and was faced with with 'empty' volumes
>> and 'I/O errors' when trying to list their contents...
>>
>> Any suggestions? I'm offering some self-made blackberry jam and raspberry brandy
>> to the person who can help to restore or backup the data.
>>
>> Tech specs:
>>
>> FreeBSD 7.2-STABLE #21: Tue May  5 18:44:10 CEST 2009 (AMD64)
>> da0 at arcmsr0 bus 0 target 0 lun 0
>> da0: <Areca ARC-1280-VOL#00 R001> Fixed Direct Access SCSI-5 device
>> da0: 166.666MB/s transfers (83.333MHz DT, offset 32, 16bit)
>> da0: Command Queueing Enabled
>> da0: 10490414MB (21484367872 512 byte sectors: 255H 63S/T 1337340C)
>> ZFS filesystem version 6
>> ZFS storage pool version 6
>>     
>
> If you are able to backup your disks, do it before we go further. I've
> some ideas, but they can mess up your data even further.
>
> First of all I'd start with upgrading system to stable/8, there could be
> better error recovery.
>
> Do not write anything new to the pool, actually do not even read from it
> as it may trigger writting as well.
>
>   
I am experiencing a similar issue with a small box here at the house.  
It is not on a raid controller, just a standard 4 port non-raid.  It is 
also giving an I/O error unable to import message.  This started 
happening after a situation similar to the above where I had a drive 
going bad that started giving dma read/write errors and causing the 
machine to lock up...not panic or crash just freeze...so I just turned 
the machine off until I had time to backup the data and move it to a new 
array.  However, when I went to do that to this particular raidz1 pool 
showed faulted and had a status message about corrupt meta data.  I 
hoped I could export/import that pool to get it back in a readable 
state.  That didn't work, the array exported fine without error but now 
refuses to import saying I/O error unable to import.  Long story made 
short, I would also be very appreciative of any ZFS related data 
recovery information or processes.

-- 
Aaron Hurt
Managing Partner
Flex I.T., LLC
611 Commerce Street
Suite 3117
Nashville, TN  37203
Phone: 615.438.7101
E-mail: aaron@goflexitllc.com


--------------040508080101090601010809--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4AC8FFE9.90606>