Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 Feb 2011 15:44:45 -0500
From:      Michael Powell <nightrecon@hotmail.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: Bad hard driver
Message-ID:  <iiuu9u$3n9$1@dough.gmane.org>
References:  <AANLkTi=1oBAT1Xr7SkUBg5pShpc0+7XiDpQvrT0dwHwd@mail.gmail.com> <EEB2B3DA-81F2-44A6-BDDD-F7D5418BB693@mac.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Chuck Swiger wrote:

> On Feb 9, 2011, at 11:15 AM, Daniel Zhelev wrote:
>> The following warning/error was logged by the smartd daemon:
>> 
>> Device: /dev/ad7, 3 Offline uncorrectable sectors
> 
> It means that the drive has detected errors in three sectors, and is
> attempting to recover them without data loss to spare sectors, so far
> without success.  It could also indicate that the drive has exhausted the
> spare sectors, in which case all future errors will cause additional data
> loss.

As long as the remap region is not full the next write attempt to these 
sectors will clear this. It can be done by dd'ing zero to the entire drive, 
or formatting the entire drive as a shotgun approach. This entails a 
complete backup and restore cycle though. A little extreme, as this 
particular error is actually rather benign and eventually self-correcting as 
long as there is space in the remap area.

Early in a drive's life this may be tolerable until the remap fills. Even if 
the remap area has space available, and these errors get cleared by the next 
write to the defective sectors I would still watch for more of these. If you 
get these errors cleared only to start to see more new ones it indicates 
media failure spreading across the platters. At such a point in a drive's 
life it only makes sense to replace it, as at some point the remap region 
fills and you will have lost data.
 
>>From the "SMART Self-test log", it seems like you are running short
>>self-tests every 24 hours, and periodically running extended tests on some
>>interval as well.  The smartctl FAQ recommends doing so at weekly
>>intervals; doing it daily is putting significant testing load onto the
>>drive.
> 
>> I know about the how to -
>> http://smartmontools.sourceforge.net/badblockhowto.html
>> 
>> But how can I get the LBA?
>> And is there some diagnostic tool for WD in ports?
> 
> Doing a "dd if=/dev/ad7 of=/dev/null bs=64k" will read-scan the entire
> drive, and ought to produce a warning in the logs indicating the LBA of
> the bad sectors.  As for diagnostic tools, WD makes utilities for DOS and
> Windows, not FreeBSD.  See:
> 
>   http://support.wdc.com/product/download.asp?groupid=613&lang=en
> 
> ...for something which you can run off of a boot floppy, USB pendrive,
> etc.

The quick test will tell you about bad sectors and then direct you to run 
the full surface scan which destroys data. But it will "fix" the drive. Back 
to the dump/restore cycle. I run this on any used drive about to recycled 
back into use. It almost always finds something and "fixes" it.

Excellent idea on how to find the LBA. If the exact sector addresses can be 
located a dd write to the specific sector(s) will clear the error condition, 
and should (in theory) be doable without losing data. I would still 
recommend an entire dump backup be done prior to trying anything.

-Mike






Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?iiuu9u$3n9$1>