Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Dec 2003 21:38:47 -0800 (PST)
From:      Dan Strick <strick@covad.net>
To:        peo@intersonic.se
Cc:        dan@mist.nodomain
Subject:   Re: Hard drive stress test
Message-ID:  <200312190538.hBJ5clEZ013377@mist.nodomain>

next in thread | raw e-mail | index | archive | help
>>
> Could somebody please recommend a utility or script suitable for 
> stressing a hard disk to check for possible errors?
>>

You might look in your hard disk manufacturer's web site for testing
and diagnostic software.  For example, Maxtor has something called
"Maxblast" and Seagate has "SeaTools".  I don't find these programs
very useful for uncovering and fixing surface defects.  You might
as well just try dd'ing your hard drive into /dev/null.

I have a program that began life twenty+ years ago as a hack for
finding bad sectors on SMD disk drives so they could be mapped out.
It worked quite well for that and also had the very useful (annoying?)
habit of uncovering controller/data-path problems (that one bit in
a zillion that got flipped for no good reason).  I recently dusted
this program off and hacked it to automagically determine on FreeBSD
the size of the special file (disk,slice,partition) that it tests
and to avoid the effects of disk caching in the kernel and the drive.

I could give you a copy of the program, but I must warn you that there
is virtually no external documentation and that it seems ineffective
on modern SCSI/ATA disk drives.  I don't know about ATA drives, but
virtually all SCSI drives come configured to hide their mistakes.
There is a bit, called "PER", in a SCSI disk drive "error recovery"
mode parameter page one can set to enable the reporting of soft errors,
but I don't trust it because during the thousands of hours I have worked
over SCSI disk drives I have yet to see a single soft error reported.
I can't explain it.  I don't believe even modern disk systems are
that reliable.  Possibly SCSI host adapter drivers don't like to report
soft errors either (seems unlikely).

Another problem is what to do with a bad spot once you find one.
I don't think I ever wrote any code to map out flaws on SCSI drives.
Perhaps it was not necessary.  Most SCSI drives have another bit,
called "ARRE", in their "error recovery" mode parameter pages that
enabled automatic "reallocation" of sectors on read errors.
Unfortunately, I have never seen this work either, perhaps because
sectors cannot be automatically reallocated after hard errors (because
the data with which to initialize the mapped sector is not available).

You might as well forget about testing/diagnostics programs.  These
things just don't seem to work anymore and the hardware manufacturers
don't seem to have much interest in producing them, perhaps because
the primary effect of such software is to increase the number of
requests for technical assistance and warranty repairs.  You don't
want to know.  Just cross your fingers and pray.  You'll be happier
that way.  :-)

Dan Strick
strick@covad.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200312190538.hBJ5clEZ013377>