Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Aug 2007 16:46:21 -0700
From:      "Steve Franks" <stevefranks@ieee.org>
To:        freebsd@sopwith.solgatos.com
Cc:        freebsd-questions@freebsd.org
Subject:   Re: 6.2 not compatible with new sata drives ?!
Message-ID:  <539c60b90708081646g3ad88b57gbdc80deab8870bce@mail.gmail.com>
In-Reply-To: <200708051633.QAA17703@sopwith.solgatos.com>
References:  <200708051633.QAA17703@sopwith.solgatos.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/5/07, Dieter <freebsd@sopwith.solgatos.com> wrote:
> >> I got 2 new 160GB drives last month, and my system has been
> >> unstable ever since.  I have swapped cables, purchased a
> >> brand-new sata150 controller (as opposed to the year old
> >> sataII), and the results are always the same.
> >
> > What make & model controllers?  What make & model drives?
> > Some combinations of controller and drive do not play well
> > together.
>
> I just found your other posting "ad8: FAILURE - device detached".
>
> I assume that the new failing disks are
>
> >>> ad4: 157066MB <HDT722516DLA380 V43OA9BA> at ata2-master SATA150
> >>> ad8: 157066MB <HDT722516DLA380 V43OA9BA> at ata4-master SATA150
>
> and that they are Hitachi?
>
> I still don't know what controller you are using, but I read
> that nforce4 plus Maxtor or Hitachi disks gives data corruption:
>
>         http://forums.nvidia.com/lofiversion/index.php?t8171.html
>
> I have been using nforce4-ultra with Seagate disks with no
> data corruption problems.
>
> It is not immediately obvious how data corruption would cause
> your device detached problem, but there could be more than
> one bug.
>
> If your controller works well with your Samsung drives, you
> could return the Hitachis and get more Samsungs.
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
>

Actually, the problem started when I added the Samsungs.

Here's the facts: I'm having the most mysterious drive issue I've ever seen.

1. I have a promise sata300 controller with 4 drives.  I replaced 2
with new drives.
2. It appearred to work for a week or so.  To my knowledge, I did no
maintanence or installs that week.
3. Over the next couple weeks, things devolved until all 4 disks
"device failed detatched" in dmesg as soon as you hit them, you also
get plenty of "set features settransfermode taskqueue timeout"'s in
dmesg.
4. I was sure this was a hardware problem.  Bought new cables, new
drives, new controller board (that apparently has a driver but is
flaky (si chipset?)).  Very carefully swapped pieces to isolate the
problem.  Every possible configuration failed, with new old disks,
cables, etc.
5. I decided that since I had pretty much cycled all the hardware, it
must be software, so I put in the 7.0 iso I was going to try on my
laptop.  Dropped into fixit, looked at the drives, no problems!
6. Decided to try my origonal 6.2amd64 iso again, on a whim.  Perhaps
not suprisingly, the disks looked fine on it too.
7. Did an 'upgrade' from the 6.2 iso, which reported sucess, which did
not fix the issue, so I suspect it's not a kernel or driver issue.  It
appears to me the 'upgrade' process replaces everything but /etc, no?
8. I'm a total novice, so I've only messed with inetd, rc.conf,
loader.conf, and crontab (so far as I know).  What in there that could
fubar the disks?
9. When I say the disks are 'good' from fixit on the iso, what I mean
is this: you can fsck_ffs each disk with zero errors.  zero errors
appear in dmesg.  Then (since the disks were mirrored, then the mirror
was broken while diagnosing, and some files were added), you can diff
-r the drives (which takes about 4 hours, they are 95% full 160GB's),
and everything goes as expected, again, no errors, no debug info in
dmesg, etc.

So, the obvious: what is different about my running system which I
have installed, vs. the 'fixit' shell in my same iso that I installed
from that could cause my sata hardware to appear bad on the running
system, but not on the iso?

UPDATE: At the moment, I think things are running ok (past 24 hrs)
with smartd disabled, so maybe the samsungs have some smart issue?

My newest issue is that the df command reports identically for the two
hitatchi drives, even though I've added several tens of megabytes to
one, but not the other.  Maybe it's time to donate this system to my
local charity for parts ;)

Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?539c60b90708081646g3ad88b57gbdc80deab8870bce>