Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jan 2005 15:06:04 -0800
From:      Jon Simola <jsimola@gmail.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: Bad disk or kernel (ATA Driver) problem?
Message-ID:  <8eea04080501191506237fc762@mail.gmail.com>
In-Reply-To: <8eea040805011913334b140af6@mail.gmail.com>
References:  <20050119151301.A22310@Denninger.Net> <8eea040805011913334b140af6@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 19 Jan 2005 13:33:12 -0800, Jon Simola <jsimola@gmail.com> wrote:

> I've got a few 1U Supermicro boxes running dual SATA drives:
> I've run into all sorts of problems with every one, and changing the
> IDE channel settings in the BIOS always fixes it. Which really annoys
> me, because I setup a new box, run it for a couple weeks, then the
> drives start getting flaky under load. Then I go change the setting in
> the BIOS (that I always forget to do on initial setup) and it's dead
> stable for months at a time.

I was politely asked to actually dig up the settings, which cut
through my lack of sleep. I should have done this earlier :)

On this one box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c):
5.2.1-RELEASE-p4
atapci0: <Intel ICH5 SATA150 controller> port 0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0-
0x7 irq 16 at device 31.2 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
GEOM: create disk ad0 dp=0xc671a560
ad0: 70911MB <WDC WD740GD-00FLA0> [144073/16/63] at ata0-master UDMA100
GEOM: create disk ad1 dp=0xc671a460
ad1: 70911MB <WDC WD740GD-00FLA0> [144073/16/63] at ata0-slave UDMA100
acd0: CDROM <CD-224E> at ata1-master PIO4

That's a pair of SATA 74GB WD Raptors. The BIOS IDE setting is for
"Combined" - SATA drives will appear on the Primary IDE channel.


On a different box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c):
5.3-STABLE-20050107
atapci0: <Intel ICH5 UDMA100 controller> port
0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on
pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
atapci1: <Intel ICH5 SATA150 controller> port
0xd000-0xd00f,0xcc00-0xcc03,0xc800-0xc807,0xc400-0xc403,0xc000-0xc007
irq 18 at device 31.2 on pci0
ata2: channel #0 on atapci1
ata3: channel #1 on atapci1
acd0: CDROM <CD-224E/1.9A> at ata1-master UDMA33
ad4: 78167MB <Maxtor 6Y080M0/YAR51HW0> [158816/16/63] at ata2-master SATA150
ad6: 78167MB <Maxtor 6Y080M0/YAR51HW0> [158816/16/63] at ata3-master SATA150

A pair of Maxtor 80GBs, the BIOS is set for "Enhanced", up to 6 drives
(4 IDE + 2 SATA).


Crazy as though it seems, I wasn't kidding about changing the BIOS.
The other 2 settings are "SATA only" and "Auto". When the drives
started flaking out (timeouts on reads) I would go into the BIOS and
cycle through the BIOS settings. After changing it once or twice,
things would be fine for months at a time.

My best suspicion is that "something" makes the ICH5 a little flaky,
and twiddling the BIOS clears it somehow. My only evidence supporting
that is that twice the bios stalled on probing the drives once this
error had happened, and I had to physically remove the drives, twiddle
the bios settings, and replace the drives before it would work again.

On OpenBSD, this problem on the same hardware manifests as a read
timeout failure during the initial boot probes. Same fix, play with
the BIOS and it suddenly works. There's a term in the Jargon file for
this, but I can't recall it at the moment.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8eea04080501191506237fc762>