Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Dec 2011 13:42:38 -0600
From:      Reid Linnemann <lreid@cs.okstate.edu>
To:        "C. P. Ghost" <cpghost@cordula.ws>
Cc:        Julien Cigar <jcigar@ulb.ac.be>, FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: AHCI timeout
Message-ID:  <CA%2B0MdpNS3juJGgcOr3co9ebx0bcJDKnf_0SFO4PJ2KU5WDEJUg@mail.gmail.com>
In-Reply-To: <CADGWnjUjqn76PgcCFkgUssE8VhUamjdJZej8=ytNGRnQw3tzBg@mail.gmail.com>
References:  <4EDE37A1.5030306@ulb.ac.be> <CADGWnjUjqn76PgcCFkgUssE8VhUamjdJZej8=ytNGRnQw3tzBg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 6, 2011 at 12:25 PM, C. P. Ghost <cpghost@cordula.ws> wrote:
> On Tue, Dec 6, 2011 at 4:41 PM, Julien Cigar <jcigar@ulb.ac.be> wrote:
>> Hello,
>>
>> I'm running 9.0-RC3 on a HP Proliant Microserver (N40L). A disk died in =
my
>> graid3 array and I replaced it with a new one, and now have tons of:
>>
>> ahcich3: Timeout on slot 5 port 0
>> ahcich3: is 00000000 cs 00000000 ss 00003f60 rs 00003f60 tfd 40 serr
>> 00000000 cmd 0000ed17
>
> Check the connectors, both on disk and on the controller. They're
> usually the culprit. Sometimes it is also a firmware problem, but
> I'll try to replace the cables first.
>
>> (...)
>>
>> Those are Seagate disks:
>>
>> jcigar@backup conf % sudo camcontrol devlist
>> <VB0250EAVER HPG0> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 at scbus0 target 0 lu=
n 0 (pass0,ada0)
>> <ST31000528AS CC38> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at scbus1 target 0 lu=
n 0 (pass1,ada1)
>> <ST31000528AS CC38> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at scbus2 target 0 lu=
n 0 (pass2,ada2)
>> <ST31000333AS CC1H> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0at scbus3 target 0 lu=
n 0 (pass3,ada3)
>>
>> The controller is:
>>
>> ahci0@pci0:0:17:0: =A0 =A0 =A0class=3D0x010601 card=3D0x1609103c chip=3D=
0x43911002
>> rev=3D0x40 hdr=3D0x00
>> =A0 =A0vendor =A0 =A0 =3D 'ATI Technologies Inc'
>> =A0 =A0device =A0 =A0 =3D 'SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]=
'
>> =A0 =A0class =A0 =A0 =A0=3D mass storage
>> =A0 =A0subclass =A0 =3D SATA
>>
>> jcigar@backup conf % vmstat -i
>> interrupt =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0total =A0 =
=A0 =A0 rate
>> irq17: ehci0 ehci1+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A02 =A0 =A0 =A0=
 =A0 =A00
>> irq18: ohci0 ohci1+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 30 =A0 =A0 =A0 =
=A0 =A00
>> irq256: bge0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 31354 =A0 =A0 =
=A0 =A0 =A04
>> irq257: ahci0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 19012658 =A0 =A0 =A0 2=
477
>> irq258: hpet0:t0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 4926229 =A0 =A0 =A0 =A0=
641
>> irq259: hpet0:t1 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 4635261 =A0 =A0 =A0 =A0=
603
>> Total =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 28605534 =A0 =
=A0 =A0 3727
>>
>>
>> Any idea what could be the cause of this ... ?
>>
>>
>> Thanks,
>> Julien
>
> -cpghost.
>
> --
> Cordula's Web. http://www.cordula.ws/
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.o=
rg"

I've had similar problems with a failing power supply when I used to
run a gmirror on 7-STABLE. I was not running with AHCI, so I did not
get the same messages; but I did get repeated WRITE_DMA timeouts on my
da disks that eventually resulted in one disk being detached from the
mirror. Cold booting was an arduous process because 9 boots of 10 the
system would start sputtering out on DMA timeouts almost immediately
after mounting the filesystems, and take well over 30 minutes just to
get through rc. I changed cables, swapped the disks around, checked
smartctl over and over to no avail. Eventually I bought a new rig and
hooked it up to the original power supply - the problems persisted. I
swapped in the new power supply and hey presto! the problems went
away. You mentioned hardware failure in the original disk, so it might
not be too far of a stretch to consider the power supply might also
have suffered a failure.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B0MdpNS3juJGgcOr3co9ebx0bcJDKnf_0SFO4PJ2KU5WDEJUg>