Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 Dec 2011 12:23:37 +0100
From:      Julien Cigar <jcigar@ulb.ac.be>
To:        "C. P. Ghost" <cpghost@cordula.ws>
Cc:        FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: AHCI timeout
Message-ID:  <4EE09E39.1060309@ulb.ac.be>
In-Reply-To: <CADGWnjUjqn76PgcCFkgUssE8VhUamjdJZej8=ytNGRnQw3tzBg@mail.gmail.com>
References:  <4EDE37A1.5030306@ulb.ac.be> <CADGWnjUjqn76PgcCFkgUssE8VhUamjdJZej8=ytNGRnQw3tzBg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------090104020503000701010001
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 12/06/2011 19:25, C. P. Ghost wrote:
> On Tue, Dec 6, 2011 at 4:41 PM, Julien Cigar<jcigar@ulb.ac.be>  wrote:
>> Hello,
>>
>> I'm running 9.0-RC3 on a HP Proliant Microserver (N40L). A disk died in my
>> graid3 array and I replaced it with a new one, and now have tons of:
>>
>> ahcich3: Timeout on slot 5 port 0
>> ahcich3: is 00000000 cs 00000000 ss 00003f60 rs 00003f60 tfd 40 serr
>> 00000000 cmd 0000ed17
>
> Check the connectors, both on disk and on the controller. They're
> usually the culprit. Sometimes it is also a firmware problem, but
> I'll try to replace the cables first.

I tried with two different connectors but the problem persists. However, 
I noticed that the problem only appear at high I/O rates (during a 
graid3 resync for example): the machine runs Bacula and the backup job 
completed successfully this night, but it was a remote machine so the 
I/O writes didn't go above 2 MB/s ...

Do you think the problem could be the firmware of the disk?

>
>> (...)
>>
>> Those are Seagate disks:
>>
>> jcigar@backup conf % sudo camcontrol devlist
>> <VB0250EAVER HPG0>                   at scbus0 target 0 lun 0 (pass0,ada0)
>> <ST31000528AS CC38>                  at scbus1 target 0 lun 0 (pass1,ada1)
>> <ST31000528AS CC38>                  at scbus2 target 0 lun 0 (pass2,ada2)
>> <ST31000333AS CC1H>                  at scbus3 target 0 lun 0 (pass3,ada3)
>>
>> The controller is:
>>
>> ahci0@pci0:0:17:0:      class=0x010601 card=0x1609103c chip=0x43911002
>> rev=0x40 hdr=0x00
>>     vendor     = 'ATI Technologies Inc'
>>     device     = 'SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]'
>>     class      = mass storage
>>     subclass   = SATA
>>
>> jcigar@backup conf % vmstat -i
>> interrupt                          total       rate
>> irq17: ehci0 ehci1+                    2          0
>> irq18: ohci0 ohci1+                   30          0
>> irq256: bge0                       31354          4
>> irq257: ahci0                   19012658       2477
>> irq258: hpet0:t0                 4926229        641
>> irq259: hpet0:t1                 4635261        603
>> Total                           28605534       3727
>>
>>
>> Any idea what could be the cause of this ... ?
>>
>>
>> Thanks,
>> Julien
>
> -cpghost.
>


-- 
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

--------------090104020503000701010001--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EE09E39.1060309>