Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Apr 2011 15:19:27 -0600
From:      "Kenneth D. Merry" <ken@FreeBSD.org>
To:        Dmitry Morozovsky <marck@rinet.ru>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: mps driver instability under stable/8
Message-ID:  <20110430211927.GA67374@nargothrond.kdm.org>
In-Reply-To: <alpine.BSF.2.00.1104291145080.29081@woozle.rinet.ru>
References:  <alpine.BSF.2.00.1104291145080.29081@woozle.rinet.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 29, 2011 at 11:51:21 +0400, Dmitry Morozovsky wrote:
> Dear Ken,
> 
> I have SuperMicro Server with mps driver you managed, with 24 SATA disks under 
> SAS x36 expander with large ZFS
> 
> Sometimes, under random disk load such as daily find, it lost all its devices:
> 
> [-- MARK -- Fri Apr 29 03:00:00 2011]
> mps0: IOC Fault 0x40005900, Resetting^M
> (pass20:mps0:0:22:0): SCSI command timeout on device handle 0x0020 SMID 442^M
> mps0: IOC Fault 0x40001500, Resetting^M
> (da19:mps0:0:21:0): SCSI command timeout on device handle 0x001f SMID 172^M
> (da19:mps0:0:21:0): SCSI command timeout on device handle 0x001f SMID 511^M
> (da20:mps0:0:20:0): SCSI command timeout on device handle 0x001e SMID 240^M
> 
> ..
> 
> (da4:mps0:0:0:0): SCSI command timeout on device handle 0x000a SMID 844^M
> (da22:mps0:0:23:0): SCSI command timeout on device handle 0x0021 SMID 713^M
> (da18:mps0:0:22:0): SCSI command timeout on device handle 0x0020 SMID 603^M
> 
> and hangs there forever (in zio state).
> 
> I've prepared debugging kernel with DDB and would be glad to help catch the 
> situation.

Hmm...

Can you send full dmesg output?  What I'm most interested in is whether
there is more kernel output before the IOC Fault that might shed some light
on what is going on.

Also, what brand (LSI, Maxim, etc.) and speed (3Gb, 6Gb) is the expander on
the backplane?

What model LSI controller do you have?  How many lanes are connected
between the controller and the backplane?

What model disks do you have in the system?  (dmesg will show that
obviously.)

Hopefully we can find some clues to point to the problem.

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110430211927.GA67374>