Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Feb 2011 13:13:10 -0700
From:      "Kenneth D. Merry" <ken@freebsd.org>
To:        Joachim Tingvold <joachim@tingvold.com>
Cc:        freebsd-scsi@freebsd.org, Alexander Motin <mav@freebsd.org>
Subject:   Re: mps0-troubles
Message-ID:  <20110208201310.GA97635@nargothrond.kdm.org>
In-Reply-To: <DE11FC96-06DB-479F-8673-B9ACE2805390@tingvold.com>
References:  <20110114001758.GA12793@nargothrond.kdm.org> <D24332F3-56AF-484C-9592-1097BF684E37@tingvold.com> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <BC40CE83-6116-49CD-8D37-5BC29893449D@tingvold.com> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <FFF5EF18-055C-4E0C-8F9B-03564217F80F@tingvold.com> <20110204180011.GA38067@nargothrond.kdm.org> <DE11FC96-06DB-479F-8673-B9ACE2805390@tingvold.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Feb 08, 2011 at 02:35:35 +0100, Joachim Tingvold wrote:
> On Fri, Feb 04, 2011, at 19:00:11PM GMT+01:00, Kenneth D. Merry wrote:
> >Perhaps it could depend on memory fragmentation somewhat.  Over time  
> >you
> >may see the low water mark go down a bit.
> >
> >The good news is that it doesn't look like we have a leak.
> 
> <http://home.komsys.org/~jocke/dmesg_mps0_freebsd-scsi_5.txt>;
> 

This particular error is interesting:

mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0
mps0: (0:40:0) terminated ioc 804b scsi 0 state c xfer 0

It means that the chip terminated the command for some reason.  I have been
talking to LSI about it.  I'm working on getting an analyzer trace when it
happens, so I cn send that to LSI.

What kind of expander do you have in your system?  How many expanders do
you have?  How many drives do you have?  Can you send 'camcontrol devlist
-v' output?

> [jocke@filserver ~]$ sysctl hw.mps.0
> hw.mps.0.debug_level: 0
> hw.mps.0.allow_multiple_tm_cmds: 0
> hw.mps.0.io_cmds_active: 1
> hw.mps.0.io_cmds_highwater: 959
> hw.mps.0.chain_free: 2048
> hw.mps.0.chain_free_lowwater: 1721
> hw.mps.0.chain_alloc_fail: 0
> 
> This time I did a recursive copy of a folder with no large files at  
> all (it contained only small documents), from 'storage' to 'storage'.
> 
> However, it recovered, so the copy just continued where it left of --  
> which is a change from previous crashes.

Yes, it looks like we're not running into the out of chain problem.

The timeouts could be due to all sorts of problems.  The IOC terminated
errors I'm still not sure about.  I need to get a trace and send that along
with a diagnostic ring buffer dump from the card to LSI to get some answers
about what is going on.

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110208201310.GA97635>