Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Jan 2011 11:38:07 +0200
From:      Alexander Motin <mav@FreeBSD.org>
To:        Joachim Tingvold <joachim@tingvold.com>
Cc:        freebsd-scsi@freebsd.org, "Kenneth D. Merry" <ken@freebsd.org>
Subject:   Re: mps0-troubles
Message-ID:  <4D38027F.8080505@FreeBSD.org>
In-Reply-To: <07392102-4584-4690-9188-5202728CC7CA@tingvold.com>
References:  <mailpost.1294832739.2809102.16331.mailing.freebsd.scsi@FreeBSD.cs.nctu.edu.tw> <4D2DAA45.30602@FreeBSD.org> <B2CFC8A1-FA1D-4718-99C3-AC3430A905C2@tingvold.com> <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <B22C5568-24D0-4530-B90A-BA6A6CAF111C@tingvold.com> <20110114001758.GA12793@nargothrond.kdm.org> <D24332F3-56AF-484C-9592-1097BF684E37@tingvold.com> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Joachim Tingvold wrote:
> On Fri, Jan 14, 2011, at 01:38:27AM GMT+01:00, Joachim Tingvold wrote:
>>> I don't think so.  If anything, I think that this is likely triggered by
>>> a large number of outstanding commands, or perhaps a leak somewhere.  If
>>> it's the former, hopefully this will fix it.  If it's the latter, you'll
>>> eventually run into the problem again.
>>
>> Okay. I've changed the value. I'll keep you posted.
> 
> Some more errors;
> <http://home.komsys.org/~jocke/dmesg_mps0_freebsd-scsi_3.txt>. Same as
> earlier, but without the "out of chain"-messages (probably just because
> I increased the number, so that this specific error-message isn't
> printed?).
> 
> This happened without direct user-interaction (it's a cron-job that
> copies files from 'zroot' to 'storage', and deletes some stuff on
> 'zroot' afterwards). However, it seems as if I can access the
> files/folders on 'storage' without the terminal freezing (based on the
> log-messages, I was able to pinpoint the last 2-3 copied folders, and
> those I could list/copy files from without the terminal freezing). I'm
> not sure why, though. Maybe because the error/crash occurred when no
> files where being copied/moved?

I would say because error recovery now managed to what it should. If
problem causing timeouts is not permanent, CAM should correctly recover
from them, and system should work further just with some delays for
recovery. In any case processes should not freeze forever as they did --
they should either work (even if slow) or terminate with I/O errors.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D38027F.8080505>