Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Apr 2010 14:26:14 -0600
From:      Scott Long <scottl@samsco.org>
To:        Andy Farkas <chuzzwassa@gmail.com>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: MFC of "Large set of CAM improvements" breaks I/O to Adaptec 29160 SCSI controller
Message-ID:  <76C33FA5-993A-4D23-8ECB-F0913E77A677@samsco.org>
In-Reply-To: <w2hff80e6381004271320m665ae062t8bea44c799a40cbc@mail.gmail.com>
References:  <E1O6ilc-0000GP-Q3@dilbert.ticketswitch.com> <4BD6F266.5080403@feral.com> <o2rff80e6381004271308l302a7173qe2dbcd4e4f038305@mail.gmail.com> <4BD74535.4060503@feral.com> <w2hff80e6381004271320m665ae062t8bea44c799a40cbc@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 27, 2010, at 2:20 PM, Andy Farkas wrote:

> On Wed, Apr 28, 2010 at 6:12 AM, Matthew Jacob <mj@feral.com> wrote:
>=20
>> Does anything time out (eventually)?
>=20
> No. I left it sitting overnight and it was still deadlocked
> in the morning...
>=20

A couple of possible scenarios here:

1.  A command completed with an error, that error was reported up to the =
periph layer, and the periph failed to properly handle it, leading to a =
lost command that eventually livelocked the VM/block layer.
2.  An error happened the transport layer, and the aic7xxx tried to =
freeze the CAM queues to perform error recovery.  Something broke in the =
freeze/unfreeze API, so the aic7xxx was left stranded.

The more I think about it, it's likely case 2, since I know that =
Alexander has been working in or near that code.

Scott




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?76C33FA5-993A-4D23-8ECB-F0913E77A677>