Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 May 1999 14:53:22 +1000
From:      Stephen McKay <syssgm@detir.qld.gov.au>
To:        Warner Losh <imp@harmony.village.org>
Cc:        freebsd-scsi@FreeBSD.ORG, syssgm@detir.qld.gov.au
Subject:   Re: aha1542 brokenness, and CAM technique query 
Message-ID:  <199905240453.OAA11833@nymph.detir.qld.gov.au>
In-Reply-To: <199905240343.VAA16346@harmony.village.org> from Warner Losh at "Sun, 23 May 1999 21:43:35 -0600"
References:  <199905231225.WAA05009@nymph.detir.qld.gov.au> <199905240343.VAA16346@harmony.village.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday, 23rd May 1999, Warner Losh wrote:

>In message <199905231225.WAA05009@nymph.detir.qld.gov.au> Stephen McKay writes
>:
>: Big deal, you say?  Well, the true answer is that aha_cmd() is broken,
>: and there is no easy fix that I know of.  Fiddling with splcam() is not
>: sufficient.  aha_cmd() should not be called if there are any other scsi
>: commands in flight.  Otherwise, AOP_START_MBOX might be issued in the
>: *middle* of another command.  This is impossible with the 1542 since
>: there is just one shared command and parameter port, and the two colliding
>: commands fail.  Lucky that, since issuing random scsi commands is a
>: file system scrambler.
>
>Yuck.  I suspect that the buslogic will have same problem.  Also,
>Justin keeps telling me that I need to merge his changes from bt_cmd.
>One of the things it does is to check the interrupt status while
>sending commands, but it doesn't seem to try any interlocking.  I
>don't know if the buslogic boards have a similar limitation or not.

I expect that the buslogics will fail also, but haven't tried any of my
cards yet.  Justin has ported the bt_cmd changes to the aha driver, and
I'll give that a go in case my analysis is not correct.  But I don't
expect it to help given that I already played with the same stuff attempting
to fix it by myself.

Copying from my mail to Justin:
A mailbox command (ie real SCSI command) is in flight when aha_cmd()
is called.  After the command byte, but before all of the parameter
bytes are written, it completes, calling aha_intr(), which calls
ahadone(), which calls xpt_done() which (as far as I can tell) allows
something else to run which queues another command which issues another
mailbox command.  Crunch!  At least, that's what I think is happening.

I have thought of a way to fix this using splcam(), but I'm not sure if
everyone will accept the tradeoff.  If priority is raised before the
command byte is issued and lowered after all parameter bytes have been
sent, then the problem cannot occur.  I have not yet tested this, and
will tonight, and then the wrangling over the ugliness can begin.

A cleaner solution (assuming my analysis is correct) is to find some way
to tell CAM to single thread the aha driver for a while.  I don't know
anything about CAM, so I'm asking.  Is it easy?

>: So, either I implement some sort of mutex in the bowels of the driver,
>: or there is a simple way of telling CAM to single thread things for
>: a while.  Please, CAM experts, let me know if this is easy.  Otherwise,
>: I will do it the hard way.
>
>I suspect that Justin will have something to say about that :-)

I'm sure he will, as soon as I've convinced him that there is a real problem.

Stephen.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199905240453.OAA11833>