Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Sep 2016 19:18:29 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-scsi@FreeBSD.org
Subject:   [Bug 212841] getting panic during mps reinitialization.
Message-ID:  <bug-212841-5312-8npO42pAem@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-212841-5312@https.bugs.freebsd.org/bugzilla/>
References:  <bug-212841-5312@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D212841

--- Comment #11 from Stephen McConnell <slm@freebsd.org> ---
The reset timing in the driver looks fine to me. There is a requirement that
the host wait a certain amount of time when it first accesses the controller
during a reset, and then a certain time to wait on checking registers, etc.
But, it looks fine.

What doesn't make sense is that you're waiting some arbitrary amount of time
after the initial failure and then it works. This time that your waiting is
after the reset completes and then after some calls to other functions. Aft=
er
all of that, some access to the DOORBELL fails. Then, waiting 2 mSecs fixes=
 it.
That's strange.

There are two ways that this will fail in Step 4 of mps_request_sync(). The
first is when reading the Interrupt Status REG. If this Register does not s=
how
an interrupt within 5 seconds, it fails (that's a really long time). The se=
cond
is when reading the DOORBELL REG. If the DOORBELL_USED bit is not set, it
fails. I can't tell which one of these fails. But, because it fails your fix
will just wait 2 mSecs and then retry, then it's successful (at least withi=
n 10
mSecs - 5 retries).

What I'm wondering is, does it really matter that you have a delay between
mps_request_sync() calls? To me, it looks like something is messed up in FW=
 and
just doing a retry fixes it.

Now, with all of that said, I'm not sure there really is a better fix except
that the delay may not need to be there. Having the delay there would make
someone think that we're just not waiting long enough, which really is not =
the
case and looks a little scary, meaning someone could think the driver timing
for this is very fragile, when it's really not.

Sean, let me know what you think about removing the delay. If you want the
delay, I would at least say to add a comment that explains the delay and re=
try,
since none of this is really supposed to happen and I think it's some FW or=
 HW
workaround.

--=20
You are receiving this mail because:
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-212841-5312-8npO42pAem>