From owner-freebsd-scsi@freebsd.org Tue Sep 27 19:18:29 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9C0DBEC16B for ; Tue, 27 Sep 2016 19:18:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A29099E0 for ; Tue, 27 Sep 2016 19:18:29 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u8RJITr9077061 for ; Tue, 27 Sep 2016 19:18:29 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-scsi@FreeBSD.org Subject: [Bug 212841] getting panic during mps reinitialization. Date: Tue, 27 Sep 2016 19:18:29 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.2-RELEASE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: slm@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Sep 2016 19:18:30 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D212841 --- Comment #11 from Stephen McConnell --- The reset timing in the driver looks fine to me. There is a requirement that the host wait a certain amount of time when it first accesses the controller during a reset, and then a certain time to wait on checking registers, etc. But, it looks fine. What doesn't make sense is that you're waiting some arbitrary amount of time after the initial failure and then it works. This time that your waiting is after the reset completes and then after some calls to other functions. Aft= er all of that, some access to the DOORBELL fails. Then, waiting 2 mSecs fixes= it. That's strange. There are two ways that this will fail in Step 4 of mps_request_sync(). The first is when reading the Interrupt Status REG. If this Register does not s= how an interrupt within 5 seconds, it fails (that's a really long time). The se= cond is when reading the DOORBELL REG. If the DOORBELL_USED bit is not set, it fails. I can't tell which one of these fails. But, because it fails your fix will just wait 2 mSecs and then retry, then it's successful (at least withi= n 10 mSecs - 5 retries). What I'm wondering is, does it really matter that you have a delay between mps_request_sync() calls? To me, it looks like something is messed up in FW= and just doing a retry fixes it. Now, with all of that said, I'm not sure there really is a better fix except that the delay may not need to be there. Having the delay there would make someone think that we're just not waiting long enough, which really is not = the case and looks a little scary, meaning someone could think the driver timing for this is very fragile, when it's really not. Sean, let me know what you think about removing the delay. If you want the delay, I would at least say to add a comment that explains the delay and re= try, since none of this is really supposed to happen and I think it's some FW or= HW workaround. --=20 You are receiving this mail because: You are on the CC list for the bug.=