Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Nov 2016 13:26:30 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        gljennjohn@gmail.com
Cc:        FreeBSD Current <freebsd-current@FreeBSD.org>, FreeBSD Hackers <freebsd-hackers@FreeBSD.org>
Subject:   Re: firewire panic
Message-ID:  <372e33d1-0304-51f9-8f8e-94d3ca36420c@FreeBSD.org>
In-Reply-To: <20161114105807.02dfbe66@ernst.home>
References:  <91a1440d-14c7-2cc6-6cbb-2b62bfd2c27d@FreeBSD.org> <f64e4840-7b80-1ead-e4d1-f64f378775a6@FreeBSD.org> <20161114105807.02dfbe66@ernst.home>

next in thread | previous in thread | raw e-mail | index | archive | help
On 14/11/2016 11:58, Gary Jennejohn wrote:
> On Sun, 13 Nov 2016 23:56:09 +0200
> Andriy Gapon <avg@FreeBSD.org> wrote:
> 
>> On 11/11/2016 14:25, Andriy Gapon wrote:
>>> panic: mutex sbp not owned at /usr/src/sys/dev/firewire/sbp.c:967
>>> cpuid = 2
>>> curthread: 0xfffff8000ada5000
>>> stack: 0xfffffe0504ded000 - 0xfffffe0504df1000
>>> stack pointer: 0xfffffe0504df0a00
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at 0xffffffff80420bbb = db_trace_self_wrapper+0x2b/frame
>>> 0xfffffe0504df0930
>>> kdb_backtrace() at 0xffffffff80670359 = kdb_backtrace+0x39/frame 0xfffffe0504df09e0
>>> vpanic() at 0xffffffff8063986c = vpanic+0x14c/frame 0xfffffe0504df0a20
>>> panic() at 0xffffffff806395b3 = panic+0x43/frame 0xfffffe0504df0a80
>>> __mtx_assert() at 0xffffffff8061c40d = __mtx_assert+0xed/frame 0xfffffe0504df0ac0
>>> sbp_cam_scan_lun() at 0xffffffff80474667 = sbp_cam_scan_lun+0x37/frame
>>> 0xfffffe0504df0af0
>>> xpt_done_process() at 0xffffffff802aacfa = xpt_done_process+0x2da/frame
>>> 0xfffffe0504df0b30
>>> xpt_done_td() at 0xffffffff802ac2e5 = xpt_done_td+0xd5/frame 0xfffffe0504df0b80  
>>
>> So, it's pretty obvious that the sbp mutex can not be held when
>> sbp_cam_scan_lun() is called.
>>
> 
> The code seems to assume that the scan_callout callout is still
> holding the mutex when sbp_cam_scan_lun() is entered.
> 
> Seems reasonable, since the man page claims that the callout routine
> keeps the mutex locked until the callout function, in this case that's
> sbp_cam_scan_target(), returns.  Since sbp_cam_scan_target() invokes
> xpt_action() with sbp_cam_scan_lun() as its callback, it seems like
> the assumption should be true.

The wrong assumption in your reasoning is that the callback is executed in the
same thread.

> Pehaps there's some asynchronous action happening with the
> firewire code which is releasing the mutex prematurely.
> 
> Or maybe the sbp used in sbp_cam_scan_lun() is wrong?  Dunno.


-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?372e33d1-0304-51f9-8f8e-94d3ca36420c>