Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Apr 2002 13:30:01 -0400 (EDT)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Matthew Jacob <mjacob@feral.com>
Cc:        hackers@FreeBSD.org
Subject:   RE: mutex owned stuff fallible?
Message-ID:  <XFMail.20020424133001.jhb@FreeBSD.org>
In-Reply-To: <Pine.BSF.4.21.0204240912560.60421-100000@beppo>

next in thread | previous in thread | raw e-mail | index | archive | help

On 24-Apr-2002 Matthew Jacob wrote:
> 
> 
> On Wed, 24 Apr 2002, John Baldwin wrote:
> 
>> 
>> On 24-Apr-2002 Matthew Jacob wrote:
>> > 
>> > This is a recent i386 SMP kernel:
>> > 
>> > 
>> > panic: mutex isp not owned at ../../../kern/kern_synch.c:449
>> > cpuid = 0; lapic.id = 00000000
>> > Debugger("panic")
>> > Stopped at      Debugger+0x41:  xorl    %eax,%eax
>> > db>
>> > db> t
>> > Debugger(c031189a) at Debugger+0x41
>> > panic(c0310ae8,c030470d,c0312018,1c1,d2d08438) at panic+0xd8
>> > _mtx_assert(d2d0843c,9,c0312018,1c1,69) at _mtx_assert+0x59
>> > msleep(d2d08438,d2d0843c,4c,c0301260,7d0) at msleep+0x157
>> > isp_mboxcmd(d2d08400,d2d19c04,f,d07dee8,0) at isp_mboxcmd+0x19c
>> > isp_fw_state(d2d08400,d2d19c54,d2d08400,d2d09000,d2d08400) at
>> > isp_fw_state+0x2b
>> > isp_fclink_test(d2d08400,1e8480,d2d08400,d2d09000,d2d0843c) at
>> > isp_fclink_test+0x5d
>> > isp_control(d2d08400,4,d2d19d18) at isp_control+0x28b
>> > isp_kthread(d2d08400,d2d19d48,d2d02a3c,c017b25c,0) at isp_kthread+0x6d
>> > fork_exit(c017b25c,d2d08400,d2d19d48) at fork_exit+0x88
>> > fork_trampoline() at fork_trampoline+0x37
>> 
>> Is this code that is checked into the tree?
> 
> Yes.
> 
>>  If so I can't see where
>> isp_kthread() calls isp_control().
> 
> isp_fc_runstate is an inline that calls isp_control.

Ah, ok.

>>  mtx_owned() should always work.  If
>> we own the lock then we were the last to write to it, so the value in our
>> cache can't be stale (at least, not the thread value, the contested bit
>> could be set by another CPU, but we mask off that bit when reading the
>> owner, so it's value doesn't matter).  If we don't own the lock, it's
>> value but we don't care so long as we don't get a false positive.  Since
>> we would have to write out the unowned cookie before another lock could
>> grab it though, we would at least have a value that up to date, so we
>> wouldn't read a stale value that had us owning the lock when we didn't.
> 
> This pp is hard to parse, but I think we're in agreement that this occurrence
> is 'inconceivable'.

Yes.

> I am *very* puzzled.

Me, too.  The next time this happens, try dumping the contents of the mutex
structure from ddb.  The first argument to mtx_assert() and 2nd arg to msleep()
is a pointer to the mutex, so you have the address.  (The pointer looks right
since the name was right in the panic message at least.)  The first bits of
the structure will be a struct lock_object which contains 3 pointers, an int,
and then 2 more pointers.  The next word will be the actual lock contents.  You
can use 'show pcpu' to get the per-CPU information containing (among other
things) curthread.  The value of the lock should be curthread (possibly with
bits 1 or 2 set).  If it is 0x4 (MTX_UNOWNED) it means the lock was released
somehow.  If that is the case, you can compile KTR into your kernel with lock
tracing using:

options         KTR
options         KTR_COMPILE=KTR_LOCK
options         KTR_MASK=KTR_LOCK

Then when it breaks do a 'show ktr' in ddb to get a trace of the most recent
lock operations.  You might want to turn on KTR_PROC as well
(s/KTR_LOCK/(KTR_LOCK|KTR_PROC)/ above) so that you see when we switch
processes so it is less confusing.  This info might be useful to look at
anyways.

Hmm, I wonder if the mutex is recursed and mtx_assert() isn't printing the
right error message?  Hmm, nope.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20020424133001.jhb>