From owner-freebsd-hackers Wed Apr 24 10:31:49 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mail.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id E7B3737B42F for ; Wed, 24 Apr 2002 10:30:54 -0700 (PDT) Received: (qmail 28273 invoked from network); 24 Apr 2002 17:30:53 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 24 Apr 2002 17:30:53 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.11.6/8.11.6) with ESMTP id g3OHUrv07383; Wed, 24 Apr 2002 13:30:53 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: Date: Wed, 24 Apr 2002 13:30:01 -0400 (EDT) From: John Baldwin To: Matthew Jacob Subject: RE: mutex owned stuff fallible? Cc: hackers@FreeBSD.org Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 24-Apr-2002 Matthew Jacob wrote: > > > On Wed, 24 Apr 2002, John Baldwin wrote: > >> >> On 24-Apr-2002 Matthew Jacob wrote: >> > >> > This is a recent i386 SMP kernel: >> > >> > >> > panic: mutex isp not owned at ../../../kern/kern_synch.c:449 >> > cpuid = 0; lapic.id = 00000000 >> > Debugger("panic") >> > Stopped at Debugger+0x41: xorl %eax,%eax >> > db> >> > db> t >> > Debugger(c031189a) at Debugger+0x41 >> > panic(c0310ae8,c030470d,c0312018,1c1,d2d08438) at panic+0xd8 >> > _mtx_assert(d2d0843c,9,c0312018,1c1,69) at _mtx_assert+0x59 >> > msleep(d2d08438,d2d0843c,4c,c0301260,7d0) at msleep+0x157 >> > isp_mboxcmd(d2d08400,d2d19c04,f,d07dee8,0) at isp_mboxcmd+0x19c >> > isp_fw_state(d2d08400,d2d19c54,d2d08400,d2d09000,d2d08400) at >> > isp_fw_state+0x2b >> > isp_fclink_test(d2d08400,1e8480,d2d08400,d2d09000,d2d0843c) at >> > isp_fclink_test+0x5d >> > isp_control(d2d08400,4,d2d19d18) at isp_control+0x28b >> > isp_kthread(d2d08400,d2d19d48,d2d02a3c,c017b25c,0) at isp_kthread+0x6d >> > fork_exit(c017b25c,d2d08400,d2d19d48) at fork_exit+0x88 >> > fork_trampoline() at fork_trampoline+0x37 >> >> Is this code that is checked into the tree? > > Yes. > >> If so I can't see where >> isp_kthread() calls isp_control(). > > isp_fc_runstate is an inline that calls isp_control. Ah, ok. >> mtx_owned() should always work. If >> we own the lock then we were the last to write to it, so the value in our >> cache can't be stale (at least, not the thread value, the contested bit >> could be set by another CPU, but we mask off that bit when reading the >> owner, so it's value doesn't matter). If we don't own the lock, it's >> value but we don't care so long as we don't get a false positive. Since >> we would have to write out the unowned cookie before another lock could >> grab it though, we would at least have a value that up to date, so we >> wouldn't read a stale value that had us owning the lock when we didn't. > > This pp is hard to parse, but I think we're in agreement that this occurrence > is 'inconceivable'. Yes. > I am *very* puzzled. Me, too. The next time this happens, try dumping the contents of the mutex structure from ddb. The first argument to mtx_assert() and 2nd arg to msleep() is a pointer to the mutex, so you have the address. (The pointer looks right since the name was right in the panic message at least.) The first bits of the structure will be a struct lock_object which contains 3 pointers, an int, and then 2 more pointers. The next word will be the actual lock contents. You can use 'show pcpu' to get the per-CPU information containing (among other things) curthread. The value of the lock should be curthread (possibly with bits 1 or 2 set). If it is 0x4 (MTX_UNOWNED) it means the lock was released somehow. If that is the case, you can compile KTR into your kernel with lock tracing using: options KTR options KTR_COMPILE=KTR_LOCK options KTR_MASK=KTR_LOCK Then when it breaks do a 'show ktr' in ddb to get a trace of the most recent lock operations. You might want to turn on KTR_PROC as well (s/KTR_LOCK/(KTR_LOCK|KTR_PROC)/ above) so that you see when we switch processes so it is less confusing. This info might be useful to look at anyways. Hmm, I wonder if the mutex is recursed and mtx_assert() isn't printing the right error message? Hmm, nope. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message