Date: Wed, 5 Jul 2006 19:40:22 GMT From: John Baldwin <jhb@freebsd.org> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/99094: panic: sleeping thread (Sleeping thread ... owns a non-sleepable lock) Message-ID: <200607051940.k65JeM6N013120@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/99094; it has been noted by GNATS. From: John Baldwin <jhb@freebsd.org> To: Eirik =?iso-8859-15?q?=D8verby?= <ltning@anduin.net> Cc: bug-followup@freebsd.org, des@freebsd.org Subject: Re: kern/99094: panic: sleeping thread (Sleeping thread ... owns a non-sleepable lock) Date: Wed, 5 Jul 2006 14:25:41 -0400 On Saturday 01 July 2006 08:04, Eirik =D8verby wrote: > Hi again, >=20 > I now have WITNESS and INVARIANTS in the kernel, and today it hung =20 > again. It looks somewhat different than before, but I am fairly =20 > certain it's the same error. >=20 > Below you'll find the panic message, a bt, a ps, and then the output =20 > of a "c", which is exactly the same as the first message except it's =20 > not chopped off due to terminal size, and finally the panic resulting =20 > from the boot() call. >=20 > /Eirik >=20 > malloc(M_WAITOK) of "1024", forcing M_NOWAIT with the following non-=20 > sleepable locks held: > exclusive sleep mutex vm object (standard object) r =3D 0 =20 > (0xffffff0018f3fe00) locked @ /usr/src/sys/compat/linprocfs/lin9 > KDB: enter: witness_warn > [thread pid 77487 tid 100323 ] > Stopped at kdb_enter+0x2f: nop > db> >=20 >=20 > db> bt > Tracing pid 77487 tid 100323 td 0xffffff00531794c0 > kdb_enter() at kdb_enter+0x2f > witness_warn() at witness_warn+0x2e0 > uma_zalloc_arg() at uma_zalloc_arg+0x1ee > malloc() at malloc+0xab > vn_fullpath() at vn_fullpath+0x56 > linprocfs_doprocmaps() at linprocfs_doprocmaps+0x31e Well, the problem is in linprocfs. It is trying to do some very expensive= =20 things while holding a mutex. Here's the code excerpt: if (lobj) { vp =3D lobj->handle; VM_OBJECT_LOCK(lobj); off =3D IDX_TO_OFF(lobj->size); if (lobj->type =3D=3D OBJT_VNODE && lobj->handle) { vn_fullpath(td, vp, &name, &freename); VOP_GETATTR(vp, &vat, td->td_ucred, td); ino =3D vat.va_fileid; } flags =3D obj->flags; ref_count =3D obj->ref_count; shadow_count =3D obj->shadow_count; VM_OBJECT_UNLOCK(lobj); The VM_OBJECT_LOCK() is a mutex, and it can't really hold a mutex while=20 calling things like vn_fullpath() and VOP_GETATTR() as those can block, etc= =2E =20 It needs to probably be reordered to grab copies of the object fields under= =20 the object lock, take a ref on the vnode (via vref) then do the vn_fullpath= ()=20 and VOP_GETATTR() after dropping the vm object lock and finally do a vrele(= )=20 to drop the vnode reference. I'm cc'ing des@ as he's the linprocfs=20 maintainer and should be able to help with this further. =2D-=20 John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607051940.k65JeM6N013120>