Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Apr 2007 14:55:03 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Stefan Ehmann <shoesoft@gmx.net>
Cc:        freebsd-current@freebsd.org, des@freebsd.org
Subject:   Re: strace causes panic: sleeping thread
Message-ID:  <20070428115503.GM2441@deviant.kiev.zoral.com.ua>
In-Reply-To: <200704281128.44077.shoesoft@gmx.net>
References:  <200704281128.44077.shoesoft@gmx.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--GU3/x65mZ6MFE8p3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Apr 28, 2007 at 11:28:43AM +0200, Stefan Ehmann wrote:
> I see this on freshly build CURRENT (i386):
>=20
> Using strace causes an immediate panic, e.g. strace echo foo:
>=20
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.s=
o:=20
> Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you =
are
> welcome to change it and/or distribute copies of it under certain conditi=
ons.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for detail=
s.
> This GDB was configured as "i386-marcel-freebsd".
>=20
> Unread portion of the kernel message buffer:
> Sleeping thread (tid 100078, pid 896) owns a non-sleepable lock
> sched_switch(c2da4a20,0,1) at sched_switch+0xff
> mi_switch(1,0) at mi_switch+0x1d4
> sleepq_switch(c2f7e168) at sleepq_switch+0x8b
> sleepq_wait_sig(c2f7e168,c2da4a20,5c,0,100,...) at sleepq_wait_sig+0x1d
> _sleep(c2f7e168,c2f7e060,15c,c0759778,0,...) at _sleep+0x26e
> procfs_ioctl(c2da4a20,c2f7e000,c2b27200,40147004,c2ca3b80) at=20
> procfs_ioctl+0x1f5
> pfs_ioctl(d4e18b84) at pfs_ioctl+0x60
> VOP_IOCTL_APV(c0794f60,d4e18b84) at VOP_IOCTL_APV+0x38
> vn_ioctl(c2ba2480,40147004,c2ca3b80,c2f60180,c2da4a20) at vn_ioctl+0x18d
> kern_ioctl(c2da4a20,3,40147004,c2ca3b80) at kern_ioctl+0x282
> ioctl(c2da4a20,d4e18d00) at ioctl+0xf1
> syscall(d4e18d38) at syscall+0x2a2
> Xint0x80_syscall() at Xint0x80_syscall+0x20
> --- syscall (54, FreeBSD ELF32, ioctl), eip =3D 0x2816474f, esp =3D 0xbfb=
fe38c,=20
> ebp =3D 0xbfbfe418 ---
> panic: sleeping thread
> cpuid =3D 0
> KDB: enter: panic
> Uptime: 21s
> Physical memory: 470 MB
> Dumping 42 MB: 27 11
>=20
> #0  doadump () at pcpu.h:172
> 172	pcpu.h: No such file or directory.
> 	in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:172
> #1  0xc055bfe1 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c=
:409
> #2  0xc055c323 in panic (fmt=3D0xc07619aa "sleeping thread")
>     at /usr/src/sys/kern/kern_shutdown.c:563
> #3  0xc0584384 in propagate_priority (td=3D0xc2da4a20)
>     at /usr/src/sys/kern/subr_turnstile.c:205
> #4  0xc0584d90 in turnstile_wait (lock=3D0xc2f7e060, owner=3D0xc2da4a20, =
queue=3D0)
>     at /usr/src/sys/kern/subr_turnstile.c:682
> #5  0xc0552d87 in _mtx_lock_sleep (m=3D0xc2f7e060, tid=3D3269086160, opts=
=3D0,=20
>     file=3D0x0, line=3D0) at /usr/src/sys/kern/kern_mutex.c:415
> #6  0xc0567658 in thread_suspend_check (return_instead=3D0)
>     at /usr/src/sys/kern/kern_thread.c:830
> #7  0xc0583d57 in userret (td=3D0xc2da4bd0, frame=3D0xd4e1bd38)
>     at /usr/src/sys/kern/subr_trap.c:112
> #8  0xc070d0c5 in syscall (frame=3D0xd4e1bd38)
>     at /usr/src/sys/i386/i386/trap.c:1078
> #9  0xc06f60c0 in Xint0x80_syscall ()=20
> at /usr/src/sys/i386/i386/exception.s:196
> #10 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)

This is because you do not have INVARIANTS in kernel. Then you would obtain
the "recursed on non-recursive mutex" panic.

The pfs_ioctl locks the process (pfs_ioctl()->pfs_visible()->pfind()) that
is the target of the ioctl, and then the pseudofs_ioctl() do PROC_LOCK()
again.

This is changed by rev. 1.62 of the fs/pseudofs/pseudofs_vnops.c, before it
process was held during pn_ioctl() call instead of being locked. Also, this
change seems to also take place for getextattr().

With the following patch, I was able to successfully strace ls. As a side
note, it seems that procfs ABI changed, strace built on RELENG_6 cannot run
on CURRENT.


Index: fs/pseudofs/pseudofs_vnops.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /usr/local/arch/ncvs/src/sys/fs/pseudofs/pseudofs_vnops.c,v
retrieving revision 1.63
diff -u -r1.63 pseudofs_vnops.c
--- fs/pseudofs/pseudofs_vnops.c	15 Apr 2007 20:35:18 -0000	1.63
+++ fs/pseudofs/pseudofs_vnops.c	28 Apr 2007 11:54:34 -0000
@@ -265,10 +265,15 @@
 	if (!pfs_visible(curthread, pn, pvd->pvd_pid, &proc))
 		PFS_RETURN (EIO);
=20
-	error =3D pn_ioctl(curthread, proc, pn, va->a_command, va->a_data);
+	if (proc !=3D NULL) {
+		_PHOLD(proc);
+		PROC_UNLOCK(proc);
+	}
+
+	error =3D (pn->pn_ioctl)(curthread, proc, pn, va->a_command, va->a_data);=
	=20
=20
 	if (proc !=3D NULL)
-		PROC_UNLOCK(proc);
+		PRELE(proc);
=20
 	PFS_RETURN (error);
 }
@@ -297,13 +302,17 @@
=20
 	if (pn->pn_getextattr =3D=3D NULL)
 		error =3D EOPNOTSUPP;
-	else
+	else {
+		if (proc !=3D NULL) {
+			_PHOLD(proc);	=20
+			PROC_UNLOCK(proc);	=20
+		}
 		error =3D pn_getextattr(curthread, proc, pn,
 		    va->a_attrnamespace, va->a_name, va->a_uio,
 		    va->a_size, va->a_cred);
-
-	if (proc !=3D NULL)
-		PROC_UNLOCK(proc);
+		if (proc !=3D NULL)
+			PRELE(proc);
+	}
=20
 	pfs_unlock(pn);
 	PFS_RETURN (error);

--GU3/x65mZ6MFE8p3
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFGMzYWC3+MBN1Mb4gRAm5tAJ9vWssiX2PpCzF24gZKa8/JaqPaugCgl6U0
t/p5WAqQ+KN4aRANw5gPkp0=
=hYxl
-----END PGP SIGNATURE-----

--GU3/x65mZ6MFE8p3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070428115503.GM2441>