Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Nov 2002 14:54:04 -0500 (EST)
From:      Robert Watson <rwatson@freebsd.org>
To:        "Joel M. Baldwin" <qumqats@outel.org>
Cc:        current@freebsd.org
Subject:   Re: more info from panic from running dnet on SMP kernel ( lock order reversal, recursed on non-recursive lock )
Message-ID:  <Pine.NEB.3.96L.1021117145242.93303I-100000@fledge.watson.org>
In-Reply-To: <211086306.1037497825@[192.168.1.20]>

next in thread | previous in thread | raw e-mail | index | archive | help
Hmm.  It looks like there is indeed a lock leak in the RFTHREAD code.
Maybe a change like the following might help:

                        PROC_LOCK(p2);
                        psignal(p2, SIGKILL);
                        PROC_UNLOCK(p2);
                }

Change the } to:
=09=09} else
=09=09=09PROC_UNLOCK(p1->p_leader);

And see if that gets rid of the problem.  Any chance this is highly
reproduceable, btw? :-)  And what app are you running that's using
RFTHREAD -- linux thread stuff?

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories

On Sun, 17 Nov 2002, Joel M. Baldwin wrote:

>=20
> running dnet on a SMP kernel causes the kernel to panic.
>=20
>=20
> lock order reversal
>  1st 0xc2c803e8 process lock (process lock) @=20
> ../../../kern/kern_fork.c:571
>  2nd 0xc03cfce0 proctree (proctree) @ ../../../kern/kern_fork.c:596
> recursed on non-recursive lock (sleep mutex) process lock @=20
> ../../../kern/kern_fork.c:599
> first acquired @ ../../../kern/kern_fork.c:571
> panic: recurse
> cpuid =3D 1; lapic.id =3D 01000000
> Debugger("panic")
> Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
> db> t
> Debugger(c03926fa,1000000,c0395ada,d26f5c08,1) at Debugger+0x55
> panic(c0395ada,c038feab,23b,c038feab,257) at panic+0x11f
> witness_lock(c2c803e8,8,c038feab,257,0) at witness_lock+0x3e6
> _mtx_lock_flags(c2c803e8,0,c038feab,257,d26f5cb8) at=20
> _mtx_lock_flags+0xb2
> fork1(c2773d00,6050,0,d26f5cd4,c2c803e8) at fork1+0xbfc
> rfork(c2773d00,d26f5d10,c03b07a2,407,1) at rfork+0x65
> syscall(2f,2f,2f,0,80ddf10) at syscall+0x28e
> Xint0x80_syscall() at Xint0x80_syscall+0x1d
> --- syscall (251, FreeBSD ELF32, rfork), eip =3D 0x8087d14, esp =3D=20
> 0xbfbff4a8, ebp =3D 0xbfbff524 ---
> db> ps
>   pid   proc     addr    uid  ppid  pgrp  flag   stat  wmesg    wchan=20
> cmd
>  6217 c2b98e00 d28a7000    0  6215  6216 0000000 newpanic: unknown=20
> thread state
> cpuid =3D 1; lapic.id =3D 01000000
> boot() called on cpu#1
> Uptime: 1h43m39s
> pfs_vncache_unload(): 1 entries remaining
> Dumping 255 MB
>  16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
> Dump complete
> Automatic reboot in 15 seconds - press a key on the console to abort
> Rebooting...
> cpu_reset called on cpu#1
> cpu_reset: Restarting BSP
> cpu_reset_proxy: Stopped CPU 1
>=20
>=20
>=20
> And then when the system came back up and I took a closer
> look at the core dump.
>=20
>=20
> (kgdb) where
> #0  doadump () at ../../../kern/kern_shutdown.c:232
> #1  0xc02114ad in boot (howto=3D260) at ../../../kern/kern_shutdown.c:364
> #2  0xc0211767 in panic () at ../../../kern/kern_shutdown.c:517
> #3  0xc014f2bc in db_ps (dummy1=3D-1070342907, dummy2=3D0, dummy3=3D-1,=
=20
> dummy4=3D0xd26f5a24 "")
>     at ../../../ddb/db_ps.c:169
> #4  0xc014d142 in db_command (last_cmdp=3D0xc03be920, cmd_table=3D0x0,=20
> aux_cmd_tablep=3D0xc03b5540,
>     aux_cmd_tablep_end=3D0xc03b5558) at ../../../ddb/db_command.c:346
> #5  0xc014d256 in db_command_loop () at ../../../ddb/db_command.c:472
> #6  0xc014feea in db_trap (type=3D3, code=3D0) at ../../../ddb/db_trap.c:=
72
> #7  0xc033da10 in kdb_trap (type=3D3, code=3D0, regs=3D0xd26f5b80)
>     at ../../../i386/i386/db_interface.c:166
> #8  0xc0356a3f in trap (frame=3D
>       {tf_fs =3D -1069481960, tf_es =3D 16, tf_ds =3D 16, tf_edi =3D=20
> -1032372992, tf_esi =3D 256, tf_ebp =3D -764453940, tf_isp =3D -764453972=
,=20
> tf_ebx =3D 0, tf_edx =3D 0, tf_ecx =3D 1, tf_eax =3D 18, tf_trapno =3D 3,=
 tf_err=20
> =3D 0, tf_eip =3D -1070342907, tf_cs =3D 8, tf_eflags =3D 662, tf_esp =3D=
=20
> -1069883258, tf_ss =3D -1069996294}) at ../../../i386/i386/trap.c:603
> #9  0xc033f238 in calltrap () at {standard input}:99
> #10 0xc021174f in panic (fmt=3D0x0) at ../../../kern/kern_shutdown.c:503
> #11 0xc02333d6 in witness_lock (lock=3D0xc2c803e8, flags=3D8,
>     file=3D0xc038feab "../../../kern/kern_fork.c", line=3D599) at=20
> ../../../kern/subr_witness.c:609
> #12 0xc02079c2 in _mtx_lock_flags (m=3D0xc03cf4c0, opts=3D0,=20
> file=3D0xc042cfd4 "=E8\003=C8=C2=AB=FE8=C0;\002",
>     line=3D-1027079192) at ../../../kern/kern_mutex.c:328
> #13 0xc01fd3ec in fork1 (td=3D0xc2773d00, flags=3D24656, pages=3D0,=20
> procp=3D0xd26f5cd4)
>     at ../../../kern/kern_fork.c:599
> #14 0xc01fc6c5 in rfork (td=3D0xc2773d00, uap=3D0xd26f5d10) at=20
> ../../../kern/kern_fork.c:168
> #15 0xc035739e in syscall (frame=3D
>       {tf_fs =3D 47, tf_es =3D 47, tf_ds =3D 47, tf_edi =3D 0, tf_esi =3D=
=20
> 135126800, tf_ebp =3D -1077938908, tf_isp =3D -764453516, tf_ebx =3D 2,=
=20
> tf_edx =3D 135381248, tf_ecx =3D 135381248, tf_eax =3D 251, tf_trapno =3D=
 0,=20
> tf_err =3D 2, tf_eip =3D 134774036, tf_cs =3D 31, tf_eflags =3D 659, tf_e=
sp =3D=20
> -1077939032, tf_ss =3D 47})
>     at ../../../i386/i386/trap.c:1033
> #16 0xc033f28d in Xint0x80_syscall () at {standard input}:141
> ---Can't read userspace from dump, or kernel process---
>=20
>=20
>=20
>=20
>=20
>=20
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-current" in the body of the message
>=20


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1021117145242.93303I-100000>