Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Mar 2004 14:35:23 -0500
From:      John Baldwin <jhb@FreeBSD.org>
To:        des@des.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Cc:        cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/kern subr_sleepqueue.c
Message-ID:  <200403031435.23839.jhb@FreeBSD.org>
In-Reply-To: <xzpk722vgbe.fsf@dwp.des.no>
References:  <200403021502.i22F28vF032585@repoman.freebsd.org> <200403021708.43422.jhb@FreeBSD.org> <xzpk722vgbe.fsf@dwp.des.no>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 02 March 2004 09:02 pm, Dag-Erling Sm=F8rgrav wrote:
> John Baldwin <jhb@FreeBSD.org> writes:
> > I never saw that case and this is the first I've heard of it.  ddb tends
> > to freeze when you enter it holding a spin lock.  Do you have any log
> > messages from the mis-matched locks for msleep?
>
> Mismatched locks to msleep(0xc9376000, pause):
>   old 0xc935d06c (process lock), new 0xc64d1e2c (process lock)
> Stack backtrace:
> sleepq_add(c748cdc0,c9376000,c64d1e2c,c05bbdf1,0) at sleepq_add+0x1ee
> msleep(c9376000,c64d1e2c,168,c05bbdf1,0) at msleep+0x19f
> kern_sigsuspend(c64d23f0,0,0,0,0) at kern_sigsuspend+0xa1
> linux_rt_sigsuspend(c64d23f0,ebb20d14,2,279b5,200212) at
> linux_rt_sigsuspend+0x4f syscall(2f,bfbf002f,bfbf002f,28636da8,bfbfe0c0) =
at
> syscall+0x129
> Xint0x80_syscall() at Xint0x80_syscall+0x1d
> --- syscall (179), eip =3D 0x288379b6, esp =3D 0xbfbfe0a0, ebp =3D 0xbfbf=
e0a8 ---
>
> DES

I see the bug.  Here are the msleep's on p_sigacts:

> grep msleep kern_sig.c
        error =3D msleep(ps, &p->p_mtx, PPAUSE|PCATCH, "sigwait", hz);
        while (msleep(p->p_sigacts, &p->p_mtx, PPAUSE|PCATCH, "pause", 0) =
=3D=3D=20
0)
        while (msleep(p->p_sigacts, &p->p_mtx, PPAUSE|PCATCH, "opause", 0) =
=3D=3D=20
0)

Now realize that p_sigacts is a refcount'd struct shared between rfork'd=20
processes (i.e. Linux threads).  The sleep's don't actually get woken up vi=
a=20
a wakeup, they get woken up via a signal, so the wait channel is really a=20
dummy.  Try changing those three msleep's to sleep on &ps and &p->p_sigacts=
=20
and see if that fixes the panic.

=2D-=20
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =3D  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403031435.23839.jhb>