Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Sep 2008 18:13:58 -0700
From:      Norbert Papke <fbsd-ml@scrapper.ca>
To:        freebsd-stable@freebsd.org
Cc:        Gavin Atkinson <gavin@freebsd.org>
Subject:   Re: Possible UDP related deadlock in 7.1-PRERELEASE
Message-ID:  <200809151813.58749.fbsd-ml@scrapper.ca>
In-Reply-To: <1221471431.49328.5.camel@buffy.york.ac.uk>
References:  <200809141219.24943.fbsd-ml@scrapper.ca> <1221471431.49328.5.camel@buffy.york.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On September 15, 2008, Gavin Atkinson wrote:
> On Sun, 2008-09-14 at 12:19 -0700, Norbert Papke wrote:
> > Symptoms:
> >
> > * I can trigger this lockup reliably by starting ktorrent.  After a short
> > while (one to two minutes), it locks up.  Other commands, e.g., netstat,
> > also lock up.
> > * The console generates "nfe0: watchdog timeout" error messages.
> > * The system becomes unusable and must be rebooted.
> >
> > Attempted Diagnosis:
> >
> > If I break into DDB, the 'ps' output shows a number of processes that
> > seem to be locked related to udp.
> >
> > [irq18:dc0]    L   *udp
> > ktorrent       L   *udpinp
> > hald           L   *udp
> > ntpd           L   *udp
> >
> > Unfortunately, I am rapidly getting out of my depth here.  I have no idea
> > how to go about further analyzing this problem and would appreciate help.
>
> Can you add:
>      options WITNESS
>      options WITNESS_SKIPSPIN
>
> to your kernel, recompile and wait for the problem to happen again?
> When it does, from the debugger issue "sh alllocks" and make a note of
> the output?

With WITNESS enabled, I now experience panics and could not follow your 
instructions.  There is no core dump.  The following gets logged 
to /var/log/messages:

shared lock of (rw) udpinp @ /usr/src/sys/netinet/udp_usrreq.c:864
while exclusively locked from /usr/src/sys/netinet6/udp6_usrreq.c:940
panic: share->excl
KDB: stack backtrace:
db_trace_self_wrapper(c06fda7c,f6b96978,c052046a,c06fbb5d,c07695c0,...) at 
db_trace_self_wrapper+0x26
kdb_backtrace(c06fbb5d,c07695c0,c06febd1,f6b96984,f6b96984,...) at 
kdb_backtrace+0x29
panic(c06febd1,c070c409,3ac,c0709eee,360,...) at panic+0xaa
witness_checkorder(ccd5209c,1,c0709eee,360,8,...) at witness_checkorder+0x17c
_rw_rlock(ccd5209c,c0709eee,360,c07780e0,cd4652c8,...) at _rw_rlock+0x2a
udp_send(d3942000,0,c580f400,c68faa00,0,...) at udp_send+0x197
udp6_send(d3942000,0,c580f400,c68faa00,0,...) at udp6_send+0x140
sosend_generic(d3942000,c68faa00,f6b96be8,0,0,...) at sosend_generic+0x50d
sosend(d3942000,c68faa00,f6b96be8,0,0,...) at sosend+0x3f
kern_sendit(cd465230,f,f6b96c64,0,0,...) at kern_sendit+0x106
sendit(0,871b9fe,0,c68faa00,1c,...) at sendit+0x182
sendto(cd465230,f6b96cfc,18,cd465230,c072bab8,...) at sendto+0x4f
syscall(f6b96d38) at syscall+0x293


Note that I do not use IPv6, none of my network interfaces is configured for 
it.

Also, since I enabled WITNESS, I get the following logged during system 
startup:

Enabling pf.
lock order reversal:
 1st 0xc09af92c pf task mtx (pf task mtx) 
@ /usr/src/sys/modules/pf/../../contri
b/pf/net/pf_ioctl.c:1394
 2nd 0xc07b4d68 ifnet (ifnet) @ /usr/src/sys/net/if.c:1558
KDB: stack backtrace:
db_trace_self_wrapper(c06fda7c,f4914a60,c0552c75,c06fed11,c07b4d68,...) at 
db_tr
ace_self_wrapper+0x26
kdb_backtrace(c06fed11,c07b4d68,c0703ca2,c0703ca2,c0703c73,...) at 
kdb_backtrace
+0x29
witness_checkorder(c07b4d68,9,c0703c73,616,572,...) at 
witness_checkorder+0x5e5
_mtx_lock_flags(c07b4d68,0,c0703c73,616,c0104414,...) at _mtx_lock_flags+0x34
ifunit(c6ef5c20,0,c09adfb5,572,c0703a71,...) at ifunit+0x2f
pfioctl(c566ce00,c0104414,c6ef5c20,3,c60c38c0,...) at pfioctl+0x2b43
devfs_ioctl_f(c588bb94,c0104414,c6ef5c20,c54bb900,c60c38c0,...) at 
devfs_ioctl_f
+0xe6
kern_ioctl(c60c38c0,3,c0104414,c6ef5c20,1000000,...) at kern_ioctl+0x243
ioctl(c60c38c0,f4914cfc,c,c0718d59,c072b350,...) at ioctl+0x134
syscall(f4914d38) at syscall+0x293
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x281ab6f3, esp = 0xbfbfde3c, 
ebp
= 0xbfbfde68 ---
pf enabled


I tried to unload 'pf' to see if it was the culprit.  However, even without pf 
loaded, I experience the panic.

Is there anything else I can try to provide better insight into what might be 
going on?

Cheers,

-- Norbert.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809151813.58749.fbsd-ml>