Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Sep 2004 10:34:39 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: Unkillable KSE threaded proc
Message-ID:  <16711.383.448500.578640@grasshopper.cs.duke.edu>
In-Reply-To: <4146AAC1.5020701@elischer.org>
References:  <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> <16704.49447.290897.602540@grasshopper.cs.duke.edu> <4146AAC1.5020701@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Julian Elischer writes:
 > Andrew Gallatin wrote:
 > > Julian Elischer writes:
 > >  > >
 > >  > >Maybe this would be easier to debug if I disabled preemption?
 > >  > >
 > >  > 
 > >  > 
 > >  > I think that this would possibly GO AWAY of you disab;ed preemption. 
 > >  > which would make it very hard to debug :-)
 > >  > 
 > > 
 > > Yes and no.  You initially asked me to try in -current because of
 > > some changes you'd made to the exit code.  RELENG_5 (with the old
 > > exit code and no preemption) shows a different problem (proc is
 > > just not killable).    If the proc was killable without preemption,
 > > that would at least show your new code is better..
 > 
 > try the attached diff:
 > 

This is worse..

Its worse in that the application never starts running fully, and that
it seems to ignore signals entirely.  I can't attach a debugger to it
to see how far it got before hanging due to the signal problem.  When
it hangs, (both before and after a signal is sent) the CPU utilization
is 0%..  Before its sent a signal, it looks like this:

  573 c1f3b8c0 e88ae000 1387   517   573 000c082 (threaded) mx_pingpong
   thread 0xc1f3e320 ksegrp 0xc19ead20 [RUNQ]
   thread 0xc1f3e4b0 ksegrp 0xc19ead20 [RUNQ]
   thread 0xc1f3e640 ksegrp 0xc19eaaf0 [SLPQ ksesigwait 0xc1f3b9c0][SLP]


db> call db_trace_thread(0xc1f3e320, -1)
sched_switch(c1f3e320,0,1,1862ccb2,994777d8) at sched_switch+0x137
mi_switch(1,0,c05fdf59,804c000,c2b8c2ec) at mi_switch+0x1ce
turnstile_wait(c1a518c0,c06c53e0,c1a4d7d0,0,1) at turnstile_wait+0x339
_mtx_lock_sleep(c06c53e0,c1f3e320,0,0,0) at _mtx_lock_sleep+0x122
vm_fault(c187a5dc,804c000,1,0,0) at vm_fault+0x214
trap_pfault(e88b8d48,1,804c800,3,804c800) at trap_pfault+0x136
trap(2f,2f,2f,805d13c,805d13c) at trap+0x201
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0x804c800, esp = 0xbfbfe66c, ebp = 0xbfbfe678 ---
0

db> call db_trace_thread(0xc1f3e4b0, -1)
sched_switch(c1f3e4b0,0,1,f0007932,9935c3e9) at sched_switch+0x137
mi_switch(1,0,c19ead60,e88bbc5c,c1f3e4b0) at mi_switch+0x1ce
sleepq_switch(c19ead60,c1f3e4b0,0,e88bbc94,c04e5da6) at sleepq_switch+0x171
sleepq_timedwait_sig(c19ead60,0,c1f3b92c,c0677640,100) at sleepq_timedwait_sig+0x13
msleep(c19ead60,c1f3b92c,168,c0677640,1771) at msleep+0x37b
kse_release(c1f3e4b0,e88bbd14,4,c04c47ab,0) at kse_release+0x29b
syscall(2f,2f,2f,8054200,0) at syscall+0x2fc
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0x8194f80, ebp = 0x8194fbc ---
0

db> call db_trace_thread(0xc1f3e640, -1)
sched_switch(c1f3e640,0,1,bc7c14b2,97d6ec54) at sched_switch+0x137
mi_switch(1,0,0,0,0) at mi_switch+0x1ce
sleepq_switch(c1f3b9c0,c1f3e640,0,e88bec94,c04e5da6) at sleepq_switch+0x171
sleepq_timedwait_sig(c1f3b9c0,0,0,0,0) at sleepq_timedwait_sig+0x13
msleep(c1f3b9c0,c1f3b92c,168,c0677635,bb9) at msleep+0x37b
kse_release(c1f3e640,e88bed14,4,c04c47ab,0) at kse_release+0x1a1
syscall(2f,2f,2f,1,81) at syscall+0x2fc
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0xbfafef30, ebp = 0xbfafef8c ---
0


A different run, but after sending it a ^C from the command line:

 547 c1f3b1c0 e88aa000    0     1   547 000c482 (threaded)  mx_pingpong
   thread 0xc1f3e960 ksegrp 0xc19eaee0 [RUNQ]
   thread 0xc1f3eaf0 ksegrp 0xc19eaee0 [RUNQ]
   thread 0xc1f3ec80 ksegrp 0xc19eab60 [SUSP]

db> call db_trace_thread(0xc1f3e960, -1)
sched_switch(c1f3e960,0,2,e7ff39b6,d6d80c8c) at sched_switch+0x137
mi_switch(2,0,0,0,0) at mi_switch+0x1ce
ast(e88c4d48) at ast+0x4eb
doreti_ast() at doreti_ast+0x17
0
db> call db_trace_thread(0xc1f3eaf0, -1)
sched_switch(c1f3eaf0,0,1,6e2ca4e6,d6924d2f) at sched_switch+0x137
mi_switch(1,0,c19eaf20,e88c7c5c,c1f3eaf0) at mi_switch+0x1ce
sleepq_switch(c19eaf20,c1f3eaf0,0,e88c7c94,c04e5da6) at sleepq_switch+0x171
sleepq_timedwait_sig(c19eaf20,0,c1f3b22c,c0677640,100) at sleepq_timedwait_sig+0x13
msleep(c19eaf20,c1f3b22c,168,c0677640,1771) at msleep+0x37b
kse_release(c1f3eaf0,e88c7d14,4,c04c47ab,0) at kse_release+0x29b
syscall(2f,2f,2f,8054200,0) at syscall+0x2fc
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (383, FreeBSD ELF32, kse_release), eip = 0x280a3d4f, esp = 0x8194f80, ebp = 0x8194fbc ---
0
db> call db_trace_thread(0xc1f3ec80, -1)
sched_switch(c1f3ec80,0,1,26e24232,4249ca0b) at sched_switch+0x137
mi_switch(1,0,0,0,0) at mi_switch+0x1ce
thread_single(1,c1f3ec80,c1f3b1c0,e88cac5c,c0500581) at thread_single+0x1d7
exit1(c1f3ec80,2,e88cacb8,c04f1736,0) at exit1+0x115
expand_name(c1f3ec80,2,c1f3ec80,e88cad48,0) at expand_name
kse_thr_interrupt(c1f3ec80,e88cad14,c,c1f3ec80,e88cad3c) at kse_thr_interrupt+0x329
syscall(2f,2f,2f,8054100,805a800) at syscall+0x2fc
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (382, FreeBSD ELF32, kse_thr_interrupt), eip = 0x280a3d6f, esp = 0xbfafee60, ebp = 0xbfafeefc ---
0


If you want line number translations, please let me know.  I saved the
kernel that this came from and also took a dump.

Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16711.383.448500.578640>