Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Sep 2004 17:45:31 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: easy to reproduce unkillable threads
Message-ID:  <16731.11515.504636.53058@grasshopper.cs.duke.edu>
In-Reply-To: <415B1ED6.8010809@elischer.org>
References:  <16728.37731.540143.307772@grasshopper.cs.duke.edu> <41589B4A.9080508@elischer.org> <415AB791.10809@freebsd.org> <16730.48642.4481.841374@grasshopper.cs.duke.edu> <415B13E8.2090205@elischer.org> <16731.6010.446877.347190@grasshopper.cs.duke.edu> <415B1ED6.8010809@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help

I tried a -current kernel (w/o your patch) from today (still RELENG_5
userland), and I still see the problem.

% ssh scream 'skill -9 -u gallatin'
Connection to scream closed by remote host.

% ssh scream 'ssh scream 'ps axH | grep testc'
  580  ??  SLs    0:00.01 csh -c ps axH | grep testc
  586  ??  RL     0:00.00 grep testc
  535  p0- WL     0:06.21 ./testcdev

On scream's console, send break to debugger..:
Stopped at      kdb_enter+0x30: leave
db> ps
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  548 c1a39c40 e67ee000 1387   547   548 0004002 [SLPQ ttyin 0xc1830c10][SLP] csh
  547 c1a39a80 e67ed000 1387   545   545 0000100 [SLPQ select 0xc071aaa4][SLP] sshd
  545 c1817000 e5556000    0   450   545 0000100 [SLPQ sbwait 0xc1991320][SLP] sshd
  535 c1a34e00 e67e6000 1387     1   535 020c482 (threaded)  testcdev
   thread 0xc164dc80 ksegrp 0xc15e57e0 [SUSP]
  511 c1a34a80 e67e4000    0     1   511 0004002 [SLPQ ttyin 0xc1705810][SLP] getty

db> trace 535
sched_switch(c164dc80,c164daf0,1,4ec51334,ed18649a) at sched_switch+0x137
mi_switch(1,c164daf0,0,0,0) at mi_switch+0x1d4
thread_single(1,c164dc80,0,0,0) at thread_single+0x1d7
exit1(c164dc80,9,0,0,c051996e) at exit1+0x115
expand_name(c164dc80,9,100,0,0) at expand_name
postsig(9,c164dc80,0,0,0) at postsig+0x204
ast(e52d1d48) at ast+0x5e4
doreti_ast() at doreti_ast+0x17
db> c


It seems to be just a problem with skill -9.  skill -2 works fine.

As I said before, libthr seems to behave differently.  Rather than a
lingering thread, the polling thread (doing the while(1)) is stuck on
the CPU (using 100% of one cpu in a dual system), and the thread which
was doing the cv_wait() is stuck with the exact same stack as above:

 629 c1a1da80 e67e7000 1387     1   629 0004482 (threaded)  testcdev
   thread 0xc164dc80 ksegrp 0xc15e54d0 [SUSP]
   thread 0xc1879af0 ksegrp 0xc15e54d0 [CPU 1]

db> trace 629
sched_switch(c164dc80,0,1,b5d71f28,b4e1d6c8) at sched_switch+0x137
mi_switch(1,0,c1870880,c164dc80,c164dc80) at mi_switch+0x1d4
thread_single(1,c164dc80,e52d1c54,c1b14100,c164dc80) at thread_single+0x1d7
exit1(c164dc80,9,0,e52d1ce4,c051996e) at exit1+0x115
expand_name(c164dc80,9,100,0,0) at expand_name
postsig(9,246,c06e7bd0,36,bfafefb4) at postsig+0x1a4
ast(e52d1d48) at ast+0x5e4
doreti_ast() at doreti_ast+0x17
db> c


Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16731.11515.504636.53058>