From owner-freebsd-bugs@FreeBSD.ORG Wed Feb 1 22:20:11 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E8D31065678 for ; Wed, 1 Feb 2012 22:20:11 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 730398FC18 for ; Wed, 1 Feb 2012 22:20:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q11MKBa3053016 for ; Wed, 1 Feb 2012 22:20:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q11MKBK2053012; Wed, 1 Feb 2012 22:20:11 GMT (envelope-from gnats) Date: Wed, 1 Feb 2012 22:20:11 GMT Message-Id: <201202012220.q11MKBK2053012@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= Cc: Subject: Re[2]: bin/164526: kill(1) can not kill process despite on -KILL X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Feb 2012 22:20:11 -0000 The following reply was made to PR bin/164526; it has been noted by GNATS. From: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= To: Jilles Tjoelker Cc: bug-followup@FreeBSD.org, freeradius-users@lists.freeradius.org, Subject: Re[2]: bin/164526: kill(1) can not kill process despite on -KILL Date: Thu, 2 Feb 2012 00:16:39 +0200 Здравствуйте, Jilles. Вы писали 28 января 2012 г., 20:24:07: >> [stuck process cannot be killed, system hangs when reboot is >> attempted] JT> A signal cannot forcibly kill a process that is stuck in the kernel. JT> Allowing this would put the integrity of the kernel data structures at JT> risk and likely cause hangs, data corruption or panics later on. JT> If a process is stuck in the kernel for a long time, this can be things JT> like broken hardware, a non-responsive NFS server or a bug. JT> A state 'T' (stopped) probably means the process is multi-threaded and JT> is trying to suspend but one or more threads will not cooperate JT> (non-interruptible sleep or running in the kernel). JT> Useful commands to obtain more information (supposing pid is 45471): JT> ps Hl45471 JT> procstat -k 45471 JT> Of course, this does not help if you already rebooted. repeated again: bug is repeateable: 1. radiusd + mod_perl + example.pl(it is connects to FireBird) + FireBIrd 2. restart firebird 3. try to restart radiusd 4. process in fall into STOP state # ps awx | grep radi 9438 ?? TLs 5:10.12 /usr/local/sbin/radiusd 27603 2 S+ 0:00.00 grep radi # procstat -k 9438 PID TID COMM TDNAME KSTACK 9438 100080 radiusd - mi_switch sleepq_switch sleepq_wait _sx_xlock_hard _sx_xlock _vm_map_lock_upgrade vm_map_lookup vm_fault_hold vm_fault trap_pfault trap calltrap 9438 100195 radiusd - mi_switch sleepq_switch sleepq_wait __lockmgr_args ffs_lock VOP_LOCK1_APV _vn_lock vm_object_deallocate unlock_and_deallocate vm_fault_hold vm_fault trap_pfault trap calltrap 9438 101144 radiusd - mi_switch thread_suspend_switch thread_single exit1 sigexit postsig ast doreti_ast # ps wHl9438 UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 133 9438 1 0 20 0 351124 322000 user m TLs ?? 0:03.65 /usr/local/sbin/radiusd 133 9438 1 0 20 0 351124 322000 ufs TLs ?? 0:00.00 /usr/local/sbin/radiusd 133 9438 1 0 20 0 351124 322000 - TLs ?? 0:05.28 /usr/local/sbin/radiusd #top last pid: 28497; load averages: 0.56, 2.34, 9.37 up 0+10:23:14 00:12:5 162 processes: 1 running, 158 sleeping, 3 stopped CPU: 1.9% user, 0.0% nice, 1.9% system, 5.3% interrupt, 90.8% idle Mem: 525M Active, 1259M Inact, 182M Wired, 41M Cache, 112M Buf, 1890M Free Swap: 4096M Total, 4096M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 6893 root 1 26 0 15392K 5580K select 0 21:17 6.10% snmpd 75797 bind 7 20 0 100M 77280K kqread 2 4:27 0.00% named 5553 root 7 20 0 53544K 39832K select 1 0:19 0.00% mpd5 77411 dhcpd 1 20 0 15032K 5360K select 3 0:18 0.00% dhcpd 3605 root 1 20 0 10460K 4004K select 3 0:11 0.00% zebra 5316 root 1 20 0 9616K 1244K select 1 0:10 0.00% syslogd 9438 freeradius 3 20 0 343M 314M STOP 0 0:09 0.00% radiusd 80843 mysql 26 20 0 402M 333M sbwait 0 0:05 0.00% mysqld 3611 root 1 20 0 14660K 5348K select 2 0:05 0.00% bgpd 80396 www 1 20 0 37908K 22876K lockf 1 0:01 0.00% httpd 26278 root 1 20 0 33812K 15608K select 2 0:01 0.00% httpd 10559 www 1 20 0 42004K 26768K lockf 1 0:01 0.00% httpd if I can supply another usefull debug info, answer as fast as you can, I can not wait too long. Thank you. -- С уважением, Коньков mailto:kes-kes@yandex.ru