From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 18:23:56 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 04EB516A4CF for ; Thu, 9 Sep 2004 18:23:56 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7C27043D53 for ; Thu, 9 Sep 2004 18:23:55 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89INmJt020395 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 14:23:48 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89INeHk058544; Thu, 9 Sep 2004 14:23:40 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.40876.708925.425911@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 14:23:40 -0400 (EDT) To: Julian Elischer In-Reply-To: <413F8DBB.5040502@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 18:23:56 -0000 Julian Elischer writes: > > I think it is > > show thread (address) FWIW, I think db_trace(thread addr, -1) seems to work better. When I enter ddb, currproc is init, so show thread seems to show garbage. db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 623 c1b5f380 e6850000 0 472 472 0000000 [RUNQ] cron 614 c1b5f8c0 e6853000 0 451 614 0000100 [RUNQ] sshd 613 c1a22540 e680a000 1387 1 611 000c482 (threaded) mx_pingpong thread 0xc1b617d0 ksegrp 0xc18779a0 [CPU 1] thread 0xc1b614b0 ksegrp 0xc18779a0 [SUSP] thread 0xc1b61320 ksegrp 0xc18779a0 [LOCK process lock c1b13200] thread 0xc2b6ce10 ksegrp 0xc1a270e0 [LOCK process lock c1b13200] db> call db_trace_thread(0xc1b617d0, -1) sched_switch(3249936336,3244003328,3244003328,468695918,1992661338) at sched_switch+216 mi_switch(2,3244003328,3244003668,3244003328,3867700060) at mi_switch+455 maybe_preempt(3244003328,252,0,3867700072,3226402603) at maybe_preempt+153 sched_add(70,3867700092,3226402999,3246881184,3867189248) at sched_add+259 end() at 3246881184 0 db> call db_trace_thread(0xc1b614b0, -1) sched_switch(3249935536,3249936336,0,2929115342,3959095726) at sched_switch+216 mi_switch(1,3249936336,0,0,0) at mi_switch+455 thread_single(1,423437840,7706937,1737258498,3243666960) at thread_single+471 exit1(3249935536,9,3867675836,3867675876,3226344614) at exit1+277 expand_name(3249935536,9,256,0,0) at expand_name postsig(9,3867675976,2,3243701424,0) at postsig+516 ast(3867675976) at ast+1508 doreti_ast() at doreti_ast+23 0 db> call db_trace_thread(0xc1b61320, -1) sched_switch(3249935136,0,0,2147060238,4154263705) at sched_switch+216 mi_switch(1,0,3249936336,3228346184,0) at mi_switch+455 turnstile_wait(3249615360,3248629164,3249936336,3248629056,3249935136) at turnstile_wait+825 _mtx_lock_sleep(3248629164,3249935136,0,0,0) at _mtx_lock_sleep+290 kse_release(3249935136,3867663636,4,3249935136,3867663676) at kse_release+322 syscall(47,47,47,134562304,0) at syscall+764 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 135876488, ebp = 135876548 --- 0 db> call db_trace_thread(0xc2b6ce10, -1) sched_switch(3266760208,0,0,2564282502,2143396982) at sched_switch+216 mi_switch(1,0,3266760208,3244171108,3228328544) at mi_switch+455 turnstile_wait(3249615360,3248629164,3249936336,3248629056,3266760208) at turnstile_wait+825 _mtx_lock_sleep(3248629164,3266760208,0,0,0) at _mtx_lock_sleep+290 kse_release(3266760208,3901611284,4,3266760208,3901611324) at kse_release+322 syscall(47,47,3215917103,1,129) at syscall+764 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 3215978288, ebp = 3215978380 --- 0 > but if yuo can get a coredump it would be best.. > in ddb do: > call doadump > > in this case it looks like thread 0xc1f2aaf0 has called exit() and is > waiting for the others to exit.. > I wonder if the lock is the answer.. it woul dbe good to follow the link > in the mutex in the proc structure at 0xc1a2d8c0 > to see which thread OWNS it.. I'm following it from 0xc1a22540 for today's lockup: (kgdb) p $proc->p_mtx $3 = { mtx_object = { lo_class = 0xc069e55c, lo_name = 0xc067788d "process lock", lo_type = 0xc067788d "process lock", lo_flags = 0x430000, lo_list = { tqe_next = 0x0, tqe_prev = 0x0 }, lo_witness = 0x0 }, mtx_lock = 0xc1b617d2, mtx_recurse = 0x0 } 0xc1b617d2 is almost the same as the thread id of the first thread (0xc1b617d0).. I've still got the dump, so if you need more info please let me know. Drew