From owner-freebsd-threads@FreeBSD.ORG Mon Sep 6 11:03:06 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4099416A4D2 for ; Mon, 6 Sep 2004 11:03:06 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1BEBE43D58 for ; Mon, 6 Sep 2004 11:03:06 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i86B35s4094507 for ; Mon, 6 Sep 2004 11:03:05 GMT (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i86B34GF094501 for freebsd-threads@freebsd.org; Mon, 6 Sep 2004 11:03:04 GMT (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 6 Sep 2004 11:03:04 GMT Message-Id: <200409061103.i86B34GF094501@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-threads@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Sep 2004 11:03:06 -0000 Current FreeBSD problem reports Critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- s [2004/03/15] kern/64313 threads FreeBSD (OpenBSD) pthread implicit set/un o [2004/04/22] threads/65883threads libkse's sigwait does not work after fork 2 problems total. Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/06/13] kern/19247 threads uthread_sigaction.c does not do anything o [2000/07/18] kern/20016 threads pthreads: Cannot set scheduling timer/Can o [2000/08/26] kern/20861 threads libc_r does not honor socket timeouts o [2001/01/20] threads/24472threads libc_r does not honor SO_SNDTIMEO/SO_RCVT o [2001/01/25] threads/24632threads libc_r delicate deviation from libc in ha o [2001/01/25] kern/24641 threads pthread_rwlock_rdlock can deadlock o [2001/11/26] bin/32295 threads pthread dont dequeue signals o [2002/02/01] threads/34536threads accept() blocks other threads o [2002/05/25] kern/38549 threads the procces compiled whith pthread stoppe o [2002/06/27] threads/39922threads [PATCH?] Threaded applications executed w o [2002/08/04] kern/41331 threads Pthread library open sets O_NONBLOCK flag o [2003/03/02] threads/48856threads Setting SIGCHLD to SIG_IGN still leaves z o [2003/03/10] threads/49087threads Signals lost in programs linked with libc o [2003/05/08] threads/51949threads thread in accept cannot be cancelled o [2004/08/26] threads/70975threads unexpected and unreliable behaviour when 15 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/05/26] kern/18824 threads gethostbyname is not thread safe o [2000/10/21] kern/22190 threads A threaded read(2) from a socketpair(2) f o [2001/09/09] threads/30464threads pthread mutex attributes -- pshared o [2002/05/02] threads/37676threads libc_r: msgsnd(), msgrcv(), pread(), pwri s [2002/07/16] threads/40671threads pthread_cancel doesn't remove thread from o [2004/07/13] threads/69020threads pthreads library leaks _gc_mutex 6 problems total. From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 01:09:08 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6DE9316A4CE for ; Wed, 8 Sep 2004 01:09:08 +0000 (GMT) Received: from hotmail.com (bay17-dav13.bay17.hotmail.com [64.4.43.193]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5B23343D1D for ; Wed, 8 Sep 2004 01:09:08 +0000 (GMT) (envelope-from yangshazhou@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Tue, 7 Sep 2004 18:08:02 -0700 Received: from 61.187.16.2 by bay17-dav13.bay17.hotmail.com with DAV; Wed, 08 Sep 2004 01:08:02 +0000 X-Originating-IP: [61.187.16.2] X-Originating-Email: [yangshazhou@hotmail.com] X-Sender: yangshazhou@hotmail.com From: To: Date: Wed, 8 Sep 2004 09:05:58 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 Message-ID: X-OriginalArrivalTime: 08 Sep 2004 01:08:02.0315 (UTC) FILETIME=[49E115B0:01C49540] Subject: sun-jdk14 can't run with '-server' in 5.3beta2 X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 01:09:08 -0000 We compiled /usr/ports/java/jdk14 in recent 5.3beta1 and beta2. Compilation succeeded. 'java -server' hung and can't be 'ctlr-c'. It has to be 'kill -9' to end. Is it a problem with new kernel? From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 16:01:37 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5A9D716A4CE for ; Wed, 8 Sep 2004 16:01:37 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id B9A6243D55 for ; Wed, 8 Sep 2004 16:01:36 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i88G1XJt018440 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Sep 2004 12:01:33 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i88G1RNj056693; Wed, 8 Sep 2004 12:01:27 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16703.11479.679335.588170@grasshopper.cs.duke.edu> Date: Wed, 8 Sep 2004 12:01:27 -0400 (EDT) To: freebsd-threads@freebsd.org X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Subject: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 16:01:37 -0000 If I send a kill -9 to a threaded process in a creative way, I see it get stuck forever exiting. (run from a /bin/sh script, killed via ssh $MACHINE skill -9 -u gallatin) It shows up in a ddb ps like this: 3403 c1652540 e52fe000 1387 1 3401 000c402 (threaded) mx_pingpong thread 0xc2de4c60 ksegrp 0xc15b2200 [SUSP] Doing a trace shows what I assume is the main thread waiting for the other thread to exit: db> tr 3403 sched_switch(c2de4c60,0,41508ec8,a87f7f6d,ffc00014) at sched_switch+0xa5 mi_switch(1,0,e89a3c44,c051f91d,c2de4c60) at mi_switch+0x1b6 thread_single(1,c06ea9c0,e89a3c64,c1652540,c2de4c60) at thread_single+0x1e0 exit1(c2de4c60,9,0,e89a3ce4,c0519447) at exit1+0x11d expand_name(c2de4c60,9,100,0,0) at expand_name postsig(9,202,c06e5db8,17f,8058f84) at postsig+0x204 ast(e89a3d48) at ast+0x5e7 doreti_ast() at doreti_ast+0x17 Looking at the proc in kgdb: (kgdb) p $proc $1 = (struct proc *) 0xc1652540 (kgdb) p * $proc $2 = { p_list = { le_next = 0xc1b66e00, le_prev = 0xc1b858c0 }, p_ksegrps = { tqh_first = 0xc2de3880, tqh_last = 0xc15b2204 }, p_threads = { tqh_first = 0xc2de4c60, tqh_last = 0xc2de4c68 }, p_suspended = { tqh_first = 0xc2de4c60, tqh_last = 0xc2de4c88 }, p_ucred = 0xc1ac7d80, p_fd = 0xc187d300, p_fdtol = 0x0, p_stats = 0xe52fe000, p_limit = 0xc1bf1700, p_upages_obj = 0xc0c4218c, p_sigacts = 0xc21bc000, p_flag = 0xc402, p_sflag = 0x1, p_state = PRS_NORMAL, p_pid = 0xd4b, p_hash = { le_next = 0x0, le_prev = 0xc155552c }, p_pglist = { le_next = 0x0, le_prev = 0xc1b64248 }, p_pptr = 0xc1561e00, p_sibling = { le_next = 0xc1876c40, le_prev = 0xc1561e68 }, p_children = { lh_first = 0x0 }, p_mtx = { mtx_object = { lo_class = 0xc06e90bc, lo_name = 0xc06bb669 "process lock", lo_type = 0xc06bb669 "process lock", lo_flags = 0x430000, lo_list = { tqe_next = 0x0, tqe_prev = 0x0 }, lo_witness = 0x0 }, mtx_lock = 0x4, mtx_recurse = 0x0 }, p_oppid = 0x0, p_vmspace = 0xc1af0258, p_swtime = 0x3a7, p_realtimer = { it_interval = { tv_sec = 0x0, tv_usec = 0x0 }, it_value = { tv_sec = 0x0, ---Type to continue, or q to quit--- tv_usec = 0x0 } }, p_runtime = { sec = 0x8, frac = 0x8ce499fd61838320 }, p_uu = 0x6ecf00, p_su = 0x13a8da, p_iu = 0x1, p_uticks = 0x3a2, p_sticks = 0xa5, p_iticks = 0x0, p_profthreads = 0x0, p_maxthrwaits = 0x0, p_traceflag = 0x0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xc1dfaa50, p_siglist = { __bits = {0x0, 0x0, 0x0, 0x0} }, p_lock = 0x0, p_sigiolst = { slh_first = 0x0 }, p_sigparent = 0x14, p_sig = 0x0, p_code = 0x0, p_stops = 0x0, p_stype = 0x0, p_step = 0x0, p_pfsflags = 0x0, p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0xc2de4c60, p_suspcount = 0x1, p_xthread = 0x0, p_magic = 0xbeefface, p_comm = "mx_pingpong\0\0\0\0\0\0\0\0", p_pgrp = 0xc1b64240, p_sysent = 0xc06ff000, p_args = 0xc2de0480, p_cpulimit = 0x7fffffffffffffff, p_nice = 0x0, p_xstat = 0x0, p_klist = { kl_lock = 0xc16525ac, kl_list = { slh_first = 0x0 } }, p_numthreads = 0x1, p_numksegrps = 0x2, p_md = { md_ldt = 0xc181f5c0 }, p_itcallout = { c_links = { sle = { sle_next = 0x0 }, tqe = { tqe_next = 0x0, tqe_prev = 0x0 } }, c_time = 0x0, c_arg = 0x0, ---Type to continue, or q to quit--- c_func = 0, c_flags = 0x8 }, p_uarea = 0xe52fe000, p_acflag = 0x10, p_ru = 0x0, p_peers = 0x0, p_leader = 0xc1652540, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xc1652700 } This is happening as of this morning with RELENG_5 (SCHED_4BSD) and with a ~3 month old 5-current (SCHED_4BSD). It seems to happen on both i386 and amd64. Question: Does the ddb ps indicate that there is another thread in the kernel? If yes, how the heck can I get a trace of it? Neither 0xc2de4c60 or 0xc15b2200 shows another stack when passed to ddb's tr. I suspect the other thread is sleeping in a cv_wait_sig() in my driver, but it would be nice to know for sure.. % ldd mx_pingpong mx_pingpong: libpthread.so.1 => /usr/lib/libpthread.so.1 (0x4807e000) libc.so.5 => /lib/libc.so.5 (0x480a3000) Thanks, Drew From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 16:17:08 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1524E16A4CE for ; Wed, 8 Sep 2004 16:17:08 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id B528E43D49 for ; Wed, 8 Sep 2004 16:17:07 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i88GH5Jt020668 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Sep 2004 12:17:05 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i88GGwB4056707; Wed, 8 Sep 2004 12:16:58 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16703.12410.319869.29996@grasshopper.cs.duke.edu> Date: Wed, 8 Sep 2004 12:16:58 -0400 (EDT) To: freebsd-threads@freebsd.org In-Reply-To: <16703.11479.679335.588170@grasshopper.cs.duke.edu> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 16:17:08 -0000 Andrew Gallatin writes: > > If I send a kill -9 to a threaded process in a creative way, I see it > get stuck forever exiting. (run from a /bin/sh script, > killed via ssh $MACHINE skill -9 -u gallatin) > > It shows up in a ddb ps like this: > > 3403 c1652540 e52fe000 1387 1 3401 000c402 (threaded) mx_pingpong > thread 0xc2de4c60 ksegrp 0xc15b2200 [SUSP] > FWIW, before sending it an skill -9, another run of the same program will show up in ddb ps like this: 3514 c1b65540 e6842000 0 3058 3514 000c002 (threaded) mx_pingpong thread 0xc2e0ab00 ksegrp 0xc1b60100 [SLPQ kserel 0xc1b6015c][SLP] thread 0xc1af7840 ksegrp 0xc1b60100 [CPU 1][kse 0xc1af8c00] thread 0xc2de5840 ksegrp 0xc1b60100 [SLPQ mx cv 0xc16b9e40][SLP] thread 0xc2de56e0 ksegrp 0xc2de3800 [SLPQ ksesigwait 0xc1b65640][SLP] Since there's only one thread left in the hung case, could it just be some sort of race in the exit code? Thanks, Drew From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 17:49:44 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D96C916A4CE for ; Wed, 8 Sep 2004 17:49:43 +0000 (GMT) Received: from green.homeunix.org (pcp04368961pcs.nrockv01.md.comcast.net [69.140.212.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 14AD843D49 for ; Wed, 8 Sep 2004 17:49:43 +0000 (GMT) (envelope-from green@green.homeunix.org) Received: from green.homeunix.org (green@localhost [127.0.0.1]) by green.homeunix.org (8.13.1/8.13.1) with ESMTP id i88Hng2O078080 for ; Wed, 8 Sep 2004 13:49:42 -0400 (EDT) (envelope-from green@green.homeunix.org) Received: (from green@localhost) by green.homeunix.org (8.13.1/8.13.1/Submit) id i88HngsR078079 for threads@FreeBSD.org; Wed, 8 Sep 2004 13:49:42 -0400 (EDT) (envelope-from green) Date: Wed, 8 Sep 2004 13:49:42 -0400 From: Brian Fundakowski Feldman To: threads@FreeBSD.org Message-ID: <20040908174941.GF928@green.homeunix.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6i Subject: [mistry.7@osu.edu: Re: FreeBSD and wine mmap] X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 17:49:44 -0000 I don't think that any allocated "red zones" should be left empty -- they should be explicitly allocated with no protection so that actually free memory space can be explicitly searched for. Comments? ----- Forwarded message from Anish Mistry ----- From: Anish Mistry To: Brian Fundakowski Feldman Subject: Re: FreeBSD and wine mmap Date: Sun, 5 Sep 2004 13:15:15 -0400 Cc: freebsd-current@freebsd.org On Sunday 29 August 2004 12:59 am, you wrote: > On Tue, Aug 10, 2004 at 01:53:14PM -0400, Anish Mistry wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On Wednesday 04 August 2004 06:39 pm, you wrote: > > > On Wed, Aug 04, 2004 at 06:28:02PM -0400, Anish Mistry wrote: > > > > Ok, so we need something like vm_map_findspace(), but for process > > > > address mapping? ie. pmap_findspace() that will return an address to > > > > a large enough free chunk? > > > > > > That's a good start, just to get something to work with. How this fits > > > in with the vm code and whether it is ultimately suitable in the long > > > run is probably up to Alan Cox. For now, just get something that (a) > > > doesn't break anything else; and (b) lets Wine behave the way it needs > > > to. > > > > > > AFAIK, there are still pthread issues with Wine, but those can't be > > > addressed until the mmap issue has a work-around. > > > > I've got a small patch that gets by the initial problem about not being > > to mmap the memory for the libraries, but the addresses that are mmap'ed > > seem to seem to overlap with memory that the current pthread > > implementation want to mmap for the "red zone" when wine tries to create > > a thread. It can't mmap the "red zone" addresses since all those address > > mapping where gobbled up before the thread launched. > > I'll try to figure out a way to maybe leave a space for the "red zone" > > and see if that works. > > Someone who actually knows what they are doing should probably take a > > look. > > The red pages are implemented by leaving the memory space unallocated; > I don't like that one bit -- this will cause those spaces to be allocated > but given no protection, which should provide the crash feature that the > guard pages are there for, but be less bogus (and it doesn't use more > "memory," but it will use a few more vm_map_entrys. > > Index: lib/libpthread/thread/thr_stack.c > =================================================================== > RCS file: /usr/ncvs/src/lib/libpthread/thread/thr_stack.c,v > retrieving revision 1.8 > diff -u -r1.8 thr_stack.c > --- lib/libpthread/thread/thr_stack.c 14 Sep 2003 22:39:44 -0000 1.8 > +++ lib/libpthread/thread/thr_stack.c 29 Aug 2004 04:50:28 -0000 > @@ -214,6 +214,17 @@ > stacksize, PROT_READ | PROT_WRITE, MAP_STACK, > -1, 0)) == MAP_FAILED) > attr->stackaddr_attr = NULL; > + if (attr->stackaddr_attr != NULL) { > + void *red; > + > + red = mmap((char *)attr->stackaddr_attr + stacksize, > + _thr_guard_default, PROT_NONE, > + MAP_ANON | MAP_FIXED | MAP_PRIVATE, -1, 0); > + if (red == MAP_FAILED) { > + (void)munmap(attr->stackaddr_attr, stacksize); > + attr->stackaddr_attr = NULL; > + } > + } > } > if (attr->stackaddr_attr != NULL) > return (0); This is good. Can this be committed? -- Anish Mistry ----- End forwarded message ----- -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 18:55:52 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 969A016A4CF for ; Wed, 8 Sep 2004 18:55:52 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7DD6443D31 for ; Wed, 8 Sep 2004 18:55:52 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id 2A4787A3D2; Wed, 8 Sep 2004 11:55:52 -0700 (PDT) Message-ID: <413F55B8.50003@elischer.org> Date: Wed, 08 Sep 2004 11:55:52 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> In-Reply-To: <16703.12410.319869.29996@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 18:55:52 -0000 it is possible. Howevr you should try this on -current, (please) because I rewrite some of the exit code and may have already fixed it.. a -curent kernel can run a 5.3 userland in general so you may just need to recompile the kernel. Andrew Gallatin wrote: >Andrew Gallatin writes: > > > > If I send a kill -9 to a threaded process in a creative way, I see it > > get stuck forever exiting. (run from a /bin/sh script, > > killed via ssh $MACHINE skill -9 -u gallatin) > > > > It shows up in a ddb ps like this: > > > > 3403 c1652540 e52fe000 1387 1 3401 000c402 (threaded) mx_pingpong > > thread 0xc2de4c60 ksegrp 0xc15b2200 [SUSP] > > > > >FWIW, before sending it an skill -9, another run of the same program >will show up in ddb ps like this: > >3514 c1b65540 e6842000 0 3058 3514 000c002 (threaded) mx_pingpong > thread 0xc2e0ab00 ksegrp 0xc1b60100 [SLPQ kserel 0xc1b6015c][SLP] > thread 0xc1af7840 ksegrp 0xc1b60100 [CPU 1][kse 0xc1af8c00] > thread 0xc2de5840 ksegrp 0xc1b60100 [SLPQ mx cv 0xc16b9e40][SLP] > thread 0xc2de56e0 ksegrp 0xc2de3800 [SLPQ ksesigwait 0xc1b65640][SLP] > > >Since there's only one thread left in the hung case, could it just >be some sort of race in the exit code? > >Thanks, > >Drew > > From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 19:09:47 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9B62716A4CE; Wed, 8 Sep 2004 19:09:47 +0000 (GMT) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id A063543D2F; Wed, 8 Sep 2004 19:09:42 +0000 (GMT) (envelope-from deischen@gdeb.com) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) i88J9fZx000339; Wed, 8 Sep 2004 15:09:41 -0400 (EDT) Date: Wed, 8 Sep 2004 15:09:41 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Brian Fundakowski Feldman In-Reply-To: <20040908174941.GF928@green.homeunix.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) cc: threads@freebsd.org Subject: Re: [mistry.7@osu.edu: Re: FreeBSD and wine mmap] X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Daniel Eischen List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 19:09:47 -0000 On Wed, 8 Sep 2004, Brian Fundakowski Feldman wrote: > I don't think that any allocated "red zones" should be left empty -- > they should be explicitly allocated with no protection so that actually > free memory space can be explicitly searched for. Comments? Yes, that's fine. See specific comments below. > > Index: lib/libpthread/thread/thr_stack.c > > =================================================================== > > RCS file: /usr/ncvs/src/lib/libpthread/thread/thr_stack.c,v > > retrieving revision 1.8 > > diff -u -r1.8 thr_stack.c > > --- lib/libpthread/thread/thr_stack.c 14 Sep 2003 22:39:44 -0000 1.8 > > +++ lib/libpthread/thread/thr_stack.c 29 Aug 2004 04:50:28 -0000 > > @@ -214,6 +214,17 @@ > > stacksize, PROT_READ | PROT_WRITE, MAP_STACK, > > -1, 0)) == MAP_FAILED) > > attr->stackaddr_attr = NULL; > > + if (attr->stackaddr_attr != NULL) { > > + void *red; Declare red above with rest of locals. > > + > > + red = mmap((char *)attr->stackaddr_attr + stacksize, > > + _thr_guard_default, PROT_NONE, > > + MAP_ANON | MAP_FIXED | MAP_PRIVATE, -1, 0); > > + if (red == MAP_FAILED) { > > + (void)munmap(attr->stackaddr_attr, stacksize); > > + attr->stackaddr_attr = NULL; > > + } > > + } > > } > > if (attr->stackaddr_attr != NULL) > > return (0); I don't know if this was stripped of tabs, but please use them and obey style(9). -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 20:37:27 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AAF6016A4CE for ; Wed, 8 Sep 2004 20:37:27 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2D85743D39 for ; Wed, 8 Sep 2004 20:37:27 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i88KbPJt028813 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Sep 2004 16:37:25 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i88KbJ5p056908; Wed, 8 Sep 2004 16:37:19 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16703.28031.454342.774229@grasshopper.cs.duke.edu> Date: Wed, 8 Sep 2004 16:37:19 -0400 (EDT) To: Julian Elischer In-Reply-To: <413F55B8.50003@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 20:37:27 -0000 Julian Elischer writes: > it is possible. Howevr you should try this on -current, (please) > because I rewrite some of the exit code > and may have already fixed it.. > > a -curent kernel can run a 5.3 userland in general so you may just need > to recompile the kernel. OK, I built a -current kernel from CVS sources dated 8amPDT. And it is worse.. The initial skill -9 -u gallatin seems to be ignored by the threaded process and it gets re-parented to init when skill takes out its parent (sh) and its parent's parent (csh and sshd): # ps axwl | grep ping | grep -v grep 1387 607 1 591 132 0 18260 11480 - R p0- 5:18.18 tests/mx_pingpong -e 2 -M 2 -E 3000000 -d scream:0 Logging in again and doing 'kill -9 607' results in other stuff starting to hang. (Can't ssh in again, kill never seems to return. In the following ps, the shell that launched the second kill -9 is pid 624 (^T also claims its running) db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 624 c1a28c40 e6808000 1387 623 624 0004002 [CPU 0] csh 623 c1f24540 e8858000 1387 621 621 0000100 [SLPQ select 0xc06cb5c4][SLP] sshd 621 c1647a80 e52e3000 0 451 621 0000100 [SLPQ sbwait 0xc1990d40][SLP] sshd 607 c1a2d8c0 e680f000 1387 1 605 000c482 (threaded) mx_pingpong thread 0xc1f25960 ksegrp 0xc18808c0 [CPU 1] thread 0xc1f2aaf0 ksegrp 0xc18808c0 [SUSP] thread 0xc1f2a960 ksegrp 0xc18808c0 [RUNQ] thread 0xc1f2a4b0 ksegrp 0xc1f282a0 [LOCK process lock c1b37bc0] db> tr 607 sched_switch(c1f25960,c15b9000,c15b9000,ae1ed572,3db79502) at sched_switch+0xd8 mi_switch(2,c15b9000,c15b9154,c15b9000,e884db50) at mi_switch+0x1c7 maybe_preempt(c15b9000,82,0,c1568c40,c15b9000) at maybe_preempt+0x99 sched_add(e884db70,46,c1f2a960,46,c18808c0) at sched_add+0x103 resetpriority(e884db84,e680f000,46,46,c1a2d8c0) at resetpriority+0x62 _end(c1f282a4,c1f25960,c1f2a970,c1f2a960,c1f2a988) at 0xc1f25960 (null)(c1f282a0,c18808c4,c1f25960,c1f2a4b8,c1f2aaf0) at 0 end(c1f28850,c1f28854,c1f25320,c1f25328,0) at 0xc1647a80 end(c1880af0,c1880af4,c1a29af0,c1a29af8,0) at 0xc1a2d8c0 _end(c1995000,c1995004,c187f7d0,c187f7d8,0) at 0xc1f24e00 <_end() is repeated quite a few times> Is there any way to get a trace of the other threads from ddb? Drew From owner-freebsd-threads@FreeBSD.ORG Wed Sep 8 22:54:52 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9362716A4CE for ; Wed, 8 Sep 2004 22:54:52 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7A00343D2D for ; Wed, 8 Sep 2004 22:54:52 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id 135647A3D2; Wed, 8 Sep 2004 15:54:52 -0700 (PDT) Message-ID: <413F8DBB.5040502@elischer.org> Date: Wed, 08 Sep 2004 15:54:51 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> In-Reply-To: <16703.28031.454342.774229@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Sep 2004 22:54:52 -0000 Andrew Gallatin wrote: >Julian Elischer writes: > > it is possible. Howevr you should try this on -current, (please) > > because I rewrite some of the exit code > > and may have already fixed it.. > > > > a -curent kernel can run a 5.3 userland in general so you may just need > > to recompile the kernel. > > >OK, I built a -current kernel from CVS sources dated 8amPDT. >And it is worse.. > >The initial skill -9 -u gallatin seems to be ignored by the threaded >process and it gets re-parented to init when skill takes out its >parent (sh) and its parent's parent (csh and sshd): > ># ps axwl | grep ping | grep -v grep > 1387 607 1 591 132 0 18260 11480 - R p0- 5:18.18 tests/mx_pingpong -e 2 -M 2 -E 3000000 -d scream:0 > > >Logging in again and doing 'kill -9 607' results in other stuff >starting to hang. (Can't ssh in again, kill never seems to return. >In the following ps, the shell that launched the second kill -9 >is pid 624 (^T also claims its running) > > > >db> ps > pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd > 624 c1a28c40 e6808000 1387 623 624 0004002 [CPU 0] csh > 623 c1f24540 e8858000 1387 621 621 0000100 [SLPQ select 0xc06cb5c4][SLP] sshd > 621 c1647a80 e52e3000 0 451 621 0000100 [SLPQ sbwait 0xc1990d40][SLP] sshd > 607 c1a2d8c0 e680f000 1387 1 605 000c482 (threaded) mx_pingpong > thread 0xc1f25960 ksegrp 0xc18808c0 [CPU 1] > thread 0xc1f2aaf0 ksegrp 0xc18808c0 [SUSP] > thread 0xc1f2a960 ksegrp 0xc18808c0 [RUNQ] > thread 0xc1f2a4b0 ksegrp 0xc1f282a0 [LOCK process lock c1b37bc0] > > >db> tr 607 >sched_switch(c1f25960,c15b9000,c15b9000,ae1ed572,3db79502) at sched_switch+0xd8 >mi_switch(2,c15b9000,c15b9154,c15b9000,e884db50) at mi_switch+0x1c7 >maybe_preempt(c15b9000,82,0,c1568c40,c15b9000) at maybe_preempt+0x99 >sched_add(e884db70,46,c1f2a960,46,c18808c0) at sched_add+0x103 >resetpriority(e884db84,e680f000,46,46,c1a2d8c0) at resetpriority+0x62 >_end(c1f282a4,c1f25960,c1f2a970,c1f2a960,c1f2a988) at 0xc1f25960 >(null)(c1f282a0,c18808c4,c1f25960,c1f2a4b8,c1f2aaf0) at 0 >end(c1f28850,c1f28854,c1f25320,c1f25328,0) at 0xc1647a80 >end(c1880af0,c1880af4,c1a29af0,c1a29af8,0) at 0xc1a2d8c0 >_end(c1995000,c1995004,c187f7d0,c187f7d8,0) at 0xc1f24e00 > ><_end() is repeated quite a few times> > > >Is there any way to get a trace of the other threads from ddb? > yes I think it is show thread (address) but if yuo can get a coredump it would be best.. in ddb do: call doadump in this case it looks like thread 0xc1f2aaf0 has called exit() and is waiting for the others to exit.. I wonder if the lock is the answer.. it woul dbe good to follow the link in the mutex in the proc structure at 0xc1a2d8c0 to see which thread OWNS it.. > >Drew > > From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 01:27:46 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0737E16A4CF; Thu, 9 Sep 2004 01:27:45 +0000 (GMT) Received: from exchhz01.viatech.com.cn (ip-40-162-97-218.anlai.com [218.97.162.40]) by mx1.FreeBSD.org (Postfix) with ESMTP id E0DFC43D41; Thu, 9 Sep 2004 01:27:39 +0000 (GMT) (envelope-from davidxu@freebsd.org) Received: from freebsd.org (DAVIDWNT [10.4.1.99]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id P0RTNG50; Thu, 9 Sep 2004 09:27:37 +0800 Message-ID: <413FB291.7060501@freebsd.org> Date: Thu, 09 Sep 2004 09:32:01 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5b) Gecko/20030723 Thunderbird/0.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Brian Fundakowski Feldman References: <20040908174941.GF928@green.homeunix.org> In-Reply-To: <20040908174941.GF928@green.homeunix.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: threads@FreeBSD.org Subject: Re: [mistry.7@osu.edu: Re: FreeBSD and wine mmap] X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 01:27:46 -0000 Brian Fundakowski Feldman wrote: > I don't think that any allocated "red zones" should be left empty -- > they should be explicitly allocated with no protection so that actually > free memory space can be explicitly searched for. Comments? > > ----- Forwarded message from Anish Mistry ----- > > From: Anish Mistry > To: Brian Fundakowski Feldman > Subject: Re: FreeBSD and wine mmap > Date: Sun, 5 Sep 2004 13:15:15 -0400 > Cc: freebsd-current@freebsd.org > > On Sunday 29 August 2004 12:59 am, you wrote: > >>On Tue, Aug 10, 2004 at 01:53:14PM -0400, Anish Mistry wrote: >> >>>-----BEGIN PGP SIGNED MESSAGE----- >>>Hash: SHA1 >>> >>>On Wednesday 04 August 2004 06:39 pm, you wrote: >>> >>>>On Wed, Aug 04, 2004 at 06:28:02PM -0400, Anish Mistry wrote: >>>> >>>>>Ok, so we need something like vm_map_findspace(), but for > > process > >>>>>address mapping? ie. pmap_findspace() that will return an > > address to > >>>>>a large enough free chunk? >>>> >>>>That's a good start, just to get something to work with. How this > > fits > >>>>in with the vm code and whether it is ultimately suitable in the > > long > >>>>run is probably up to Alan Cox. For now, just get something that > > (a) > >>>>doesn't break anything else; and (b) lets Wine behave the way it > > needs > >>>>to. >>>> >>>>AFAIK, there are still pthread issues with Wine, but those can't > > be > >>>>addressed until the mmap issue has a work-around. >>> >>>I've got a small patch that gets by the initial problem about not > > being > >>>to mmap the memory for the libraries, but the addresses that are > > mmap'ed > >>>seem to seem to overlap with memory that the current pthread >>>implementation want to mmap for the "red zone" when wine tries to > > create > >>>a thread. It can't mmap the "red zone" addresses since all those > > address > >>>mapping where gobbled up before the thread launched. >>>I'll try to figure out a way to maybe leave a space for the "red > > zone" > >>>and see if that works. >>>Someone who actually knows what they are doing should probably take > > a > >>>look. >> >>The red pages are implemented by leaving the memory space unallocated; >>I don't like that one bit -- this will cause those spaces to be > > allocated > >>but given no protection, which should provide the crash feature that > > the > >>guard pages are there for, but be less bogus (and it doesn't use more >>"memory," but it will use a few more vm_map_entrys. >> >>Index: lib/libpthread/thread/thr_stack.c >>=================================================================== >>RCS file: /usr/ncvs/src/lib/libpthread/thread/thr_stack.c,v >>retrieving revision 1.8 >>diff -u -r1.8 thr_stack.c >>--- lib/libpthread/thread/thr_stack.c 14 Sep 2003 22:39:44 -0000 1.8 >>+++ lib/libpthread/thread/thr_stack.c 29 Aug 2004 04:50:28 -0000 >>@@ -214,6 +214,17 @@ >> stacksize, PROT_READ | PROT_WRITE, MAP_STACK, >> -1, 0)) == MAP_FAILED) >> attr->stackaddr_attr = NULL; >>+ if (attr->stackaddr_attr != NULL) { >>+ void *red; >>+ >>+ red = mmap((char *)attr->stackaddr_attr + stacksize, >>+ _thr_guard_default, PROT_NONE, >>+ MAP_ANON | MAP_FIXED | MAP_PRIVATE, -1, 0); >>+ if (red == MAP_FAILED) { >>+ (void)munmap(attr->stackaddr_attr, stacksize); >>+ attr->stackaddr_attr = NULL; >>+ } >>+ } >> } >> if (attr->stackaddr_attr != NULL) >> return (0); > > This is good. Can this be committed? So can newest thread still overflow its stack with this code ? Also how about if another thread allocated the red zone(but not used as red zone) before your thread can execute this mmap with MAP_FIXED ? This introduces a failure case we didn't have. I think you'd map stacksize + guard_size, and use mprotect() to set red zone page. From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 01:40:31 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B88616A4CE; Thu, 9 Sep 2004 01:40:31 +0000 (GMT) Received: from green.homeunix.org (pcp04368961pcs.nrockv01.md.comcast.net [69.140.212.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 95FE343D31; Thu, 9 Sep 2004 01:40:30 +0000 (GMT) (envelope-from green@green.homeunix.org) Received: from green.homeunix.org (green@localhost [127.0.0.1]) by green.homeunix.org (8.13.1/8.13.1) with ESMTP id i891eTq6080674; Wed, 8 Sep 2004 21:40:29 -0400 (EDT) (envelope-from green@green.homeunix.org) Received: (from green@localhost) by green.homeunix.org (8.13.1/8.13.1/Submit) id i891eTqa080673; Wed, 8 Sep 2004 21:40:29 -0400 (EDT) (envelope-from green) Date: Wed, 8 Sep 2004 21:40:29 -0400 From: Brian Fundakowski Feldman To: David Xu Message-ID: <20040909014029.GL928@green.homeunix.org> References: <20040908174941.GF928@green.homeunix.org> <413FB291.7060501@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <413FB291.7060501@freebsd.org> User-Agent: Mutt/1.5.6i cc: threads@freebsd.org Subject: Re: [mistry.7@osu.edu: Re: FreeBSD and wine mmap] X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 01:40:31 -0000 On Thu, Sep 09, 2004 at 09:32:01AM +0800, David Xu wrote: > Brian Fundakowski Feldman wrote: > > >I don't think that any allocated "red zones" should be left empty -- > >they should be explicitly allocated with no protection so that actually > >free memory space can be explicitly searched for. Comments? > > > >----- Forwarded message from Anish Mistry ----- > > > >From: Anish Mistry > >To: Brian Fundakowski Feldman > >Subject: Re: FreeBSD and wine mmap > >Date: Sun, 5 Sep 2004 13:15:15 -0400 > >Cc: freebsd-current@freebsd.org > > > >On Sunday 29 August 2004 12:59 am, you wrote: > > > >>On Tue, Aug 10, 2004 at 01:53:14PM -0400, Anish Mistry wrote: > >> > >>>-----BEGIN PGP SIGNED MESSAGE----- > >>>Hash: SHA1 > >>> > >>>On Wednesday 04 August 2004 06:39 pm, you wrote: > >>> > >>>>On Wed, Aug 04, 2004 at 06:28:02PM -0400, Anish Mistry wrote: > >>>> > >>>>>Ok, so we need something like vm_map_findspace(), but for > > > >process > > > >>>>>address mapping? ie. pmap_findspace() that will return an > > > >address to > > > >>>>>a large enough free chunk? > >>>> > >>>>That's a good start, just to get something to work with. How this > > > >fits > > > >>>>in with the vm code and whether it is ultimately suitable in the > > > >long > > > >>>>run is probably up to Alan Cox. For now, just get something that > > > >(a) > > > >>>>doesn't break anything else; and (b) lets Wine behave the way it > > > >needs > > > >>>>to. > >>>> > >>>>AFAIK, there are still pthread issues with Wine, but those can't > > > >be > > > >>>>addressed until the mmap issue has a work-around. > >>> > >>>I've got a small patch that gets by the initial problem about not > > > >being > > > >>>to mmap the memory for the libraries, but the addresses that are > > > >mmap'ed > > > >>>seem to seem to overlap with memory that the current pthread > >>>implementation want to mmap for the "red zone" when wine tries to > > > >create > > > >>>a thread. It can't mmap the "red zone" addresses since all those > > > >address > > > >>>mapping where gobbled up before the thread launched. > >>>I'll try to figure out a way to maybe leave a space for the "red > > > >zone" > > > >>>and see if that works. > >>>Someone who actually knows what they are doing should probably take > > > >a > > > >>>look. > >> > >>The red pages are implemented by leaving the memory space unallocated; > >>I don't like that one bit -- this will cause those spaces to be > > > >allocated > > > >>but given no protection, which should provide the crash feature that > > > >the > > > >>guard pages are there for, but be less bogus (and it doesn't use more > >>"memory," but it will use a few more vm_map_entrys. > >> > >>Index: lib/libpthread/thread/thr_stack.c > >>=================================================================== > >>RCS file: /usr/ncvs/src/lib/libpthread/thread/thr_stack.c,v > >>retrieving revision 1.8 > >>diff -u -r1.8 thr_stack.c > >>--- lib/libpthread/thread/thr_stack.c 14 Sep 2003 22:39:44 -0000 1.8 > >>+++ lib/libpthread/thread/thr_stack.c 29 Aug 2004 04:50:28 -0000 > >>@@ -214,6 +214,17 @@ > >> stacksize, PROT_READ | PROT_WRITE, MAP_STACK, > >> -1, 0)) == MAP_FAILED) > >> attr->stackaddr_attr = NULL; > >>+ if (attr->stackaddr_attr != NULL) { > >>+ void *red; > >>+ > >>+ red = mmap((char *)attr->stackaddr_attr + stacksize, > >>+ _thr_guard_default, PROT_NONE, > >>+ MAP_ANON | MAP_FIXED | MAP_PRIVATE, -1, 0); > >>+ if (red == MAP_FAILED) { > >>+ (void)munmap(attr->stackaddr_attr, stacksize); > >>+ attr->stackaddr_attr = NULL; > >>+ } > >>+ } > >> } > >> if (attr->stackaddr_attr != NULL) > >> return (0); > > > >This is good. Can this be committed? > > So can newest thread still overflow its stack with this code ? > Also how about if another thread allocated the red zone(but not used > as red zone) before your thread can execute this mmap with MAP_FIXED ? > This introduces a failure case we didn't have. > > I think you'd map stacksize + guard_size, and use mprotect() to set > red zone page. That's a very good point indeed. Would you like to modify the code to do that and commit it? -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 09:42:01 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D6F7416A4CE for ; Thu, 9 Sep 2004 09:42:01 +0000 (GMT) Received: from tts.orel.ru (tts.orel.ru [213.59.64.67]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E6F643D4C for ; Thu, 9 Sep 2004 09:42:01 +0000 (GMT) (envelope-from bel@orel.ru) Received: from orel.ru (lg.orel.ru [62.33.11.59]) by tts.orel.ru (8.12.10/8.12.10/bel) with ESMTP id i899fveT004957 for ; Thu, 9 Sep 2004 13:41:59 +0400 Message-ID: <41402562.5050901@orel.ru> Date: Thu, 09 Sep 2004 13:41:54 +0400 From: Andrew Belashov Organization: ORIS User-Agent: Mozilla/5.0 (X11; U; FreeBSD sparc64; en-US; rv:1.6) Gecko/20040407 X-Accept-Language: ru, en-us, en MIME-Version: 1.0 To: freebsd-threads@freebsd.org X-Enigmail-Version: 0.83.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Zombi-Check: on netra2.orel.ru Subject: libkse: propagate test X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 09:42:02 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, All! I'm debugging libkse library for FreeBSD/sparc64. Some tests from lib/libpthread/test/ failed. Please, explain me, that it means: =================================== bel@bel$ ./propagate_s.pl 1..1 ~ U close ~ U close ~ U wait ~ U sleep 4 propagation(s) not ok 1 =================================== How it can be corrected? - -- With best regards, Andrew Belashov. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFBQCVdwF8YpH80o/IRAoMTAJ0fHiXPq6hy7S9UmbXdz6GH1xZwEQCgnIWB q3QjdaAwOx5KWYtG8o1x+Fg= =7/Rl -----END PGP SIGNATURE----- From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 14:16:31 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5359316A4CE for ; Thu, 9 Sep 2004 14:16:31 +0000 (GMT) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id E5C6643D1F for ; Thu, 9 Sep 2004 14:16:30 +0000 (GMT) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) i89EGTdt012575; Thu, 9 Sep 2004 10:16:29 -0400 (EDT) Date: Thu, 9 Sep 2004 10:16:29 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Andrew Belashov In-Reply-To: <41402562.5050901@orel.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) cc: freebsd-threads@freebsd.org Subject: Re: libkse: propagate test X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Daniel Eischen List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 14:16:31 -0000 On Thu, 9 Sep 2004, Andrew Belashov wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, All! > > I'm debugging libkse library for FreeBSD/sparc64. > Some tests from lib/libpthread/test/ failed. > > Please, explain me, that it means: > > =================================== > bel@bel$ ./propagate_s.pl > 1..1 > ~ U close > > ~ U close > > ~ U wait > > ~ U sleep It means something in libc is calling those functions instead of using the internal names (e.g., _close() instead of close()). At a quick glance, the offenders seem to be: libc/gen/rcmdsh.c close, wait libc/gen/sysconf.c close libc/rpc/getnetconfig.c sleep -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 18:23:56 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 04EB516A4CF for ; Thu, 9 Sep 2004 18:23:56 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7C27043D53 for ; Thu, 9 Sep 2004 18:23:55 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89INmJt020395 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 14:23:48 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89INeHk058544; Thu, 9 Sep 2004 14:23:40 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.40876.708925.425911@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 14:23:40 -0400 (EDT) To: Julian Elischer In-Reply-To: <413F8DBB.5040502@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 18:23:56 -0000 Julian Elischer writes: > > I think it is > > show thread (address) FWIW, I think db_trace(thread addr, -1) seems to work better. When I enter ddb, currproc is init, so show thread seems to show garbage. db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 623 c1b5f380 e6850000 0 472 472 0000000 [RUNQ] cron 614 c1b5f8c0 e6853000 0 451 614 0000100 [RUNQ] sshd 613 c1a22540 e680a000 1387 1 611 000c482 (threaded) mx_pingpong thread 0xc1b617d0 ksegrp 0xc18779a0 [CPU 1] thread 0xc1b614b0 ksegrp 0xc18779a0 [SUSP] thread 0xc1b61320 ksegrp 0xc18779a0 [LOCK process lock c1b13200] thread 0xc2b6ce10 ksegrp 0xc1a270e0 [LOCK process lock c1b13200] db> call db_trace_thread(0xc1b617d0, -1) sched_switch(3249936336,3244003328,3244003328,468695918,1992661338) at sched_switch+216 mi_switch(2,3244003328,3244003668,3244003328,3867700060) at mi_switch+455 maybe_preempt(3244003328,252,0,3867700072,3226402603) at maybe_preempt+153 sched_add(70,3867700092,3226402999,3246881184,3867189248) at sched_add+259 end() at 3246881184 0 db> call db_trace_thread(0xc1b614b0, -1) sched_switch(3249935536,3249936336,0,2929115342,3959095726) at sched_switch+216 mi_switch(1,3249936336,0,0,0) at mi_switch+455 thread_single(1,423437840,7706937,1737258498,3243666960) at thread_single+471 exit1(3249935536,9,3867675836,3867675876,3226344614) at exit1+277 expand_name(3249935536,9,256,0,0) at expand_name postsig(9,3867675976,2,3243701424,0) at postsig+516 ast(3867675976) at ast+1508 doreti_ast() at doreti_ast+23 0 db> call db_trace_thread(0xc1b61320, -1) sched_switch(3249935136,0,0,2147060238,4154263705) at sched_switch+216 mi_switch(1,0,3249936336,3228346184,0) at mi_switch+455 turnstile_wait(3249615360,3248629164,3249936336,3248629056,3249935136) at turnstile_wait+825 _mtx_lock_sleep(3248629164,3249935136,0,0,0) at _mtx_lock_sleep+290 kse_release(3249935136,3867663636,4,3249935136,3867663676) at kse_release+322 syscall(47,47,47,134562304,0) at syscall+764 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 135876488, ebp = 135876548 --- 0 db> call db_trace_thread(0xc2b6ce10, -1) sched_switch(3266760208,0,0,2564282502,2143396982) at sched_switch+216 mi_switch(1,0,3266760208,3244171108,3228328544) at mi_switch+455 turnstile_wait(3249615360,3248629164,3249936336,3248629056,3266760208) at turnstile_wait+825 _mtx_lock_sleep(3248629164,3266760208,0,0,0) at _mtx_lock_sleep+290 kse_release(3266760208,3901611284,4,3266760208,3901611324) at kse_release+322 syscall(47,47,3215917103,1,129) at syscall+764 Xint0x80_syscall() at Xint0x80_syscall+31 --- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 3215978288, ebp = 3215978380 --- 0 > but if yuo can get a coredump it would be best.. > in ddb do: > call doadump > > in this case it looks like thread 0xc1f2aaf0 has called exit() and is > waiting for the others to exit.. > I wonder if the lock is the answer.. it woul dbe good to follow the link > in the mutex in the proc structure at 0xc1a2d8c0 > to see which thread OWNS it.. I'm following it from 0xc1a22540 for today's lockup: (kgdb) p $proc->p_mtx $3 = { mtx_object = { lo_class = 0xc069e55c, lo_name = 0xc067788d "process lock", lo_type = 0xc067788d "process lock", lo_flags = 0x430000, lo_list = { tqe_next = 0x0, tqe_prev = 0x0 }, lo_witness = 0x0 }, mtx_lock = 0xc1b617d2, mtx_recurse = 0x0 } 0xc1b617d2 is almost the same as the thread id of the first thread (0xc1b617d0).. I've still got the dump, so if you need more info please let me know. Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 19:08:27 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DEB7616A4CE; Thu, 9 Sep 2004 19:08:27 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id A216A43D58; Thu, 9 Sep 2004 19:08:26 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id 31D977A3E1; Thu, 9 Sep 2004 12:08:26 -0700 (PDT) Message-ID: <4140AA2A.90605@elischer.org> Date: Thu, 09 Sep 2004 12:08:26 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> In-Reply-To: <16704.40876.708925.425911@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 19:08:28 -0000 thanks, I'm flooded with work for a couple of days.. it looks as if one ofthe threads (0xc1b614b0) has called exit, whichmeans it is in thread_single() waiting for all the other threads to suicide, but at least one of them doen't want to.. Two of them (0xc1b61320 and 0xc2b6ce10) are refusing to finish up and exit because they need the proc lock, which is owned by a fourth one.. (0xc1b617d0) the fourth one has just preempted itself with some other thread (3244003328 whatever that is in hex (0xC15B9000)) do you still have the 'ps'? what is thread (0xC15B9000)? the thread that holds teh lock is the first one below.. [skip below for further comments.] interestingly Andrew Gallatin wrote: >Julian Elischer writes: > > > > I think it is > > > > show thread (address) > >FWIW, I think db_trace(thread addr, -1) seems to work better. >When I enter ddb, currproc is init, so show thread >seems to show garbage. > > >db> ps > pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd > > thread 0xc1b617d0 ksegrp 0xc18779a0 [CPU 1] > thread 0xc1b614b0 ksegrp 0xc18779a0 [SUSP] > thread 0xc1b61320 ksegrp 0xc18779a0 [LOCK process lock c1b13200] > thread 0xc2b6ce10 ksegrp 0xc1a270e0 [LOCK process lock c1b13200] > >db> call db_trace_thread(0xc1b617d0, -1) >sched_switch(3249936336,3244003328,3244003328,468695918,1992661338) at sched_switch+216 >mi_switch(2,3244003328,3244003668,3244003328,3867700060) at mi_switch+455 >maybe_preempt(3244003328,252,0,3867700072,3226402603) at maybe_preempt+153 >sched_add(70,3867700092,3226402999,3246881184,3867189248) at sched_add+259 >end() at 3246881184 >0 > > odd that teh stack trace stops there?? that in itself is wierd.. I don't understand why the thread is marked as currently running on CPU1. it called sched_switch that should have saved its state and put it on teh run queue (and marked it as such) so its state should be RUNQ. unless it has got into some infinite loop there, either going in or out of the switchout. it would be interesting to see the actual instruction pointer.. notice that preemption is involved... john may also have an idea.. (CC'd)_ >db> call db_trace_thread(0xc1b614b0, -1) >sched_switch(3249935536,3249936336,0,2929115342,3959095726) at sched_switch+216 >mi_switch(1,3249936336,0,0,0) at mi_switch+455 >thread_single(1,423437840,7706937,1737258498,3243666960) at thread_single+471 >exit1(3249935536,9,3867675836,3867675876,3226344614) at exit1+277 >expand_name(3249935536,9,256,0,0) at expand_name >postsig(9,3867675976,2,3243701424,0) at postsig+516 >ast(3867675976) at ast+1508 >doreti_ast() at doreti_ast+23 >0 > >db> call db_trace_thread(0xc1b61320, -1) >sched_switch(3249935136,0,0,2147060238,4154263705) at sched_switch+216 >mi_switch(1,0,3249936336,3228346184,0) at mi_switch+455 >turnstile_wait(3249615360,3248629164,3249936336,3248629056,3249935136) at turnstile_wait+825 >_mtx_lock_sleep(3248629164,3249935136,0,0,0) at _mtx_lock_sleep+290 >kse_release(3249935136,3867663636,4,3249935136,3867663676) at kse_release+322 >syscall(47,47,47,134562304,0) at syscall+764 >Xint0x80_syscall() at Xint0x80_syscall+31 >--- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 135876488, ebp = 135876548 --- >0 > >db> call db_trace_thread(0xc2b6ce10, -1) >sched_switch(3266760208,0,0,2564282502,2143396982) at sched_switch+216 >mi_switch(1,0,3266760208,3244171108,3228328544) at mi_switch+455 >turnstile_wait(3249615360,3248629164,3249936336,3248629056,3266760208) at turnstile_wait+825 >_mtx_lock_sleep(3248629164,3266760208,0,0,0) at _mtx_lock_sleep+290 >kse_release(3266760208,3901611284,4,3266760208,3901611324) at kse_release+322 >syscall(47,47,3215917103,1,129) at syscall+764 >Xint0x80_syscall() at Xint0x80_syscall+31 >--- syscall (383, FreeBSD ELF32, kse_release), eip = 671759695, esp = 3215978288, ebp = 3215978380 --- >0 > > > > but if yuo can get a coredump it would be best.. > > in ddb do: > > call doadump > > > > in this case it looks like thread 0xc1f2aaf0 has called exit() and is > > waiting for the others to exit.. > > I wonder if the lock is the answer.. it woul dbe good to follow the link > > in the mutex in the proc structure at 0xc1a2d8c0 > > to see which thread OWNS it.. > >I'm following it from 0xc1a22540 for today's lockup: > >(kgdb) p $proc->p_mtx >$3 = { > mtx_object = { > lo_class = 0xc069e55c, > lo_name = 0xc067788d "process lock", > lo_type = 0xc067788d "process lock", > lo_flags = 0x430000, > lo_list = { > tqe_next = 0x0, > tqe_prev = 0x0 > }, > lo_witness = 0x0 > }, > mtx_lock = 0xc1b617d2, > mtx_recurse = 0x0 >} > > >0xc1b617d2 is almost the same as the thread id of the >first thread (0xc1b617d0).. > >I've still got the dump, so if you need more info please let me know. > >Drew >_______________________________________________ >freebsd-threads@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-threads >To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > > From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 19:12:30 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A616B16A4CE; Thu, 9 Sep 2004 19:12:30 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4095643D1F; Thu, 9 Sep 2004 19:12:30 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89JCSJt027623 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 15:12:28 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89JCNuQ058586; Thu, 9 Sep 2004 15:12:23 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.43799.598836.408741@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 15:12:23 -0400 (EDT) To: Julian Elischer In-Reply-To: <4140AA2A.90605@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 19:12:30 -0000 Julian Elischer writes: > > thread 0xc1b617d0 ksegrp 0xc18779a0 [CPU 1] > > thread 0xc1b614b0 ksegrp 0xc18779a0 [SUSP] > > thread 0xc1b61320 ksegrp 0xc18779a0 [LOCK process lock c1b13200] > > thread 0xc2b6ce10 ksegrp 0xc1a270e0 [LOCK process lock c1b13200] > > > >db> call db_trace_thread(0xc1b617d0, -1) > >sched_switch(3249936336,3244003328,3244003328,468695918,1992661338) at sched_switch+216 > >mi_switch(2,3244003328,3244003668,3244003328,3867700060) at mi_switch+455 > >maybe_preempt(3244003328,252,0,3867700072,3226402603) at maybe_preempt+153 > >sched_add(70,3867700092,3226402999,3246881184,3867189248) at sched_add+259 > >end() at 3246881184 > >0 > > > > > > odd that teh stack trace stops there?? that in itself is wierd.. > I don't understand why the thread is marked as currently running on > CPU1. it called sched_switch that should have saved its state > and put it on teh run queue (and marked it as such) so its state should > be RUNQ. In another context, John recently told me: for threads on CPUs we don't save the current thread state anywhere when we enter the debugger, so the backtrace is only relevant for info since the last context switch. Maybe that explains at least this wierdness.. Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 19:26:29 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA44C16A4CE for ; Thu, 9 Sep 2004 19:26:29 +0000 (GMT) Received: from mail2.speakeasy.net (mail2.speakeasy.net [216.254.0.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id 939CE43D3F for ; Thu, 9 Sep 2004 19:26:29 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 824 invoked from network); 9 Sep 2004 19:26:29 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 9 Sep 2004 19:26:29 -0000 Received: from [10.50.40.210] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i89JQQOM009310; Thu, 9 Sep 2004 15:26:26 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: Julian Elischer Date: Thu, 9 Sep 2004 15:26:29 -0400 User-Agent: KMail/1.6.2 References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> In-Reply-To: <4140AA2A.90605@elischer.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409091526.29468.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Andrew Gallatin cc: freebsd-threads@FreeBSD.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 19:26:29 -0000 On Thursday 09 September 2004 03:08 pm, Julian Elischer wrote: > >I'm following it from 0xc1a22540 for today's lockup: > > > >(kgdb) p $proc->p_mtx > >$3 = { > > mtx_object = { > > lo_class = 0xc069e55c, > > lo_name = 0xc067788d "process lock", > > lo_type = 0xc067788d "process lock", > > lo_flags = 0x430000, > > lo_list = { > > tqe_next = 0x0, > > tqe_prev = 0x0 > > }, > > lo_witness = 0x0 > > }, > > mtx_lock = 0xc1b617d2, > > mtx_recurse = 0x0 > >} > > > > > >0xc1b617d2 is almost the same as the thread id of the > >first thread (0xc1b617d0).. That means it has a flag set, 2 = MTX_CONTESTED meaning that it is a contested lock. 0xc1b617d0 is the thread that owns the lock. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 19:37:59 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CACAF16A4CE; Thu, 9 Sep 2004 19:37:59 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7512343D49; Thu, 9 Sep 2004 19:37:59 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89JbuJt001827 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 15:37:56 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89JbpkK058606; Thu, 9 Sep 2004 15:37:51 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.45327.42494.922427@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 15:37:51 -0400 (EDT) To: Julian Elischer In-Reply-To: <4140AA2A.90605@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 19:37:59 -0000 Julian Elischer writes: > thanks, > I'm flooded with work for a couple of days.. Me too.. Sorry for the terribly latency in giving you more info. > it looks as if one ofthe threads (0xc1b614b0) has called exit, > whichmeans it is in thread_single() > waiting for all the other threads to suicide, but at least one of them > doen't want to.. > > Two of them (0xc1b61320 and 0xc2b6ce10) are refusing to finish up and exit > because they need the proc lock, which is owned by a fourth one.. > (0xc1b617d0) > > the fourth one has just preempted itself with some other thread > (3244003328 whatever that is in > hex (0xC15B9000)) do you still have the 'ps'? > what is thread (0xC15B9000)? > No, but I've got the dump. It looks like it was preempted by the fxp ethernet driver's ithread: (kgdb) p ((struct thread*)0xC15B9000)->td_proc->p_comm $7 = "irq31: fxp0\0\0\0\0\0\0\0\0" Maybe this would be easier to debug if I disabled preemption? % cat opt_sched.h #define PREEMPTION 1 #define SCHED_4BSD 1 Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 20:42:53 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D6A8216A4CE; Thu, 9 Sep 2004 20:42:53 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id BD71843D53; Thu, 9 Sep 2004 20:42:53 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id 86D347A3D2; Thu, 9 Sep 2004 13:42:53 -0700 (PDT) Message-ID: <4140C04D.1060906@elischer.org> Date: Thu, 09 Sep 2004 13:42:53 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> In-Reply-To: <16704.45327.42494.922427@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 20:42:54 -0000 Andrew Gallatin wrote: >Julian Elischer writes: > > thanks, > > I'm flooded with work for a couple of days.. > >Me too.. Sorry for the terribly latency in giving you more info. > > > it looks as if one ofthe threads (0xc1b614b0) has called exit, > > whichmeans it is in thread_single() > > waiting for all the other threads to suicide, but at least one of them > > doen't want to.. > > > > Two of them (0xc1b61320 and 0xc2b6ce10) are refusing to finish up and exit > > because they need the proc lock, which is owned by a fourth one.. > > (0xc1b617d0) > > > > the fourth one has just preempted itself with some other thread > > (3244003328 whatever that is in > > hex (0xC15B9000)) do you still have the 'ps'? > > what is thread (0xC15B9000)? > > > >No, but I've got the dump. It looks like it was preempted by >the fxp ethernet driver's ithread: > >(kgdb) p ((struct thread*)0xC15B9000)->td_proc->p_comm >$7 = "irq31: fxp0\0\0\0\0\0\0\0\0" > >Maybe this would be easier to debug if I disabled preemption? > I think that this would possibly GO AWAY of you disab;ed preemption. which would make it very hard to debug :-) > >% cat opt_sched.h >#define PREEMPTION 1 >#define SCHED_4BSD 1 > > >Drew > > From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 20:46:38 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E9B5B16A4CE; Thu, 9 Sep 2004 20:46:38 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 944F343D39; Thu, 9 Sep 2004 20:46:38 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89KkaJt013470 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 16:46:36 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89KkVeb058664; Thu, 9 Sep 2004 16:46:31 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.49447.290897.602540@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 16:46:31 -0400 (EDT) To: Julian Elischer In-Reply-To: <4140C04D.1060906@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 20:46:39 -0000 Julian Elischer writes: > > > >Maybe this would be easier to debug if I disabled preemption? > > > > > I think that this would possibly GO AWAY of you disab;ed preemption. > which would make it very hard to debug :-) > Yes and no. You initially asked me to try in -current because of some changes you'd made to the exit code. RELENG_5 (with the old exit code and no preemption) shows a different problem (proc is just not killable). If the proc was killable without preemption, that would at least show your new code is better.. Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 21:38:41 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EF71416A4CE; Thu, 9 Sep 2004 21:38:40 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id A071143D48; Thu, 9 Sep 2004 21:38:40 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89LccJt021382 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 17:38:38 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89LcX8V058712; Thu, 9 Sep 2004 17:38:33 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.52569.375858.857614@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 17:38:33 -0400 (EDT) To: Julian Elischer In-Reply-To: <4140C04D.1060906@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 21:38:41 -0000 Julian Elischer writes: > I think that this would possibly GO AWAY of you disab;ed preemption. > which would make it very hard to debug :-) Nope, still happens w/o preempt.. And its the "worse" problem of deadlocking the system rather than just having the process fail to exit. db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 579 c37e41c0 e8855000 1387 578 579 0004002 [SLPQ ttyin 0xc17df810][SLP] csh 578 c1817540 e671a000 1387 576 576 0000100 [SLPQ select 0xc06cb704][SLP] sshd 576 c37e4540 e8857000 0 451 576 0000100 [SLPQ sbwait 0xc1983e84][SLP] sshd 566 c1a1fc40 e67ba000 1387 1 564 000c482 (threaded) mx_pingpong thread 0xc37944b0 ksegrp 0xc1a20460 [CPU 0] thread 0xc3794640 ksegrp 0xc1a20460 [SUSP] thread 0xc187e320 ksegrp 0xc1a20460 [RUNQ] thread 0xc187e4b0 ksegrp 0xc187fee0 [CPU 1] db> call db_trace_thread(0xc37944b0, -1) kdb_enter(c0686ceb,c0645179,fc,c37944b0,c16bd000) at kdb_enter+0x30 siointr1(c16bd000,2,fc,e8842ba0,c0650df2) at siointr1+0xd1 siointr(c16bd000,c37944b0,c1a1fc40,4,c37944b0) at siointr+0x77 intr_execute_handlers(c1556e90,e8842be0,e8842c28,c0639f53,34) at intr_execute_handlers+0x8d lapic_handle_intr(34) at lapic_handle_intr+0x3b Xapic_isr1() at Xapic_isr1+0x33 --- interrupt, eip = 0xc04ea44a, esp = 0xe8842c24, ebp = 0xe8842c28 --- thread_suspend_check(0,246,e8842c60,c0501b86,c37944b0) at thread_suspend_check+0x21f exit1(c37944b0,9,0,0,c04e1e66) at exit1+0x109 expand_name(c37944b0,9,100,0,0) at expand_name postsig(9,c37944b0,0,0,0) at postsig+0x204 ast(e8842d48) at ast+0x5e4 doreti_ast() at doreti_ast+0x17 0 db> call db_trace_thread(0xc3794640, -1) sched_switch(c3794640,c37944b0,0,94bc2c2e,2227b660) at sched_switch+0xd8 mi_switch(1,c37944b0,0,0,0) at mi_switch+0x1c7 thread_single(1,c3794640,0,0,0) at thread_single+0x1d7 exit1(c3794640,9,e8845cbc,e8845ce4,c04e1e66) at exit1+0x115 expand_name(c3794640,9,100,0,0) at expand_name postsig(9,c3794640,0,0,0) at postsig+0x204 ast(e8845d48) at ast+0x5e4 doreti_ast() at doreti_ast+0x17 0 db> call db_trace_thread(0xc187e320, -1) sched_switch(c187e320,0,0,9a67657e,e2359ef4) at sched_switch+0xd8 mi_switch(2,0,0,0,0) at mi_switch+0x1c7 ast(e6749d48) at ast+0x4eb doreti_ast() at doreti_ast+0x17 0 db> call db_trace_thread(0xc187e4b0, -1) sched_switch(c187fee0,1e,0,1e,0) at sched_switch+0xd8 0 db> show pcpu cpuid = 0 curthread = 0xc37944b0: pid 566 "mx_pingpong" curpcb = 0xe8842da0 fpcurthread = none idlethread = 0xc1561640: pid 12 "idle: cpu0" APIC ID = 0 currentldt = 0x30 db> show pcpu 1 cpuid = 1 curthread = 0xc187e4b0: pid 566 "mx_pingpong" curpcb = 0xe674cda0 fpcurthread = none idlethread = 0xc15614b0: pid 11 "idle: cpu1" APIC ID = 1 currentldt = 0x30 According to kgdb, the lock holder for the proc lock is 0xc37944b0: (kgdb) p/x td->td_proc->p_mtx->mtx_lock $8 = 0xc37944b2 Maybe its some sort of spinlock deadlock.. I'm going to enable witness and try again. Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 22:40:33 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C3CE516A4D0; Thu, 9 Sep 2004 22:40:33 +0000 (GMT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32FE743D39; Thu, 9 Sep 2004 22:40:33 +0000 (GMT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.10/8.12.10) with ESMTP id i89MeRJt029538 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 9 Sep 2004 18:40:27 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.12.9p2/8.12.9/Submit) id i89MeMUO058769; Thu, 9 Sep 2004 18:40:22 -0400 (EDT) (envelope-from gallatin) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16704.56278.102480.817628@grasshopper.cs.duke.edu> Date: Thu, 9 Sep 2004 18:40:22 -0400 (EDT) To: Julian Elischer In-Reply-To: <4140C04D.1060906@elischer.org> References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 22:40:34 -0000 Here it is with WITNESS, after hacking it to enter my driver's spinlocks into the order list, and to expand by 4x this size of the witness pool.. db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 595 c16478c0 e52e2000 1387 1 593 000c482 (threaded) mx_pingpong thread 0xc35ff960 ksegrp 0xc15ba7e0 [CPU 0] thread 0xc35ffaf0 ksegrp 0xc15ba7e0 [SUSP] thread 0xc35ffc80 ksegrp 0xc15ba7e0 [RUNQ] thread 0xc35ffe10 ksegrp 0xc1880e70 [CPU 1] db> sho pcpu cpuid = 0 curthread = 0xc35ff960: pid 595 "mx_pingpong" curpcb = 0xe782dda0 fpcurthread = none idlethread = 0xc1561640: pid 12 "idle: cpu0" APIC ID = 0 currentldt = 0x30 spin locks held: exclusive spin mutex sio r = 0 (0xc0702220) locked @ dev/sio/sio.c:1709 db> sho locks exclusive sleep mutex process lock r = 0 (0xc164792c) locked @ kern/kern_exit.c:136 exclusive spin mutex sio r = 0 (0xc0702220) locked @ dev/sio/sio.c:1709 db> sho pcpu 1 cpuid = 1 curthread = 0xc35ffe10: pid 595 "mx_pingpong" curpcb = 0xe7836da0 fpcurthread = none idlethread = 0xc15614b0: pid 11 "idle: cpu1" APIC ID = 1 currentldt = 0x30 spin locks held: db> call db_trace_thread(0xc35ffaf0, -1) sched_switch(c35ffaf0,c35ff960,0,11d,f187c1ca) at sched_switch+0xfd mi_switch(1,c35ff960,c065f8ac,335,c164792c) at mi_switch+0x2a0 thread_single(1,0,c065c2a3,88,e7830c68) at thread_single+0x1d7 exit1(c35ffaf0,9,c065eebc,996,1) at exit1+0xd5 expand_name(c35ffaf0,9,c065eebc,928,0) at expand_name postsig(9,0,c0661a77,100,1020800) at postsig+0x1e0 ast(e7830d48) at ast+0x48a doreti_ast() at doreti_ast+0x17 0 db> call db_trace_thread(0xc35ffc80, -1) sched_switch(c35ffc80,0,0,117,7b0cc0ca) at sched_switch+0xfd mi_switch(2,0,c0661a77,f5,1010000) at mi_switch+0x2a0 ast(e7833d48) at ast+0x3dd doreti_ast() at doreti_ast+0x17 0 db> call db_trace_thread(0xc35ffe10, -1) sched_switch(18e,bb9,c1880e70,1e,0) at sched_switch+0xfd __func__.0() at __func__.0+0xac79 0 db> Show witness is kinda long. Sleep locks: 0 mxsleep(0,2): es->wait_sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,2): es->cmd_sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,1): es->wait_sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,1): es->cmd_sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,0): mapper sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,-1): route update sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mxsleep(0,-1): route cmd sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 0 mx(-1,0): mx mapper mapbuf -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_msgbuf.c:84 0 mx(-1,0): mx mapper msgbuf -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_msgbuf.c:84 0 ATAPI CD bioqueue lock -- last acquired @ dev/ata/atapi-cd.c:1100 0 g_xdown -- last acquired @ geom/geom_io.c:392 4 ATA queue lock -- last acquired @ dev/ata/ata-queue.c:172 4 bio queue -- last acquired @ geom/geom_io.c:65 4 ATA disk bioqueue lock -- last acquired @ dev/ata/ata-disk.c:236 12 system map -- last acquired @ vm/vm_map.c:2313 13 kmem object -- last acquired @ vm/vm_kern.c:398 14 vm page queue mutex -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:842 15 vnode interlock -- last acquired @ kern/vfs_subr.c:2099 16 spechash -- last acquired @ kern/vfs_subr.c:1903 16 Syncer mtx -- last acquired @ kern/vfs_subr.c:1504 16 vnode_free_list -- last acquired @ kern/vfs_subr.c:3217 16 cdev -- last acquired @ kern/kern_conf.c:81 17 Malloc Stats -- last acquired @ kern/kern_malloc.c:185 17 UMA pcpu -- last acquired @ vm/uma_core.c:1800 18 KMAP ENTRY -- last acquired @ vm/uma_core.c:2224 19 UMA zone -- last acquired @ vm/uma_core.c:1817 16 Name Cache -- last acquired @ kern/vfs_cache.c:352 15 CMAPCADDR12 -- last acquired @ i386/i386/pmap.c:2488 15 pmap -- last acquired @ i386/i386/pmap.c:847 16 uma object -- last acquired @ vm/uma_core.c:963 17 UMA pcpu -- (already displayed) 13 kernel object -- last acquired @ vm/vm_object.c:454 14 vm page queue mutex -- (already displayed) 0 g_xup -- last acquired @ geom/geom_io.c:449 1 g_disk_done -- last acquired @ geom/geom_disk.c:196 4 bio queue -- (already displayed) 17 UMA pcpu -- (already displayed) 3 Giant -- last acquired @ kern/kern_timeout.c:247 4 kobj -- last acquired @ kern/subr_kobj.c:298 4 struct mount mtx -- last acquired @ kern/vfs_subr.c:901 15 vnode interlock -- (already displayed) 4 bounce pages lock -- last acquired @ i386/i386/busdma_machdep.c:860 4 vm86 lock -- last acquired @ i386/i386/vm86.c:582 4 standard object -- last acquired @ vm/vm_object.c:454 5 dev_pager list -- last acquired @ vm/device_pager.c:163 5 vm object_list -- last acquired @ vm/vm_object.c:643 14 vm page queue mutex -- (already displayed) 4 udp -- last acquired @ netinet/udp_usrreq.c:263 5 UMA lock -- last acquired @ vm/uma_core.c:1466 12 system map -- (already displayed) 5 udpinp -- last acquired @ netinet/udp_usrreq.c:401 6 arc4_mtx -- last acquired @ libkern/arc4random.c:137 6 accept -- last acquired @ kern/uipc_socket.c:334 7 so_snd -- last acquired @ kern/uipc_socket.c:2091 8 so_rcv -- last acquired @ kern/uipc_socket.c:2092 9 radix node head -- last acquired @ net/route.c:148 10 ifnet -- last acquired @ net/if.c:697 10 rtentry -- last acquired @ netinet/ip_output.c:824 11 rts_inq -- last acquired @ net/netisr.c:231 11 network driver -- last acquired @ dev/fxp/if_fxp.c:1225 12 bpf interface lock -- last acquired @ net/bpf.c:1155 13 bpf cdev lock -- last acquired @ net/bpf.c:1157 14 sellck -- last acquired @ kern/sys_generic.c:726 12 if send queue -- last acquired @ dev/fxp/if_fxp.c:1267 12 knlist lock for lockless objects -- last acquired @ kern/kern_event.c:1598 12 system map -- (already displayed) 11 ifaddr -- last acquired @ net/if.c:594 9 process lock -- last acquired @ kern/kern_exit.c:136 10 ktrace -- last acquired @ kern/kern_exit.c:347 10 sigacts -- last acquired @ kern/subr_trap.c:256 10 struct pargs.ref -- last acquired @ kern/kern_proc.c:1100 10 session -- last acquired @ kern/kern_proc.c:464 15 vnode interlock -- (already displayed) 11 tty -- last acquired @ kern/tty.c:2746 11 uidinfo hash -- last acquired @ kern/kern_resource.c:1004 12 sleep mtxpool -- last acquired @ kern/kern_descrip.c:1896 12 uidinfo struct -- last acquired @ order list:0 13 allprison -- last acquired @ kern/kern_jail.c:460 4 GEOM orphanage -- last acquired @ geom/geom_event.c:170 4 ATA disk bioqueue lock -- (already displayed) 4 ithread -- last acquired @ kern/kern_intr.c:276 4 kernel linker -- last acquired @ kern/kern_linker.c:461 4 protect sysfilt_ops -- last acquired @ kern/kern_event.c:667 4 TID lock -- last acquired @ kern/kern_thread.c:206 4 rman head -- last acquired @ kern/subr_rman.c:111 4 rman -- last acquired @ kern/subr_rman.c:448 12 system map -- (already displayed) 4 bio queue -- (already displayed) 4 taskqueue list -- last acquired @ kern/subr_taskqueue.c:85 4 sf_buf -- last acquired @ i386/i386/vm_machdep.c:674 4 domain list -- last acquired @ kern/uipc_domain.c:110 4 buffer daemon lock -- last acquired @ kern/vfs_bio.c:401 4 ttylist -- last acquired @ kern/tty.c:2745 11 tty -- (already displayed) 4 if_cloners lock -- last acquired @ net/if_clone.c:199 4 pseudofs -- last acquired @ fs/pseudofs/pseudofs_fileno.c:86 4 pbuf mutex -- last acquired @ vm/vm_pager.c:414 4 accounting -- last acquired @ kern/kern_acct.c:232 4 if_clone lock -- last acquired @ net/if_clone.c:304 4 pfil_head_mtx -- last acquired @ net/pfil.c:166 5 pfil_head_list lock -- last acquired @ net/pfil.c:172 4 bdone lock -- last acquired @ kern/vfs_bio.c:3759 4 nfsd_mtx -- last acquired @ nfsserver/nfs_srvsock.c:811 4 lo_mtx -- last acquired @ net/if_loop.c:154 4 fdesc -- last acquired @ kern/kern_descrip.c:1614 5 filedesc structure -- last acquired @ kern/kern_descrip.c:1926 6 accept -- (already displayed) 6 devd -- last acquired @ kern/subr_bus.c:497 14 sellck -- (already displayed) 6 pipe mutex -- last acquired @ kern/sys_pipe.c:1520 14 sellck -- (already displayed) 7 sigio lock -- last acquired @ kern/kern_descrip.c:729 8 process group -- last acquired @ kern/kern_proc.c:458 9 process lock -- (already displayed) 4 mntid -- last acquired @ kern/vfs_subr.c:407 5 mountlist -- last acquired @ kern/vfs_subr.c:3464 4 pseudofs_vncache -- last acquired @ fs/pseudofs/pseudofs_vncache.c:239 4 ATA queue lock -- (already displayed) 4 taskqueue -- last acquired @ kern/subr_taskqueue.c:193 4 devstat -- last acquired @ kern/subr_devstat.c:190 4 buf queue lock -- last acquired @ kern/vfs_bio.c:1505 15 vnode interlock -- (already displayed) 4 ufs ihash -- last acquired @ ufs/ufs/ufs_ihash.c:120 15 vnode interlock -- (already displayed) 4 dirhash list -- last acquired @ ufs/ufs/ufs_dirhash.c:348 5 dirhash -- last acquired @ ufs/ufs/ufs_dirhash.c:349 4 needsbuffer lock -- last acquired @ kern/vfs_bio.c:296 4 runningbufspace lock -- last acquired @ kern/vfs_bio.c:314 4 eventhandler -- last acquired @ kern/subr_eventhandler.c:213 5 eventhandler list -- last acquired @ kern/kern_exit.c:199 17 Malloc Stats -- (already displayed) 17 UMA pcpu -- (already displayed) 4 rtsock route_cb lock -- last acquired @ net/rtsock.c:234 4 tcp -- last acquired @ netinet/tcp_timer.c:138 5 tcpinp -- last acquired @ netinet/tcp_input.c:737 6 tcp_hc_entry -- last acquired @ netinet/tcp_hostcache.c:287 12 system map -- (already displayed) 6 random reseed -- last acquired @ dev/random/yarrow.c:193 6 arc4_mtx -- (already displayed) 6 so_glabel -- last acquired @ kern/uipc_socket.c:282 6 accept -- (already displayed) 4 malloc -- last acquired @ kern/kern_malloc.c:518 17 Malloc Stats -- (already displayed) 4 mx(0,2): es->sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:539 4 bpf global lock -- last acquired @ net/bpf.c:1445 4 if_afdata -- last acquired @ net/if.c:487 4 unp -- last acquired @ kern/uipc_usrreq.c:247 6 accept -- (already displayed) 4 mxsleep(0,-1): is->sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:774 4 mx(-1,-1): mx_global_mutex -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_instance.c:1578 5 mx(0,-1): is->sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_common.c:952 4 mx(0,1): es->sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/mx.c:539 4 mx(-1,0): peer sync -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_peer.c:248 0 arp_inq -- last acquired @ net/netisr.c:231 0 igmp_mtx -- last acquired @ netinet/igmp.c:431 0 ip_inq -- last acquired @ net/netisr.c:231 0 ipqlock -- last acquired @ netinet/ip_input.c:1092 17 UMA pcpu -- (already displayed) 0 sem -- last acquired @ kern/sysv_sem.c:1174 0 GEOM topology -- last acquired @ geom/geom_event.c:202 4 bdone lock -- (already displayed) 4 GEOM orphanage -- (already displayed) 4 devstat -- (already displayed) 1 fdc lock -- last acquired @ dev/fdc/fdc.c:748 2 callout_wait_lock -- last acquired @ kern/kern_timeout.c:289 1 swapdev -- last acquired @ vm/swap_pager.c:2124 12 system map -- (already displayed) 4 ATA queue lock -- (already displayed) 4 bio queue -- (already displayed) 0 p_peers -- last acquired @ kern/kern_exit.c:243 0 module subsystem sx lock -- last acquired @ kern/kern_module.c:113 0 rawcb -- last acquired @ net/raw_usrreq.c:80 8 so_rcv -- (already displayed) 0 sysctl lock -- last acquired @ kern/kern_sysctl.c:1315 1 rip -- last acquired @ netinet/raw_ip.c:195 17 UMA pcpu -- (already displayed) 1 filelist lock -- last acquired @ kern/kern_descrip.c:1388 5 filedesc structure -- (already displayed) 1 allproc -- last acquired @ kern/kern_exit.c:690 2 user map -- last acquired @ vm/vm_map.c:301 3 Giant -- (already displayed) 0 kernel environment -- last acquired @ kern/kern_environment.c:285 0 dev_pager create -- last acquired @ vm/device_pager.c:150 4 standard object -- (already displayed) 0 ddp_list_mtx -- last acquired @ order list:0 1 ddp_mtx -- last acquired @ order list:0 0 slip_mtx -- last acquired @ order list:0 1 slip sc_mtx -- last acquired @ order list:0 0 proctree -- last acquired @ kern/kern_exit.c:583 1 allproc -- (already displayed) Spin locks: 0 ap boot -- last acquired @ i386/i386/mp_machdep.c:517 1 sio -- last acquired @ dev/sio/sio.c:1709 2 cy -- last acquired @ order list:0 3 uart_hwmtx -- last acquired @ order list:0 4 sabtty -- last acquired @ order list:0 5 zstty -- last acquired @ order list:0 6 ng_node -- last acquired @ order list:0 7 ng_worklist -- last acquired @ order list:0 8 taskqueue_fast -- last acquired @ order list:0 9 intr table -- last acquired @ i386/i386/intr_machdep.c:89 10 ithread table lock -- last acquired @ order list:0 11 sleepq chain -- last acquired @ kern/subr_sleepqueue.c:223 12 sched lock -- last acquired @ kern/kern_clock.c:382 13 turnstile chain -- last acquired @ kern/subr_turnstile.c:411 14 td_contested -- last acquired @ kern/subr_turnstile.c:712 15 callout -- last acquired @ kern/kern_clock.c:227 16 entropy harvest -- last acquired @ dev/random/randomdev_soft.c:248 17 entropy harvest buffers -- last acquired @ dev/random/randomdev_soft.c:270 18 allpmaps -- last acquired @ i386/i386/pmap.c:1146 19 vm page queue free mutex -- last acquired @ vm/vm_page.c:1076 20 icu -- last acquired @ order list:0 21 smp rendezvous -- last acquired @ i386/i386/pmap.c:663 22 tlb -- last acquired @ order list:0 23 clk -- last acquired @ i386/isa/clock.c:404 24 mutex profiling lock -- last acquired @ order list:0 25 kse zombie lock -- last acquired @ kern/kern_thread.c:383 26 ALD Queue -- last acquired @ order list:0 27 pcicfg -- last acquired @ i386/pci/pci_cfgreg.c:205 28 kreqq spinlock -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mi_common.c:444 29 is->cmdq.spinlock -- last acquired @ /home/gallatin/mx/tiki/driver/freebsd/../common/mx_lanai_command.c:245 Locks which were never acquired: <...> No dump... it failed in witness.. Drew From owner-freebsd-threads@FreeBSD.ORG Thu Sep 9 23:05:42 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2612C16A50E; Thu, 9 Sep 2004 23:05:42 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 25AC443D1F; Thu, 9 Sep 2004 23:05:39 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id CA2987A3D2; Thu, 9 Sep 2004 16:05:38 -0700 (PDT) Message-ID: <4140E1C2.3020704@elischer.org> Date: Thu, 09 Sep 2004 16:05:38 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Andrew Gallatin References: <16703.11479.679335.588170@grasshopper.cs.duke.edu> <16703.12410.319869.29996@grasshopper.cs.duke.edu> <413F55B8.50003@elischer.org> <16703.28031.454342.774229@grasshopper.cs.duke.edu> <413F8DBB.5040502@elischer.org> <16704.40876.708925.425911@grasshopper.cs.duke.edu> <4140AA2A.90605@elischer.org> <16704.45327.42494.922427@grasshopper.cs.duke.edu> <4140C04D.1060906@elischer.org> <16704.49447.290897.602540@grasshopper.cs.duke.edu> In-Reply-To: <16704.49447.290897.602540@grasshopper.cs.duke.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: John Baldwin cc: freebsd-threads@freebsd.org Subject: Re: Unkillable KSE threaded proc X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 23:05:42 -0000 Andrew Gallatin wrote: >Julian Elischer writes: > > > > > >Maybe this would be easier to debug if I disabled preemption? > > > > > > > > > I think that this would possibly GO AWAY of you disab;ed preemption. > > which would make it very hard to debug :-) > > > >Yes and no. You initially asked me to try in -current because of >some changes you'd made to the exit code. RELENG_5 (with the old >exit code and no preemption) shows a different problem (proc is >just not killable). If the proc was killable without preemption, >that would at least show your new code is better.. > yeah, well I have this on my radar it's #4 on my to do list :-) > >Drew > > From owner-freebsd-threads@FreeBSD.ORG Fri Sep 10 10:31:36 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD82E16A4CE; Fri, 10 Sep 2004 10:31:36 +0000 (GMT) Received: from tts.orel.ru (tts.orel.ru [213.59.64.67]) by mx1.FreeBSD.org (Postfix) with ESMTP id 844E843D1F; Fri, 10 Sep 2004 10:31:35 +0000 (GMT) (envelope-from bel@orel.ru) Received: from orel.ru (lg.orel.ru [62.33.11.59]) by tts.orel.ru (8.12.10/8.12.10/bel) with ESMTP id i8AAVReT021917; Fri, 10 Sep 2004 14:31:28 +0400 Message-ID: <4141827F.5000002@orel.ru> Date: Fri, 10 Sep 2004 14:31:27 +0400 From: Andrew Belashov Organization: ORIS User-Agent: Mozilla/5.0 (X11; U; FreeBSD sparc64; en-US; rv:1.6) Gecko/20040407 X-Accept-Language: ru, en-us, en MIME-Version: 1.0 To: freebsd-gnats-submit@FreeBSD.org, freebsd-threads@FreeBSD.org X-Enigmail-Version: 0.83.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------000002020101060604040907" X-Zombi-Check: on netra2.orel.ru Subject: [PATCH] Re: bin/32295: pthread dont dequeue signals X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2004 10:31:36 -0000 This is a multi-part message in MIME format. --------------000002020101060604040907 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hello, All! About one year I use my patch for open PR bin/32295. No problem detected on high load mysql server. Also, this patch tested with mozilla and firefox. Patch in attachment. Alternate location: Problem description: In function _thread_sig_handler() received signal queued (by writing into pipe). =============================================================================== if (_queue_signals != 0) { __sys_write(_thread_kern_pipe[1], &c, 1); DBG_MSG("Got signal %d, queueing to kernel pipe\n", sig); } =============================================================================== But flag _sigq_check_reqd (Check of queue of signals is required) is not touched if signal should be ignored: =============================================================================== if (_thread_sigq[sig - 1].blocked == 0) { [.........] /* Indicate that there are queued signals: */ _thread_sigq[sig - 1].pending = 1; _sigq_check_reqd = 1; } /* These signals need special handling: */ else if (sig == SIGCHLD || sig == SIGTSTP || sig == SIGTTIN || sig == SIGTTOU) { _thread_sigq[sig - 1].pending = 1; _thread_sigq[sig - 1].signo = sig; _sigq_check_reqd = 1; } else { DBG_MSG("Got signal %d, ignored.\n", sig); } =============================================================================== In this situation, dequeue signal handler is not executed and signal does not dequeued from pipe... This is bug and patch should be commited. -- With best regards, Andrew Belashov. --------------000002020101060604040907 Content-Type: text/plain; name="uthread_sig.c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="uthread_sig.c.diff" --- lib/libc_r/uthread/uthread_sig.c.orig Wed Dec 3 09:54:40 2003 +++ lib/libc_r/uthread/uthread_sig.c Fri Sep 10 10:50:32 2004 @@ -160,8 +160,10 @@ _thread_sigq[sig - 1].signo = sig; _sigq_check_reqd = 1; } - else + else { DBG_MSG("Got signal %d, ignored.\n", sig); + _sigq_check_reqd = 1; + } } /* * The signal handlers should have been installed so that they --------------000002020101060604040907-- From owner-freebsd-threads@FreeBSD.ORG Fri Sep 10 10:40:31 2004 Return-Path: Delivered-To: freebsd-threads@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3DD8D16A4CE for ; Fri, 10 Sep 2004 10:40:31 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 15FF543D1F for ; Fri, 10 Sep 2004 10:40:31 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i8AAeUUt051037 for ; Fri, 10 Sep 2004 10:40:30 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i8AAeUUg051036; Fri, 10 Sep 2004 10:40:30 GMT (envelope-from gnats) Date: Fri, 10 Sep 2004 10:40:30 GMT Message-Id: <200409101040.i8AAeUUg051036@freefall.freebsd.org> To: freebsd-threads@FreeBSD.org From: Andrew Belashov Subject: [PATCH] Re: bin/32295: pthread dont dequeue signals X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Andrew Belashov List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2004 10:40:31 -0000 The following reply was made to PR bin/32295; it has been noted by GNATS. From: Andrew Belashov To: freebsd-gnats-submit@FreeBSD.org, freebsd-threads@FreeBSD.org Cc: Subject: [PATCH] Re: bin/32295: pthread dont dequeue signals Date: Fri, 10 Sep 2004 14:31:27 +0400 This is a multi-part message in MIME format. --------------000002020101060604040907 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hello, All! About one year I use my patch for open PR bin/32295. No problem detected on high load mysql server. Also, this patch tested with mozilla and firefox. Patch in attachment. Alternate location: Problem description: In function _thread_sig_handler() received signal queued (by writing into pipe). =============================================================================== if (_queue_signals != 0) { __sys_write(_thread_kern_pipe[1], &c, 1); DBG_MSG("Got signal %d, queueing to kernel pipe\n", sig); } =============================================================================== But flag _sigq_check_reqd (Check of queue of signals is required) is not touched if signal should be ignored: =============================================================================== if (_thread_sigq[sig - 1].blocked == 0) { [.........] /* Indicate that there are queued signals: */ _thread_sigq[sig - 1].pending = 1; _sigq_check_reqd = 1; } /* These signals need special handling: */ else if (sig == SIGCHLD || sig == SIGTSTP || sig == SIGTTIN || sig == SIGTTOU) { _thread_sigq[sig - 1].pending = 1; _thread_sigq[sig - 1].signo = sig; _sigq_check_reqd = 1; } else { DBG_MSG("Got signal %d, ignored.\n", sig); } =============================================================================== In this situation, dequeue signal handler is not executed and signal does not dequeued from pipe... This is bug and patch should be commited. -- With best regards, Andrew Belashov. --------------000002020101060604040907 Content-Type: text/plain; name="uthread_sig.c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="uthread_sig.c.diff" --- lib/libc_r/uthread/uthread_sig.c.orig Wed Dec 3 09:54:40 2003 +++ lib/libc_r/uthread/uthread_sig.c Fri Sep 10 10:50:32 2004 @@ -160,8 +160,10 @@ _thread_sigq[sig - 1].signo = sig; _sigq_check_reqd = 1; } - else + else { DBG_MSG("Got signal %d, ignored.\n", sig); + _sigq_check_reqd = 1; + } } /* * The signal handlers should have been installed so that they --------------000002020101060604040907--