From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 09:18:35 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3725616A4CE; Sun, 16 Nov 2003 09:18:35 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E1B643FCB; Sun, 16 Nov 2003 09:18:34 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAGHIX1G006463; Sun, 16 Nov 2003 12:18:33 -0500 (EST) Date: Sun, 16 Nov 2003 12:18:33 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031115193039.GA55917@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 17:18:35 -0000 On Sat, 15 Nov 2003, Marcel Moolenaar wrote: > On Sat, Nov 15, 2003 at 12:36:42PM -0500, Daniel Eischen wrote: > > On Fri, 14 Nov 2003, Marcel Moolenaar wrote: > > > > > Gang, > > > > > > The following change broke KSE on ia64: > > > > > > -------- > > > revision 1.18 > > > date: 2003/11/08 06:07:04; author: davidxu; state: Exp; lines: +16 -17 > > > Use THR lock instead of KSE lock to avoid scheduler be blocked in spinlock. > > > > > > Reviewed by: deischen > > > -------- > > > > > > We seem to be clobbering the thread structure instead of writing > > > to the mailbox. This happens at initialization. Can it be that > > > the change assumes PER_KSE and doesxn't work for PER_THREAD? > > > > I _think_ this may be because rltd-elf (at least for ia64) calls > > malloc with the rtld lock held. I'm not sure how to test this > > theory. > > No worries, I have a way to disproof it :-) > > Staticly linked binaries are as much broken as dynamicly linked > binaries. So, if we have a rtld problem, it's not the only one: Are you sure there's not an ia64 kernel bug or ia64 context restoring bug? If I enable debug messages in thread/thr_kern.c (uncomment #define DBG_MSG), I get: Found completed thread 6000000000014000, name initial thread Continuing thread 6000000000014000 in critical region Switching out thread 6000000000014000, state 0 Found completed thread 6000000000014000, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: Found completed thread 6000000000014000, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: ... repeatedly. The first two lines tell us that the thread blocked while in a critical region and the kernel thinks it is now unblocked. The critical region may be the malloc spinlock being held and the reason it blocked perhaps due to a page fault. Is it possible that the blocked context is incorrectly marked, or that it is just not being resumed properly? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 10:22:21 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 57A2316A4CE; Sun, 16 Nov 2003 10:22:21 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6275743F85; Sun, 16 Nov 2003 10:22:20 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAGIMKbe051480; Sun, 16 Nov 2003 10:22:20 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAGIMJbH060485; Sun, 16 Nov 2003 10:22:19 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAGIMJdk060484; Sun, 16 Nov 2003 10:22:19 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 10:22:19 -0800 From: Marcel Moolenaar To: deischen@freebsd.org Message-ID: <20031116182219.GB60377@dhcp01.pn.xcllnt.net> References: <20031115193039.GA55917@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 18:22:21 -0000 On Sun, Nov 16, 2003 at 12:18:33PM -0500, Daniel Eischen wrote: > > Are you sure there's not an ia64 kernel bug or ia64 context > restoring bug? There's nothing pointing in that direction yet. I keep thinking that the case is related to having TP per thread on ia64, while it's per KSE on i386. > The critical region may be the malloc spinlock being held > and the reason it blocked perhaps due to a page fault. Is > it possible that the blocked context is incorrectly marked, > or that it is just not being resumed properly? The likelylood that it's incorrectly marked is larger than the likelyhood that it's improperly resumed. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 11:30:21 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C4D4416A4CE; Sun, 16 Nov 2003 11:30:21 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB98343FB1; Sun, 16 Nov 2003 11:30:20 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAGJUK1G003660; Sun, 16 Nov 2003 14:30:20 -0500 (EST) Date: Sun, 16 Nov 2003 14:30:20 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031116182219.GB60377@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 19:30:21 -0000 On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > On Sun, Nov 16, 2003 at 12:18:33PM -0500, Daniel Eischen wrote: > > > > Are you sure there's not an ia64 kernel bug or ia64 context > > restoring bug? > > There's nothing pointing in that direction yet. I keep thinking > that the case is related to having TP per thread on ia64, while > it's per KSE on i386. If you noop the spinlock/spinunlock, the problem still occurs. > > The critical region may be the malloc spinlock being held > > and the reason it blocked perhaps due to a page fault. Is > > it possible that the blocked context is incorrectly marked, > > or that it is just not being resumed properly? > > The likelylood that it's incorrectly marked is larger than > the likelyhood that it's improperly resumed. What should I be looking at, [um]c_flags? $ simple Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x8, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread Switching out thread 6000000000014000, state 0 Threads in waiting queue: ... -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 11:53:44 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4564F16A4CE; Sun, 16 Nov 2003 11:53:44 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1A11243F3F; Sun, 16 Nov 2003 11:53:43 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAGJrgbe051893; Sun, 16 Nov 2003 11:53:42 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAGJrgbH060809; Sun, 16 Nov 2003 11:53:42 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAGJrgen060808; Sun, 16 Nov 2003 11:53:42 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 11:53:42 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031116195342.GA60773@dhcp01.pn.xcllnt.net> References: <20031116182219.GB60377@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 19:53:44 -0000 On Sun, Nov 16, 2003 at 02:30:20PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > On Sun, Nov 16, 2003 at 12:18:33PM -0500, Daniel Eischen wrote: > > > > > > Are you sure there's not an ia64 kernel bug or ia64 context > > > restoring bug? > > > > There's nothing pointing in that direction yet. I keep thinking > > that the case is related to having TP per thread on ia64, while > > it's per KSE on i386. > > If you noop the spinlock/spinunlock, the problem still > occurs. Hmmm, good to know. It tells me that the lock is in reality already a no-op :-) > What should I be looking at, [um]c_flags? mc_flags is very informative. > $ simple > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x8, name initial thread This is a context created by the kernel. It's one created by getcontext(). Only the kernel needs to preserve the return registers (which is what mc_flags indicates) because it needs to be able to resume system calls. > Switching out thread 6000000000014000, state 0 > Threads in waiting queue: > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread This is an asynchronuous context. Probably the result of a trap, but possibly the result of an interrupt. Does this mean that the thread has run since it was last found (i.e. the previous context) or do we have a case where a context is clobbered (I don't see a switch in)? -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 12:09:16 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FDB016A4CE; Sun, 16 Nov 2003 12:09:16 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B8E543F75; Sun, 16 Nov 2003 12:09:15 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAGK9E1G011960; Sun, 16 Nov 2003 15:09:14 -0500 (EST) Date: Sun, 16 Nov 2003 15:09:14 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031116195342.GA60773@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 20:09:16 -0000 On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > On Sun, Nov 16, 2003 at 02:30:20PM -0500, Daniel Eischen wrote: > > > What should I be looking at, [um]c_flags? > > mc_flags is very informative. > > > $ simple > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x8, name initial thread > > This is a context created by the kernel. It's one created by getcontext(). I'm not sure what you mean by "created by getcontext()". You mean get_mcontext(), the syscall getcontext(), or userland _ia64_save_context()? It shouldn't be from the syscall getcontext() because it is the initial thread. > Only the kernel needs to preserve the return registers (which is what > mc_flags indicates) because it needs to be able to resume system calls. > > > Switching out thread 6000000000014000, state 0 > > Threads in waiting queue: > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread > > This is an asynchronuous context. Probably the result of a trap, but > possibly the result of an interrupt. Does this mean that the thread > has run since it was last found (i.e. the previous context) or do we > have a case where a context is clobbered (I don't see a switch in)? Yes, this is the main thread and has run, blocked, and now completed. All three statements above are from the KSE scheduler as a result of an upcall. The first statement is the scheduler detecting that there was a thread that was running and that it is no longer running (it's being switched out). There are no threads in the waiting queue. The third statement is from checking the list of unblocked threads in the KSE mailbox. The same thread (main thread) is being resumed over and over again which shouldn't happen for this simple program. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 12:56:18 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF91616A4CE; Sun, 16 Nov 2003 12:56:18 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B98A43F93; Sun, 16 Nov 2003 12:56:17 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAGKuHbe052191; Sun, 16 Nov 2003 12:56:17 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAGKuGbH061043; Sun, 16 Nov 2003 12:56:16 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAGKuG83061042; Sun, 16 Nov 2003 12:56:16 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 12:56:16 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031116205616.GB60888@dhcp01.pn.xcllnt.net> References: <20031116195342.GA60773@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 20:56:19 -0000 On Sun, Nov 16, 2003 at 03:09:14PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > On Sun, Nov 16, 2003 at 02:30:20PM -0500, Daniel Eischen wrote: > > > > > What should I be looking at, [um]c_flags? > > > > mc_flags is very informative. > > > > > $ simple > > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x8, name initial thread > > > > This is a context created by the kernel. It's one created by getcontext(). > > I'm not sure what you mean by "created by getcontext()". You > mean get_mcontext(), the syscall getcontext(), or userland > _ia64_save_context()? It shouldn't be from the syscall getcontext() > because it is the initial thread. I meant getcontext(2). _ia64_save_context() always creates contexts that have mc_flags=0. Note that all synchronuous contexts created by the kernel have valid return registers. It's not only getcontext(2) that does that. I was sloppy. The context could be the result of an upcall due to a blocking system call. > > Only the kernel needs to preserve the return registers (which is what > > mc_flags indicates) because it needs to be able to resume system calls. > > > > > Switching out thread 6000000000014000, state 0 > > > Threads in waiting queue: > > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread > > > > This is an asynchronuous context. Probably the result of a trap, but > > possibly the result of an interrupt. Does this mean that the thread > > has run since it was last found (i.e. the previous context) or do we > > have a case where a context is clobbered (I don't see a switch in)? > > Yes, this is the main thread and has run, blocked, and now completed. > All three statements above are from the KSE scheduler as a result of > an upcall. See my comment above. > The same thread (main thread) is being resumed over and over again > which shouldn't happen for this simple program. Can it be that the thread is deadlocked? There's no forward progress. There's only context switching... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 13:55:46 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1816D16A4CE; Sun, 16 Nov 2003 13:55:46 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2AD5143FBD; Sun, 16 Nov 2003 13:55:45 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAGLti1G004585; Sun, 16 Nov 2003 16:55:44 -0500 (EST) Date: Sun, 16 Nov 2003 16:55:44 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031116205616.GB60888@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 21:55:46 -0000 On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > On Sun, Nov 16, 2003 at 03:09:14PM -0500, Daniel Eischen wrote: > > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > On Sun, Nov 16, 2003 at 02:30:20PM -0500, Daniel Eischen wrote: > > > > > > > What should I be looking at, [um]c_flags? > > > > > > mc_flags is very informative. > > > > > > > $ simple > > > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x8, name initial thread > > > > > > This is a context created by the kernel. It's one created by getcontext(). > > > > I'm not sure what you mean by "created by getcontext()". You > > mean get_mcontext(), the syscall getcontext(), or userland > > _ia64_save_context()? It shouldn't be from the syscall getcontext() > > because it is the initial thread. > > I meant getcontext(2). _ia64_save_context() always creates contexts > that have mc_flags=0. Note that all synchronuous contexts created > by the kernel have valid return registers. It's not only getcontext(2) > that does that. I was sloppy. The context could be the result of an > upcall due to a blocking system call. I think it should be the result of being blocked, probably in malloc() and sbrk(). When we were using KSE locks for spinlocks, upcalls were disabled. > > > Only the kernel needs to preserve the return registers (which is what > > > mc_flags indicates) because it needs to be able to resume system calls. > > > > > > > Switching out thread 6000000000014000, state 0 > > > > Threads in waiting queue: > > > > Found completed thread 6000000000014000, uc_flags 0x0, mc_flags 0x3, name initial thread > > > > > > This is an asynchronuous context. Probably the result of a trap, but > > > possibly the result of an interrupt. Does this mean that the thread > > > has run since it was last found (i.e. the previous context) or do we > > > have a case where a context is clobbered (I don't see a switch in)? > > > > Yes, this is the main thread and has run, blocked, and now completed. > > All three statements above are from the KSE scheduler as a result of > > an upcall. > > See my comment above. > > > The same thread (main thread) is being resumed over and over again > > which shouldn't happen for this simple program. > > Can it be that the thread is deadlocked? There's no forward progress. > There's only context switching... I don't think so. I think the thread stack/frame is corrupted, either because it is copied out or resumed incorrectly. I'll do some more digging. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 14:22:03 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C52D516A4CE; Sun, 16 Nov 2003 14:22:03 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3794043FBD; Sun, 16 Nov 2003 14:22:02 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAGMM2be052743; Sun, 16 Nov 2003 14:22:02 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAGMM1bH061333; Sun, 16 Nov 2003 14:22:01 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAGMM0pg061332; Sun, 16 Nov 2003 14:22:01 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 14:22:00 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031116222200.GA61279@dhcp01.pn.xcllnt.net> References: <20031116205616.GB60888@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 22:22:03 -0000 On Sun, Nov 16, 2003 at 04:55:44PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > > The same thread (main thread) is being resumed over and over again > > > which shouldn't happen for this simple program. > > > > Can it be that the thread is deadlocked? There's no forward progress. > > There's only context switching... > > I don't think so. I think the thread stack/frame is corrupted, either > because it is copied out or resumed incorrectly. I'll do some more > digging. I loaded it up in the simulator. The thread is continuously being resumed because of a page fault that results in an upcall, which ends up in the UTS, which selects the same thread, which causes the page fault again. The page fault is the result of a bogus address that in the debugger results in a SIGILL. However, when we don't run in a debugger, the SIGILL doesn't get handled. Hence the non- forward progress. The extensive debug information I posted earlier is therefore still relevant. Now that I have things running in the simulator I'll see if I can figure out where things go wrong. Chances are that we now have an upcall where we didn't have one before and that it exposes incomplete state (such as a thread pointer that hasn't been set). The incomplete state causes the corruption we're seeing. Anyway: I'll be digging too... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 16:54:28 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A15916A4CE; Sun, 16 Nov 2003 16:54:28 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4D00643FEA; Sun, 16 Nov 2003 16:54:27 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAH0sO1G012494; Sun, 16 Nov 2003 19:54:26 -0500 (EST) Date: Sun, 16 Nov 2003 19:54:24 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031116222200.GA61279@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 00:54:28 -0000 On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > On Sun, Nov 16, 2003 at 04:55:44PM -0500, Daniel Eischen wrote: > > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > > > > The same thread (main thread) is being resumed over and over again > > > > which shouldn't happen for this simple program. > > > > > > Can it be that the thread is deadlocked? There's no forward progress. > > > There's only context switching... > > > > I don't think so. I think the thread stack/frame is corrupted, either > > because it is copied out or resumed incorrectly. I'll do some more > > digging. > > I loaded it up in the simulator. The thread is continuously being > resumed because of a page fault that results in an upcall, which > ends up in the UTS, which selects the same thread, which causes the > page fault again. Is it possible the thread is marked for an upcall when the page is not yet present? > The page fault is the result of a bogus address > that in the debugger results in a SIGILL. However, when we don't > run in a debugger, the SIGILL doesn't get handled. Hence the non- > forward progress. > > The extensive debug information I posted earlier is therefore still > relevant. Now that I have things running in the simulator I'll see > if I can figure out where things go wrong. Chances are that we now > have an upcall where we didn't have one before and that it exposes > incomplete state (such as a thread pointer that hasn't been set). > The incomplete state causes the corruption we're seeing. This is kind of what I was thinking too. > Anyway: I'll be digging too... I'm not getting threads@ mail any longer, just the CC. Are you? -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 17:29:50 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B01AB16A4CE; Sun, 16 Nov 2003 17:29:50 -0800 (PST) Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com [218.97.164.167]) by mx1.FreeBSD.org (Postfix) with ESMTP id BCF7A43FE3; Sun, 16 Nov 2003 17:29:44 -0800 (PST) (envelope-from davidxu@viatech.com.cn) Received: from viatech.com.cn (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id WZMFW3WP; Mon, 17 Nov 2003 09:09:40 +0800 Message-ID: <3FB825D9.6050407@viatech.com.cn> Date: Mon, 17 Nov 2003 09:35:21 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5b) Gecko/20030723 Thunderbird/0.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: deischen@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: threads@freebsd.org cc: davidxu@freebsd.org cc: Marcel Moolenaar Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 01:29:50 -0000 Daniel Eischen wrote: >On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > >>On Sun, Nov 16, 2003 at 04:55:44PM -0500, Daniel Eischen wrote: >> >> >>>On Sun, 16 Nov 2003, Marcel Moolenaar wrote: >>> >>> >>> >>>>>The same thread (main thread) is being resumed over and over again >>>>>which shouldn't happen for this simple program. >>>>> >>>>> >>>>Can it be that the thread is deadlocked? There's no forward progress. >>>>There's only context switching... >>>> >>>> >>>I don't think so. I think the thread stack/frame is corrupted, either >>>because it is copied out or resumed incorrectly. I'll do some more >>>digging. >>> >>> >>I loaded it up in the simulator. The thread is continuously being >>resumed because of a page fault that results in an upcall, which >>ends up in the UTS, which selects the same thread, which causes the >>page fault again. >> >> > >Is it possible the thread is marked for an upcall when the >page is not yet present?] > Current, on IA64, page fault never schedules an upcall, I have only enabled it on i386, and peter enabled it on AMD64. > > > >>The page fault is the result of a bogus address >>that in the debugger results in a SIGILL. However, when we don't >>run in a debugger, the SIGILL doesn't get handled. Hence the non- >>forward progress. >> >>The extensive debug information I posted earlier is therefore still >>relevant. Now that I have things running in the simulator I'll see >>if I can figure out where things go wrong. Chances are that we now >>have an upcall where we didn't have one before and that it exposes >>incomplete state (such as a thread pointer that hasn't been set). >>The incomplete state causes the corruption we're seeing. >> >> > >This is kind of what I was thinking too. > > The returned memory block from malloc() is being used by unknown code, I don't know why it occurs, but if you waste a memory block by applying the following patch for thr_alloc(), then things work: Index: thr_kern.c =================================================================== RCS file: /home/ncvs/src/lib/libpthread/thread/thr_kern.c,v retrieving revision 1.102 diff -u -r1.102 thr_kern.c --- thr_kern.c 9 Nov 2003 00:37:14 -0000 1.102 +++ thr_kern.c 17 Nov 2003 01:24:59 -0000 @@ -2422,6 +2422,8 @@ struct pthread *thread = NULL; int i; + malloc(sizeof(struct pthread)); + if (curthread != NULL) { if (GC_NEEDED()) _thr_gc(curthread); From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 17:34:39 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B852E16A4CE; Sun, 16 Nov 2003 17:34:39 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8A99C43FCB; Sun, 16 Nov 2003 17:34:37 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAH1Ybbe053803; Sun, 16 Nov 2003 17:34:37 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAH1YabH061840; Sun, 16 Nov 2003 17:34:36 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAH1Ya2T061839; Sun, 16 Nov 2003 17:34:36 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 17:34:36 -0800 From: Marcel Moolenaar To: deischen@freebsd.org Message-ID: <20031117013436.GA61716@dhcp01.pn.xcllnt.net> References: <20031116222200.GA61279@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: threads@freebsd.org cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 01:34:39 -0000 On Sun, Nov 16, 2003 at 07:54:24PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > On Sun, Nov 16, 2003 at 04:55:44PM -0500, Daniel Eischen wrote: > > > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > > > > > > The same thread (main thread) is being resumed over and over again > > > > > which shouldn't happen for this simple program. > > > > > > > > Can it be that the thread is deadlocked? There's no forward progress. > > > > There's only context switching... > > > > > > I don't think so. I think the thread stack/frame is corrupted, either > > > because it is copied out or resumed incorrectly. I'll do some more > > > digging. > > > > I loaded it up in the simulator. The thread is continuously being > > resumed because of a page fault that results in an upcall, which > > ends up in the UTS, which selects the same thread, which causes the > > page fault again. > > Is it possible the thread is marked for an upcall when the > page is not yet present? No, it's just a side-effect. The problem is that we get an upcall when we allocate struct pthread for the signal daemon. This is before we set the maximum concurrency. The upcall is what's causing problems, because previously the KSE lock prevented the upcall. I don't yet know what exactly is being messed up, but eventually we clobber memory. The clobbering invalidates pointerd and that's what's causing the page faults. Hence, the damage has been done before we get the SIGILL. > I'm not getting threads@ mail any longer, just the CC. Are > you? Yes, I get the threads@ mail. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Sun Nov 16 17:46:22 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0B78B16A4CE; Sun, 16 Nov 2003 17:46:22 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E34C43FBD; Sun, 16 Nov 2003 17:46:21 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAH1kKbe053868; Sun, 16 Nov 2003 17:46:20 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAH1kKbH061873; Sun, 16 Nov 2003 17:46:20 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAH1kK2I061872; Sun, 16 Nov 2003 17:46:20 -0800 (PST) (envelope-from marcel) Date: Sun, 16 Nov 2003 17:46:20 -0800 From: Marcel Moolenaar To: David Xu Message-ID: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> References: <3FB825D9.6050407@viatech.com.cn> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FB825D9.6050407@viatech.com.cn> User-Agent: Mutt/1.5.4i cc: deischen@freebsd.org cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 01:46:22 -0000 On Mon, Nov 17, 2003 at 09:35:21AM +0800, David Xu wrote: > > > The returned memory block from malloc() is being used by unknown code, I > don't know > why it occurs, but if you waste a memory block by applying the following > patch for > thr_alloc(), then things work: The memory block is clobbered by a ucontext_t. This may be the result of the kernel doing the upcall (though indirectly I would suspect). -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Mon Nov 17 11:02:44 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0BF2816A4CE for ; Mon, 17 Nov 2003 11:02:44 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 40FC943F85 for ; Mon, 17 Nov 2003 11:02:43 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id hAHJ2hFY006629 for ; Mon, 17 Nov 2003 11:02:43 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id hAHJ2g0d006623 for freebsd-threads@freebsd.org; Mon, 17 Nov 2003 11:02:42 -0800 (PST) (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 17 Nov 2003 11:02:42 -0800 (PST) Message-Id: <200311171902.hAHJ2g0d006623@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-threads@FreeBSD.org Subject: Current problem reports assigned to you X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:02:44 -0000 Current FreeBSD problem reports Critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/06/13] kern/19247 threads uthread_sigaction.c does not do anything o [2002/01/16] kern/33951 threads pthread_cancel is ignored 2 problems total. Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/07/18] kern/20016 threads pthreads: Cannot set scheduling timer/Can o [2000/08/26] misc/20861 threads libc_r does not honor socket timeouts o [2001/01/19] bin/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVT o [2001/01/25] bin/24632 threads libc_r delicate deviation from libc in ha o [2001/01/25] misc/24641 threads pthread_rwlock_rdlock can deadlock o [2001/04/02] bin/26307 threads libc_r aborts when using the KDE media pl o [2001/10/31] bin/31661 threads pthread_kill signal handler doesn't get s o [2001/11/26] bin/32295 threads pthread dont dequeue signals o [2002/02/01] i386/34536 threads accept() blocks other threads o [2002/03/07] bin/35622 threads sigaltstack is missing in libc_r o [2002/05/25] kern/38549 threads the procces compiled whith pthread stoppe o [2002/06/27] bin/39922 threads [PATCH?] Threaded applications executed w o [2002/08/04] misc/41331 threads Pthread library open sets O_NONBLOCK flag o [2002/10/10] kern/43887 threads abnormal CPU useage when use pthread_mute o [2003/03/02] bin/48856 threads Setting SIGCHLD to SIG_IGN still leaves z o [2003/03/10] bin/49087 threads Signals lost in programs linked with libc a [2003/04/08] bin/50733 threads buildworld won't build, because of linkin o [2003/05/07] bin/51949 threads thread in accept cannot be cancelled o [2003/05/30] kern/52817 threads top(1) shows garbage for threaded process 19 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/05/25] misc/18824 threads gethostbyname is not thread safe o [2000/10/21] misc/22190 threads A threaded read(2) from a socketpair(2) f o [2001/09/09] bin/30464 threads pthread mutex attributes -- pshared o [2002/05/02] bin/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwri o [2002/07/16] misc/40671 threads pthread_cancel doesn't remove thread from 5 problems total. From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 06:33:59 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B60D16A4CE for ; Wed, 19 Nov 2003 06:33:59 -0800 (PST) Received: from gw.celabo.org (gw.celabo.org [208.42.49.153]) by mx1.FreeBSD.org (Postfix) with ESMTP id 70B8343F85 for ; Wed, 19 Nov 2003 06:33:58 -0800 (PST) (envelope-from nectar@celabo.org) Received: from madman.celabo.org (madman.celabo.org [10.0.1.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "madman.celabo.org", Issuer "celabo.org CA" (verified OK)) by gw.celabo.org (Postfix) with ESMTP id 0B8F854883; Wed, 19 Nov 2003 08:33:58 -0600 (CST) Received: by madman.celabo.org (Postfix, from userid 1001) id A87B26D455; Wed, 19 Nov 2003 08:33:57 -0600 (CST) Date: Wed, 19 Nov 2003 08:33:57 -0600 From: "Jacques A. Vidrine" To: Petri Helenius Message-ID: <20031119143357.GG28412@madman.celabo.org> Mail-Followup-To: "Jacques A. Vidrine" , Petri Helenius , freebsd-threads@FreeBSD.org References: <3F4F5467.1050404@he.iki.fi> <20031010171704.GV45920@madman.celabo.org> <3F871BFD.4070506@he.iki.fi> <20031010210354.GA47867@madman.celabo.org> <3F87292F.2030702@he.iki.fi> Mime-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3F87292F.2030702@he.iki.fi> X-Url: http://www.celabo.org/ User-Agent: Mutt/1.5.4i-ja.1 cc: freebsd-threads@FreeBSD.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 14:33:59 -0000 On Sat, Oct 11, 2003 at 12:48:31AM +0300, Petri Helenius wrote: > Jacques A. Vidrine wrote: > > >On Fri, Oct 10, 2003 at 11:52:13PM +0300, Petri Helenius wrote: > > > > > >>Jacques A. Vidrine wrote: > >> > >> > >>> > >>> > >>I ended up putting CFLAGS+=-DOPENSSL_THREADS > >>in make.conf and recompiling world. Works for me. > >> > >>However it would make sense to fix it more cleanly. > > > >You then linked against a threading library? Blew up here. > > > I´m sorry but I´m unable to follow your train of thought, please explain. Can't recall if I replied to this message already. When I built OpenSSL with -DOPENSSL_THREADS, and then linked an OpenSSL-using application with one of our threaded libraries, the application crashed. Cheers, -- Jacques Vidrine NTT/Verio SME FreeBSD UNIX Heimdal nectar@celabo.org jvidrine@verio.net nectar@freebsd.org nectar@kth.se From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:04:18 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0A7D816A4CE; Wed, 19 Nov 2003 07:04:18 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8FB7743FA3; Wed, 19 Nov 2003 07:04:16 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJF4F1G023627; Wed, 19 Nov 2003 10:04:15 -0500 (EST) Date: Wed, 19 Nov 2003 10:04:15 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: "Jacques A. Vidrine" In-Reply-To: <20031119143357.GG28412@madman.celabo.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN Content-Transfer-Encoding: QUOTED-PRINTABLE cc: freebsd-threads@freebsd.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:04:18 -0000 On Wed, 19 Nov 2003, Jacques A. Vidrine wrote: > On Sat, Oct 11, 2003 at 12:48:31AM +0300, Petri Helenius wrote: > > Jacques A. Vidrine wrote: > >=20 > > >On Fri, Oct 10, 2003 at 11:52:13PM +0300, Petri Helenius wrote: > > >=20 > > > > > >>Jacques A. Vidrine wrote: > > >> =20 > > >> > > >>> =20 > > >>> > > >>I ended up putting CFLAGS+=3D-DOPENSSL_THREADS > > >>in make.conf and recompiling world. Works for me. > > >> > > >>However it would make sense to fix it more cleanly. > > > > > >You then linked against a threading library? Blew up here. > > > > > I=B4m sorry but I=B4m unable to follow your train of thought, please ex= plain. >=20 > Can't recall if I replied to this message already. >=20 > When I built OpenSSL with -DOPENSSL_THREADS, and then linked an > OpenSSL-using application with one of our threaded libraries, the > application crashed. $ ldd /usr/lib/libssl.so.3 # or wherever your libssl lies $ ldd ssl_application Are they using different thread libraries? I've never tried this, but just want to be sure you're not blowing up because different libraries are in use. --=20 Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:08:42 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F40F016A4D0 for ; Wed, 19 Nov 2003 07:08:41 -0800 (PST) Received: from gw.celabo.org (gw.celabo.org [208.42.49.153]) by mx1.FreeBSD.org (Postfix) with ESMTP id BC35A43FBD for ; Wed, 19 Nov 2003 07:08:39 -0800 (PST) (envelope-from nectar@celabo.org) Received: from madman.celabo.org (madman.celabo.org [10.0.1.111]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "madman.celabo.org", Issuer "celabo.org CA" (verified OK)) by gw.celabo.org (Postfix) with ESMTP id 5912254883; Wed, 19 Nov 2003 09:08:39 -0600 (CST) Received: by madman.celabo.org (Postfix, from userid 1001) id DE5EB6D455; Wed, 19 Nov 2003 09:08:38 -0600 (CST) Date: Wed, 19 Nov 2003 09:08:38 -0600 From: "Jacques A. Vidrine" To: Daniel Eischen Message-ID: <20031119150838.GA63955@madman.celabo.org> Mail-Followup-To: "Jacques A. Vidrine" , Daniel Eischen , Petri Helenius , freebsd-threads@freebsd.org References: <20031119143357.GG28412@madman.celabo.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Url: http://www.celabo.org/ User-Agent: Mutt/1.5.4i-ja.1 cc: freebsd-threads@freebsd.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:08:42 -0000 On Wed, Nov 19, 2003 at 10:04:15AM -0500, Daniel Eischen wrote: > $ ldd /usr/lib/libssl.so.3 # or wherever your libssl lies > $ ldd ssl_application > > Are they using different thread libraries? I've never tried > this, but just want to be sure you're not blowing up because > different libraries are in use. At the time, they were both referencing libc_r. Cheers, -- Jacques Vidrine NTT/Verio SME FreeBSD UNIX Heimdal nectar@celabo.org jvidrine@verio.net nectar@freebsd.org nectar@kth.se From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:13:57 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CFE2D16A4CE; Wed, 19 Nov 2003 07:13:57 -0800 (PST) Received: from silver.he.iki.fi (helenius.fi [193.64.42.241]) by mx1.FreeBSD.org (Postfix) with ESMTP id AE64C43FCB; Wed, 19 Nov 2003 07:13:55 -0800 (PST) (envelope-from pete@he.iki.fi) Received: from he.iki.fi (localhost [127.0.0.1]) by silver.he.iki.fi (8.12.9p2/8.11.4) with ESMTP id hAJFDqgr041957; Wed, 19 Nov 2003 17:13:52 +0200 (EET) (envelope-from pete@he.iki.fi) Message-ID: <3FBB8893.3080607@he.iki.fi> Date: Wed, 19 Nov 2003 17:13:23 +0200 From: Petri Helenius User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Eischen References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: "Jacques A. Vidrine" cc: freebsd-threads@freebsd.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:13:58 -0000 Daniel Eischen wrote: > $ ldd /usr/lib/libssl.so.3 # or wherever your libssl lies > >$ ldd ssl_application > >Are they using different thread libraries? I've never tried >this, but just want to be sure you're not blowing up because >different libraries are in use. > > > I use the only choice on 4.X and libkse on 5.X. Pete From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:21:51 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1A48816A4CE; Wed, 19 Nov 2003 07:21:51 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3D51A44003; Wed, 19 Nov 2003 07:21:40 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJFLd1G027970; Wed, 19 Nov 2003 10:21:39 -0500 (EST) Date: Wed, 19 Nov 2003 10:21:39 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: "Jacques A. Vidrine" In-Reply-To: <20031119150838.GA63955@madman.celabo.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@FreeBSD.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:21:51 -0000 On Wed, 19 Nov 2003, Jacques A. Vidrine wrote: > On Wed, Nov 19, 2003 at 10:04:15AM -0500, Daniel Eischen wrote: > > $ ldd /usr/lib/libssl.so.3 # or wherever your libssl lies > > $ ldd ssl_application > > > > Are they using different thread libraries? I've never tried > > this, but just want to be sure you're not blowing up because > > different libraries are in use. > > At the time, they were both referencing libc_r. > Cheers, Can you guys try building with debugging enabled and get some more info (with libkse)? I'd like to see if it's openssl or the application, or our thread library. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:43:41 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4D36F16A4CF; Wed, 19 Nov 2003 07:43:41 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 54EFC43F3F; Wed, 19 Nov 2003 07:43:40 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJFhd1G003242; Wed, 19 Nov 2003 10:43:39 -0500 (EST) Date: Wed, 19 Nov 2003 10:43:39 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Petri Helenius In-Reply-To: <3FBB8D3B.10009@he.iki.fi> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN Content-Transfer-Encoding: QUOTED-PRINTABLE cc: "Jacques A. Vidrine" cc: freebsd-threads@FreeBSD.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:43:41 -0000 On Wed, 19 Nov 2003, Petri Helenius wrote: > Daniel Eischen wrote: >=20 > >Can you guys try building with debugging enabled and get some > >more info (with libkse)? I'd like to see if it's openssl or > >the application, or our thread library. > > > > =20 > > > Me? Everything works for me after doing the change described. >=20 > If you don=B4t compile openssl thread-safe is crashing too violent option > or is there some other concern? No, I'm only interested in [thread library] problems when built thread-safe. If you don't build openssl thread safe, then all bets are off if you try using it multi-threaded. --=20 Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 07:53:01 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4F94216A4CE; Wed, 19 Nov 2003 07:53:01 -0800 (PST) Received: from silver.he.iki.fi (helenius.fi [193.64.42.241]) by mx1.FreeBSD.org (Postfix) with ESMTP id E5AA743FE9; Wed, 19 Nov 2003 07:52:59 -0800 (PST) (envelope-from pete@he.iki.fi) Received: from he.iki.fi (localhost [127.0.0.1]) by silver.he.iki.fi (8.12.9p2/8.11.4) with ESMTP id hAJFqwgr042150; Wed, 19 Nov 2003 17:52:58 +0200 (EET) (envelope-from pete@he.iki.fi) Message-ID: <3FBB91BD.8040807@he.iki.fi> Date: Wed, 19 Nov 2003 17:52:29 +0200 From: Petri Helenius User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Eischen References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: "Jacques A. Vidrine" cc: freebsd-threads@FreeBSD.org Subject: Re: threads and openssl X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 15:53:01 -0000 Daniel Eischen wrote: > >No, I'm only interested in [thread library] problems when built >thread-safe. If you don't build openssl thread safe, then all >bets are off if you try using it multi-threaded. > > > It works for me, however I would love to see the default to be thread-safe in future 5.X releases. Pete From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 12:27:20 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4F14B16A4CF for ; Wed, 19 Nov 2003 12:27:20 -0800 (PST) Received: from ms-smtp-03-eri0.socal.rr.com (ms-smtp-03-qfe0.socal.rr.com [66.75.162.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6096443FB1 for ; Wed, 19 Nov 2003 12:27:17 -0800 (PST) (envelope-from sean@mcneil.com) Received: from blue.mcneil.com (cpe-66-75-176-109.socal.rr.com [66.75.176.109])hAJKRC6h010809 for ; Wed, 19 Nov 2003 12:27:15 -0800 (PST) Received: from [66.75.176.109] (mcneil.com [66.75.176.109]) by blue.mcneil.com (8.12.10/8.12.10) with ESMTP id hAJKR6Ue094017 for ; Wed, 19 Nov 2003 12:27:06 -0800 (PST) (envelope-from sean@mcneil.com) From: Sean McNeil To: freebsd-threads@freebsd.org Content-Type: text/plain Organization: Sean McNeil Consulting Message-Id: <1069273626.93981.10.camel@blue.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Wed, 19 Nov 2003 12:27:06 -0800 Content-Transfer-Encoding: 7bit X-DCC-dmv.com-Metrics: blue.mcneil.com 1181; Body=1 Fuz1=1 Fuz2=1 X-Virus-Scanned: Symantec AntiVirus Scan Engine Subject: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 20:27:20 -0000 Hello all, I have an interesting problem. I've ported a device driver to FreeBSD from Linux and all works just fine with the non-threaded application. When I use threads, the first 8 pages of the map get removed from the process vmspace. A trick is used to mmap different memory pools of the driver. There are 3 of them and an offset is given to mmap to identify which one: 0x10000000 0x20000000 0x40000000 The buffer I see losing the first 8 pages is the one mmap'd with 0x40000000 I haven't looked into the other mmaps, as the above one is the most important. mmap is called correctly and each page is returned appropriately. Again, this works without threads. Is there some mmap region that threads uses that is conflicting with my choice? Is there any known issue with mmap and threads? This problem happens with libc_r.so, libkse.so, and libthr.so. The work I'm doing is split into driver/application and turnaround is high to get the application people to recompile with different values (in progress), so I thought asking here might answer my question sooner than experimentation. All comments/replies are greatly appreciated, Sean From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:24:48 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6C31816A4CE for ; Wed, 19 Nov 2003 13:24:48 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id ED72443FBD for ; Wed, 19 Nov 2003 13:24:45 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJLOj1G026771; Wed, 19 Nov 2003 16:24:45 -0500 (EST) Date: Wed, 19 Nov 2003 16:24:45 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Sean McNeil In-Reply-To: <1069273626.93981.10.camel@blue.mcneil.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:24:48 -0000 On Wed, 19 Nov 2003, Sean McNeil wrote: > Hello all, > > I have an interesting problem. I've ported a device driver to FreeBSD > from Linux and all works just fine with the non-threaded application. > When I use threads, the first 8 pages of the map get removed from the > process vmspace. > > A trick is used to mmap different memory pools of the driver. There are > 3 of them and an offset is given to mmap to identify which one: > > 0x10000000 > 0x20000000 > 0x40000000 > > The buffer I see losing the first 8 pages is the one mmap'd with > > 0x40000000 > > I haven't looked into the other mmaps, as the above one is the most > important. mmap is called correctly and each page is returned > appropriately. Again, this works without threads. > > Is there some mmap region that threads uses that is conflicting with my > choice? Is there any known issue with mmap and threads? This problem > happens with libc_r.so, libkse.so, and libthr.so. > > The work I'm doing is split into driver/application and turnaround is > high to get the application people to recompile with different values > (in progress), so I thought asking here might answer my question sooner > than experimentation. > > All comments/replies are greatly appreciated, > Sean The thread libraries use mmap to map kern.usrstack as thread stacks and guard pages. I don't know how this would affect your driver. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:26:27 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B75B916A4CE for ; Wed, 19 Nov 2003 13:26:27 -0800 (PST) Received: from ms-smtp-01-eri0.socal.rr.com (ms-smtp-01-qfe0.socal.rr.com [66.75.162.133]) by mx1.FreeBSD.org (Postfix) with ESMTP id 92D7143FAF for ; Wed, 19 Nov 2003 13:26:21 -0800 (PST) (envelope-from sean@mcneil.com) Received: from blue.mcneil.com (cpe-66-75-176-109.socal.rr.com [66.75.176.109])hAJLQAcX020089; Wed, 19 Nov 2003 13:26:10 -0800 (PST) Received: from [66.75.176.109] (mcneil.com [66.75.176.109]) by blue.mcneil.com (8.12.10/8.12.10) with ESMTP id hAJLQAUe004056; Wed, 19 Nov 2003 13:26:10 -0800 (PST) (envelope-from sean@mcneil.com) From: Sean McNeil To: Daniel Eischen In-Reply-To: References: Content-Type: text/plain Organization: Sean McNeil Consulting Message-Id: <1069277169.4052.0.camel@blue.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Wed, 19 Nov 2003 13:26:10 -0800 Content-Transfer-Encoding: 7bit X-DCC-dmv.com-Metrics: blue.mcneil.com 1181; Body=2 Fuz1=2 Fuz2=2 cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:26:27 -0000 OK, would this happen to be 8 pages typically? On Wed, 2003-11-19 at 13:24, Daniel Eischen wrote: > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > Hello all, > > > > I have an interesting problem. I've ported a device driver to FreeBSD > > from Linux and all works just fine with the non-threaded application. > > When I use threads, the first 8 pages of the map get removed from the > > process vmspace. > > > > A trick is used to mmap different memory pools of the driver. There are > > 3 of them and an offset is given to mmap to identify which one: > > > > 0x10000000 > > 0x20000000 > > 0x40000000 > > > > The buffer I see losing the first 8 pages is the one mmap'd with > > > > 0x40000000 > > > > I haven't looked into the other mmaps, as the above one is the most > > important. mmap is called correctly and each page is returned > > appropriately. Again, this works without threads. > > > > Is there some mmap region that threads uses that is conflicting with my > > choice? Is there any known issue with mmap and threads? This problem > > happens with libc_r.so, libkse.so, and libthr.so. > > > > The work I'm doing is split into driver/application and turnaround is > > high to get the application people to recompile with different values > > (in progress), so I thought asking here might answer my question sooner > > than experimentation. > > > > All comments/replies are greatly appreciated, > > Sean > > The thread libraries use mmap to map kern.usrstack as > thread stacks and guard pages. I don't know how this > would affect your driver. From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:27:55 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5A32616A4D0 for ; Wed, 19 Nov 2003 13:27:55 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3283243F3F for ; Wed, 19 Nov 2003 13:27:54 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJLRp1G027408; Wed, 19 Nov 2003 16:27:51 -0500 (EST) Date: Wed, 19 Nov 2003 16:27:51 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Marcel Moolenaar In-Reply-To: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:27:55 -0000 On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > On Mon, Nov 17, 2003 at 09:35:21AM +0800, David Xu wrote: > > > > > The returned memory block from malloc() is being used by unknown code, I > > don't know > > why it occurs, but if you waste a memory block by applying the following > > patch for > > thr_alloc(), then things work: > > The memory block is clobbered by a ucontext_t. This may be the result > of the kernel doing the upcall (though indirectly I would suspect). Any more on this. I haven't been able to find anything on our end. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:30:39 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 79E2516A4CE for ; Wed, 19 Nov 2003 13:30:39 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9FA9D43FE1 for ; Wed, 19 Nov 2003 13:30:38 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJLUc1G028033; Wed, 19 Nov 2003 16:30:38 -0500 (EST) Date: Wed, 19 Nov 2003 16:30:37 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Sean McNeil In-Reply-To: <1069277169.4052.0.camel@blue.mcneil.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:30:39 -0000 On Wed, 19 Nov 2003, Sean McNeil wrote: > OK, would this happen to be 8 pages typically? It depends; see the comment and ascii art in src/lib/libpthread/thread/thr_alloc.c. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:34:11 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7DBD916A4CE for ; Wed, 19 Nov 2003 13:34:11 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8AD4A43F75 for ; Wed, 19 Nov 2003 13:34:10 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from ns1.xcllnt.net (localhost [127.0.0.1]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAJLYAbe074992; Wed, 19 Nov 2003 13:34:10 -0800 (PST) (envelope-from marcel@ns1.xcllnt.net) Received: (from marcel@localhost) by ns1.xcllnt.net (8.12.9/8.12.9/Submit) id hAJLYAfg074991; Wed, 19 Nov 2003 13:34:10 -0800 (PST) (envelope-from marcel) Date: Wed, 19 Nov 2003 13:34:10 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031119213410.GA74882@ns1.xcllnt.net> References: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:34:11 -0000 On Wed, Nov 19, 2003 at 04:27:51PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > On Mon, Nov 17, 2003 at 09:35:21AM +0800, David Xu wrote: > > > > > > > The returned memory block from malloc() is being used by unknown code, I > > > don't know > > > why it occurs, but if you waste a memory block by applying the following > > > patch for > > > thr_alloc(), then things work: > > > > The memory block is clobbered by a ucontext_t. This may be the result > > of the kernel doing the upcall (though indirectly I would suspect). > > Any more on this. I haven't been able to find anything > on our end. Not yet. I got side-tracked by dynamic root breakages. That's over with for now, so I'm back on KSE... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:38:39 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4BEEE16A4CE for ; Wed, 19 Nov 2003 13:38:39 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [208.142.252.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id 190FA43FBF for ; Wed, 19 Nov 2003 13:38:38 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id hAJLcTc32081; Wed, 19 Nov 2003 16:38:29 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 19 Nov 2003 16:38:29 -0500 (EST) From: Jeff Roberson To: Daniel Eischen In-Reply-To: Message-ID: <20031119163813.E10222-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Sean McNeil cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:38:39 -0000 On Wed, 19 Nov 2003, Daniel Eischen wrote: > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > OK, would this happen to be 8 pages typically? > > It depends; see the comment and ascii art in > src/lib/libpthread/thread/thr_alloc.c. Have you tried with libc_r, libthr, and libkse? > > -- > Dan Eischen > > _______________________________________________ > freebsd-threads@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:42:32 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 337EE16A4CE for ; Wed, 19 Nov 2003 13:42:32 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4D97243FDF for ; Wed, 19 Nov 2003 13:42:31 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hAJLgR1G000797; Wed, 19 Nov 2003 16:42:27 -0500 (EST) Date: Wed, 19 Nov 2003 16:42:27 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: Jeff Roberson In-Reply-To: <20031119163813.E10222-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Sean McNeil cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:42:32 -0000 On Wed, 19 Nov 2003, Jeff Roberson wrote: > > On Wed, 19 Nov 2003, Daniel Eischen wrote: > > > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > > > OK, would this happen to be 8 pages typically? > > > > It depends; see the comment and ascii art in > > src/lib/libpthread/thread/thr_alloc.c. > > Have you tried with libc_r, libthr, and libkse? I'm sorry I cut the original email so soon. In it, he said he tried all three libraries with the same result. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:43:18 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0C7FB16A4CE for ; Wed, 19 Nov 2003 13:43:18 -0800 (PST) Received: from ms-smtp-03-eri0.socal.rr.com (ms-smtp-03-qfe0.socal.rr.com [66.75.162.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id 208BC43F3F for ; Wed, 19 Nov 2003 13:43:15 -0800 (PST) (envelope-from sean@mcneil.com) Received: from blue.mcneil.com (cpe-66-75-176-109.socal.rr.com [66.75.176.109])hAJLh86h027684; Wed, 19 Nov 2003 13:43:09 -0800 (PST) Received: from [66.75.176.109] (mcneil.com [66.75.176.109]) by blue.mcneil.com (8.12.10/8.12.10) with ESMTP id hAJLh8Ue004148; Wed, 19 Nov 2003 13:43:08 -0800 (PST) (envelope-from sean@mcneil.com) From: Sean McNeil To: Jeff Roberson In-Reply-To: <20031119163813.E10222-100000@mail.chesapeake.net> References: <20031119163813.E10222-100000@mail.chesapeake.net> Content-Type: text/plain Organization: Sean McNeil Consulting Message-Id: <1069278187.4118.0.camel@blue.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Wed, 19 Nov 2003 13:43:08 -0800 Content-Transfer-Encoding: 7bit X-DCC-dmv.com-Metrics: blue.mcneil.com 1181; Body=3 Fuz1=3 Fuz2=3 X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:43:18 -0000 Yes, I mentioned this in my original post. They all have the same problem. On Wed, 2003-11-19 at 13:38, Jeff Roberson wrote: > On Wed, 19 Nov 2003, Daniel Eischen wrote: > > > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > > > OK, would this happen to be 8 pages typically? > > > > It depends; see the comment and ascii art in > > src/lib/libpthread/thread/thr_alloc.c. > > Have you tried with libc_r, libthr, and libkse? > > > > > -- > > Dan Eischen > > > > _______________________________________________ > > freebsd-threads@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > > > From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 13:54:06 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B6A6016A4CE for ; Wed, 19 Nov 2003 13:54:06 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [208.142.252.6]) by mx1.FreeBSD.org (Postfix) with ESMTP id 61F0C43FDF for ; Wed, 19 Nov 2003 13:54:05 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id hAJLs0b42715; Wed, 19 Nov 2003 16:54:00 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 19 Nov 2003 16:54:00 -0500 (EST) From: Jeff Roberson To: Sean McNeil In-Reply-To: <1069278187.4118.0.camel@blue.mcneil.com> Message-ID: <20031119165234.D10222-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 21:54:06 -0000 On Wed, 19 Nov 2003, Sean McNeil wrote: > Yes, I mentioned this in my original post. They all have the same > problem. If you mount procfs you can look through the vm map for the process. You want /proc//map I believe. Please note that the address returned by your driver routine is a physical address that will be mapped by the kernel at a new virtual address. User-space can pass you only the offset into your memory range, and not a real address. Cheers, Jeff > > On Wed, 2003-11-19 at 13:38, Jeff Roberson wrote: > > On Wed, 19 Nov 2003, Daniel Eischen wrote: > > > > > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > > > > > OK, would this happen to be 8 pages typically? > > > > > > It depends; see the comment and ascii art in > > > src/lib/libpthread/thread/thr_alloc.c. > > > > Have you tried with libc_r, libthr, and libkse? > > > > > > > > -- > > > Dan Eischen > > > > > > _______________________________________________ > > > freebsd-threads@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > > > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > > > > > > From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 15:39:03 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 17E7016A4CE for ; Wed, 19 Nov 2003 15:39:03 -0800 (PST) Received: from ms-smtp-01-eri0.socal.rr.com (ms-smtp-01-qfe0.socal.rr.com [66.75.162.133]) by mx1.FreeBSD.org (Postfix) with ESMTP id F323843FB1 for ; Wed, 19 Nov 2003 15:39:01 -0800 (PST) (envelope-from sean@mcneil.com) Received: from blue.mcneil.com (cpe-66-75-176-109.socal.rr.com [66.75.176.109])hAJNcrcX014438; Wed, 19 Nov 2003 15:38:59 -0800 (PST) Received: from [66.75.176.109] (mcneil.com [66.75.176.109]) by blue.mcneil.com (8.12.10/8.12.10) with ESMTP id hAJNclUe004793; Wed, 19 Nov 2003 15:38:47 -0800 (PST) (envelope-from sean@mcneil.com) From: Sean McNeil To: Jeff Roberson In-Reply-To: <20031119165234.D10222-100000@mail.chesapeake.net> References: <20031119165234.D10222-100000@mail.chesapeake.net> Content-Type: text/plain Organization: Sean McNeil Consulting Message-Id: <1069285127.4781.2.camel@blue.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Wed, 19 Nov 2003 15:38:47 -0800 Content-Transfer-Encoding: 7bit X-DCC-dmv.com-Metrics: blue.mcneil.com 1181; Body=3 Fuz1=3 Fuz2=3 cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 23:39:03 -0000 Actually, what I'm doing is looking through the process map. What I'm seeing is that the first 8 pages are always missing. Here is the routine that I'm using: static vm_map_entry_t find_entry (vm_map_t map, vm_paddr_t addr) { vm_map_entry_t entry = &map->header; while (1) { vm_page_t page; vm_paddr_t paddr; vm_object_t object; if (entry->eflags & MAP_ENTRY_IS_SUB_MAP) { vm_map_entry_t sub_entry = find_entry (entry->object.sub_map, addr); if (sub_entry) return sub_entry; goto next_entry; } object = entry->object.vm_object; if (object == NULL || object->type != OBJT_DEVICE) goto next_entry; VM_OBJECT_LOCK (object); TAILQ_FOREACH (page, &object->un_pager.devp.devp_pglist, pageq) { if (VM_PAGE_TO_PHYS (page) == addr) { VM_OBJECT_UNLOCK (object); return entry; } } VM_OBJECT_UNLOCK (object); next_entry: entry = entry->next; if (entry == &map->header) break; } return NULL; } I've actually tested now with memory mmap'd with an offset of both 0x40000000 and 0x04000000 same results with both. The result being entry returned by the above routine is NULL. Sean On Wed, 2003-11-19 at 13:54, Jeff Roberson wrote: > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > Yes, I mentioned this in my original post. They all have the same > > problem. > > If you mount procfs you can look through the vm map for the process. You > want /proc//map I believe. Please note that the address returned by > your driver routine is a physical address that will be mapped by the > kernel at a new virtual address. User-space can pass you only the offset > into your memory range, and not a real address. > > Cheers, > Jeff > > > > > On Wed, 2003-11-19 at 13:38, Jeff Roberson wrote: > > > On Wed, 19 Nov 2003, Daniel Eischen wrote: > > > > > > > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > > > > > > > OK, would this happen to be 8 pages typically? > > > > > > > > It depends; see the comment and ascii art in > > > > src/lib/libpthread/thread/thr_alloc.c. > > > > > > Have you tried with libc_r, libthr, and libkse? > > > > > > > > > > > -- > > > > Dan Eischen > > > > > > > > _______________________________________________ > > > > freebsd-threads@freebsd.org mailing list > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > > > > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > > > > > > > > > > From owner-freebsd-threads@FreeBSD.ORG Wed Nov 19 16:02:41 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 10B2816A4CE for ; Wed, 19 Nov 2003 16:02:41 -0800 (PST) Received: from ms-smtp-01-eri0.socal.rr.com (ms-smtp-01-qfe0.socal.rr.com [66.75.162.133]) by mx1.FreeBSD.org (Postfix) with ESMTP id E59A343FE3 for ; Wed, 19 Nov 2003 16:02:39 -0800 (PST) (envelope-from sean@mcneil.com) Received: from blue.mcneil.com (cpe-66-75-176-109.socal.rr.com [66.75.176.109])hAK02ZcX000581; Wed, 19 Nov 2003 16:02:36 -0800 (PST) Received: from [66.75.176.109] (mcneil.com [66.75.176.109]) by blue.mcneil.com (8.12.10/8.12.10) with ESMTP id hAK02OUe004970; Wed, 19 Nov 2003 16:02:24 -0800 (PST) (envelope-from sean@mcneil.com) From: Sean McNeil To: Jeff Roberson In-Reply-To: <20031119165234.D10222-100000@mail.chesapeake.net> References: <20031119165234.D10222-100000@mail.chesapeake.net> Content-Type: text/plain Organization: Sean McNeil Consulting Message-Id: <1069286544.4964.1.camel@blue.mcneil.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Wed, 19 Nov 2003 16:02:24 -0800 Content-Transfer-Encoding: 7bit X-DCC-dmv.com-Metrics: blue.mcneil.com 1181; Body=3 Fuz1=3 Fuz2=3 X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-threads@freebsd.org Subject: Re: Losing pages from a mmap in threaded app vs. non-threaded X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 00:02:41 -0000 I think I know what the significance of the 8 pages is: I essentially write 8 pages at a time. So I think that those pages are getting moved from the proc vmspace map somehow after a thread writes to them. Where would the pages go? On Wed, 2003-11-19 at 13:54, Jeff Roberson wrote: > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > Yes, I mentioned this in my original post. They all have the same > > problem. > > If you mount procfs you can look through the vm map for the process. You > want /proc//map I believe. Please note that the address returned by > your driver routine is a physical address that will be mapped by the > kernel at a new virtual address. User-space can pass you only the offset > into your memory range, and not a real address. > > Cheers, > Jeff > > > > > On Wed, 2003-11-19 at 13:38, Jeff Roberson wrote: > > > On Wed, 19 Nov 2003, Daniel Eischen wrote: > > > > > > > On Wed, 19 Nov 2003, Sean McNeil wrote: > > > > > > > > > OK, would this happen to be 8 pages typically? > > > > > > > > It depends; see the comment and ascii art in > > > > src/lib/libpthread/thread/thr_alloc.c. > > > > > > Have you tried with libc_r, libthr, and libkse? > > > > > > > > > > > -- > > > > Dan Eischen > > > > > > > > _______________________________________________ > > > > freebsd-threads@freebsd.org mailing list > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > > > > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" > > > > > > > > > > From owner-freebsd-threads@FreeBSD.ORG Thu Nov 20 02:35:47 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 02D1616A4CF for ; Thu, 20 Nov 2003 02:35:47 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32A4C43FBF for ; Thu, 20 Nov 2003 02:34:43 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hAKAXebe079222; Thu, 20 Nov 2003 02:33:40 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hAKAXdmK010482; Thu, 20 Nov 2003 02:33:39 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hAKAWO0k010463; Thu, 20 Nov 2003 02:32:24 -0800 (PST) (envelope-from marcel) Date: Thu, 20 Nov 2003 02:32:24 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031120103224.GB10417@dhcp01.pn.xcllnt.net> References: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="fUYQa+Pmc3FrFX/N" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 10:35:47 -0000 --fUYQa+Pmc3FrFX/N Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Nov 19, 2003 at 04:27:51PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > The memory block is clobbered by a ucontext_t. This may be the result > > of the kernel doing the upcall (though indirectly I would suspect). > > Any more on this. I haven't been able to find anything > on our end. Another piece of the puzzle. If you apply the attached patch, KSE works with rev 1.18 of thr_spinlock.c. I have to analyze how this makes a difference, but my first guess is that the syscall results in an upcall and a context switch (be it to the same thread). This sanitizes internal state and prevents corruption. More tomorrow... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net --fUYQa+Pmc3FrFX/N Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="kse.diff" Index: pthread_md.h =================================================================== RCS file: /home/ncvs/src/lib/libpthread/arch/ia64/include/pthread_md.h,v retrieving revision 1.11 diff -u -r1.11 pthread_md.h --- pthread_md.h 19 Sep 2003 23:28:13 -0000 1.11 +++ pthread_md.h 20 Nov 2003 10:07:37 -0000 @@ -35,7 +35,10 @@ #define KSE_STACKSIZE 16384 -#define THR_GETCONTEXT(ucp) _ia64_save_context(&(ucp)->uc_mcontext) +#define THR_GETCONTEXT(ucp) do { \ + __sys_write(1, "X\n", 2); \ + _ia64_save_context(&(ucp)->uc_mcontext); \ + } while (0) #define THR_SETCONTEXT(ucp) PANIC("THR_SETCONTEXT() now in use!\n") #define PER_THREAD --fUYQa+Pmc3FrFX/N-- From owner-freebsd-threads@FreeBSD.ORG Fri Nov 21 02:17:19 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1128F16A4CE for ; Fri, 21 Nov 2003 02:17:19 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 570AB43FD7 for ; Fri, 21 Nov 2003 02:16:15 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from athlon.pn.xcllnt.net (athlon.pn.xcllnt.net [192.168.4.3]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hALAFCEG004800; Fri, 21 Nov 2003 02:15:12 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from athlon.pn.xcllnt.net (localhost [127.0.0.1]) hALAFC20092718; Fri, 21 Nov 2003 02:15:12 -0800 (PST) (envelope-from marcel@athlon.pn.xcllnt.net) Received: (from marcel@localhost) by athlon.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hALADuGp092706; Fri, 21 Nov 2003 02:13:56 -0800 (PST) (envelope-from marcel) Date: Fri, 21 Nov 2003 02:13:56 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20031121101356.GA92329@athlon.pn.xcllnt.net> References: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="nFreZHaLTZJo0R7j" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.5.1i cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 10:17:19 -0000 --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Nov 19, 2003 at 04:27:51PM -0500, Daniel Eischen wrote: > > > > > > > The returned memory block from malloc() is being used by unknown code, I > > > don't know > > > why it occurs, but if you waste a memory block by applying the following > > > patch for > > > thr_alloc(), then things work: > > > > The memory block is clobbered by a ucontext_t. This may be the result > > of the kernel doing the upcall (though indirectly I would suspect). > > Any more on this. I haven't been able to find anything > on our end. Ok. More pieces of the puzzle. If I apply the attached patch (against clean sources), I get the following: itanium% ./foo.bad XXX:_thr_alloc: thread=200000000008a000, tcb=2000000000085000 XXX:_thr_alloc: thread=2000000000090000, tcb=2000000000090000 The second _thr_alloc() is screwed up, in that malloc() returns the same pointer twice. Hence thread->tcb points to thread itself and we're clobbering our thread structure. Since thr_spinlock.c affects the locking of malloc(), we may have a race condition. Note that forcing an upcall (by adding a _thread_printf() in the code stream) seems to fix it. Does the UTS call malloc when first invoked? -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="kse.diff" Index: thr_kern.c =================================================================== RCS file: /home/ncvs/src/lib/libpthread/thread/thr_kern.c,v retrieving revision 1.102 diff -u -r1.102 thr_kern.c --- thr_kern.c 9 Nov 2003 00:37:14 -0000 1.102 +++ thr_kern.c 21 Nov 2003 09:31:22 -0000 @@ -2443,6 +2443,8 @@ free(thread); thread = NULL; } else { + _thread_printf(1, "XXX:%s: thread=%p, tcb=%p\n", + __func__, thread, thread->tcb); /* * Initialize thread locking. * Lock initializing needs malloc, so don't --nFreZHaLTZJo0R7j-- From owner-freebsd-threads@FreeBSD.ORG Fri Nov 21 04:22:23 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0997E16A4CE; Fri, 21 Nov 2003 04:22:23 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5D00543FE0; Fri, 21 Nov 2003 04:22:22 -0800 (PST) (envelope-from davidxu@freebsd.org) Received: from freebsd.org (davidxu@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id hALCMGFY018182; Fri, 21 Nov 2003 04:22:18 -0800 (PST) (envelope-from davidxu@freebsd.org) Message-ID: <3FBE061B.3070206@freebsd.org> Date: Fri, 21 Nov 2003 20:33:31 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031025 Thunderbird/0.3 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Marcel Moolenaar References: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> <20031121101356.GA92329@athlon.pn.xcllnt.net> In-Reply-To: <20031121101356.GA92329@athlon.pn.xcllnt.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 12:22:23 -0000 Marcel Moolenaar wrote: >On Wed, Nov 19, 2003 at 04:27:51PM -0500, Daniel Eischen wrote: > > >>>>The returned memory block from malloc() is being used by unknown code, I >>>>don't know >>>>why it occurs, but if you waste a memory block by applying the following >>>>patch for >>>>thr_alloc(), then things work: >>>> >>>> >>>The memory block is clobbered by a ucontext_t. This may be the result >>>of the kernel doing the upcall (though indirectly I would suspect). >>> >>> >>Any more on this. I haven't been able to find anything >>on our end. >> >> > >Ok. More pieces of the puzzle. If I apply the attached patch (against >clean sources), I get the following: > >itanium% ./foo.bad >XXX:_thr_alloc: thread=200000000008a000, tcb=2000000000085000 >XXX:_thr_alloc: thread=2000000000090000, tcb=2000000000090000 > >The second _thr_alloc() is screwed up, in that malloc() returns >the same pointer twice. Hence thread->tcb points to thread itself >and we're clobbering our thread structure. > I saw the same result. >Since thr_spinlock.c >affects the locking of malloc(), we may have a race condition. >Note that forcing an upcall (by adding a _thread_printf() in the >code stream) seems to fix it. Does the UTS call malloc when first >invoked? > > > No, we never call malloc in such case. I suspect we do not fully restore thread's context. In kernel, I pass zero as third parameter to get_mcontext(), is it enough for ia64 ? From owner-freebsd-threads@FreeBSD.ORG Fri Nov 21 06:12:19 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4238B16A4CF; Fri, 21 Nov 2003 06:12:19 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 663EA43FEA; Fri, 21 Nov 2003 06:12:18 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mail.pcnet.com (8.12.10/8.12.1) with ESMTP id hALECF1G000395; Fri, 21 Nov 2003 09:12:15 -0500 (EST) Date: Fri, 21 Nov 2003 09:12:15 -0500 (EST) From: Daniel Eischen X-Sender: eischen@pcnet5.pcnet.com To: David Xu In-Reply-To: <3FBE061B.3070206@freebsd.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: David Xu cc: threads@freebsd.org cc: Marcel Moolenaar Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: deischen@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 14:12:19 -0000 On Fri, 21 Nov 2003, David Xu wrote: > Marcel Moolenaar wrote: > > >Ok. More pieces of the puzzle. If I apply the attached patch (against > >clean sources), I get the following: > > > >itanium% ./foo.bad > >XXX:_thr_alloc: thread=200000000008a000, tcb=2000000000085000 > >XXX:_thr_alloc: thread=2000000000090000, tcb=2000000000090000 > > > >The second _thr_alloc() is screwed up, in that malloc() returns > >the same pointer twice. Hence thread->tcb points to thread itself > >and we're clobbering our thread structure. > > > I saw the same result. > > >Since thr_spinlock.c > >affects the locking of malloc(), we may have a race condition. > >Note that forcing an upcall (by adding a _thread_printf() in the > >code stream) seems to fix it. Does the UTS call malloc when first > >invoked? > > > > > > > No, we never call malloc in such case. I suspect we do not > fully restore thread's context. In kernel, I pass zero as third > parameter to get_mcontext(), is it enough for ia64 ? Well, we do call malloc at library initialization. We malloc initial KSE & thread, locks, and a few other things. But this is before __isthreaded gets set (so the spinlock shouldn't be used). All this is done before the first thread is created. -- Dan Eischen From owner-freebsd-threads@FreeBSD.ORG Fri Nov 21 08:26:17 2003 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A8D4A16A4CE; Fri, 21 Nov 2003 08:26:17 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4DD4443FB1; Fri, 21 Nov 2003 08:26:16 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.9/8.12.9) with ESMTP id hALGQFEG006911; Fri, 21 Nov 2003 08:26:15 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) hALGQFdL003374; Fri, 21 Nov 2003 08:26:15 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.10/8.12.10/Submit) id hALGQ9cv003373; Fri, 21 Nov 2003 08:26:09 -0800 (PST) (envelope-from marcel) Date: Fri, 21 Nov 2003 08:26:09 -0800 From: Marcel Moolenaar To: David Xu Message-ID: <20031121162609.GA3258@dhcp01.pn.xcllnt.net> References: <20031117014620.GB61716@dhcp01.pn.xcllnt.net> <20031121101356.GA92329@athlon.pn.xcllnt.net> <3FBE061B.3070206@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FBE061B.3070206@freebsd.org> User-Agent: Mutt/1.5.4i cc: David Xu cc: threads@freebsd.org Subject: Re: KSE/ia64 broken X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 16:26:17 -0000 On Fri, Nov 21, 2003 at 08:33:31PM +0800, David Xu wrote: > > > >Ok. More pieces of the puzzle. If I apply the attached patch (against > >clean sources), I get the following: > > > >itanium% ./foo.bad > >XXX:_thr_alloc: thread=200000000008a000, tcb=2000000000085000 > >XXX:_thr_alloc: thread=2000000000090000, tcb=2000000000090000 > > > >The second _thr_alloc() is screwed up, in that malloc() returns > >the same pointer twice. Hence thread->tcb points to thread itself > >and we're clobbering our thread structure. > > > I saw the same result. > > >Since thr_spinlock.c > >affects the locking of malloc(), we may have a race condition. > >Note that forcing an upcall (by adding a _thread_printf() in the > >code stream) seems to fix it. Does the UTS call malloc when first > >invoked? > > > No, we never call malloc in such case. I suspect we do not > fully restore thread's context. In kernel, I pass zero as third > parameter to get_mcontext(), is it enough for ia64 ? Yes. The context is asynchronous. We save and restore all scratch registers, including the high FP registers. Note that an incorrect context restoration would very likely not have such a clean failure mode. The thing that bugs me is that if you add a _thread_printf() just prior to the call to _thr_alloc(), you trigger an upcall. That seems to make all the difference. It's like having to avoid that the UTS gets its first upcall with a spinlock held. What also bugs me is that the second malloc happily returns the same address as the malloc immediately prior to it. There's no indication of corruption. It's like the first malloc never happened or that the memory got freed in between. If you look at it from a more context oriented point of view; it's like the second malloc is returning the results of the first malloc as if the context of the first (assuming it got saved) is restored by the second. This could mean that if the context switching is normal, that we missed saving a context and we're restoring a stale context. Anyway: upcalls play a key role. BTW: Maybe an interesting experiment is to disable upcalls on page faults on i386 and see if that makes a difference. We do not have upcalls for page faults on ia64. There may be an upcall on i386 that we do not get on ia64... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net