Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Dec 2009 10:09:51 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-stable@freebsd.org
Cc:        freebsd-hackers@freebsd.org, Steven Hartland <killing@multiplay.co.uk>
Subject:   Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?
Message-ID:  <200912181009.51798.jhb@freebsd.org>
In-Reply-To: <28F90357192743E085ABEE7CD4C9FDF9@multiplay.co.uk>
References:  <DD0B1DB4EEAE4FB49FFFE1FDF5E9D7E3@multiplay.co.uk> <200912170908.49119.jhb@freebsd.org> <28F90357192743E085ABEE7CD4C9FDF9@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote:
> ----- Original Message ----- 
> From: "John Baldwin" <jhb@freebsd.org>
> > For the hang it seems you have a thread waiting in a blocking read(), a thread 
> > waiting in a blocking accept(), and lots of threads creating condition 
> > variables.  However, the pthread_cond_init() in libpthread (libthr on FreeBSD) 
> > doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to 
> > me.  However, that may be gdb getting confused.  The pthread_cleanup_push() 
> > frame may be cond_init().  However, it doesn't call umtx_op() (the 
> > _thr_umutex_init() call it makes just initializes the structure, it doesn't 
> > make a _umtx_op() system call).  You might try posting on threads@ to try to 
> > get more info on this, but your pthread_cond_init() stack traces don't really 
> > make sense.  Can you rebuild libc and libthr with debug symbols?
> > 
> > For example:
> > 
> > # cd /usr/src/lib/libc
> > # make clean 
> > # make DEBUG_FLAGS=-g
> > # make DEBUG_FLAGS=-g install
> > 
> > However, if you are hanging in read(), that usually means you have a socket 
> > that just doesn't have data.  That might be an application bug of some sort.
> > 
> > The segv trace doesn't include the first part of GDB messages which show which 
> > thread actually had a seg fault.  It looks like it was the thread that was 
> > throwing an exception.  However, nanosleep() doesn't throw exceptions, so that 
> > stack trace doesn't really make sense either.  Perhaps that stack is hosed by 
> > the exception handling code?
> 
> I've uploaded a two more traces for the oxt test failure / segv.
> http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
> 
> >From looking at the test case it testing the capture of failures and its ability
> to create a stack trace output so that may give others some indication where
> the issue may be?
> 
> I will look to do the same on for the hang issue but that's on a live site so
> will need to schedule some downtime before I can get those rebuilt and then
> wait for it to hang again, which could be quite some time :(

Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200912181009.51798.jhb>