Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Oct 2002 16:09:29 -0400 (EDT)
From:      Daniel Eischen <eischen@pcnet1.pcnet.com>
To:        Peter Pentchev <roam@ringlet.net>
Cc:        Linus Kendall <linus@angliaab.se>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: PThreads problem
Message-ID:  <Pine.GSO.4.10.10210211555100.29011-100000@pcnet1.pcnet.com>
In-Reply-To: <20021021194453.GB377@straylight.oblivion.bg>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 21 Oct 2002, Peter Pentchev wrote:
> Okay, I can see what the problem is; however, I have absolutely no idea
> how it is to be solved :(
> 
> The DNS resolution routines of libcurl use alarm() as a timeout
> mechanism for the system DNS resolving functions.  To enforce the
> timeout even when the resolver functions are automatically restarted
> after the SIGALRM signal, libcurl attempts to set a jump buffer in the
> thread doing the DNS lookup, and to siglongjmp() to it from the SIGALRM
> handler.
> 
> This works just fine on Linux, where each thread executes as a separate
> process; the signal is correctly delivered to the thread which invoked
> alarm(), and, consequently, exactly the one that set the jump buffer in
> the first place.

This demonstrates one of the evils of LinuxThreads.  Now there's
code that is seemingly dependent on non-portable thread behaviour.
Well, perhaps it's not an evil of LinuxThreads, but of an evil
library/application ;-)

> On FreeBSD, however, the signal is delivered merely to the currently
> executing thread; if the resolver routines are currently in the process
> of sending or receiving data on a network socket, the currently
> executing thread may very well not be the one that has requested the
> resolving, and so siglongjmp() may be called from a thread which is NOT
> the one the jump buffer has been set in.  As the abort error message
> states, this is behavior not covered by any standards, and, I dare say,
> not very easy to implement at all, so it is currently unimplemented in
> FreeBSD.  For a standards reference, the SUSv2 siglongjmp() manpage at
> http://www.opengroup.org/onlinepubs/007908799/xsh/siglongjmp.html
> explicitly states at the end of the DESCRIPTION section:
> 
>   The effect of a call to siglongjmp() where initialisation of the jmp_buf
>   structure was not performed in the calling thread is undefined.

Right, and I think there's a little stronger wording than that
in the '96 POSIX spec.  It doesn't make sense regardless because
you'd be trying to jump to a context that uses another thread's
stack.

> > Blocking all signals resulted in an application which executed but
> > still I got problems with slow responses from libcurl
> 
> As I understand it, the only reason for SIGALRM to make a difference
> would be a situation where a DNS query times out, at least by libcurl's
> standards.  Is your application trying to do such lookups?
> 
> If anybody is interested, I am attaching a short proof-of-concept
> program which starts up two threads, then waits for a signal handler to
> hit.  If the longjmp() call is commented out, it displays the thread ID
> of the thread which received the signal - almost always the main thread,
> the one listed as 'me' in the list output at the program start, and most
> definitely not the last thread to call setjmp(), as that would be 't2'.
> If the longjmp() call is uncommented, the signal handler executing in
> the 'me' thread will longjmp() to a buffer initialized in the 't2'
> thread, and the program will abort with your error message with a 100%
> failure (or would that be success in proving the concept?) rate.

You shouldn't be calling pthread_mutex_lock and friends from
a signal handler.

> People knowledgeable about threads: would there be a way to fix that
> problem?  I don't know.. something like examining the jump buffer, then
> activating the thread that is stored there, and resuming the currently
> executing thread at the point where it was interrupted by the signal?
> Without looking at the code, I can guess that most probably the answer
> would be a short burst of hysterical laughter :)  Still.. one may hope..
> :)

No, this can't be easily fixed nor would we want to fix it ;-)

There are other ways to wait for events.  If multiple threads
need to have timeouts then you can always create a server
thread to process timeouts.  You can use mutexes and condition
variables to add and remove timeout events to the server thread,
and the server thread can use sigalarm() and sigwait() or
sigsuspend() to wait for the alarm.  When the sigwait()/sigsuspend()
wakeup, the server thread can pull event timeouts off its list
and signal the timedout threads using pthread_kill() or whatever.

-- 
Dan Eischen


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.10.10210211555100.29011-100000>