Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Oct 2002 22:44:53 +0300
From:      Peter Pentchev <roam@ringlet.net>
To:        Linus Kendall <linus@angliaab.se>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: PThreads problem
Message-ID:  <20021021194453.GB377@straylight.oblivion.bg>
In-Reply-To: <1035218026.24330.33.camel@bilbo>
References:  <1035200159.24315.13.camel@bilbo> <20021021124520.GS389@straylight.oblivion.bg> <1035206648.24315.20.camel@bilbo> <20021021134834.GA41198@straylight.oblivion.bg> <20021021135045.GB41198@straylight.oblivion.bg> <1035218026.24330.33.camel@bilbo>

next in thread | previous in thread | raw e-mail | index | archive | help

--QKdGvSO+nmPlgiQ/
Content-Type: text/plain; charset=windows-1251
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Oct 21, 2002 at 06:33:46PM +0200, Linus Kendall wrote:
> Answer inline below.
>=20
> m?n 2002-10-21 klockan 15.50 skrev Peter Pentchev:
> > On Mon, Oct 21, 2002 at 04:48:34PM +0300, Peter Pentchev wrote:
> > > On Mon, Oct 21, 2002 at 03:24:08PM +0200, Linus Kendall wrote:
> > > > m?n 2002-10-21 klockan 14.45 skrev Peter Pentchev:
> > > > > On Mon, Oct 21, 2002 at 01:35:59PM +0200, Linus Kendall wrote:
> > > > > > Hi,
> > > > > >=20
> > > > > > I'm trying to port a heavily threaded application from Linux (D=
ebian
> > > > > > 3.0, 2.4.19) to
> > > > > > FreeBSD (4.6-RELEASE). The program compiles successfully using =
gcc with
> > > > > > -pthreads. But, when I try to run the application I get the fol=
lowing
> > > > > > error after a while (after spawning 11 threads):
> > > > > >=20
> > > > > > Fatal error 'siglongjmp()ing between thread contexts is undefin=
ed by
> > > > > > POSIX 1003.1' at line ? in file
> > > > > > /usr/src/lib/libc_r/uthread/uthread_jmp.c (errno =3D ?)
> > > > > > Abort trap - core dumped
> > > > > >=20
[snip]
> > > This is interesting; can you produce a simple testcase?  If not, I wi=
ll
> > > be able to take a look at it some time later today or tomorrow, but n=
ot
> > > right now :(
>=20
> I'm not sure if I've really got time to produce a testcase. As I've
> understood the main cause of the crash was that in *BSD the signals
> are sent to each thread but in Linux they're sent to the process.

Okay, I can see what the problem is; however, I have absolutely no idea
how it is to be solved :(

The DNS resolution routines of libcurl use alarm() as a timeout
mechanism for the system DNS resolving functions.  To enforce the
timeout even when the resolver functions are automatically restarted
after the SIGALRM signal, libcurl attempts to set a jump buffer in the
thread doing the DNS lookup, and to siglongjmp() to it from the SIGALRM
handler.

This works just fine on Linux, where each thread executes as a separate
process; the signal is correctly delivered to the thread which invoked
alarm(), and, consequently, exactly the one that set the jump buffer in
the first place.

On FreeBSD, however, the signal is delivered merely to the currently
executing thread; if the resolver routines are currently in the process
of sending or receiving data on a network socket, the currently
executing thread may very well not be the one that has requested the
resolving, and so siglongjmp() may be called from a thread which is NOT
the one the jump buffer has been set in.  As the abort error message
states, this is behavior not covered by any standards, and, I dare say,
not very easy to implement at all, so it is currently unimplemented in
FreeBSD.  For a standards reference, the SUSv2 siglongjmp() manpage at
http://www.opengroup.org/onlinepubs/007908799/xsh/siglongjmp.html
explicitly states at the end of the DESCRIPTION section:

  The effect of a call to siglongjmp() where initialisation of the jmp_buf
  structure was not performed in the calling thread is undefined.

> Blocking all signals resulted in an application which executed but
> still I got problems with slow responses from libcurl

As I understand it, the only reason for SIGALRM to make a difference
would be a situation where a DNS query times out, at least by libcurl's
standards.  Is your application trying to do such lookups?

If anybody is interested, I am attaching a short proof-of-concept
program which starts up two threads, then waits for a signal handler to
hit.  If the longjmp() call is commented out, it displays the thread ID
of the thread which received the signal - almost always the main thread,
the one listed as 'me' in the list output at the program start, and most
definitely not the last thread to call setjmp(), as that would be 't2'.
If the longjmp() call is uncommented, the signal handler executing in
the 'me' thread will longjmp() to a buffer initialized in the 't2'
thread, and the program will abort with your error message with a 100%
failure (or would that be success in proving the concept?) rate.

People knowledgeable about threads: would there be a way to fix that
problem?  I don't know.. something like examining the jump buffer, then
activating the thread that is stored there, and resuming the currently
executing thread at the point where it was interrupted by the signal?
Without looking at the code, I can guess that most probably the answer
would be a short burst of hysterical laughter :)  Still.. one may hope..
:)

G'luck,
Peter

--=20
Peter Pentchev	roam@ringlet.net	roam@FreeBSD.org
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
Hey, out there - is it *you* reading me, or is it someone else?

#include <sys/types.h>

#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

pthread_mutex_t	 mtxQ;
int		 q[16];
pthread_t	 tq[16];
size_t		 qcnt;
sigjmp_buf	 jmpbuf;

static void
sigalarm(int f)
{

	pthread_mutex_lock(&mtxQ);
	q[qcnt] =3D f;
	tq[qcnt] =3D pthread_self();
	qcnt++;
	pthread_mutex_unlock(&mtxQ);
//	siglongjmp(jmpbuf, 5);
}

static void *
thr(void *arg)
{

	sigsetjmp(jmpbuf, 0);
	sleep((int)arg);
	return (NULL);
}

int
main(void)
{
	pthread_t t1, t2;
	size_t i;
	struct sigaction sa;

	sigsetjmp(jmpbuf, 0);
	pthread_mutex_init(&mtxQ, NULL);
	printf("me =3D %ld\n", (long)pthread_self());
	pthread_create(&t1, NULL, thr, (void *)4);
	printf("t1 =3D %ld\n", (long)t1);
	pthread_create(&t2, NULL, thr, (void *)5);
	printf("t2 =3D %ld\n", (long)t2);
	memset(&sa, 0, sizeof(sa));
	sa.sa_handler =3D sigalarm;
	sigemptyset(&sa.sa_mask);
	sigaddset(&sa.sa_mask, SIGALRM);
	sigaction(SIGALRM, &sa, NULL);
	alarm(1);
	printf("qcnt =3D %u\n", qcnt);
	sleep(3);
	printf("qcnt =3D %u\n", qcnt);
	sleep(3);
	printf("qcnt =3D %u\n", qcnt);
	sleep(3);
	printf("qcnt =3D %u\n", qcnt);
	for (i =3D 0; i < qcnt; i++)
		printf("%2d\t%d\t%ld\n", i, q[i], (long)tq[i]);
	return (0);
}

--QKdGvSO+nmPlgiQ/
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (FreeBSD)

iD8DBQE9tFk17Ri2jRYZRVMRAiD7AKCHcKXNfptMBTuXuDFhsWK6FDkKkQCglLay
VqYWWD9o76hlCsGxBjMXXNk=
=IdBl
-----END PGP SIGNATURE-----

--QKdGvSO+nmPlgiQ/--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021021194453.GB377>