From owner-freebsd-hackers Mon Oct 21 12:45:22 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5111F37B401 for ; Mon, 21 Oct 2002 12:45:17 -0700 (PDT) Received: from straylight.ringlet.net (office.sbnd.net [217.75.140.130]) by mx1.FreeBSD.org (Postfix) with SMTP id 0152643E4A for ; Mon, 21 Oct 2002 12:45:13 -0700 (PDT) (envelope-from roam@ringlet.net) Received: (qmail 26891 invoked by uid 1000); 21 Oct 2002 19:44:53 -0000 Date: Mon, 21 Oct 2002 22:44:53 +0300 From: Peter Pentchev To: Linus Kendall Cc: freebsd-hackers@freebsd.org Subject: Re: PThreads problem Message-ID: <20021021194453.GB377@straylight.oblivion.bg> Mail-Followup-To: Linus Kendall , freebsd-hackers@freebsd.org References: <1035200159.24315.13.camel@bilbo> <20021021124520.GS389@straylight.oblivion.bg> <1035206648.24315.20.camel@bilbo> <20021021134834.GA41198@straylight.oblivion.bg> <20021021135045.GB41198@straylight.oblivion.bg> <1035218026.24330.33.camel@bilbo> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QKdGvSO+nmPlgiQ/" Content-Disposition: inline In-Reply-To: <1035218026.24330.33.camel@bilbo> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --QKdGvSO+nmPlgiQ/ Content-Type: text/plain; charset=windows-1251 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 21, 2002 at 06:33:46PM +0200, Linus Kendall wrote: > Answer inline below. >=20 > m?n 2002-10-21 klockan 15.50 skrev Peter Pentchev: > > On Mon, Oct 21, 2002 at 04:48:34PM +0300, Peter Pentchev wrote: > > > On Mon, Oct 21, 2002 at 03:24:08PM +0200, Linus Kendall wrote: > > > > m?n 2002-10-21 klockan 14.45 skrev Peter Pentchev: > > > > > On Mon, Oct 21, 2002 at 01:35:59PM +0200, Linus Kendall wrote: > > > > > > Hi, > > > > > >=20 > > > > > > I'm trying to port a heavily threaded application from Linux (D= ebian > > > > > > 3.0, 2.4.19) to > > > > > > FreeBSD (4.6-RELEASE). The program compiles successfully using = gcc with > > > > > > -pthreads. But, when I try to run the application I get the fol= lowing > > > > > > error after a while (after spawning 11 threads): > > > > > >=20 > > > > > > Fatal error 'siglongjmp()ing between thread contexts is undefin= ed by > > > > > > POSIX 1003.1' at line ? in file > > > > > > /usr/src/lib/libc_r/uthread/uthread_jmp.c (errno =3D ?) > > > > > > Abort trap - core dumped > > > > > >=20 [snip] > > > This is interesting; can you produce a simple testcase? If not, I wi= ll > > > be able to take a look at it some time later today or tomorrow, but n= ot > > > right now :( >=20 > I'm not sure if I've really got time to produce a testcase. As I've > understood the main cause of the crash was that in *BSD the signals > are sent to each thread but in Linux they're sent to the process. Okay, I can see what the problem is; however, I have absolutely no idea how it is to be solved :( The DNS resolution routines of libcurl use alarm() as a timeout mechanism for the system DNS resolving functions. To enforce the timeout even when the resolver functions are automatically restarted after the SIGALRM signal, libcurl attempts to set a jump buffer in the thread doing the DNS lookup, and to siglongjmp() to it from the SIGALRM handler. This works just fine on Linux, where each thread executes as a separate process; the signal is correctly delivered to the thread which invoked alarm(), and, consequently, exactly the one that set the jump buffer in the first place. On FreeBSD, however, the signal is delivered merely to the currently executing thread; if the resolver routines are currently in the process of sending or receiving data on a network socket, the currently executing thread may very well not be the one that has requested the resolving, and so siglongjmp() may be called from a thread which is NOT the one the jump buffer has been set in. As the abort error message states, this is behavior not covered by any standards, and, I dare say, not very easy to implement at all, so it is currently unimplemented in FreeBSD. For a standards reference, the SUSv2 siglongjmp() manpage at http://www.opengroup.org/onlinepubs/007908799/xsh/siglongjmp.html explicitly states at the end of the DESCRIPTION section: The effect of a call to siglongjmp() where initialisation of the jmp_buf structure was not performed in the calling thread is undefined. > Blocking all signals resulted in an application which executed but > still I got problems with slow responses from libcurl As I understand it, the only reason for SIGALRM to make a difference would be a situation where a DNS query times out, at least by libcurl's standards. Is your application trying to do such lookups? If anybody is interested, I am attaching a short proof-of-concept program which starts up two threads, then waits for a signal handler to hit. If the longjmp() call is commented out, it displays the thread ID of the thread which received the signal - almost always the main thread, the one listed as 'me' in the list output at the program start, and most definitely not the last thread to call setjmp(), as that would be 't2'. If the longjmp() call is uncommented, the signal handler executing in the 'me' thread will longjmp() to a buffer initialized in the 't2' thread, and the program will abort with your error message with a 100% failure (or would that be success in proving the concept?) rate. People knowledgeable about threads: would there be a way to fix that problem? I don't know.. something like examining the jump buffer, then activating the thread that is stored there, and resuming the currently executing thread at the point where it was interrupted by the signal? Without looking at the code, I can guess that most probably the answer would be a short burst of hysterical laughter :) Still.. one may hope.. :) G'luck, Peter --=20 Peter Pentchev roam@ringlet.net roam@FreeBSD.org PGP key: http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint FDBA FD79 C26F 3C51 C95E DF9E ED18 B68D 1619 4553 Hey, out there - is it *you* reading me, or is it someone else? #include #include #include #include #include #include pthread_mutex_t mtxQ; int q[16]; pthread_t tq[16]; size_t qcnt; sigjmp_buf jmpbuf; static void sigalarm(int f) { pthread_mutex_lock(&mtxQ); q[qcnt] =3D f; tq[qcnt] =3D pthread_self(); qcnt++; pthread_mutex_unlock(&mtxQ); // siglongjmp(jmpbuf, 5); } static void * thr(void *arg) { sigsetjmp(jmpbuf, 0); sleep((int)arg); return (NULL); } int main(void) { pthread_t t1, t2; size_t i; struct sigaction sa; sigsetjmp(jmpbuf, 0); pthread_mutex_init(&mtxQ, NULL); printf("me =3D %ld\n", (long)pthread_self()); pthread_create(&t1, NULL, thr, (void *)4); printf("t1 =3D %ld\n", (long)t1); pthread_create(&t2, NULL, thr, (void *)5); printf("t2 =3D %ld\n", (long)t2); memset(&sa, 0, sizeof(sa)); sa.sa_handler =3D sigalarm; sigemptyset(&sa.sa_mask); sigaddset(&sa.sa_mask, SIGALRM); sigaction(SIGALRM, &sa, NULL); alarm(1); printf("qcnt =3D %u\n", qcnt); sleep(3); printf("qcnt =3D %u\n", qcnt); sleep(3); printf("qcnt =3D %u\n", qcnt); sleep(3); printf("qcnt =3D %u\n", qcnt); for (i =3D 0; i < qcnt; i++) printf("%2d\t%d\t%ld\n", i, q[i], (long)tq[i]); return (0); } --QKdGvSO+nmPlgiQ/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE9tFk17Ri2jRYZRVMRAiD7AKCHcKXNfptMBTuXuDFhsWK6FDkKkQCglLay VqYWWD9o76hlCsGxBjMXXNk= =IdBl -----END PGP SIGNATURE----- --QKdGvSO+nmPlgiQ/-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message