Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Dec 2001 12:46:24 -0600
From:      Alfred Perlstein <bright@mu.org>
To:        Daniel Eischen <deischen@gdeb.com>
Cc:        Dan Eischen <eischen@vigrid.com>, Louis-Philippe Gagnon <louisphilippe@macadamian.com>, freebsd-current@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG
Subject:   Re: Possible libc_r pthread bug
Message-ID:  <20011204124624.L92148@elvis.mu.org>
In-Reply-To: <3C0D1680.E3461FB@gdeb.com>; from deischen@gdeb.com on Tue, Dec 04, 2001 at 01:31:28PM -0500
References:  <094601c179ea$7cca85c0$2964a8c0@MACADAMIAN.com> <Pine.SUN.3.91.1011130170847.14642A-100000@pcnet1.pcnet.com> <20011204021815.E92148@elvis.mu.org> <3C0CC2FE.275F4C68@vigrid.com> <20011204114236.H92148@elvis.mu.org> <3C0D1680.E3461FB@gdeb.com>

next in thread | previous in thread | raw e-mail | index | archive | help
* Daniel Eischen <deischen@gdeb.com> [011204 12:32] wrote:
> Alfred Perlstein wrote:
> > 
> > * Dan Eischen <eischen@vigrid.com> [011204 06:26] wrote:
> > >
> > > There are already cancellation tests when resuming threads
> > > whose contexts are not saved as a result of a signal interrupt
> > > (ctxtype != CTX_UC). You shouldn't test for cancellation when
> > > ctxtype == CTX_UC because you are running on the scheduler
> > > stack, not the threads stack.
> > 
> > That makes sense, but why?
> 
> Because when a thread gets cancelled, pthread_exit gets called
> which then calls the scheduler again.  It is also possible to
> get interrupted during this process and the threads context
> (which is operating on the scheduler stack) could get saved.
> The scheduler could get entered again, and if the thread
> gets resumed, it'll longjmp to the saved context which is the
> scheduler stack (and which was just trashed by entering the
> scheduler again).
> 
> It is too confusing to try to handle conditions like this, and
> the threads library doesn't need to get any more confusing ;-)
> Once the scheduler is entered, no pthread routines should
> be called and the scheduler should not be recursively
> entered.  The only way out of the scheduler should be a
> longjmp or sigreturn to a saved threads context.

Ok, for the sake of beating a clue into me...

in uthread_kern.c:_thread_kern_sched

                /* Save the state of the current thread: */
                if (_setjmp(curthread->ctx.jb) == 0) {
                        /* Flag the jump buffer was the last state saved: */
                        curthread->ctxtype = CTX_JB_NOSIG;
                        curthread->longjmp_val = 1;
                } else {
                        DBG_MSG("Returned from ___longjmp, thread %p\n",
                            curthread);
                        /*
                         * This point is reached when a longjmp() is called
                         * to restore the state of a thread.
                         *
                         * This is the normal way out of the scheduler.
                         */
                        _thread_kern_in_sched = 0;

                        if (curthread->sig_defer_count == 0) {
                                if (((curthread->cancelflags &
                                    PTHREAD_AT_CANCEL_POINT) == 0) &&
                                    ((curthread->cancelflags &
                                    PTHREAD_CANCEL_ASYNCHRONOUS) != 0))
                                        /*
                                         * Cancellations override signals.
                                         *
                                         * Stick a cancellation point at the
                                         * start of each async-cancellable
                                         * thread's resumption.
                                         *
                                         * We allow threads woken at cancel
                                         * points to do their own checks.
                                         */
                                        pthread_testcancel();
                        }

Why isn't this "working", shouldn't it be doing the right thing?
What if curthread->sig_defer_count wasn't tested?
Maybe this should be a test against curthread->sig_defer_count <= 1?

I'll play with this some more when I get back to my box at home,
it just seems bizarro to me.


-- 
-Alfred Perlstein [alfred@freebsd.org]
'Instead of asking why a piece of software is using "1970s technology,"
 start asking why software is ignoring 30 years of accumulated wisdom.'
                           http://www.morons.org/rants/gpl-harmful.php3

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011204124624.L92148>