Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Nov 2008 14:51:57 -0800
From:      "Peter Wemm" <peter@wemm.org>
To:        "John Baldwin" <jhb@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, David Xu <davidxu@freebsd.org>
Subject:   Re: svn commit: r184216 - head/sys/kern
Message-ID:  <e7db6d980811041451w1da54fa3lceeed73a640f51a0@mail.gmail.com>
In-Reply-To: <200811041707.26052.jhb@freebsd.org>
References:  <200810240103.m9O13V7f071075@svn.freebsd.org> <200811041707.26052.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 4, 2008 at 2:07 PM, John Baldwin <jhb@freebsd.org> wrote:
> On Thursday 23 October 2008 09:03:31 pm David Xu wrote:
>> Author: davidxu
>> Date: Fri Oct 24 01:03:31 2008
>> New Revision: 184216
>> URL: http://svn.freebsd.org/changeset/base/184216
>>
>> Log:
>>   partly revert revision 184199, because TDF_NEEDSIGCHK is persitent
>>   when thread is in kernel mode, it can cause dead loop, now unlock
>>   process lock after acquired sleep queue lock and thread lock to
>>   avoid the problem. This means TDF_NEEDSIGCHK and TDF_NEEDSUSPCHK must
>>   be set with process lock and thread lock being hold at same time.
>
> You can't unlock the proc lock while holding the thread_lock().  This will
> lead to deadlock due to the way that thread_lock() works.  This is different
> from the rules in 6.x where you could drop a mutex while holding sched_lock.
> You will need to revert this.
>
> --
> John Baldwin


I had to back out rev 184216 and 184199 in total in order to stop my
machine from dying.

Compile this dumb program:
http://people.freebsd.org/~peter/pth.c
$ cc -pthread -o pth pth.c
run in a shell while loop so that the entire thing is execed and exits
repeatedly.
$ while true; do date; ./pth; done

On my 2-core athlon64 box at home, and the 8-core ref8-i386 in the
freebsd.org cluster, this causes a lockup in mere seconds.

Backing out these two changes solves it.

my machine:
spin lock 0xffffff00a4037000 (turnstile lock) held by
0xffffff01045746e0 (tid 100355) too long
panic: spin lock held too long

ref8-i386:
spin lock 0xc06436c0 (sched lock 5) held by 0xd374f690 (tid 100249) too long
panic: spin lock held too long
-- 
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e7db6d980811041451w1da54fa3lceeed73a640f51a0>