Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Aug 2004 00:14:52 -0500
From:      Jon Noack <noackjr@alumni.rice.edu>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: Deadlocks with recent SMP current
Message-ID:  <4121944C.5060802@alumni.rice.edu>
In-Reply-To: <411EF85A.30006@elischer.org>
References:  <20040813121208.M31181@cvs.imp.ch> <20040813102922.E93695@carver.gumbysoft.com> <411D20DF.2000503@samsco.org> <411E9399.3050200@alumni.rice.edu> <411EF85A.30006@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08/15/04 00:44, Julian Elischer wrote:
> Jon Noack wrote:
>> On 08/13/04 15:13, Scott Long wrote:
>>> Can you try the patch below? It's really only a band-aid, but might 
>>> make things usable for now. Also, are more lockups being seen under 
>>> ULE or under 4BSD. There was a recent change to ULE (rev 1.120 of 
>>> sched_ule.c) that seems to have aggrivated the scheduler problems on 
>>> my test systems.
>>>
>>> Scott
>>>
>>> Index: kern_switch.c
>>> ===================================================================
>>> RCS file: /usr/ncvs/src/sys/kern/kern_switch.c,v
>>> retrieving revision 1.78
>>> diff -u -r1.78 kern_switch.c
>>> --- kern_switch.c       10 Aug 2004 00:26:25 -0000      1.78
>>> +++ kern_switch.c       13 Aug 2004 20:11:27 -0000
>>> @@ -345,6 +345,8 @@
>>>                 return;
>>>         }
>>>
>>> +       critical_enter();
>>> +
>>>         tda = kg->kg_last_assigned;
>>>         if ((ke = td->td_kse) == NULL) {
>>>                 if (kg->kg_idle_kses) {
>>> @@ -441,6 +443,7 @@
>>>                 CTR3(KTR_RUNQ, "setrunqueue: held: td%p kg%p pid%d",
>>>                         td, td->td_ksegrp, td->td_proc->p_pid);
>>>         }
>>> +       critical_exit();
>>>  }
>>>
>>>  /*
>>
>> Here's a data point:
>> My dual Pentium3 system has been up for 20+ hours with this patch. 
>> Previously, it wouldn't survive for more than an hour or so 
>> (regardless of load).
> 
> try the following change instead:
> in maybe_preempt() in kern_switch.c
> 
>         ctd = curthread;
> +        if ((ctd->td_kse == NULL) || (ctd->td_kse->ke_thread != ctd))
> +               return (0);
>         pri = td->td_priority;

With the previous patch I still had difficulties getting through a 
buildworld in multi-user (while running apache, postfix+amavisd-new, 
nfs, etc.).  With this patch I have not run into any issues (make -j4 
buildworlds are stable on my dual p3 even after uncommenting 
-DUSE_KQUEUE and rebuilding make).  If the last patch was a bandaid, 
this is one of those new-fangled "sport" bandaids that are water- and 
sweat-resistent... ;-)

Jon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4121944C.5060802>