From owner-freebsd-current@FreeBSD.ORG Thu Nov 17 19:42:57 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05982106564A; Thu, 17 Nov 2011 19:42:57 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id D5D0C8FC13; Thu, 17 Nov 2011 19:42:55 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA22942; Thu, 17 Nov 2011 21:42:54 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1RR7rK-000C6b-2G; Thu, 17 Nov 2011 21:42:54 +0200 Message-ID: <4EC563BB.60209@FreeBSD.org> Date: Thu, 17 Nov 2011 21:42:51 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:8.0) Gecko/20111108 Thunderbird/8.0 MIME-Version: 1.0 To: John Baldwin References: <20111113083215.GV50300@deviant.kiev.zoral.com.ua> <201111171137.18663.jhb@freebsd.org> <4EC53D1B.4000308@FreeBSD.org> <201111171409.37629.jhb@freebsd.org> In-Reply-To: <201111171409.37629.jhb@freebsd.org> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Kostik Belousov , Alexander Motin , freebsd-current@FreeBSD.org Subject: Re: Stop scheduler on panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Nov 2011 19:42:57 -0000 on 17/11/2011 21:09 John Baldwin said the following: > On Thursday, November 17, 2011 11:58:03 am Andriy Gapon wrote: >> on 17/11/2011 18:37 John Baldwin said the following: >>> On Thursday, November 17, 2011 4:47:42 am Andriy Gapon wrote: >>>> on 17/11/2011 10:34 Andriy Gapon said the following: >>>>> on 17/11/2011 10:15 Kostik Belousov said the following: >>>>>> I have the following change for eons on my test boxes. Without it, >>>>>> I simply cannot get _any_ dump. >>>>>> >>>>>> diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c >>>>>> index 10b89c7..a38e42f 100644 >>>>>> --- a/sys/cam/cam_xpt.c >>>>>> +++ b/sys/cam/cam_xpt.c >>>>>> @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb) >>>>>> TAILQ_INSERT_TAIL(&cam_simq, sim, links); >>>>>> mtx_unlock(&cam_simq_lock); >>>>>> sim->flags |= CAM_SIM_ON_DONEQ; >>>>>> - if (first) >>>>>> + if (first && panicstr == NULL) >>>>>> swi_sched(cambio_ih, 0); >>>>>> } >>>>>> } >>>>> >>>>> I think that this (or similar) change should go into the patch and the tree. >>>>> >>>> >>>> And, BTW, I still would like to do something like the following (perhaps with >>>> td_oncpu = NOCPU and td_flags &= ~TDF_NEEDRESCHED also moved to the common code): >>>> >>>> Index: sys/kern/sched_ule.c >>>> =================================================================== >>>> --- sys/kern/sched_ule.c (revision 227608) >>>> +++ sys/kern/sched_ule.c (working copy) >>>> @@ -1790,7 +1790,6 @@ sched_switch(struct thread *td, struct thread *new >>>> td->td_oncpu = NOCPU; >>>> if (!(flags & SW_PREEMPT)) >>>> td->td_flags &= ~TDF_NEEDRESCHED; >>>> - td->td_owepreempt = 0; >>>> tdq->tdq_switchcnt++; >>>> /* >>>> * The lock pointer in an idle thread should never change. Reset it >>>> Index: sys/kern/kern_synch.c >>>> =================================================================== >>>> --- sys/kern/kern_synch.c (revision 227608) >>>> +++ sys/kern/kern_synch.c (working copy) >>>> @@ -406,6 +406,8 @@ mi_switch(int flags, struct thread *newtd) >>>> ("mi_switch: switch must be voluntary or involuntary")); >>>> KASSERT(newtd != curthread, ("mi_switch: preempting back to ourself")); >>>> >>>> + td->td_owepreempt = 0; >>>> + >>>> /* >>>> * Don't perform context switches from the debugger. >>>> */ >>>> Index: sys/kern/sched_4bsd.c >>>> =================================================================== >>>> --- sys/kern/sched_4bsd.c (revision 227608) >>>> +++ sys/kern/sched_4bsd.c (working copy) >>>> @@ -940,7 +940,6 @@ sched_switch(struct thread *td, struct thread *new >>>> td->td_lastcpu = td->td_oncpu; >>>> if (!(flags & SW_PREEMPT)) >>>> td->td_flags &= ~TDF_NEEDRESCHED; >>>> - td->td_owepreempt = 0; >>>> td->td_oncpu = NOCPU; >>>> >>>> /* >>>> >>>> Does anybody see any potential problems with such a change? >>> >>> Hmm, does this mean the preemption will be lost if you break into the >>> debugger and continue in the non-panic case? >> >> Not sure which exact scenario you have in mind. >> Please note that the above diff just moves resetting of td_owepreempt to an >> earlier place. As far as I can see there are no checks of td_owepreempt value >> between the new place and the old places. > > I'm worried that you are clearing td_owepreempt even in cases where a context > switch is not performed. So say you enter DDB with td_owepreempt set and that > DDB bails on a context switch. With your change it will now clear td_owepreempt > and "lose" the preemption. > And without the change we get the recursion and double-fault because of kdb_switch -> thread_unlock -> spinlock_exit -> critical_exit -> mi_switch in this case ? BTW, it is my opinion that we really should not let the debugger code call mi_switch for any reason. -- Andriy Gapon