Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Jun 2011 18:33:59 -0400
From:      Attilio Rao <attilio@freebsd.org>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        freebsd-current@freebsd.org, freebsd-stable@freebsd.org, Andriy Gapon <avg@freebsd.org>
Subject:   Re: [poll / rfc] kdb_stop_cpus
Message-ID:  <BANLkTinJHTW9UEh8F2WFmy65uhmete%2B1wQ@mail.gmail.com>
In-Reply-To: <4DE8FD83.6030503@freebsd.org>
References:  <4DE8FA2E.4030202@FreeBSD.org> <4DE8FD83.6030503@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/6/3 Nathan Whitehorn <nwhitehorn@freebsd.org>:
> On 06/03/11 10:13, Andriy Gapon wrote:
>>
>> I wonder if anybody uses kdb_stop_cpus with non-default value.
>> If, yes, I am very interested to learn about your usecase for it.
>>
>> I think that the default kdb behavior is the correct one, so it doesn't
>> make sense
>> to have a knob to turn on incorrect behavior.
>> But I may be missing something obvious.
>>
>> The comment in the code doesn't really satisfy me:
>> /*
>> =C2=A0* Flag indicating whether or not to IPI the other CPUs to stop the=
m on
>> =C2=A0* entering the debugger. =C2=A0Sometimes, this will result in a de=
adlock as
>> =C2=A0* stop_cpus() waits for the other cpus to stop, so we allow it to =
be
>> =C2=A0* disabled. =C2=A0In order to maximize the chances of success, use=
 a hard
>> =C2=A0* stop for that.
>> =C2=A0*/
>>
>> The hard stop should be sufficiently mighty.
>> Yes, I am aware of supposedly extremely rare situations where a deadlock
>> could
>> happen even when using hard stop. =C2=A0But I'd rather fix that than hav=
e this
>> switch.
>>
>> Oh, the commit message (from 2004) explains it:
>>>
>>> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we
>>> attempt to IPI other cpus when entering the debugger in order to stop
>>> them while in the debugger. =C2=A0The default remains to issue the stop=
;
>>> however, that can result in a hang if another cpu has interrupts disabl=
ed
>>> and is spinning, since the IPI won't be received and the KDB will wait
>>> indefinitely. =C2=A0We probably need to add a timeout, but this is a us=
eful
>>> stopgap in the mean time.
>>
>> But that was before we started using hard stop in this context (in 2009)=
.
>
> Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this st=
ill
> applies.

Well, if I get Andriy's proposal right, he just wants to trim off the
possibility to not stop the CPUs on entering KDB. I'm not entirely
sure why there is a sysctl for disabling that and I really don't want
it.

Note that the missing of the NMI/privileged Interrupt is not going to
be a factor on this request, unless you are worried a lot by the easy
deadlock that a normal stop operation may lead.
If that is the case, I think that the upcoming work on skipping
locking during KDB/panic entering is going to help a lot for this
case. At that point removing the possibility to turn off CPU stopping
will be a good idea, IMHO.

Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTinJHTW9UEh8F2WFmy65uhmete%2B1wQ>