Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Nov 2011 13:14:05 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Alexander Motin <mav@freebsd.org>
Cc:        freebsd-current@freebsd.org, Andriy Gapon <avg@freebsd.org>
Subject:   Re: Stop scheduler on panic
Message-ID:  <20111117111405.GT50300@deviant.kiev.zoral.com.ua>
In-Reply-To: <4EC4D3DE.8080103@FreeBSD.org>
References:  <20111113083215.GV50300@deviant.kiev.zoral.com.ua> <20111116202714.5ee4bd53@fabiankeil.de> <4EC43764.1020202@FreeBSD.org> <4EC4423A.3020904@FreeBSD.org> <20111117081533.GP50300@deviant.kiev.zoral.com.ua> <4EC4C89A.2040208@FreeBSD.org> <20111117090653.GR50300@deviant.kiev.zoral.com.ua> <4EC4D3DE.8080103@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Qzzl9rayGpn1ceqJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Nov 17, 2011 at 11:29:02AM +0200, Alexander Motin wrote:
> On 11/17/11 11:06, Kostik Belousov wrote:
> > On Thu, Nov 17, 2011 at 10:40:58AM +0200, Alexander Motin wrote:
> >> On 11/17/11 10:15, Kostik Belousov wrote:
> >>> On Thu, Nov 17, 2011 at 01:07:38AM +0200, Alexander Motin wrote:
> >>>> On 17.11.2011 00:21, Andriy Gapon wrote:
> >>>>> on 16/11/2011 21:27 Fabian Keil said the following:
> >>>>>> Kostik Belousov<kostikbel@gmail.com>  wrote:
> >>>>>>
> >>>>>>> I was tricked into finishing the work by Andrey Gapon, who develo=
ped
> >>>>>>> the patch to reliably stop other processors on panic.  The patch
> >>>>>>> greatly improves the chances of getting dump on panic on SMP host.
> >>>>>>
> >>>>>> I tested the patch trying to get a dump (from the debugger) for
> >>>>>> kern/162036, which currently results in the double fault reported =
in:
> >>>>>> http://lists.freebsd.org/pipermail/freebsd-current/2011-September/=
027766.html
> >>>>>>
> >>>>>> It didn't help, but also didn't make anything worse.
> >>>>>>
> >>>>>> Fabian
> >>>>>
> >>>>> The mi_switch recursion looks very familiar to me:
> >>>>> mi_switch() at mi_switch+0x270
> >>>>> critical_exit() at critical_exit+0x9b
> >>>>> spinlock_exit() at spinlock_exit+0x17
> >>>>> mi_switch() at mi_switch+0x275
> >>>>> critical_exit() at critical_exit+0x9b
> >>>>> spinlock_exit() at spinlock_exit+0x17
> >>>>> [several pages of the previous three lines skipped]
> >>>>> mi_switch() at mi_switch+0x275
> >>>>> critical_exit() at critical_exit+0x9b
> >>>>> spinlock_exit() at spinlock_exit+0x17
> >>>>> intr_even_schedule_thread() at intr_event_schedule_thread+0xbb
> >>>>> ahci_end_transaction() at ahci_end_transaction+0x398
> >>>>> ahci_ch_intr() at ahci_ch_intr+0x2b5
> >>>>> ahcipoll() at ahcipoll+0x15
> >>>>> xpt_polled_action() at xpt_polled_action+0xf7
> >>>>>
> >>>>> In fact I once discussed with jhb this recursion triggered from a d=
ifferent
> >>>>> place.  To quote myself:
> >>>>> <avg>    spinlock_exit ->  critical_exit ->  mi_switch ->  kdb_swit=
ch ->
> >>>>> thread_unlock ->  spinlock_exit ->  critical_exit ->  mi_switch -> =
 ...
> >>>>> <avg>    in the kdb context
> >>>>> <avg>    this issue seems to be triggered by td_owepreempt being tr=
ue at=20
> >>>>> the time
> >>>>> kdb is entered
> >>>>> <avg>    and there of course has to be an initial spinlock_exit cal=
l=20
> >>>>> somewhere
> >>>>> <avg>    in my case it's because of usb keyboard
> >>>>> <avg>    I wonder if it would make sense to clear td_owepreempt rig=
ht=20
> >>>>> before
> >>>>> calling kdb_switch in mi_switch
> >>>>> <avg>    instead of in sched_switch()
> >>>>> <avg>    clearing td_owepreempt seems like a scheduler-independent=
=20
> >>>>> operation to me
> >>>>> <avg>    or is it better to just skip locking in usb when kdb_activ=
e is set
> >>>>> <avg>    ?
> >>>>>
> >>>>> The workaround described above should work in this case.
> >>>>> Another possibility is to pessimize mtx_unlock_spin() implementatio=
ns to=20
> >>>>> check
> >>>>> SCHEDULER_STOPPED() and to bypass any further actions in that case.=
  But=20
> >>>>> that
> >>>>> would add unnecessary overhead to the sunny day code paths.
> >>>>>
> >>>>> Going further up the stack one can come up with the following propo=
sals:
> >>>>> - check SCHEDULER_STOPPED() swi_sched() and return early
> >>>>> - do not call swi_sched() from xpt_done() if we somehow know that w=
e are=20
> >>>>> in a
> >>>>> polling mode
> >>>>
> >>>> There is no flag in CAM now to indicate polling mode, but if needed,=
 it=20
> >>>> should not be difficult to add one and not call swi_sched().
> >>>
> >>> I have the following change for eons on my test boxes. Without it,
> >>> I simply cannot get _any_ dump.
> >>>
> >>> diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c
> >>> index 10b89c7..a38e42f 100644
> >>> --- a/sys/cam/cam_xpt.c
> >>> +++ b/sys/cam/cam_xpt.c
> >>> @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb)
> >>>  			TAILQ_INSERT_TAIL(&cam_simq, sim, links);
> >>>  			mtx_unlock(&cam_simq_lock);
> >>>  			sim->flags |=3D CAM_SIM_ON_DONEQ;
> >>> -			if (first)
> >>> +			if (first && panicstr =3D=3D NULL)
> >>>  				swi_sched(cambio_ih, 0);
> >>>  		}
> >>>  	}
> >>
> >> That should be OK for kernel dumping. I was thinking about CAM abusing
> >> polling not only for dumping. But looking on cases where it does it no=
w,
> >> may be it is better to rewrite them instead.
> >=20
> > So, should I interpret your response as 'Reviewed by" ?
>=20
> It feels somehow dirty to me. I don't like these global variables. If
> you consider it is fine, proceed, I see no much harm. But if not, I can
> add polling flag to the CAM. Flip a coin for me. :)
You promised to add the polling at summer' meeting in Kiev. Will you do
it now ?

--Qzzl9rayGpn1ceqJ
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAk7E7H0ACgkQC3+MBN1Mb4hTQACfbU7Gf0DWa1tO4ZiC9fzlEv+d
92kAnRotW8eifQEaYvgGPjwlsFUqn/hW
=IIQ4
-----END PGP SIGNATURE-----

--Qzzl9rayGpn1ceqJ--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111117111405.GT50300>