Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Dec 2004 07:15:26 -0800
From:      Kris Kennaway <kris@obsecurity.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        Kris Kennaway <kris@obsecurity.org>
Subject:   Re: cvs commit: src/sys/i386/i386 vm_machdep.c
Message-ID:  <20041215151526.GA3462@xor.obsecurity.org>
In-Reply-To: <200412142148.48019.jhb@FreeBSD.org>
References:  <200411300618.iAU6IkQX065609@repoman.freebsd.org> <41BF6F44.2090407@root.org> <20041215001034.GA60875@xor.obsecurity.org> <200412142148.48019.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--UlVJffcvxoiEqYs2
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Dec 14, 2004 at 09:48:48PM -0500, John Baldwin wrote:
> On Tuesday 14 December 2004 07:10 pm, Kris Kennaway wrote:
> > On Tue, Dec 14, 2004 at 02:55:00PM -0800, Nate Lawson wrote:
> > > >Erm, well, that's not always easy since sometimes when you panic you
> > > > can't talk to the other CPUs for whatever reason.  Putting back the
> > > > proxy reset doesn't hurt for now but does restore functionality in =
at
> > > > least some cases.  I'd rather have that then certain hard panics not
> > > > get into ddb because we couldn't get onto the BSP to run ddb.
> > >
> > > Perhaps you could give me some pointers on what is counted on to be
> > > working when panic() is called?  I can't come up with a situation whe=
re
> > > the proxy code couldn't be used upon entry to ddb.  If there were any
> > > cases like this, the proxy code wouldn't work for cpu_reset() either.
> > > Also, in such a case, it's hard to see how ddb could be usable since =
it
> > > tries to stop other processors, which requires similar code to the pr=
oxy.
> > >
> > > Or in other words, if you have enough capability to call panic() or
> > > break to ddb, then you have enough resources to do an IPI and get onto
> > > the BSP.
> >
> > NB: DDB often isn't usable on SMP machines thesedays, and will hang
> > when a panic tries to enter it.
>=20
> Try debug.kdb.stop_cpus=3D0 (sysctl and tunable) to prevent KDB from tryi=
ng to=20
> stop the other CPUs.  Another possible fix that ups@ has talked about is=
=20
> changing IPI_STOP to use an NMI rather than a vector (you can send NMI IP=
Is=20
> via the local APIC) so that IPI_STOP is more reliable.

This is already set, and it doesn't always fix the problem.  I often
get overlapping panics from the other CPUs on this machine, and it
often locks up when trying to enter DDB, or while printing the panic
string (the other day it only got as far as 'p' before hanging).

Kris

--UlVJffcvxoiEqYs2
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQFBwFUOWry0BWjoQKURAptkAJ9Cx6tNlFHoB7I1li/+JqZvoGOmBwCgsidA
jhTA21WX70K+c7ty06mnFzM=
=9sKl
-----END PGP SIGNATURE-----

--UlVJffcvxoiEqYs2--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041215151526.GA3462>