Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Nov 2017 23:35:47 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Peter Wemm <peter@wemm.org>
Cc:        "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, Warner Losh <imp@freebsd.org>,  src-committers <src-committers@freebsd.org>,  "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r325378 - head/sys/dev/ipmi
Message-ID:  <CANCZdfpkwhcVefhr1bp7oAVvq_uy1ASLot6XdV=zCTYnY3cx7g@mail.gmail.com>
In-Reply-To: <1595776.mmy5sTxHyV@overcee.wemm.org>
References:  <201711040301.vA431wdY002757@repo.freebsd.org> <2932858.xKWtPkGhRe@overcee.wemm.org> <CANCZdfq8jnuO8_=5PFFbXeEu_V14LM4_zYxjF2EBsmk9g-srMQ@mail.gmail.com> <1595776.mmy5sTxHyV@overcee.wemm.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Nov 4, 2017 at 11:19 PM, Peter Wemm <peter@wemm.org> wrote:

> On Saturday, November 04, 2017 11:03:55 PM Warner Losh wrote:
> > On Sat, Nov 4, 2017 at 10:50 PM, Peter Wemm <peter@wemm.org> wrote:
> > > On Saturday, November 04, 2017 03:01:58 AM Warner Losh wrote:
> > > > Author: imp
> > > > Date: Sat Nov  4 03:01:58 2017
> > > > New Revision: 325378
> > > > URL: https://svnweb.freebsd.org/changeset/base/325378
> > > >
> > > > Log:
> > > >   Make the startup timeout 0 seconds by default rathern than 420s.
> This
> > > >   makes the default fail safe when watchdogd is disabled (which is
> also
> > > >   the default).
> > >
> > > We're still getting unanticipated reboots.
> > >
> > > I think what is happening is:
> > > 1) orderly reboot initiated.
> > > 2) By default, the watchdog code sets a 420 second timer, even with no
> > > watchdogd.
> > > 3) reboot complets, system comes up.
> > > 4) A few minutes later, the pre-reboot 420 second timer expires and
> > > *another*
> > > reboot happens.
> > >
> > > Setting hw.ipmi.on="0" in loader.conf stops this...
> > >
> > > eg: reboot at 4:41:47.. system comes back up, and later:
> > > ...
> > > Uptime: 322 Sun Nov 5 04:48:45 UTC 2017
> > > Uptime: 323 Sun Nov 5 04:48:46 UTC 2017
> > > Uptime: 324 Sun Nov 5 04:48:47 UTC 2017
> > > Stopping cron.
> > > Waiting for PIDS: 1004.
> > > Stopping sshd.
> > > Waiting for PIDS: 994.
> > > Stopping nginx.
> > > ...
> > > That's exactly 420 seconds after the original reboot which matches the
> > > wd_shutdown_countdown timer that is still enabled.]
> >
> > Good detective work.I suspect this will need to be opt-in as well...
> Though
> > the other option is to disable the watchdog on attach if we're not
> enabling
> > the early watchdog which would give us a watchdog when we hang on
> > shutdown...  I need to think this through.... Fix it early with less
> > protection by setting this to 0, or fix it later with more protection,
> but
> > perhaps odd behavior for some edge cases like downgrade.
> >
> > In the mean time hw.ipmi.wd_shutdown_countdown=0 should also fix it. Can
> > you confirm that?
> >
> > Warner
>
> We have a number of obnoxious machines that take 5+ minutes in POST.  The 7
> minute timer is cutting it awfully close.
>
> However, what I'm more worried about: what if you're going to boot
> something
> other than FreeBSD?  Or going into the BIOS to tweak something?   If I
> break
> into the loader to pause booting, it'll just silently reboot out from
> under me
> a few minutes later.   I don't see how this can be anything but opt-in by
> default.  As it's a timer initiated by an orderly shutdown/reboot there
> should
> be plenty of time for an approprate value to be safely set.
>
> Yes, setting the sysctl after boot did prevent the spurious reboot after
> the
> next boot-up.


OK. Given the edge cases aren't so edgy as I was originally thinking, I'm
inclined to agree here: both features have to be opt-in. Attempts at being
clever only work in a monoculture of FreeBSD where one is always moving
forward in versions and never back. There's problems with both of these
assumptions...

Sorry for what sounds like a lot of hassle to diagnose this.

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpkwhcVefhr1bp7oAVvq_uy1ASLot6XdV=zCTYnY3cx7g>