Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Nov 2019 09:02:32 -0700
From:      Ian Lepore <ian@freebsd.org>
To:        Daniel Braniss <danny@cs.huji.ac.il>
Cc:        freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   Re: can the hardware watchdog reboot a hung kernel?
Message-ID:  <828605fef472e04311c83a7de0d1f4df429ae717.camel@freebsd.org>
In-Reply-To: <2AD912BF-97B0-421D-B561-722D74864DC9@cs.huji.ac.il>
References:  <EC4DB495-55D0-44BB-8D6A-0301785FADC7@cs.huji.ac.il> <9cded04a-9ae1-881e-3962-7ef0322e96ed@grosbein.net> <2AD912BF-97B0-421D-B561-722D74864DC9@cs.huji.ac.il>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2019-11-14 at 17:35 +0200, Daniel Braniss wrote:
> > On 14 Nov 2019, at 17:28, Eugene Grosbein <eugen@grosbein.net>
> > wrote:
> > 
> > 14.11.2019 21:52, Daniel Braniss wrote:
> > 
> > > hi,
> > > I have serveral hundred Nano-pi NEO running, and sometimes they
> > > hang, since there is no console
> > > available, the only solution is to do a power cycle - not so easy
> > > since they are distributed in three buildings :-)
> > > 
> > > I am looking at the watchdog stuff, but it seems that what I want
> > > is not supported, i.e.
> > > 	reboot the kernel when hung 
> > > 
> > > wishful thinking?
> > 
> > It's possible if the hardware has such a watchdog and kernel
> > subsystem watchdog(4) supports it.
> > rc.conf(5) manual page describes watchdogd_enable option.
> > 
> 
> yes, but it relys  on user land, what if the kernel is hung? 
> 

It relies on the userland daemon to issue the ioctl() calls to pet the
dog.  If the kernel is hung, then userland code isn't going to run
either, and the watchdog petting won't happen, and eventually the
hardware reboots.

We use this at $work specifically to reboot if the kernel hangs, using
this config:

watchdogd_enable=YES
watchdogd_flags="-s 16 -t 64 -x 64"

That says the daemon should pet the dog every 16 seconds, and the
hardware is programmed to reboot if 64 seconds elapses without petting.
In addition, when watchdogd is shutdown normally (like during a normal
system reboot) it doesn't disable the watchdog hardware, it sets the
timeout to 64s to protect against any kind of hang during the reboot. 
The -t and -x times can be different, 64s just happens to work well for
us in both cases.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?828605fef472e04311c83a7de0d1f4df429ae717.camel>