Date: Thu, 14 Nov 2019 11:13:41 -0700 From: Ian Lepore <ian@freebsd.org> To: Daniel Braniss <danny@cs.huji.ac.il> Cc: freebsd-hackers <freebsd-hackers@freebsd.org> Subject: Re: can the hardware watchdog reboot a hung kernel? Message-ID: <8814791e9634980810a41b9cc229612e225a40ee.camel@freebsd.org> In-Reply-To: <BEC1714A-2361-4B62-BEB9-82808920C269@cs.huji.ac.il> References: <EC4DB495-55D0-44BB-8D6A-0301785FADC7@cs.huji.ac.il> <9cded04a-9ae1-881e-3962-7ef0322e96ed@grosbein.net> <2AD912BF-97B0-421D-B561-722D74864DC9@cs.huji.ac.il> <828605fef472e04311c83a7de0d1f4df429ae717.camel@freebsd.org> <BEC1714A-2361-4B62-BEB9-82808920C269@cs.huji.ac.il>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2019-11-14 at 20:10 +0200, Daniel Braniss wrote: > > On 14 Nov 2019, at 18:02, Ian Lepore <ian@freebsd.org> wrote: > > > > On Thu, 2019-11-14 at 17:35 +0200, Daniel Braniss wrote: > > > > On 14 Nov 2019, at 17:28, Eugene Grosbein <eugen@grosbein.net> > > > > wrote: > > > > > > > > 14.11.2019 21:52, Daniel Braniss wrote: > > > > > > > > > hi, > > > > > I have serveral hundred Nano-pi NEO running, and sometimes > > > > > they > > > > > hang, since there is no console > > > > > available, the only solution is to do a power cycle - not so > > > > > easy > > > > > since they are distributed in three buildings :-) > > > > > > > > > > I am looking at the watchdog stuff, but it seems that what I > > > > > want > > > > > is not supported, i.e. > > > > > reboot the kernel when hung > > > > > > > > > > wishful thinking? > > > > > > > > It's possible if the hardware has such a watchdog and kernel > > > > subsystem watchdog(4) supports it. > > > > rc.conf(5) manual page describes watchdogd_enable option. > > > > > > > > > > yes, but it relys on user land, what if the kernel is hung? > > > > > > > It relies on the userland daemon to issue the ioctl() calls to pet > > the > > dog. If the kernel is hung, then userland code isn't going to run > > either, and the watchdog petting won't happen, and eventually the > > hardware reboots. > > > > We use this at $work specifically to reboot if the kernel hangs, > > using > > this config: > > > > watchdogd_enable=YES > > watchdogd_flags="-s 16 -t 64 -x 64" > > > > That says the daemon should pet the dog every 16 seconds, and the > > hardware is programmed to reboot if 64 seconds elapses without > > petting. > > In addition, when watchdogd is shutdown normally (like during a > > normal > > system reboot) it doesn't disable the watchdog hardware, it sets > > the > > timeout to 64s to protect against any kind of hang during the > > reboot. > > The -t and -x times can be different, 64s just happens to work well > > for > > us in both cases. > > > > -- Ian > > > > ok, that is very encouraging, now a last question > how can i hang the kernel to test that the watchdog kicks in? apart > from writing a kernel module :-) > Drop into the kernel debugger and just let it sit there until it reboots (or fails to, I guess). Do "sysctl debug.kdb.enter=1". -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8814791e9634980810a41b9cc229612e225a40ee.camel>