Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Feb 2013 00:04:41 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Mark Atkinson <atkin901@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: watchdogs
Message-ID:  <512B1B19.3000207@mu.org>
In-Reply-To: <kg87el$iip$1@ger.gmane.org>
References:  <512525C1.1070502@norma.perm.ru> <kg87el$iip$1@ger.gmane.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2/22/13 8:47 AM, Mark Atkinson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 02/20/2013 11:36, Eugene M. Zheganin wrote:
>> Hi.
>>
>> I have a bunch of FreeBSDs that hangs (and I really want to do
>> something to fight this). May be it's the zfs or may be it's the pf
>> (I also have a bunch of really stable ones, so it's hard to isolate
>> and tell). Since 9.x hang more often I suppose it's pf. I use
>> ichwd.ko and watchdogd to reboot a machine when it hangs.  It works
>> pretty well; I'm also working on a various WITNESS/INVARIANTS stuff
>> and I'm trying to report it to gnats, but obviously it would be
>> much nicer if the system would panic and leave some debuggable core
>> after a hang (so far I don't have any, so I can only guess). I've
>> read about software watchdog in kernel and I doesn'y quite
>> understand: it's said that kernel software watchdog is able to
>> panic when a deadlock occurs. Can this be achieved with ichwd ?
>> Another one: as far as I understand ichwd reboots my machine on a
>> hardware level, right ? So am I right saying that software watchdog
>> can be, in theory, also deadlocked, thus, being kinda less reliable
>> solution ?
> I just want to /metoo that I have 32bit/i386 box running zfs, pf and
> - -current that is hardlocking randomly (usually has an uptime for a few
> days to a couple weeks).   SW_WATCHDOG won't fire when it locks so it
> must be locking pretty fast.
>
> I just noticed that ichwd will load on this box, so I'll try that
> instead, but now I'm wondering if the SW_WATCHDOG kernel will
> interfere or rather if watchdogd is smart enough to handle both?

watchdog(4) will arm all watchdogs.

watchdogd uses watchdog(4) so yes, both watchdogs (SW_WATCHDOG & ichwd) 
should be armed.

>
> This box used to occasionally panic on the ZFS stack panic so I did
> the KSTACK_PAGES=4 change to the kernel and now it just hardlocks.
> I'm not saying they are related.

Interesting.  What is the default for KSTACK_PAGES?

Btw, from all I've heard less than 4GB ram + ZFS == you're gonna have a 
bad time.

There are supposedly some ways to make it somewhat reliable by disabling 
certain features, but I don't know the tricks off hand.

-Alfred



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?512B1B19.3000207>