From owner-freebsd-arm@FreeBSD.ORG Wed Dec 10 15:23:07 2014 Return-Path: Delivered-To: freebsd-arm@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6D6CBA7A for ; Wed, 10 Dec 2014 15:23:07 +0000 (UTC) Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3F5ADA9 for ; Wed, 10 Dec 2014 15:23:07 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1Xyj6S-000PKP-4O; Wed, 10 Dec 2014 15:23:00 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id sBAFMwNk007679; Wed, 10 Dec 2014 08:22:58 -0700 (MST) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1+5iClrvQHASwKUFY0dMmZj X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: RPi - watchdogd not working anymore (since r273154+) From: Ian Lepore To: Andreas Schwarz In-Reply-To: <457ba92c65e.30c60c9a@mail.schwarzes.net> References: <457b82448a7.53997a24@mail.schwarzes.net> <1418185467.1064.166.camel@revolution.hippie.lan> <457ba92c65e.30c60c9a@mail.schwarzes.net> Content-Type: text/plain; charset="us-ascii" Date: Wed, 10 Dec 2014 08:22:58 -0700 Message-ID: <1418224978.1064.182.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-arm@FreeBSD.org X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Dec 2014 15:23:07 -0000 On Wed, 2014-12-10 at 06:29 +0100, Andreas Schwarz wrote: > On 09.12.14, Ian Lepore wrote: > > Hi Ian, > > > watchdogd requests a timeout of approximately 128 seconds. It used to > > pet the dog once per second, and recently it was changed to only do so > > once every 10 seconds for efficiency. > > > > The rpi watchdog hardware is unable to set a timeout longer than 15 > > seconds, but a bug in the driver let the value wrap around then get > > bitmasked such that the request for a 128 second timeout was actually > > getting handled as a 9 second timeout. The 9 second thing worked when > > we were petting the dog once a second, but now fails at once every 10 > > seconds. > > > > I've got a fix for the wrapping problem, but all it will do is make the > > timer not get set at all if you ask for 128 seconds (that's what the > > interface for watchdogs requires, don't set the timer if can't be at > > least as long as requested). > > > > To fix your problem you'll need to set watchdogd_flags="-s 4 -t 8" in > > your rc.conf. That will make the timeout 8 seconds and pet the dog > > every 4 seconds. You don't have too many options for the timeout value > > (-t) because of the goofy way the timeout is represented in the kernel. > > The only choices that work on rpi are 1,2,4,8 seconds. If you ask for 9 > > it gets represented as a value that translates to 17.5 seconds and rpi > > can't do it. > > Thank you for your copious explanation. I understand the problem and > was able to run the watchdog again. In general, it's a litte bit unsatisfying, > that we have a (limited) watchdog hardware which will not fit the requirements > of the freebsd watchdog implementation (which normally should cover all > the watchdog hardware). > > best regards, > Andreas > The hardware is what it is, and the rpi isn't the only modern arm system with a 16 second max timeout. I think the freebsd watchdog interface reflects its age... a timeout of 128 or even 256 seconds is probably reasonable for some big server where you want to be extra-sure it's a genuine lockup because a reboot is pretty drastic. But if your smartphone locks up, do you really want to wait 5 minutes for it to recover? Timeouts at the low end seem more important these days. It does kind of suck, though, that the hardware can do 1-15 seconds and we can only hit 4 datapoints in the bottom half of that range. I'm pondering ways we could extend the current interface in the kernel to do better, without breaking existing drivers. On the other hand, what we've been getting by accident for a couple years is a 9-second timeout with 1-second petting, so -s 4 -t 8 is actually a bit better than that. :) -- Ian