From owner-freebsd-hackers@freebsd.org Thu Nov 14 16:02:40 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C8A4F1AC09A for ; Thu, 14 Nov 2019 16:02:40 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound2m.ore.mailhop.org (outbound2m.ore.mailhop.org [54.149.155.156]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47DR8J0q5Qz3F3H for ; Thu, 14 Nov 2019 16:02:39 +0000 (UTC) (envelope-from ian@freebsd.org) ARC-Seal: i=1; a=rsa-sha256; t=1573747356; cv=none; d=outbound.mailhop.org; s=arc-outbound20181012; b=Olh67MR7OcRbzolpgogXCWF+sjn4kb0GFK7Z8GkeUbg72zJBVHVmQSA4SFldx6CbhnvJsH6y9fF+C SgFxqv9USDKiyWMB2xI8oNdq523cWlc+nyXVeY3RiXablRrPV1/jZ1Rfpq/9avcJG3mA8tUZT2Wrb9 Z8adbrjKaDk+mhCg0kN2/U0Q6poXrmAAeIEw3axKR3asM0lTd2AVXvwcmAvX9EHaoYNJQj8XxSJWAe stF+ibAeov1pK5vGhZ3Gdw+Ri0u5xhG3hobEQTGeGtiT1iuF10tlZuOkm8rhfpSm6ZLkniKuyAv08s d0W5o6FX8fPcnZEI8DpissJ0uWP8eiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=arc-outbound20181012; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:dkim-signature:from; bh=4+3wJLkBDJxWaW1NiXR+Qw0YsTXtWD1eAlhu0IH/Ny0=; b=UOiQ+X0/4nS+fg7LKsOzR2WnaoW07RgI8dP+soWmUGEdAkuXdmRR6tWyzyTPb76gKmZdgoGg4M6bs MPiy61PD+R5RPT7T+CNFx5ICyi98Uo5MDIv2Ha4pFqiL9bOewvsqvXRN24PM4QNhMxn/MvfwiM/sK8 0v8XjqcGHIVNifdhnFqVlAV0Lt6/jBvEQ6ISiIuzL7FpZeACZ8nmzbyP/3OG/7JzzqayhEJMM+hWjl 8+lOYn6/HqDLqMY3lTXVAhZ+UeKet2peu1mLUs28mWD1JS37N8jn0bevZbCt0YqmGt2J1bazLEXVDq yHrUjovVD3hQKBhFeM0BtS6AzEp6C8Q== ARC-Authentication-Results: i=1; outbound4.ore.mailhop.org; spf=softfail smtp.mailfrom=freebsd.org smtp.remote-ip=67.177.211.60; dmarc=none header.from=freebsd.org; arc=none header.oldest-pass=0; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outbound.mailhop.org; s=dkim-high; h=content-transfer-encoding:mime-version:content-type:references:in-reply-to: date:cc:to:from:subject:message-id:from; bh=4+3wJLkBDJxWaW1NiXR+Qw0YsTXtWD1eAlhu0IH/Ny0=; b=Y30WDyVXqqktd/Itj0jfdB5f08VtRU2TSmYLgEnKPSrtpXdlBmYWLAvvSRiQr4vGFXGmJOGx9H08/ Uzd5xoz2d5CaExD/e+2x26eoNeS6q0BWmBH9vi3hdMRD73JpSx9c0Sq/qiaie0NhqWAiwRIdV+o2fT kH/BMQ/qpBdIG814RSAMnFPTlL4S1NRbPG7QcfMsH+Ht5G8AhST9oyDrPp+yo7tcFaufRP6VEZTC8z cI4jPNjKwkl2f4vHOc9mHIB7slXBNmzHvDlkrSI7PBIuO0ngVwoFJmfiQS7NQ9X/U6/h6lwXMY7jQm U5Cn+sbf3j+iGz1vh01EVJ2n2QPW8BQ== X-MHO-RoutePath: aGlwcGll X-MHO-User: 2a9baf79-06f8-11ea-829e-79a40d15cccd X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 67.177.211.60 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [67.177.211.60]) by outbound4.ore.mailhop.org (Halon) with ESMTPSA id 2a9baf79-06f8-11ea-829e-79a40d15cccd; Thu, 14 Nov 2019 16:02:35 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id xAEG2WIT027870; Thu, 14 Nov 2019 09:02:32 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <828605fef472e04311c83a7de0d1f4df429ae717.camel@freebsd.org> Subject: Re: can the hardware watchdog reboot a hung kernel? From: Ian Lepore To: Daniel Braniss Cc: freebsd-hackers Date: Thu, 14 Nov 2019 09:02:32 -0700 In-Reply-To: <2AD912BF-97B0-421D-B561-722D74864DC9@cs.huji.ac.il> References: <9cded04a-9ae1-881e-3962-7ef0322e96ed@grosbein.net> <2AD912BF-97B0-421D-B561-722D74864DC9@cs.huji.ac.il> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 FreeBSD GNOME Team Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 47DR8J0q5Qz3F3H X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-1.92 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.92)[-0.916,0]; ASN(0.00)[asn:16509, ipnet:54.148.0.0/15, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Nov 2019 16:02:40 -0000 On Thu, 2019-11-14 at 17:35 +0200, Daniel Braniss wrote: > > On 14 Nov 2019, at 17:28, Eugene Grosbein > > wrote: > > > > 14.11.2019 21:52, Daniel Braniss wrote: > > > > > hi, > > > I have serveral hundred Nano-pi NEO running, and sometimes they > > > hang, since there is no console > > > available, the only solution is to do a power cycle - not so easy > > > since they are distributed in three buildings :-) > > > > > > I am looking at the watchdog stuff, but it seems that what I want > > > is not supported, i.e. > > > reboot the kernel when hung > > > > > > wishful thinking? > > > > It's possible if the hardware has such a watchdog and kernel > > subsystem watchdog(4) supports it. > > rc.conf(5) manual page describes watchdogd_enable option. > > > > yes, but it relys on user land, what if the kernel is hung? > It relies on the userland daemon to issue the ioctl() calls to pet the dog. If the kernel is hung, then userland code isn't going to run either, and the watchdog petting won't happen, and eventually the hardware reboots. We use this at $work specifically to reboot if the kernel hangs, using this config: watchdogd_enable=YES watchdogd_flags="-s 16 -t 64 -x 64" That says the daemon should pet the dog every 16 seconds, and the hardware is programmed to reboot if 64 seconds elapses without petting. In addition, when watchdogd is shutdown normally (like during a normal system reboot) it doesn't disable the watchdog hardware, it sets the timeout to 64s to protect against any kind of hang during the reboot. The -t and -x times can be different, 64s just happens to work well for us in both cases. -- Ian