From owner-freebsd-hackers Sun Feb 4 06:52:31 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id GAA06037 for hackers-outgoing; Sun, 4 Feb 1996 06:52:31 -0800 (PST) Received: from hauki.clinet.fi (root@hauki.clinet.fi [194.100.0.1]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id GAA05939 for ; Sun, 4 Feb 1996 06:50:08 -0800 (PST) Received: from newzetor.clinet.fi (root@newzetor.clinet.fi [194.100.0.11]) by hauki.clinet.fi (8.7.3/8.6.4) with ESMTP id QAA19257; Sun, 4 Feb 1996 16:48:43 +0200 (EET) Received: (hsu@localhost) by newzetor.clinet.fi (8.7.3/8.6.4) id QAA23064; Sun, 4 Feb 1996 16:48:44 +0200 (EET) Date: Sun, 4 Feb 1996 16:48:44 +0200 (EET) Message-Id: <199602041448.QAA23064@newzetor.clinet.fi> From: Heikki Suonsivu To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) Cc: freebsd-hackers@freebsd.org In-reply-to: J Wunsch's message of 2 Feb 1996 23:43:36 +0200 Subject: Re: Watchdog timers Organization: Clinet Ltd, Espoo, Finland References: <199602021020.LAA04930@uriah.heep.sax.de> Sender: owner-hackers@freebsd.org Precedence: bulk In article <199602021020.LAA04930@uriah.heep.sax.de> J Wunsch writes: Idea stolen from Linux: create a /dev/watchdog for this purpose. Once it is held open by a process, the kernel resets the CPU if it doesn't get a response on a device after a certain time. The idea behind this is that most of the hanging systems have still a running async portion of the kernel, i.e. things like interrupt handling continue to work, but the process context switching hangs for some reason (e.g. SCSI bus hangs etc.). The chances are good that the kernel could still kill itself. Metoo, would save a lot of trouble for us. We try to use FreeBSD on big servers, and it seems hopeless, the more load the more it locks up (daily for the most loaded server)! -- Heikki Suonsivu, T{ysikuu 10 C 83/02210 Espoo/FINLAND, hsu@clinet.fi work +358-0-4375209 fax -4555276 home -8031121