From owner-freebsd-questions@FreeBSD.ORG Wed Dec 29 10:06:47 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0115316A4CE for ; Wed, 29 Dec 2004 10:06:47 +0000 (GMT) Received: from lp1001.snu.ac.kr (lp1001.snu.ac.kr [147.46.70.11]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5FB1543D53 for ; Wed, 29 Dec 2004 10:06:46 +0000 (GMT) (envelope-from spamrefuse@yahoo.com) Received: from [IPv6:::1] (localhost [127.0.0.1]) (authenticated (0 bits)) by lp1001.snu.ac.kr (8.13.1/8.11.6) with ESMTP id iBTA4TL2005495 for ; Wed, 29 Dec 2004 19:04:30 +0900 Message-ID: <41D281B5.3050107@yahoo.com> Date: Wed, 29 Dec 2004 19:06:45 +0900 From: Rob User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20041226 X-Accept-Language: en-us, en MIME-Version: 1.0 To: FreeBSD References: <41D27378.7010103@yahoo.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: 5.3 in diskless cluster: irregular reboots at 14:09 hr. ?!?! X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Dec 2004 10:06:47 -0000 Colin J. Raven wrote: > On Dec 29, Rob launched this into the bitstream: > >> >> I'm running 5.3-Stable on all PC's. >> >> I have a master/router with 7 diskless slaves. One of the >> slaves shows irregular reboots, without a trace, not even >> a shutdown message in the logs. >> >> Until now I have the following sudden reboots of one particular >> slave happen: >> Nov. 16 14:09:41 >> Nov. 30 14:09:23 >> Dec. 28 14:09:34 >> >> Each is exactly at the same time; this is rather peculiar, isn't it? >> >> Any idea what's going on here, or how to trace this problem? > > > What *else* is happening at (or immediately before) 14:09 on this > machine?? For example is something rather intense occurring immediately > beforehand? I'm thinking power supply failure when it get's loaded > beyond a certain point...so, pursuant to that is there maybe a big log > grep happening beforehand, or some other event that stresses components, > thus consuming more power? Thank you Colin. What would be a good command to run, to find out how stressful the PC is right before the reboot? Is 'top' good enough? Or is there something better? 'ps auxw' for example? Since I don't know on what date it happens a next time, I will start a cron job each day at 14:08 to check how stressful the PC is. It will output the result of the job to disk. > It has that funny; "I'll bet the PSU is on the way out" feeling to it, > but actually proving that can be tedious. I may also swap UPS between two slaves and see if the reboots are related to a shaky UPS. I don't want to replace the PSU yet :(. Rob.