Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Aug 2002 10:16:00 +0100
From:      Matthew Seaman <m.seaman@infracaninophile.co.uk>
To:        Robert Covell <rcovell@rolet.com>
Cc:        freebsd-questions@FreeBSD.ORG
Subject:   Re: Incorrect Uptime
Message-ID:  <20020826091600.GA3238@happy-idiot-talk.infracaninophi>
In-Reply-To: <003701c24c85$ee9fb300$6401a8c0@kc.rr.com>
References:  <003701c24c85$ee9fb300$6401a8c0@kc.rr.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Aug 25, 2002 at 05:22:38PM -0500, Robert Covell wrote:

> We have a mail server running FreeBSD 4.1.1-RELEASE.  Every so
> often, say once a month, uptime says the server is running at 100%
> for 1, 5, and 15 displays.  But when I go into top the system is
> 100% idle (or very close to it).  The only way I have found to fix
> it is to reboot the box.  I have found out that if uptime return
> 1.00 for the one minute it means the cpu is at 0% utilization.  If
> it say 1.01 it is at 1% utilization.  Anyone have an idea of why
> this would be happening?  We use uptime to monitor the performance
> on the server, and cannot determine why this would be happening when
> it is really not at 100%.

The 1, 5 and 15 minute load averages aren't quite the same thing as
CPU utilization.  The load averages are a measure of the number of
processes sitting in the queue requesting a time slice on the CPU.  On
an unloaded system, where there's plenty of spare CPU cycles, a
process will get a time slice almost immediately so the load average
won't be affected much.

Now, the CPU utilization and the load averages usually correlate
pretty well, but it is possible for the load average to increase
without the CPU usage going up.  This indicates that the kernel is so
busy dealing with some other matter that it hasn't got round to
dealing out time slices to processes very promptly.  Usually that
means some higher priority interrupt triggered by hardware.

This can be an indication of failing hardware: the kernel is
desperately trying to get a response out of a piece of equipment that
has gone a bit catatonic.  It can be down to something as trivial as a
broken wire in a network cable, or as bad as an impending failure of
your main hard drive.

Make sure your backups are comprehensive and up to date.  Survey the
system log files for other evidence of problems --- the kernel will
usually log something when it encounters such.  Run healthd or xmbmon
or the like to monitor motherboard and CPU temperatures ---
overheating is one of the most common causes of things going horribly
wrong.  Schedule some down time to perform preventive maintenance like
cleaning dust out of the fans and so forth.

	Cheers,

	Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.                       26 The Paddocks
                                                      Savill Way
                                                      Marlow
Tel: +44 1628 476614                                  Bucks., SL7 1TH UK

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020826091600.GA3238>