Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Oct 2005 15:04:54 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Andre Oppermann <andre@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/usr.bin/vmstat vmstat.c src/usr.bin/w w.c
Message-ID:  <20051018135821.L93164@delplex.bde.org>
In-Reply-To: <200510171537.j9HFbMdW073713@repoman.freebsd.org>
References:  <200510171537.j9HFbMdW073713@repoman.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Oct 2005, Andre Oppermann wrote:

> andre       2005-10-17 15:37:22 UTC
>
>  FreeBSD src repository
>
>  Modified files:
>    usr.bin/vmstat       vmstat.c
>    usr.bin/w            w.c
>  Log:
>  Obtain true uptime through clock_gettime(CLOCK_MONOTONIC, struct *timespec)
>  instead of subtracting 'bootime' from 'now'.

This is bogus, and it breaks vmstat some more in the dead kernel case.

In the live kernel case, clock_gettime() returns the time since an
unspecified point in the past.  It is still necessary to subtract the
boottime, one measured by the same clock, especially under systems
like FreeBSD where the "unspecified point in the past" is undocumented.

In the dead kernel case, clock_gettime() cannot be used, and using
it gives a wrong uptime (that of the running system, not that of the
dead kernel).

The old code is wrong too.  It subtracts the boot time from the current
time, where the current time is for the live kernel and the boot time
is for the live or dead kernel.  This gives an even wronger uptime
for the dead kernel case, since a live kernel may be run longer after
the boot time of a dead kernel than after its own boot time.

I don't know of any good way to determine the uptime of a dead kernel
now.  I only know of a wrong way: kern_shutdown.c has a low quality
function print_uptime() which does user-interface things that don't
belong in the kernel.  It prints the uptime to the console.  To ensure
its low quality, it does this after the dump has completed, so that
its output doesn't get printed in the dump via the message buffer.
Thus its output can only be seen if you are watching the shutdown in
some way, and is soon forgotten unless your watcher has logging features.

The death time for a dead kernel should be saved in a variable near its
boottime variable so that utilities like vmstat can determine it easily.

For live kernels, subtracting the boot time from the current _real_
time using difftime() is the correct method.  Both times are relative
to the same clock, so their difference gives the elasped time unless
there are bugs in this clock.  The clock should be the real time clock
and not the monotonic clock, since uptimes are in real time.  In
practice there should only be a difference of a leap second or two and
no one would be able to see it.  However, there are bugs.  For the
real time clock, the boot time at least used to be bogusly adjusted
to compensate for adjustements to the real time by settime(), mainly
so that subtracting the boot time from the current time gives a value
near the uptime.  This is correct if the adjustment is to compensate
for an initial error in the real time as is normal if the system clock
is on local time (here local time is 36000 seconds in advance of UTC,
so both the real time and the boot time start up 36000 too high; then
the settime() in adjkerntz sets them both back 36000; if the boottime
were not adjusted then the uptime shown by vmstat would be off by 36000
seconds).  This is wrong if the adjustment is to compensate for clock
drift.  Then the uptime really should change to adjust for the error
in the real time, but instead the boot time is messed up to keep the
uptime the same.  Using the monotonic clock for the uptime gives
essentially the same bug -- in general, the real time clock and the
monotonic clock will drift at different rates, and since only the
real time clock can be synced with an external clock, the inaccuracy
can only be bounded for the real time clock.

There are also some style bugs in vmstat.c in the changes and old ones
nearby:

% Index: vmstat.c
% ===================================================================
% RCS file: /home/ncvs/src/usr.bin/vmstat/vmstat.c,v
% retrieving revision 1.90
% retrieving revision 1.91
% diff -u -2 -r1.90 -r1.91
% --- vmstat.c	6 Aug 2005 13:56:21 -0000	1.90
% +++ vmstat.c	17 Oct 2005 15:37:22 -0000	1.91
% @@ -397,24 +397,12 @@
%  getuptime(void)
%  {
% -	static struct timeval boottime;
% -	static time_t now;
% +	struct timespec sp;
%  	time_t uptime;
% 
% -	if (boottime.tv_sec == 0) {
% -		if (kd != NULL) {
% -			kread(X_BOOTTIME, &boottime, sizeof(boottime));

These leaves dead code in the nlist.

% -		} else {
% -			size_t size;
% -
% -			size = sizeof(boottime);
% -			mysysctl("kern.boottime", &boottime, &size, NULL, 0);
% -			if (size != sizeof(boottime))
% -				errx(1, "kern.boottime size mismatch");

There was error checking for the sysctl.

% -		}
% -	}
% -	(void)time(&now);
% -	uptime = now - boottime.tv_sec;
% +	(void)clock_gettime(CLOCK_MONOTONIC, &sp);

There was/is no error checking for the time calls.

% +	uptime = sp.tv_sec;
%  	if (uptime <= 0 || uptime > 60*60*24*365*10)
%  		errx(1, "time makes no sense; namelist must be wrong");

This check is now bogus.  Dead kernels are no longer used to determine the
uptime; in particular the namelist is not used.  The only possibility of
an error is clock_gettime() but errors from it were voided.

% +

Extra blank line.

%  	return(uptime);

No space before return expression.

%  }

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051018135821.L93164>