Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Apr 2003 17:05:47 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        Dag-Erling Smorgrav <des@ofug.org>
Subject:   Re: cvs commit: src/sys/conf options.i386 src/sys/i386/i386 tsc.c src/sys/i386/conf NOTES 
Message-ID:  <20030408163515.M7458@gamplex.bde.org>
In-Reply-To: <20030407192803.C3990@gamplex.bde.org>
References:  <4145.1049705887@critter.freebsd.dk> <20030407192803.C3990@gamplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 7 Apr 2003, Bruce Evans wrote:

> On Mon, 7 Apr 2003, Poul-Henning Kamp wrote:
>
> > In message <20030407163148.L3478@gamplex.bde.org>, Bruce Evans writes:
> > >The old calibration code calibrates them relative to another clock so
> > >the only error in their relative frequencies is from the different time
> > >that it takes to read their counters.  The i8254 counter typically takes
> > >5 usec longer to read, but for some reason the actual error is less than
> > >1 i8254 cycle on all of my active systems.
> >
> > That is because the i8254 access is synchronized to the "virtual" ISA
> > bus frequency and your CPU is much faster than that.
>
> That must be the reason for the 0-cycle differences but not for the
> accuracy of the old algorithm.  It was essentially:
>
> 	read RTC using rtcin(); wait for it to change
> 	(1) read i8254 using getit()
> 	(2) read TSC using rdtsc()
> 	read RTC using rtcin(); wait for it to change again
> 	(1a) read i8254 using getit()
> 	(2a) read TSC using rdtsc()
> 	i8254 freq = (1a) - (1)
> 	TSC freq = (2a) - (1)
>
> This is better than I remembered.  The only algorithmic problem with
> it is that the cache state is different for some of the reads, in
> particular the last 2.  The reads of the TSC are delayed by however
> long it takes to read the i8254.  This time is almost constant, so the
> difference is almost independent of it on fast enough CPUs.  Warming
> up the cache might make it completely independent.

I found the bug that caused the 5 usec jitter on my Athlon.  The algorithm
wasn't quite as above.  Steps (2) and (2a) weren't quite symmetrical, so
there was an extra getit() sometimes.  getit() takes about 5 usec...
Fixing this and also removing the inb(0x84)'s from rtcin() (which shouldn't
matter) gave the following output:

%%%
Calibrating clock(s) ... TSC clock: 1532754558 Hz, i8254 clock: 1193128 Hz
Press a key on the console to abort clock calibration
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193129 Hz
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency
Timecounter "i8254"  frequency 1193182 Hz
CLK_USE_TSC_CALIBRATION not specified - using old calibration method
TSC clock: 1532823404 Hz
TSC clock: 1532823869 Hz
raw: 230135339339 230136881932 230136881943 231671247940 231671247951 248533852639
TSC clock: 1532823253 Hz
TSC clock: 1532823868 Hz
raw: 248535096508 248536639084 248536639095 250071004924 250071004935 266933609453
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 266934826763 266936369356 266936369367 268470735196 268470735207 285333339725
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 285334458755 285336001348 285336001359 286870367188 286870367199 303732971717
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 303734137955 303735680548 303735680559 305270046388 305270046399 322132650917
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 322133801363 322135343956 322135343967 323669709796 323669709807 340532314325
Timecounter "TSC"  frequency 1532823868 Hz
%%%

There are now apparently only 2 sources of jitter for the
"Calibrating clocks" part:
- non-determinism reading the i8254.  Causes a max jitter of 1 i8254 cycle.
- cache not warm.  Causes the first part to take a whole 162 cycles longer.

Calibrating clock(s) ... TSC clock: 1532754558 Hz, i8254 clock: 1193128 Hz
Press a key on the console to abort clock calibration
Calibrating clock(s) ... TSC clock: 1532754720 Hz, i8254 clock: 1193128 Hz

The jitter in the "old" calibration part probably has similar causes.
Counting TSC cycles for the 10+-second DELAY() gives a result that is
consistently about 0.4 usec/second = 4 usec/10 seconds larger than for
the 1+-second DELAY().  4 usec is the magic number for the overhead of
a single getit().  The cache apparently takes 2 passes instead of 1
to warm up for the 1-second delay.  After that the cycle counts are
perfectly stable.  The cycle counts are also very stable across boots.

> Changing this to
>
> 	(2) read TSC
> 	DELAY(1000000)
> 	(2a) read TSC
> 	TSC freq = (2a) - (1)
>
> gave 2 new sources of errors although it fixes the RTC source: any
> error in calibation of DELAY(), plus non-determinism from the loop in
> DELAY().  The latter may be precisely 0 in much the same cases that
> the difference in the delays in the old algorithm is precisely 0.
>
> > The best result I have had so far, and the only one I have sufficient
> > faith in to advocate its use in general, takes an entirely different
> > route:
> >
> > The RTC interrupts us at 128Hz for statclock, divide this in software
> > to get 1Hz and take timestamps and feed them to the NTP kernel-FLL code
> > and tell NTPD to lock to that at a high stratum.
> >
> > This will synchronize the clock to the RTC frequency.

This can also be used to calibrate the clocks again after booting.  For
the TSC.  We can get more precision by waiting for more than 1 tick,
and the waiting would be interrupt-driven instead of busy-waiting so
it wouldn't slow down the boot.  Unfortunately we would also get jitter
from interrupt latency, especially in -current (see phk's graph).
ntp can filter the jitter but I think it would take many hundreds of
seconds to get near the apparent precision of unbroken busy-waiting
(1 part in <TSC freq>!?).  Of course, clock drift is likely to
invalidate that much precision.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030408163515.M7458>