Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Feb 2002 09:21:31 -0800 (PST)
From:      John Polstra <jdp@polstra.com>
To:        hackers@freebsd.org
Cc:        phk@critter.freebsd.dk
Subject:   Re: A question about timecounters 
Message-ID:  <200202051721.g15HLVW03792@vashon.polstra.com>
In-Reply-To: <86051.1012909502@critter.freebsd.dk>
References:  <86051.1012909502@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <86051.1012909502@critter.freebsd.dk>,
Poul-Henning Kamp  <phk@critter.freebsd.dk> wrote:
> In message <200202050141.g151fxW02520@vashon.polstra.com>, John Polstra writes:
> >That's the global variable named "timecounter", right?  I did notice
> >one potential problem: that variable is not declared volatile.  So
> >in this part ...
> 
> This may be a problem, I have yet to see GCC make different code for
> that but I should probably have committed the "volatile" anyway.

It should be committed, but it is not causing the problem in this
case.  I changed it and then compared MD5s of the object files.  The
only changes that resulted were unimportant.

> >I also noticed this in tco_forward():
> >
> >        tco = timecounter;
> >        tc = sync_other_counter();
> >	[...]
> >        if (tco->tc_poll_pps)
> >
> >But sync_other_counter() loads its own copy of "timecounter",
> >and there's no guarantee it hasn't changed from the value that
> >tco_forward() saved in its local variable.  I'm not sure yet if
> >that's a potential problem.  It could corrected by passing "tco" as
> >an argument to sync_other_counter.  I'll try that too.
> 
> This code is actually correct, the tc_poll_pps needs to be done on
> the "old" timecounter, because that would be the reference for any
> captured hardware timestamps, if I did it on the new timecounter I
> might get negative deltas which would complicate things.  Also the
> new timecounter may have a changed frequency/offset (tickadj/ntpd
> and all that).

I don't think I follow your reasoning here.  If the call to
sync_other_counter were inlined, we'd have something like this:

    tco = timecounter;
    tco_in_sync_other_counter = timecounter;
    [...]
    if (tco->tc_poll_pps)

Obviously tco and tco_in_sync_other_counter will have the same value
almost all of the time, so the code can't be relying on them being
different.

Anyway, I realize now that this also isn't the problem, because
tco_forward is only ever called at splclock.  It can't be
interrupted or re-entered, at least not on the uniprocessor -stable
systems I'm looking at.

> There is one more failure mode which you have overlooked:  The individual
> timecounters maintain a binary counter of a certain width, if interrupt
> latency gets too bad, this may overflow.
> 
> This is a non-issue for the TSC, which is 64bit wide in hardware.

Many of the systems where I see this problem are using the TSC as
the timecounter.  They don't have APM in the kernel, and they aren't
running ntpd.  I.e., it's not only the i8254 that's the problem.  The
fastest of these systems is a 1.13 GHz PIII, and it would take the 32
bits of the TSC which are actually used 3.8 seconds to wrap around.

> Hope this helps...

Yep, thanks.  I have some ideas of other things to try.

John
-- 
  John Polstra
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202051721.g15HLVW03792>