Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Jan 2017 19:56:23 -0600
From:      Leif Pedersen <bilbo@hobbiton.org>
To:        Bruce Walker <bruce.walker@gmail.com>
Cc:        freebsd-cloud@freebsd.org
Subject:   Re: GCE: significant clock drift - the solution
Message-ID:  <CAK-wPOiUAn7qw0SQdDsm8Uwknm2_TphG3fejUWbRaj6pzy8xaA@mail.gmail.com>
In-Reply-To: <CAJUU0CcAbxijE--e3iM0bVbAT9D0gbu7kERkX%2BR2dGBjgsoN-w@mail.gmail.com>
References:  <CAJUU0CcAbxijE--e3iM0bVbAT9D0gbu7kERkX%2BR2dGBjgsoN-w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Just for others to learn from my head-pounding, heh:

On 10.3, I've been using kern.timecounter.hardware=ACPI-safe

I set it with sysctl after booting, then recorded it in sysctl.conf. No
loader tweaking was necessary. Besides speculating from experience, I don't
know which is best or why. I arrived at that by trying all the options
until one kept the clock stable for me, after other suggestions didn't work
consistently.

I forgot which, but one option actually seemed to work for days, then made
my VMs appear to hang because the virtual clock stopped ticking completely.
Upon rebooting the clock seemed fine for a few more days. It was
repeatable, and changing the setting decidedly stopped the problem.

The takeaway I'm getting at is, picking the wrong thing can (or at least
did) make for very weird symptoms. So it's something to suspect if you find
that your VMs appear "unstable" in mysterious ways. I can imagine that
things get pretty weird if the clock abruptly stops, leaving only IO events
left to drive the kernel.

Again, I'm not *advising* anyone to use my setting. But if you find you
have seemingly unrelated stability problems or are just into jolly bizarre
experiments, there's something to see here by selecting different hardware
clocks. And it may vary by provider. I observed this on Google Compute
Engine.

-Leif






On Jan 18, 2017 7:10 PM, "Bruce Walker" <bruce.walker@gmail.com> wrote:

Hi gang, first message. I am developing on Google Cloud Engine, I'm using
the FreeBSD community image, and so far it's been pretty painless. Thank
you!

But I noticed that the system clock on one of my two instances was drifting
fast over time, and I mean really fast. I estimate about 1 second ahead for
every 20-30 seconds of elapsed time.

So I would stop and restart ntpd and that would reset the time. Then the
creep again ...

Long story short: I searched the Google gce-discussion group and found
another FreeBSD user with the same issue, and a solution was provided by
one Andy Carrel. In a nutshell two system boot files need tweaks:

/boot/loader.conf
    machdep.disable_tsc_calibration=1
    kern.timecounter.invariant_tsc=1

/etc/sysctl.conf
    kern.timecounter.hardware=TSC-low

Andy further explains: "The loader.conf changes instruct the kernel to not
try to do calibration of the TSC, since this calibration could give bad
results due to the virtualization, and to ignore the fact that the
"invariant TSC" feature is not advertised by the CPU and just assume the
CPU has this feature.

The sysctl.conf change instructs the kernel to use the TSC-low timecounter
which was made available by the loader.conf changes."

For reference, here's a link to the posting:
https://groups.google.com/forum/#!msg/gce-discussion/
NKhl1QOVucQ/EDyLd_FxCAAJ


Maybe these can make it back into the community cloud image.

--
-bmw
_______________________________________________
freebsd-cloud@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-cloud
To unsubscribe, send any mail to "freebsd-cloud-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAK-wPOiUAn7qw0SQdDsm8Uwknm2_TphG3fejUWbRaj6pzy8xaA>