Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jun 2018 12:49:17 -0600
From:      Ian Lepore <ian@freebsd.org>
To:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, Alan Somers <asomers@freebsd.org>
Cc:        Jung-uk Kim <jkim@freebsd.org>, Andriy Gapon <avg@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: TSC calibration in virtual machines
Message-ID:  <1530125357.24573.101.camel@freebsd.org>
In-Reply-To: <201806271705.w5RH5WZ7009269@pdx.rh.CN85.dnsmgr.net>
References:  <201806271705.w5RH5WZ7009269@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 2018-06-27 at 10:05 -0700, Rodney W. Grimes wrote:
> > 
> > On Wed, Jun 27, 2018 at 10:36 AM, Jung-uk Kim <jkim@freebsd.org>
> > wrote:
> > 
> > > 
> > > On 06/27/2018 03:14, Andriy Gapon wrote:
> > > > 
> > > > 
> > > > It seems that TSC calibration in virtual machines sometimes can
> > > > do more
> > > harm
> > > > 
> > > > than good.  Should we default to trusting the information
> > > > provided by a
> > > hypervisor?
> > > > 
> > > > 
> > > > Specifically, I am observing a problem on GCE instances where
> > > > calibrated
> > > TSC
> > > > 
> > > > frequency is about 10% lower than advertised frequency.  And
> > > > apparently
> > > the
> > > > 
> > > > advertised frequency is the right one.
> > > > 
> > > > I found this thread with similar reports and a variety of
> > > > workarounds
> > > from
> > > > 
> > > > administratively disabling the calibration to switching to a
> > > > different
> > > timecounter:
> > > > 
> > > > https://lists.freebsd.org/pipermail/freebsd-cloud/2017-
> > > January/000080.html
> > > 
> > > We already do that for VMware hosts since r221214.
> > > 
> > > https://svnweb.freebsd.org/changeset/base/221214
> > > 
> > > We should do the same for each hypervisor.
> > > 
> > > Jung-uk Kim
> > > 
> > > 
> > We probably should.  But why does calibration fail in the first
> > place?  If
> > it can fail in a VM, then it can probably fail on bare metal
> > too.  It would
> > be worth investigating.
> No, the failure in a VM is unique to a VM, it has to do with the fact
> your have the hypervisor timeslicing a CPU that you believe to be
> 100%
> dedicated to you.
> 
> There are several white papers, including one from VMWare about what
> they have done to help with the time keeping problems.
> 
> What is suggested above would be a correct thing to do.
> Bhyve creates these issues as well, and use of certain timers
> in a bhyve guest can cause you nightmares with ntp.

Iirc, bhyve's arithmetic when doing timer emulation leads to roundoff
errors that accumulate to effectively make the emulated timer run off-
frequency. The hpet timer was trivial to fix by just redefining it to
run at a power-of-2 frequency to eliminate rounding errors. The other
timers have to run at fixed frequencies, so better arithmetic will be
the way to fix them. I vaguely remember that being harder to do than to
say because of the way the code is currently structured, which is why I
just did the easy fix to the hpet so that people would have at least
one usable timer that didn't give ntpd fits in guest OSes.

-- Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1530125357.24573.101.camel>