From owner-freebsd-current@freebsd.org Wed Jun 27 18:49:31 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9BE70100171B for ; Wed, 27 Jun 2018 18:49:31 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound1a.eu.mailhop.org (outbound1a.eu.mailhop.org [52.58.109.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 264018C893 for ; Wed, 27 Jun 2018 18:49:30 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-RoutePath: aGlwcGll X-MHO-User: cbecad29-7a3a-11e8-aa1a-954dbaed88ca X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 67.177.211.60 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [67.177.211.60]) by outbound1.eu.mailhop.org (Halon) with ESMTPSA id cbecad29-7a3a-11e8-aa1a-954dbaed88ca; Wed, 27 Jun 2018 18:49:19 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id w5RInH82002250; Wed, 27 Jun 2018 12:49:17 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1530125357.24573.101.camel@freebsd.org> Subject: Re: TSC calibration in virtual machines From: Ian Lepore To: "Rodney W. Grimes" , Alan Somers Cc: Jung-uk Kim , Andriy Gapon , FreeBSD Current Date: Wed, 27 Jun 2018 12:49:17 -0600 In-Reply-To: <201806271705.w5RH5WZ7009269@pdx.rh.CN85.dnsmgr.net> References: <201806271705.w5RH5WZ7009269@pdx.rh.CN85.dnsmgr.net> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jun 2018 18:49:31 -0000 On Wed, 2018-06-27 at 10:05 -0700, Rodney W. Grimes wrote: > > > > On Wed, Jun 27, 2018 at 10:36 AM, Jung-uk Kim > > wrote: > > > > > > > > On 06/27/2018 03:14, Andriy Gapon wrote: > > > > > > > > > > > > It seems that TSC calibration in virtual machines sometimes can > > > > do more > > > harm > > > > > > > > than good.  Should we default to trusting the information > > > > provided by a > > > hypervisor? > > > > > > > > > > > > Specifically, I am observing a problem on GCE instances where > > > > calibrated > > > TSC > > > > > > > > frequency is about 10% lower than advertised frequency.  And > > > > apparently > > > the > > > > > > > > advertised frequency is the right one. > > > > > > > > I found this thread with similar reports and a variety of > > > > workarounds > > > from > > > > > > > > administratively disabling the calibration to switching to a > > > > different > > > timecounter: > > > > > > > > https://lists.freebsd.org/pipermail/freebsd-cloud/2017- > > > January/000080.html > > > > > > We already do that for VMware hosts since r221214. > > > > > > https://svnweb.freebsd.org/changeset/base/221214 > > > > > > We should do the same for each hypervisor. > > > > > > Jung-uk Kim > > > > > > > > We probably should.  But why does calibration fail in the first > > place?  If > > it can fail in a VM, then it can probably fail on bare metal > > too.  It would > > be worth investigating. > No, the failure in a VM is unique to a VM, it has to do with the fact > your have the hypervisor timeslicing a CPU that you believe to be > 100% > dedicated to you. > > There are several white papers, including one from VMWare about what > they have done to help with the time keeping problems. > > What is suggested above would be a correct thing to do. > Bhyve creates these issues as well, and use of certain timers > in a bhyve guest can cause you nightmares with ntp. Iirc, bhyve's arithmetic when doing timer emulation leads to roundoff errors that accumulate to effectively make the emulated timer run off- frequency. The hpet timer was trivial to fix by just redefining it to run at a power-of-2 frequency to eliminate rounding errors. The other timers have to run at fixed frequencies, so better arithmetic will be the way to fix them. I vaguely remember that being harder to do than to say because of the way the code is currently structured, which is why I just did the easy fix to the hpet so that people would have at least one usable timer that didn't give ntpd fits in guest OSes. -- Ian