Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Oct 2017 12:21:36 -0600
From:      Ian Lepore <ian@freebsd.org>
To:        Boris Samorodov <bsam@passap.ru>, freebsd-current@FreeBSD.org, freebsd-virtualization@FreeBSD.org
Subject:   Re: host, bhyve vm and ntpd
Message-ID:  <1508523696.1383.75.camel@freebsd.org>
In-Reply-To: <30992c14-7b78-ab9f-5693-931e6ca41f1b@passap.ru>
References:  <2931f1cc-6574-b58d-4b94-5f77fa5cdb85@passap.ru> <1508512327.1383.55.camel@freebsd.org> <39bf2426-2edf-d485-7c81-519e931154be@passap.ru> <bf8eae88-ee44-58d5-bb3a-565a0314fdfe@passap.ru> <1508517160.1383.63.camel@freebsd.org> <76ff7afb-3d3a-96f6-1275-89472ff5683d@passap.ru> <1508522667.1383.69.camel@freebsd.org> <30992c14-7b78-ab9f-5693-931e6ca41f1b@passap.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2017-10-20 at 21:15 +0300, Boris Samorodov wrote:
> 20.10.2017 21:04, Ian Lepore ΠΙΫΕΤ:
> > 
> > On Fri, 2017-10-20 at 20:20 +0300, Boris Samorodov wrote:
> > > 
> > > (CC to freebsd-virtualization@)
> > > 
> > > 20.10.2017 19:32, Ian Lepore ΠΙΫΕΤ:
> > > > 
> > > > 
> > > > On Fri, 2017-10-20 at 18:36 +0300, Boris Samorodov wrote:
> > > > > 
> > > > > 
> > > > > 20.10.2017 18:31, Boris Samorodov ΠΙΫΕΤ:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 20.10.2017 18:12, Ian Lepore ΠΙΫΕΤ:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On Fri, 2017-10-20 at 14:46 +0300, Boris Samorodov wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I have got a host:
> > > > > > > > ---
> > > > > > > > bhyve-host% uname -a
> > > > > > > > FreeBSD sm.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #3 r322868: Fri Aug
> > > > > > > > 25 05:25:26 MSK 2017
> > > > > > > > bsam@builder.bsnet:/usr/obj/usr/src/sys/GENERIC-FASTššamd64 amd64
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > And a bhyve vm:
> > > > > > > > ---
> > > > > > > > bhyve-vm: uname -a
> > > > > > > > FreeBSD builder.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #58 r324782: Fri
> > > > > > > > Oct 20 05:12:17 MSK 2017
> > > > > > > > bsam@builder.bsnet:/usr/obj/usr/src/sys/PKG64Xššamd64 amd64
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > The only difference at kernel configs is a colored console. :-)
> > > > > > > > 
> > > > > > > > And here I get some weird (is it?) result at the VM (I expect ntpd to be
> > > > > > > > more stable):
> > > > > > > > ---
> > > > > > > > bhyve-vm% for t in `jot 10`; do ntpq -pn; sleep 64; done
> > > > > > > > šššššremotešššššššššššrefidššššššst t when poll reachšššdelayšššoffset
> > > > > > > > jitter
> > > > > > > > ==============================================================================
> > > > > > > > šXX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš9ššš64šššš3šššš0.605ššš-1.202
> > > > > > > > 316.407
> > > > > > > > šXX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš7ššš64šššš7šššš0.605ššš-1.202
> > > > > > > > 358.395
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš5ššš64ššš17šššš0.615šš-328.42
> > > > > > > > 181.405
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš3ššš64ššš37šššš0.615šš-328.42
> > > > > > > > 214.868
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš67ššš64ššš37šššš0.615šš-328.42
> > > > > > > > 214.868
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš63ššš64ššš77šššš0.615šš-328.42
> > > > > > > > 268.618
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš60ššš64šš177šššš0.615šš-328.42
> > > > > > > > 333.175
> > > > > > > > šXX.XX.XX.1šššššš.STEP.šššššššššš16 u 1910ššš64šššš0šššš0.000šššš0.000
> > > > > > > > 0.000
> > > > > > > > šXX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš27ššš64šššš1šššš0.703šš-262.63
> > > > > > > > 0.004
> > > > > > > > šXX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš31ššš64šššš1šššš0.649šš-331.43
> > > > > > > > 68.800
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > At the same time host's results are very stable:
> > > > > > > > ---
> > > > > > > > bhyve-host% for t in `jot 10`; do ntpq -pn; sleep 64; done
> > > > > > > > šššššremotešššššššššššrefidššššššst t when poll reachšššdelayšššoffset
> > > > > > > > jitter
> > > > > > > > ==============================================================================
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš1ššš64šššš1šššš0.401šššš0.176
> > > > > > > > 0.106
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš6ššš64šššš3šššš0.401šššš0.176
> > > > > > > > 0.459
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 ušššš3ššš64šššš7šššš0.401šššš0.176
> > > > > > > > 0.940
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš67ššš64šššš7šššš0.401šššš0.176
> > > > > > > > 0.940
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš64ššš64ššš17šššš0.401šššš0.176
> > > > > > > > 1.566
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš60ššš64ššš37šššš0.448šššš1.275
> > > > > > > > 1.739
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš55ššš64ššš77šššš0.448šššš1.275
> > > > > > > > 2.365
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš53ššš64šš177šššš0.448šššš1.275
> > > > > > > > 3.110
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš50ššš64šš377šššš0.448šššš1.275
> > > > > > > > 3.929
> > > > > > > > *XX.XX.XX.1ššššššXX.XX.XX.245ššššš4 uššš45ššš64šš377šššš0.443šššš8.750
> > > > > > > > 4.722
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > The network is organized via bridge -- host igb and vm tap interfaces
> > > > > > > > are members of one bridge.
> > > > > > > > 
> > > > > > > > Are those results expected? Does it smell like a bug? Should I dig
> > > > > > > > furter?
> > > > > > > > 
> > > > > > > So it is repeatedly stepping the clock in the VM? (Set
> > > > > > > kern.timecounter.stepwarnings=1 to log steps).
> > > > > > No kernel/ntpd messages for 20 minutes after setting this sysctl.
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > šThat is usually a sign
> > > > > > > that the chosen timecounter is running at a different frequency than it
> > > > > > > claimed to be when it registered itself -- the host may not be
> > > > > > > emulating the timer hardware properly in the guest. šWhat is the output
> > > > > > > of sysctl kern.timecounter in the vm?
> > > > > > ---
> > > > > > bhyve-vm% sysctl kern.timecounter
> > > > > > 
> > > > > > kern.timecounter.tsc_shift: 1
> > > > > > kern.timecounter.smp_tsc_adjust: 0
> > > > > > kern.timecounter.smp_tsc: 0
> > > > > > kern.timecounter.invariant_tsc: 1
> > > > > > kern.timecounter.fast_gettime: 1
> > > > > > kern.timecounter.tick: 1
> > > > > > kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(-100)
> > > > > > dummy(-1000000)
> > > > > > kern.timecounter.hardware: HPET
> > > > > > kern.timecounter.alloweddeviation: 5
> > > > > > kern.timecounter.stepwarnings: 1
> > > > > > kern.timecounter.tc.ACPI-fast.quality: 900
> > > > > > kern.timecounter.tc.ACPI-fast.frequency: 3579545
> > > > > > kern.timecounter.tc.ACPI-fast.counter: 4161213491
> > > > > > kern.timecounter.tc.ACPI-fast.mask: 4294967295
> > > > > > kern.timecounter.tc.HPET.quality: 950
> > > > > > kern.timecounter.tc.HPET.frequency: 10000000
> > > > > > kern.timecounter.tc.HPET.counter: 3518036865
> > > > > > kern.timecounter.tc.HPET.mask: 4294967295
> > > > > > kern.timecounter.tc.i8254.quality: 0
> > > > > > kern.timecounter.tc.i8254.frequency: 1193182
> > > > > > kern.timecounter.tc.i8254.counter: 47597
> > > > > > kern.timecounter.tc.i8254.mask: 65535
> > > > > > kern.timecounter.tc.TSC-low.quality: -100
> > > > > > kern.timecounter.tc.TSC-low.frequency: 1199886114
> > > > > > kern.timecounter.tc.TSC-low.counter: 1274338278
> > > > > > kern.timecounter.tc.TSC-low.mask: 4294967295
> > > > > > ---
> > > > > BTW, here is the host kern.timecounter:
> > > > > ---
> > > > > kern.timecounter.tsc_shift: 1
> > > > > kern.timecounter.smp_tsc_adjust: 0
> > > > > kern.timecounter.smp_tsc: 1
> > > > > kern.timecounter.invariant_tsc: 1
> > > > > kern.timecounter.fast_gettime: 1
> > > > > kern.timecounter.tick: 1
> > > > > kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) TSC-low(1000)
> > > > > dummy(-1000000)
> > > > > kern.timecounter.hardware: TSC-low
> > > > > kern.timecounter.alloweddeviation: 5
> > > > > kern.timecounter.stepwarnings: 0
> > > > > kern.timecounter.tc.ACPI-fast.quality: 900
> > > > > kern.timecounter.tc.ACPI-fast.frequency: 3579545
> > > > > kern.timecounter.tc.ACPI-fast.counter: 9047194
> > > > > kern.timecounter.tc.ACPI-fast.mask: 16777215
> > > > > kern.timecounter.tc.HPET.quality: 950
> > > > > kern.timecounter.tc.HPET.frequency: 14318180
> > > > > kern.timecounter.tc.HPET.counter: 2232552795
> > > > > kern.timecounter.tc.HPET.mask: 4294967295
> > > > > kern.timecounter.tc.i8254.quality: 0
> > > > > kern.timecounter.tc.i8254.frequency: 1193182
> > > > > kern.timecounter.tc.i8254.counter: 43410
> > > > > kern.timecounter.tc.i8254.mask: 65535
> > > > > kern.timecounter.tc.TSC-low.quality: 1000
> > > > > kern.timecounter.tc.TSC-low.frequency: 1200067168
> > > > > kern.timecounter.tc.TSC-low.counter: 2463146362
> > > > > kern.timecounter.tc.TSC-low.mask: 4294967295
> > > > > ---
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Also, what is the output of ntptime(8) in the vm?
> > > > > > ---
> > > > > > bhyve-vm% ntptime
> > > > > > 
> > > > > > ntp_gettime() returns code 0 (OK)
> > > > > > š time dd94930f.20ea2900ššFri, Oct 20 2017 18:21:51.128, (.128573699),
> > > > > > š maximum error 1309110 us, estimated error 3 us, TAI offset 37
> > > > > > ntp_adjtime() returns code 0 (OK)
> > > > > > š modes 0x0 (),
> > > > > > š offset 0.000 us, frequency 0.000 ppm, interval 1 s,
> > > > > > š maximum error 1309110 us, estimated error 3 us,
> > > > > > š status 0x2001 (PLL,NANO),
> > > > > > š time constant 6, precision 0.001 us, tolerance 496 ppm,
> > > > > > ---
> > > > > > 
> > > > > > Ian, thank you for your help!
> > > > > > 
> > > > It seems odd to me that the frequency of the host HPET is 14.3mhz and
> > > > that of the guest is 10.0mhz, but maybe that's a normal condition for
> > > > bhyve. šI did find some google hits[1] for bhyve guest timekeeping
> > > > trouble with the HPET timer which was solved by switching to a
> > > > different timecounter. šTimecounter choices can't be controlled from
> > > > loader.conf, so I guess a sysctl.conf entry of
> > > > kern.timecounter.hardware="ACPI-fast" is the only way to fix that. šYou
> > > > can also just do that command interactively first and see if it stops
> > > > the time steps and ntp settles down.
> > > The process seems to become more monotonic. But steps nevertheless:
> > > ---
> > > *XX.1ššššššXX.245ššššš4 uššš20ššš64šššš1šššš0.717šš-12.771ššš4.193
> > > *XX.1ššššššXX.245ššššš4 uššš28ššš64šššš3šššš0.751šš-41.970šš32.342
> > > *XX.1ššššššXX.245ššššš4 uššš23ššš64šššš7šššš0.748šš-59.505šš46.624
> > > *XX.1ššššššXX.245ššššš4 uššš18ššš64ššš17šššš0.699šš-75.164šš56.482
> > > *XX.1ššššššXX.245ššššš4 uššš14ššš64ššš37šššš0.669šš-90.112šš63.767
> > > *XX.1ššššššXX.245ššššš4 uššš11ššš64ššš77šššš0.605šš-10.567šš60.914
> > > *XX.1ššššššXX.245ššššš4 ušššš7ššš64šš177šššš0.591šš-169.54 116.762
> > > *XX.1ššššššXX.245ššššš4 ušššš3ššš64šš377šššš0.591šš-169.54 102.107
> > > *XX.1ššššššXX.245ššššš4 uššš68ššš64šš377šššš0.591šš-169.54 102.107
> > > *XX.1ššššššXX.245ššššš4 uššš63ššš64šš377šššš0.591šš-169.54šš88.424
> > > *XX.1ššššššXX.245ššššš4 uššš58ššš64šš377šššš0.591šš-169.54šš92.949
> > > *XX.1ššššššXX.245ššššš4 uššš55ššš64šš377šššš0.591šš-169.54 111.512
> > > *XX.1ššššššXX.245ššššš4 uššš50ššš64šš377šššš0.591šš-169.54 140.827
> > > *XX.1ššššššXX.245ššššš4 uššš45ššš64šš377šššš0.591šš-169.54 177.360
> > > *XX.1ššššššXX.245ššššš4 uššš43ššš64šš377šššš0.591šš-169.54 219.057
> > > šXX.1šššššš.STEP.šššš16 ušš588ššš64šššš0šššš0.000šššš0.000ššš0.000
> > > ---
> > > 
> > > > 
> > > > 
> > > > This would be a workaround, not a fix per se. šIf the time steps go
> > > > away, then something in bhyve's emulation of HPET (maybe only on some
> > > > hardware?) must be buggy.
> > > > 
> > > > 
> > > > [1] https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-April/003492.html
> > > Also thanks for the link. Unfortunately the problem seems to persist.
> > > 
> > Hmm, that almost looks normal... it noticed the clock was drifting
> > fast, so it went into a frequency-training cycle (at the point where
> > the offset stopped changing at -169.54) and then once it had figured
> > out the frequency it did a step to get realigned, and (presumably)
> > adjusted the frequency of the kernel clock. šThe output of ntptime
> > before and after the step would show that... before the step the
> > frequency would show as zero (which was the case in your ntptime output
> > earlier), and after the step it would be non-zero. šNo more steps would
> > occur after the first one, if that's what is happening here.
> Those were following steps:
> ---
> šXX.1ššššššXX.245ššššš4 uššš15ššš64šššš1šššš0.597šš-51.802ššš7.547
> šXX.1ššššššXX.245ššššš4 uššš17ššš64šššš1šššš0.597šš-51.802šš29.526
> šXX.1ššššššXX.245ššššš4 uššš19ššš64šššš1šššš0.597šš-51.802šš52.760
> šXX.1ššššššXX.245ššššš4 uššš19ššš64šššš3šššš0.597šš-51.802šš77.354
> šXX.1ššššššXX.245ššššš4 uššš16ššš64šššš7šššš0.597šš-51.802 103.182
> šXX.1ššššššXX.245ššššš4 uššš12ššš64ššš17šššš0.597šš-51.802 138.976
> *XX.1ššššššXX.245ššššš4 ušššš9ššš64ššš37šššš0.700šš-200.07 109.275
> šXX.1šššššš.STEP.šššš16 u 1100ššš64šššš0šššš0.000šššš0.000ššš0.002
> šXX.1ššššššXX.245ššššš4 uššš63ššš64šššš1šššš0.645ššš-3.620ššš0.002
> šXX.1ššššššXX.245ššššš4 uššš34ššš64šššš1šššš0.645ššš-3.620šš66.907
> šXX.1ššššššXX.245ššššš4 uššš36ššš64šššš1šššš0.643šš-115.30šš85.082---
> 
> No stability. :-) But I see your point. I'll leave it till morning
> with hope it may stabilize.
> 

I don't think it will. šntpd doesn't need more than one cycle to figure
out the frequency of a clock and adjust it (and record the amount in
ntpd.drift for future use). šEither the clock is drifting at more than
the max 500ppm rate, or it isn't really drifting steadily, it's still
jumping around in an uneven manner.

It might be interesting to try one of the lower-quality clocks such asš
i8254, or try TSC-low, although there was a comment in that mail thread
about how TSC might be a bad choice.

Beyond that, I'm not sure what else to try. šIt might be necessary to
get some bhyve developers involved (I know almost nothing about it).

-- Ian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1508523696.1383.75.camel>