Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Mar 2017 18:05:07 +0100
From:      "O. Hartmann" <ohartmann@walstatt.org>
To:        Cy Schubert <Cy.Schubert@komquats.com>
Cc:        "O. Hartmann" <ohartmann@walstatt.org>, freebsd-current <freebsd-current@freebsd.org>, Don Lewis <truckman@FreeBSD.org>
Subject:   Re: ntpd dies nightly on a server with jails
Message-ID:  <20170317180507.5c64fb26@thor.intern.walstatt.dynvpn.de>
In-Reply-To: <201703152012.v2FKCbvg078762@slippy.cwsent.com>
References:  <20170315071724.78bb0bdc@freyja.zeit4.iv.bundesimmobilien.de> <201703152012.v2FKCbvg078762@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/AYNW4EZFsh9hIquzAhaV/tS
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Am Wed, 15 Mar 2017 13:12:37 -0700
Cy Schubert <Cy.Schubert@komquats.com> schrieb:

> Hi O.Hartmann,
>=20
> I'll try to answer as much as I can in the noon hour I have left.
>=20
> In message <20170315071724.78bb0bdc@freyja.zeit4.iv.bundesimmobilien.de>,=
=20
> "O. H
> artmann" writes:
> > Running a host with several jails on recent CURRENT (12.0-CURRENT #8 r3=
15187:
> > Sun Mar 12 11:22:38 CET 2017 amd64) makes me trouble on a daily basis.
> >=20
> > The box is an older two-socket Fujitsu server equipted with two four-co=
re
> > Intel(R) Xeon(R) CPU L5420  @ 2.50GHz.
> >=20
> > The box has several jails, each jail does NOT run service ntpd. Each ja=
il has
> > its dedicated loopback, lo1 throughout lo5 (for the moment) with dedica=
ted IP
> > :
> > 127.0.1.1 - 127.0.5.1 (if this matter, I believe not).
> >=20
> > The host itself has two main NICs, broadcom based. bcm0 is dedicated to=
 the
> > host, bcm1 is shared amongst the jails: each jail has an IP bound to bc=
m1 via
> > whihc the jails communicate with the network.
> >=20
> > I try to capture log informations via syslog, but FreeBSD's ntpd seems =
to be
> > very, very sparse with such informations, coverging to null - I can't s=
ee
> > anything suiatble in the logs why NTPD dies almost every night leaving =
the
> > system with a wild reset of time. Sometimes it is a gain of 6 hours, so=
metime
> > s
> > it is only half an hour. I leave the box at 16:00 local time usually an=
d take
> > care again at ~ 7 o'clock in the morning local time. =20
>=20
> We will need to turn on debugging. Unfortunately debug code is not compil=
ed=20
> into the binary. We have two options. You can either update=20
> src/usr.sbin/ntp/config.h to enable DEBUG or build the port (it's the exa=
ct=20
> same ntp) with the DEBUG option -- this is probably simpler. Then enable=
=20
> debug with -d and -D. -D increases verbosity. I just committed a debug=20
> option to both ntp ports to assist here.
>=20
> Next question: Do you see any indication of a core dump? I'd be intereste=
d=20
> in looking at it if possible.
>=20
> >=20
> > When the clock is floating that wild, in all cases ntpd isn't running a=
ny mor
> > e.
> > I try to restart with options -g and -G to adjust the time quickly at t=
he
> > beginning, which works fine. =20
>=20
> This is disconcerting. If your clock is floating wildly without ntpd=20
> running there are other issues that might be at play here. At most the=20
> clock might drift a little, maybe a minute or two a day but not by a lot.=
=20
> Does the drift cause your clocks to run fast or slow?
>=20
> >=20
> > Apart from possible misconfigurations of the jails (I'm quite new to ja=
ils an
> > d
> > their pitfalls), I was wondering what causes ntpd to die. i can't deter=
mine
> > exactly the time of its death, so it might be related to diurnal/period=
ic
> > processes (I use only the most vanilla configurations on periodic, exce=
pt for
> > checking ZFS's scrubbing enabled). =20
>=20
> As I'm a little rushed for time, I didn't catch whether the jails=20
> themselves were also running ntpd... just thought I'd ask. I don't see ho=
w=20
> zfs scrubbing or any other periodic scripts could cause this.
>=20
> >=20
> > I'ven't had the chance to check whether the hardware is completely all =
right,
> > but from a superficial point of view there is no issue with high gain o=
f the
> > internal clock or other hardware issues. =20
>=20
> It's probably a good idea to check. I don't think that would cause ntpd a=
ny=20
> gas. I've seen RTC battery messages on my gear which haven't caused ntpd=
=20
> any problem. I have two machines which complain about RTC battery being=20
> dead, where in fact I have replaced the batteries and the messages still=
=20
> are displayed at boot. I'm not sure if it's possible for a kernel to dama=
ge=20
> the RTC. In my case that doesn't cause ntpd any problems. It's probably=20
> good to check anyway.
>=20
> >=20
> > If there are known issues with jails (the problem occurs since I use th=
ose),
> > advice is appreciated. =20
>=20
> Not that I know of.
>=20
>=20

Just some strange news:

I left the server the whole day with ntpd disabled and I didn't watch a gai=
n of the RTC
by one second, even stressing the machine.

But soon after restarting ntpd, I realised immediately a 30 minutes off! Th=
is morning,
the discrapancy was almost 5 hours - it looked more like a weird ajustment =
to another
time base than UTC.

Over the weekend I'll leave the server with ntpd disabled and only RTC runn=
ing. I've the
strange feeling that something is intentionally readjusting the ntpd time d=
ue to a
misconfiguration or a rogue ntp server in the X.CC.pool.ntp.org

--=20
O. Hartmann

Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BCr
Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Abs.=
 4 BDSG).

--Sig_/AYNW4EZFsh9hIquzAhaV/tS
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iLUEARMKAB0WIQQZVZMzAtwC2T/86TrS528fyFhYlAUCWMwXQwAKCRDS528fyFhY
lJjKAf9PBQ+3ap+ojjMsDiUzsMQIMtzzxaNTECyO+R3LamuNsZ3F7bAeIs/Z0z6q
/aWm8VfJalzpAgwIZCofgb0SHHCxAf9X7zioQI9mC7DqWA80U8I25BCku5zg68xx
q6vGRgVahAxFJTQI0/O00XpIuYqpfFrXX/6cuVJR+u4TxoRZxqDm
=8jqD
-----END PGP SIGNATURE-----

--Sig_/AYNW4EZFsh9hIquzAhaV/tS--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170317180507.5c64fb26>