Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Apr 2013 11:50:52 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Jim Harris <jim.harris@gmail.com>
Subject:   Re: Synchronizing TSC
Message-ID:  <20130417085052.GZ2930@kib.kiev.ua>
In-Reply-To: <516E4537.7050205@FreeBSD.org>
References:  <516DCAF7.20400@FreeBSD.org> <CAJP=Hc8fzwZBCp-9K8dHP1HrMh7r62FHm%2BSKWooBf_yUGS6mFQ@mail.gmail.com> <516E4537.7050205@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--sR4wgT97u2q5hWcW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Apr 17, 2013 at 09:46:15AM +0300, Alexander Motin wrote:
> On 17.04.2013 03:25, Jim Harris wrote:
> >
> > On Tue, Apr 16, 2013 at 3:04 PM, Alexander Motin <mav@freebsd.org
> > <mailto:mav@freebsd.org>> wrote:
> >
> >     Hi.
> >
> >     Recently I've got 6-core/12-thread system on Sandy Bridge-E Core
> >     i7-3930K CPU and was unpleasantly surprised to see that TSCs are not
> >     synchronized there. While all 11 APs were synchronized, BSP was far
> >     behind them. Since it is single-socket system, I don't know any good
> >     reason for such behavior except some BIOS bug. But I've recalled
> >     that somewhere was some discussions about possible TSC
> >     synchronization. I've implemented patch below that allows to adjust
> >     TSC values of BSPs to AP's one on boot using CPU MSRs, hoping that
> >     they should not diverge after that:
> >     http://people.freebsd.org/~__mav/tsc_adj2.patch
> >     <http://people.freebsd.org/~mav/tsc_adj2.patch>;
> >
> >     I don't know very much about all different TSC hardware to predict
> >     when it is safe to enable the functionality, but at least on my
> >     system being enabled via loader tunable it seems working well.
> >
> >     Comments?
> >
> >
> > You may be remembering this thread on r238755 last year:
> >
> > http://lists.freebsd.org/pipermail/svn-src-head/2012-July/038992.html
> >
> > This was a bug fix in the TSC synchronization test code though, not
> > anything for trying to adjust out-of-sync TSCs.
>=20
> I remember that thread, but I think I've seen somebody told somewhere=20
> that it could be interesting to implement some MI mechanism. Never mind.
>=20
> > The Intel SDM (volume 3, section 17.13 of March 2013 revision) says
> > earlier models can only write to lower 32 bits of
> > IA32_TIME_STAMP_COUNTER, but these models also should not have invariant
> > TSC so they would never even get to your new routine.  So your patch
> > seems OK for Intel CPUs, at least as a tunable that is disabled by defa=
ult.
>=20
> Thanks.
>=20
> > My only concern would be why TSC on the BSP started out-of-sync on your
> > system.  Theoretically, BIOS could adjust TSCs in SMM to try to hide SMI
> > code execution from the OS, which could then make them out-of-sync
> > again.  Not sure if that's what's happening here, but might be worth a
> > test putting the TSC test code on a periodic timer to see if they ever
> > get out of sync again.
>=20
> I did one more interesting observation: on every reboot drift between=20
> BSP and APs is growing proportionally to the previous system power-on=20
> time. On first boot it is -3878361036 (just above one second), after=20
> reboot some minutes later it is -1123454492776 (about 6 minutes), after=
=20
> another reboot it is -1853033521804 (about 10 minutes).
>=20
> Unless my adjustment code would be active, I would guess that AP's TSC=20
> is running linearly while BSP's for some reason reset to zero on every=20
> reboot. But since I am synchronizing them on each boot, the only=20
> possibility for it I see is that there is some other timer(s) /=20
> counter(s) not affected by MSR writes that ticks linearly and reloading=
=20
> AP's TSC, but for some reason not reloading BSP's.

For me it sounds as the BIOS bug, indeed. Could you verify the content
of IA32_TSC_ADJUST on all cores (I believe it is present on E5) ?
Also, using TSC_ADJUST to correct the skew seems to be preferrable,
according to the Intel docs.

Why do you use cpuid in the assembly sequence ? As I understand, you
ensure that there is a serialization point, but why do you need it ?

The common knowledge is that for CPUs with invariant TSC, the TSC
counter is single-instance and located on uncore. For single-socket
configurations, your patch would be fine. But, for multi-socket
machines, each package has its own counter, and counters might drift.
As result, the initial synchronization would still allow the eventual
de-sync and this is problematic.

--sR4wgT97u2q5hWcW
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJRbmJrAAoJEJDCuSvBvK1BnpQP/Ao3ez1ajXqJ/QmY+9rwEe5z
261Yru3pwdJxR0Og3zdEeUOc7zUURTVBm1JB6rwoklalDTPosiQw9/PcWgaV/Dr/
M0Qy81K7cq6pz0Mpo2iztI8hqcOqt8nA9D3An6fqp+FgCOVKhk8IpwtMf1EblVaf
ZrJKIev6sgQ4hiDHrHMbbJnfeiUQxdwxpxdWPfIyugtjRUvx1cFAwJDkGHVuWuPW
+4GIusv38UHeoJ205+MJPoQ/lqalvWb+AsRlaG75gWz2DwqFPeS48PjRIa/2bnUW
e3/4EFunGUHQUeyQwOAWtV23isPYJ65NAaYM5MotwY+WL0vqWnlETPFr+ix6wXtY
t3A1gJVXMhjuGEeMvAG60VBE3CMDhzKUWdC0kf46X9PlFIu6+LE61HP/HcJjVrDj
wNVaqsx7mMFUHGkjHE7Py8YoYAfZ2tscuI83uO499HoBWCBFwe3AfrTpPd4rE6OR
8QoMyx+ZuKHwXkIplwNxHaaVOoHBkmY5RIThIvBSCyQJNMlnmKHEJFF70aofFXzk
6I6ltIKyZCnaZXXjL5vCC0MszdAH0gInscqPMv2QoGXojNjFOUeWfq/lADdrvFAE
JskmfmgJwC/jRGhDpqYjkCsuJeLcdMMvQTuqx9BNVz1HLYg32+J6/GwCIE/OURYV
5m+48QBNnTClt3U+teGd
=JkPB
-----END PGP SIGNATURE-----

--sR4wgT97u2q5hWcW--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130417085052.GZ2930>