Date: Wed, 28 Oct 2015 18:47:55 +0200 From: Andriy Gapon <avg@FreeBSD.org> To: Konstantin Belousov <kostikbel@gmail.com>, Alexander Motin <mav@FreeBSD.org> Cc: freebsd-hackers <freebsd-hackers@FreeBSD.org>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Jung-uk Kim <jkim@FreeBSD.org> Subject: Re: instability of timekeeping Message-ID: <5630FC3B.2070908@FreeBSD.org> In-Reply-To: <20151027140403.GB2257@kib.kiev.ua> References: <56261398.60102@FreeBSD.org> <56261FE6.90302@FreeBSD.org> <56274FFC.2000608@FreeBSD.org> <20151021184850.GX2257@kib.kiev.ua> <562F3E2F.2010100@FreeBSD.org> <20151027115810.GU2257@kib.kiev.ua> <562F8109.4050203@FreeBSD.org> <20151027140403.GB2257@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 27/10/2015 16:04, Konstantin Belousov wrote: > On Tue, Oct 27, 2015 at 03:50:01PM +0200, Andriy Gapon wrote: >> On 27/10/2015 13:58, Konstantin Belousov wrote: >>> On Tue, Oct 27, 2015 at 11:04:47AM +0200, Andriy Gapon wrote: >>>> And now another observation. I have C1E option enabled in BIOS. It >>>> means that if all cores enter C1 state, then the whole processor >>>> is magically placed into a deep C-state (C3, I think). LAPIC timer >>>> on this CPU model does not run in the deep C-state. So, I had to >>>> disable C1E option to test the LAPIC timer in a useful way. But >>>> before actually testing it I first tried to reproduce the problem. As >>>> you might have already guessed the problem is gone with that option >>>> disabled. Scratching my head to understand the implications of this >>>> observation. >>> >>> Most obvious explanation would be that the latency of wakeup is very large. >>> What is the HPET frequency when the jitter occur ? >>> >> >> kern.timecounter.tc.TSC-low.frequency: 1607351869 >> kern.eventtimer.et.HPET.frequency: 14318180 >> >> Or did you mean the actual rate of timer interrupts? > > Actual rate, and I used the wrong word. Since most likely your eventtimer > is not periodic, I mean the next timer interrupt deadline. > I wasn't sure how to answer your question, so I went ahead and obtained a little bit of debugging information using KTR. For a start, here is a patch that adds couple more trace points: https://people.freebsd.org/~avg/timekeeping-ktr.patch Please note that the patch has a small bug in hardclock_cnt(): + CTR1(KTR_SPARE2, "hardclock_cnt at %d: newticks=%d", + newticks); Two format specifiers but only one argument. I mention this, so that the trace is not misinterpreted. Then, here is a snippet of the trace that seems to be of interest: https://people.freebsd.org/~avg/timekeeping.ktrdump.txt I must admit that I have a hard time interpreting what that trace says. As I understand the KTR timestamps are raw TSC values. So, there is a huge gap between entries #999543 and #999544. It's larger than 2 * 2^32, so even after applying tsc_shift it's still larger that 2^32. I am not sure how the HPET timer could get programmed for such a long delay. Or maybe there is some lower level problem that caused the interrupt to be delayed by (16450821032024 - 16442184278422) / 2 / 1607351869 ≈ 2.687 seconds. That's quite a long time. One thing that I can tell is that my earlier hypothesis seems to be wrong as all raw TSC values are increasing. Grepping for lines where the timer gets reprogrammed I was surprised to see the following: 999543 1 16442184278422 load at 1: next 5042.d25692b3 eq 0 999507 3 16442183843960 load at 3: next 5042.d23e2146 eq 0 999503 2 16442183825488 load at 2: next 5042.d25692b3 eq 0 999472 0 16442183792870 load at 0: next 5042.d23e2146 eq 0 This looks like the next event time has flipped between those two values somehow. Also, the following seemed a little bit strange: 999508 3 16442183860624 ipi at 3: now 5042.d2301033 999467 3 16442183787851 active at 3: now 5042.d2306108 The later even reports a value of 'now' that is smaller. I think that that can be explained by the fact that in one case the value is obtained via sbinuptime() while in the other case it's taken from state->now, which is set when an actual timer interrupt is received (before sending a hardclock IPI). In either case I am going to add a few more trace points in et_start and the HPET timer code and see if I can catch anything interesting there. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5630FC3B.2070908>