Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Nov 2016 04:12:50 -0800
From:      Jason Harmening <jason.harmening@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: huge nanosleep variance on 11-stable
Message-ID:  <CAM=8qakFsacJcURw6SN5LksbOYAqwgqu0BV6Kym_0x30DxF0Dg@mail.gmail.com>
In-Reply-To: <20161125092503.GZ54029@kib.kiev.ua>
References:  <c88341e2-4c52-ed3c-a469-6446da4415f4@gmail.com> <6167392c-c37a-6e39-aa22-ca45435d6088@gmail.com> <20161102075509.GF54029@kib.kiev.ua> <3620f62e-0f4c-2d62-dcf8-e2fdff459250@gmail.com> <20161102162808.GI54029@kib.kiev.ua> <20161125092503.GZ54029@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 25, 2016 at 1:25 AM, Konstantin Belousov <kostikbel@gmail.com>
wrote:

> On Wed, Nov 02, 2016 at 06:28:08PM +0200, Konstantin Belousov wrote:
> > On Wed, Nov 02, 2016 at 09:18:15AM -0700, Jason Harmening wrote:
> > > I think you are probably right.  Hacking out the Intel-specific
> > > additions to C-state parsing in acpi_cpu_cx_cst() from r282678 (thus
> > > going back to sti;hlt instead of monitor+mwait at C1) fixed the problem
> > > for me.  But r282678 also had the effect of enabling C2 and C3 on my
> > > system, because ACPI only presents MWAIT entries for those states and
> > > not p_lvlx.
> > You can do the same with "debug.acpi.disabled=mwait" loader tunable
> > without hacking the code. And set sysctl hw.acpi.cpu.cx_lowest to C1 to
> > enforce use of hlt instruction even when mwait states were requested.
>
> I believe I now understood the problem.  First, I got the definitive
> confirmation that LAPIC timer on Nehalems is stopped in any C mode
> higher than C1/C1E, i.e. even if C2 is enabled LAPIC eventtimer cannot
> be used.  This is consistent with the ARAT CPUID bit CPUID[0x6].eax[2]
> reported zero.
>
> On SandyBridge and IvyBridge CPUs, it seems that ARAT might be both 0
> and 1 according to the same source, but all CPUs I saw have ARAT = 1.
> And for Haswell and later generations, ARAT is claimed to be always
> implemented.
>
> The actual issue is somewhat silly bug, I must admit: if ncpus >= 8, and
> non-FSB interrupt routing from HPET, default HPET eventtimer quality 450
> is reduced by 100, i.e. it is 350. OTOH, LAPIC default quality is 600
> and it is reduced by 200 if ARAT is not reported. We end up with HPET
> quality 350 < LAPIC quality 400, despite ARAT is not set.
>
> The patch below sets LAPIC eventtimer quality to 100 if not ARAT.  Also
> I realized that there is no reason to disable deadline mode regardless
> of ARAT.
>
> diff --git a/sys/x86/x86/local_apic.c b/sys/x86/x86/local_apic.c
> index d9a3453..1b1547d 100644
> --- a/sys/x86/x86/local_apic.c
> +++ b/sys/x86/x86/local_apic.c
> @@ -478,8 +478,9 @@ native_lapic_init(vm_paddr_t addr)
>                 lapic_et.et_quality = 600;
>                 if (!arat) {
>                         lapic_et.et_flags |= ET_FLAGS_C3STOP;
> -                       lapic_et.et_quality -= 200;
> -               } else if ((cpu_feature & CPUID_TSC) != 0 &&
> +                       lapic_et.et_quality = 100;
> +               }
> +               if ((cpu_feature & CPUID_TSC) != 0 &&
>                     (cpu_feature2 & CPUID2_TSCDLT) != 0 &&
>                     tsc_is_invariant && tsc_freq != 0) {
>                         lapic_timer_tsc_deadline = 1;
>
> Ah, that makes sense.  Thanks!

I'll try the patch as soon as I get back from vacation.  I've been able to
verify that setting cx_lowest and disabling mwait fixes the problem without
hacking the code.  But I've been too busy at $(WORK) to check anything
else, namely whether forcing HPET would also fix the problem.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM=8qakFsacJcURw6SN5LksbOYAqwgqu0BV6Kym_0x30DxF0Dg>