Date: Fri, 18 Mar 2016 11:16:21 -0700 From: Stanislav Sedov <stas@FreeBSD.org> To: John Baldwin <jhb@freebsd.org> Cc: Adrian Chadd <adrian.chadd@gmail.com>, Konstantin Belousov <kostikbel@gmail.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: Peformance issues with r278325 Message-ID: <8EE51E0E-41F4-4B5A-A755-B58E8E1D1776@FreeBSD.org> In-Reply-To: <3277812.DVsZx4uMun@ralph.baldwin.cx> References: <FA50A68E-7F3D-4361-8A8A-EB7F97EF3D00@FreeBSD.org> <8403291.NqUNo0Qq5W@ralph.baldwin.cx> <CAJ-VmonUdHRJ0jcFphqXi1w2PwbDycLG190Yd5z8JQWGxW_1iQ@mail.gmail.com> <3277812.DVsZx4uMun@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Mar 18, 2016, at 10:37 AM, John Baldwin <jhb@freebsd.org> wrote: >=20 > On Thursday, March 17, 2016 10:45:39 AM Adrian Chadd wrote: >> On 17 March 2016 at 10:23, John Baldwin <jhb@freebsd.org> wrote: >>=20 >>>=20 >>> I had not expected this commit to have this impact, but Konstantin = is correct. >>> I wonder if simply reducing the DELAY() from 5 down to 1 or so would = be >>> sufficient? (Note that you need to adjust the prior loop to use +=3D = 1 instead >>> of +=3D 5 in that case.) >>>=20 >>> Note that DETECT_DEADLOCK is not enabled by default, so the = AFTER_SPIN case >>> (which waits for an IPI just sent to be delivered) shouldn't be = enabled (and >>> in fact I'd like to just remove that code entirely). This means = that only >>> BEFORE_SPIN should be spinning, and it should only be spinning if a = CPU sends >>> IPIs back to back such that the previous IPI is still pending (not = yet >>> delivered) when the CPU wants to send another IPI. >>>=20 >>> We can probably assume a TSC if we have SMP, so if changing the = delay from 5 >>> to 1 doesn't work we can try just using the TSC directly to control = the >>> spin length and go back to using a simple pause. >>>=20 >>> I have an old set of changes that might also be interesting that = permit >>> TLB shootdown IPI handlers to run while spinlocks are held (by using = cr8/TPR >>> to control interrupts when a spinlock is held instead of disabling = all >>> interrupts). I haven't found a workload where that helped yet. = However, >>> yours might be an interesting workload to try those changes out on. >>=20 >> Do you think it's worth just reverting it for now just so it lands in >> 10.3-RELEASE? >=20 > Probably. If the '1' change fixes it that is a simple test, otherwise = we > can revert in 10.x. I think I'll likely just convert it to use a = direct > TSC delay loop always in HEAD (assuming that verifies ok in testing as = well). >=20 FWIW we are currently testing the delay '1' change. Unfortunately, the = test is not easy to repeat (we didn't find a synthetic one yet that results in = the same outcome), so it does take more time that I would like. Will follow up = with the=20 results. We did try HEAD as well a while ago, and although it exhibited the same = pattern. However it did not utilize the x2apic unfortunately, as it does seem to = be disables in the BIOS (FreeBSD reports it being disables in the DMAR table). Thanks for looking into it! -- Stanislav Sedov ST4096-RIPE
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8EE51E0E-41F4-4B5A-A755-B58E8E1D1776>