Date: Fri, 19 Apr 2019 22:28:26 -0700 From: Mark Millard <marklmi@yahoo.com> To: Justin Hibbits <chmeeedalf@gmail.com> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org> Subject: Re: Looks like ap_letgo use needs platform specific code to allow avoiding the "sleep-gets-stuck" problem on PowerMac11,2's . . . Message-ID: <1C697AD4-6CAC-4C33-8D77-72D9B53A7648@yahoo.com> In-Reply-To: <2E386EE0-782D-47CB-978B-B5A010AFCF88@yahoo.com> References: <B5B1CC39-1D75-42F4-9661-62DA9D029D34@yahoo.com> <CAHSQbTBJtUc70XZgEk449-dLEhOydsypxD2NpVNsrxy0NcxnNA@mail.gmail.com> <2E386EE0-782D-47CB-978B-B5A010AFCF88@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[The or 31,31,31 and or 6,6,6 test behaved as you said.] On 2019-Apr-19, at 21:42, Mark Millard <marklmi at yahoo.com> wrote: > On 2019-Apr-19, at 20:48, Justin Hibbits <chmeeedalf at gmail.com> = wrote: >=20 >> On Fri, Apr 19, 2019 at 10:36 PM Mark Millard <marklmi@yahoo.com> = wrote: >>>=20 >>> [Note: My context is tied to getting usefdt mode operable on >>> old PowerMacs. The below is only tested for usefdt mode >>> so far.] >>>=20 >>> The following investigatory patch has so-far stopped my having >>> sleep-gets-stuck problems (only seen on 2-socket/1-core-each >>> 970 MP G5 Powermac11,2's as far as I know): >>>=20 >>> # svnlite diff /usr/src/sys/powerpc/powerpc/mp_machdep.c | more = Index: = /usr/src/sys/powerpc/powerpc/mp_machdep.c >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- /usr/src/sys/powerpc/powerpc/mp_machdep.c (revision 345758) >>> +++ /usr/src/sys/powerpc/powerpc/mp_machdep.c (working copy) >>> @@ -77,9 +77,10 @@ >>> PCPU_SET(awake, 1); >>> __asm __volatile("msync; isync"); >>>=20 >>> + powerpc_sync(); >>> while (ap_letgo =3D=3D 0) >>> - __asm __volatile("or 31,31,31"); >>> - __asm __volatile("or 6,6,6"); >>> + powerpc_sync(); >>> + isync(); >>>=20 >>> /* >>> * Set timebase as soon as possible to meet an implicit = rendezvous >>> @@ -262,8 +263,11 @@ >>> __asm __volatile("msync; isync"); >>>=20 >>> /* Let APs continue */ >>> - atomic_store_rel_int(&ap_letgo, 1); >>> + ap_letgo=3D 1; // depend on prior sync, no need to lwsync = first >>>=20 >>> + powerpc_sync(); // analogous to what the ap's do (more = similar time frame?) >>> + if (ap_letgo) isync(); >>> + >>> platform_smp_timebase_sync(ap_timebase, 0); >>>=20 >>> while (ap_awake < smp_cpus) >>>=20 >>> Apparently, the use of "or 31,31,31" causes sizable >>> variations in the time frame when the platform_smp_timebase_sync >>> happens on the various cores across the two 970MPs. >>>=20 >>> It looks something like a platform_ap_letgo_wait is appropriate, >>> with a powermac_ap_letgo_wait specific one, say. (Or, possibly, >>> AIM specific but spanning powermac?) >>>=20 >>> The above patch has booted and operated the 2-socket PowerMac7,2 >>> context fine as well so far. I'll check the G5's and a G4 >>> dual-socket with a 32-bit powerpc build. >>>=20 >>> I've no clue if there are any time-mismatch issues across >>> sockets/cores/hw-threads for the "8-way SMT" contexts with >>> "dozens to hundreds of CPUs". >>>=20 >>> I've only been testing for part of today and I do not have >>> access to any non-PowerMac PowerPC contexts. So this is >>> preliminary but I do not expect "or 31,31,31" is going to >>> be appropriate to the PowerMac11,2 contexts that caused >>> my investigation of the issue. >>=20 >> Those nops are just that on the G5: nops. They do absolutely nothing >> on any processor that's not multithreaded. On multithreaded CPUs = they >> are priority hints. >=20 > I thought PowerISA 2.03 was before multi-threaded (but spanning > multi-core). It lists 31,31,31 in a table with other values. Is > there a better match to the 970MP vintage of things for me to > reference? >=20 > I'll test: >=20 > powerpc_sync(); > while (ap_letgo =3D=3D 0) > { > __asm __volatile("or 31,31,31"); > powerpc_sync(); > } > __asm __volatile("or 6,6,6"); > isync(); >=20 > (or some other such if you want). Let me know if you > have some specific variation you prefer. Some quick testing seems to be doing what you indicated. I've updated https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D233863 to have the updated code. >> What most likely 'fixed' the problem for you was the addition of the >> synchronization primitives in the code path. You can add them = without >> sacrificing the nops. Does this patch fix the boot issue on the G5 >> quad without the usefdt=3D1 setting, and without reverting the KVA >> change? >=20 > We already had an exchange about my forcing an slb entry as > needed for pcpup->pc_curpcb to be used. It massively changed > the frequency of hangups (rare now). >=20 > As a reminder, I had added >=20 > hack_into_slb_if_needed(pcpup->pc_curpcb); >=20 > in cpudep_ap_bootstrap. This was because other > activity from: >=20 > SI_SUB_KTHREAD_INIT =3D 0xe000000, /* init process*/ > SI_SUB_KTHREAD_PAGE =3D 0xe400000, /* pageout daemon*/ > SI_SUB_KTHREAD_VM =3D 0xe800000, /* vm daemon*/ > SI_SUB_KTHREAD_BUF =3D 0xea00000, /* buffer daemon*/ > SI_SUB_KTHREAD_UPDATE =3D 0xec00000, /* update daemon*/ > SI_SUB_KTHREAD_IDLE =3D 0xee00000, /* idle procs*/ > #ifndef EARLY_AP_STARTUP > SI_SUB_SMP =3D 0xf000000, /* start the APs*/ > #endif=20 >=20 > was competing for slb entries and doing slb entry replacements in > parallel with the ap startup activity so sometimes no slot covered > the pcpup->pc_curpcb relted address range. (Replacement slots are > picked based on mftb()%n_slbs .) >=20 > Are you asking me to disable that call and see what happens? I've not done anything about disabling the replacement of an slb entry for spanning what pcpup->pc_curpcb-> refers to when there is no such spanning entry already. (The code makes no replacement if an entry does span the address range. So, effectively, then, the code is a no-op for such conditions.) > With the hack_into_slb_if_needed call and the other patches > reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D233863 > I'm booting and using historical and usefdt modes just fine > as far as I've tested. (usefdt mode needing vt.) The patches do > not involve reverting the KVA change. One of the test machines is=20 > a "G5 quad" and its is the primary one I build on and test. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1C697AD4-6CAC-4C33-8D77-72D9B53A7648>