Date: Thu, 2 May 2019 22:23:25 -0700 From: Mark Millard <marklmi@yahoo.com> To: Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: 970MP PowerMac G5s: What printf's show about cpu_mp_unleash hangups on the test variant of head -r347003 (surprising, important) Message-ID: <EF44A358-CC6D-4244-A911-6D4DACFF4B21@yahoo.com>
next in thread | raw e-mail | index | archive | help
[Note: I still have your requested loop change, my isync additions, and my libc string compare code change in what I'm working with for head -r347003 .] I started using printf to help identify more about what code managed to execute vs what code did not for hang-ups. This note is just about cpu_mp_unleash observations and experiments related to what printf's showed. I did: static void cpu_mp_unleash(void *dummy) { . . . (omitted as all earlier printf's printed) . . . printf("cpu_mp_unleash: before DELAY\n"); /* Let the APs get into the scheduler */ DELAY(10000); printf("cpu_mp_unleash: after DELAY\n"); } What I saw was only the first of the twoDEALY printf's shown above was printing when cpu_mp_unleash hung up, such a hangup being the normal case when vt_upgrade did not hang-up first. So I looked at /mnt/usr/src/sys/powerpc/powerpc/clock.c and its DELAY routine and came up with only one thing that looked like a useful experiment. Note what I then commented out: # svnlite diff /mnt/usr/src/sys/powerpc/powerpc/clock.c Index: /mnt/usr/src/sys/powerpc/powerpc/clock.c =================================================================== --- /mnt/usr/src/sys/powerpc/powerpc/clock.c (revision 347003) +++ /mnt/usr/src/sys/powerpc/powerpc/clock.c (working copy) @@ -309,10 +309,10 @@ TSENTER(); tb = mftb(); ttb = tb + howmany((uint64_t)n * 1000000, ps_per_tick); - nop_prio_vlow(); + //nop_prio_vlow(); while (tb < ttb) tb = mftb(); - nop_prio_medium(); + //nop_prio_medium(); TSEXIT(); } After this change I've not (yet?) seen another cpu_mp_unleash hangup in my test context. Even if not documented to do so, it appears to me that ori Rx,Rx,Rx code that is behind the nop_prio_vlow() does something specific on the 970MP's in the 2-socket/2-core-each G5 PowerMac11,2's --and what it does interferes with making progress in DELAY, in at least that specific use of it and/or any others on the ap's during cpu_mp_unleash. Of course, this testing process is of a probabilistic context and I do not have hundreds or more of examples of any specific condition at this point. But, so far, the change in behavior seems clear: I went from always-hanging-up-so-far to always-booting-so-far (when vt_upgrade did not prevent the test in each context). === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EF44A358-CC6D-4244-A911-6D4DACFF4B21>