Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 May 2019 22:23:25 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   970MP PowerMac G5s: What printf's show about cpu_mp_unleash hangups on the test variant of head -r347003 (surprising, important)
Message-ID:  <EF44A358-CC6D-4244-A911-6D4DACFF4B21@yahoo.com>

next in thread | raw e-mail | index | archive | help
[Note: I still have your requested loop change, my
isync additions, and my libc string compare code
change in what I'm working with for head -r347003 .]

I started using printf to help identify more about what
code managed to execute vs what code did not for
hang-ups.

This note is just about cpu_mp_unleash observations and
experiments related to what printf's showed.

I did:

static void
cpu_mp_unleash(void *dummy)
{
. . . (omitted as all earlier printf's printed) . . .
printf("cpu_mp_unleash: before DELAY\n");
        /* Let the APs get into the scheduler */
        DELAY(10000);
printf("cpu_mp_unleash: after DELAY\n");

}

What I saw was only the first of the twoDEALY printf's
shown above was printing when cpu_mp_unleash hung up,
such a hangup being the normal case when vt_upgrade
did not hang-up first.

So I looked at /mnt/usr/src/sys/powerpc/powerpc/clock.c
and its DELAY routine and came up with only one thing
that looked like a useful experiment. Note what I
then commented out:

# svnlite diff /mnt/usr/src/sys/powerpc/powerpc/clock.c
Index: /mnt/usr/src/sys/powerpc/powerpc/clock.c
===================================================================
--- /mnt/usr/src/sys/powerpc/powerpc/clock.c	(revision 347003)
+++ /mnt/usr/src/sys/powerpc/powerpc/clock.c	(working copy)
@@ -309,10 +309,10 @@
 	TSENTER();
 	tb = mftb();
 	ttb = tb + howmany((uint64_t)n * 1000000, ps_per_tick);
-	nop_prio_vlow();
+	//nop_prio_vlow();
 	while (tb < ttb)
 		tb = mftb();
-	nop_prio_medium();
+	//nop_prio_medium();
 	TSEXIT();
 }

After this change I've not (yet?) seen another cpu_mp_unleash
hangup in my test context.

Even if not documented to do so, it appears to me that
ori Rx,Rx,Rx code that is behind the nop_prio_vlow() does
something specific on the 970MP's in the 2-socket/2-core-each
G5 PowerMac11,2's --and what it does interferes with making
progress in DELAY, in at least that specific use of it and/or
any others on the ap's during cpu_mp_unleash.

Of course, this testing process is of a probabilistic context
and I do not have hundreds or more of examples of any specific
condition at this point. But, so far, the change in behavior
seems clear: I went from always-hanging-up-so-far to
always-booting-so-far (when vt_upgrade did not prevent the
test in each context).


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EF44A358-CC6D-4244-A911-6D4DACFF4B21>