Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 May 2019 22:43:36 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   head -r347003 on 2-socket/2-cores-each G5 PowerMac11, 2's: one type of boot-blocking context found
Message-ID:  <D2CEBBBA-40A5-4924-9817-53A8ED81011E@yahoo.com>

next in thread | raw e-mail | index | archive | help
Every example of boot failure during cpu_mp_unleash,
where I've had the tracking in place, has had 1 or more
examples of srr0<DMAP_BASE_ADDRESS (EXC_ISE) in
handle_kernel_slb_spill before cpu_mp_unleash tries to
start its first ap.

Every example of boot success, where I've had the tracking
in place, has had no examples of srr0<DMAP_BASE_ADDRESS
(EXC_ISE) in handle_kernel_slb_spill before the
cpu_mp_unleash finished. (Successful boots are rare
in my current test context, so there are fewer examples
of this.)

In other words: the original live-G5 information
for the segment was still present throughout that
time frame, thus avoiding a slbtrap for such a
fetch address over the time frame involved.



In the the code:

        rstvec = rstvec_virtbase + reset;
printf("powermac_smp_start_cpu: about to use *rstvec==4\n");
        *rstvec = 4;
        powerpc_sync();
        (void)(*rstvec);
        powerpc_sync();
        DELAY(1);
printf("powermac_smp_start_cpu: about to use *rstvec==0\n");
        *rstvec = 0;
        powerpc_sync();
        (void)(*rstvec);
        powerpc_sync();
printf("powermac_smp_start_cpu: done using *rstvec==0\n");

Every boot failure has had the last line reported by
FireWire dcons use as the first of those 3 printf's,
for CPU 2 as the target (of 0-3).

The above code appears to me to execute with MSR.IR=1
on the bsp.

But, then, what would *rstvec do if there is no ESID=0
V=1 combination active for the live-G5 information at
the time? Does that block the exception code that
is in what would be ESID=0's address range, effectively
preventing slbtrap from being invoked to enable ESID=0?

In other words: when MSR.IR=1, does there always
need to be a ESID=0 V=1 entry? Is it appropriate
to reserve one for ESID=0 V=1 (after invalidating
any arbitrarily placed ESID=0 V=1 entry present
before the kernel even started)?



I am extending the tracking to include ESID=0
specific tracking as well.

And I'll keep going in case this is an inference
from too small of a sample.


Notes:


So far I've only seen EXC_ISE in handle_kernel_slb_spill
for srr0<DMAP_BASE_ADDRESS, The rest are dar (EXC_DSE)
based --and for those I've seen each of:

dar<DMAP_BASE_ADDRESS
DMAP_BASE_ADDRESS<=dar<VM_MIN_KERNEL_ADDRESS
VM_MIN_KERNEL_ADDRESS<=dar

No count of examples of handle_kernel_slb_spill
for a range (given the type) has been more than
4 so far. From boot to boot the counts that
are ever non-zero (so far) vary in value. I've
never had all the counts be zero.



Part of having "tracking in place" is the use of
FireWire dcons to observe printf output. Information
from prior to my working that way is not all that
useful: I did not then know that there was missing
printf output that had been generated.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D2CEBBBA-40A5-4924-9817-53A8ED81011E>