Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Apr 2019 14:14:22 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Subject:   Side subject: powerpc64 related PowerMac boot hangs slb details that drove having hack_into_slb_if_needed
Message-ID:  <D80DF79C-89A0-43C5-8D73-8CF1255AB17B@yahoo.com>
In-Reply-To: <C93E2906-864D-40F3-80A0-73DA5F7DA3E7@yahoo.com>
References:  <B5B1CC39-1D75-42F4-9661-62DA9D029D34@yahoo.com> <CAHSQbTBJtUc70XZgEk449-dLEhOydsypxD2NpVNsrxy0NcxnNA@mail.gmail.com> <2E386EE0-782D-47CB-978B-B5A010AFCF88@yahoo.com> <1C697AD4-6CAC-4C33-8D77-72D9B53A7648@yahoo.com> <C93E2906-864D-40F3-80A0-73DA5F7DA3E7@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Looks like my memory was wrong about details of what is going
on for the slb issue: the context. So, correcting, with newly
looked-up detail and avoiding incorrect memories.]

On 2019-Apr-19, at 23:08, Mark Millard <marklmi at yahoo.com> wrote:
. . .
>>> Does this patch fix the boot issue on the G5
>>> quad without the usefdt=1 setting, and without reverting the KVA
>>> change?
>> 
>> We already had an exchange about my forcing an slb entry as
>> needed for pcpup->pc_curpcb to be used. It massively changed
>> the frequency of hangups (rare now).
>> 
>> As a reminder, I had added
>> 
>> hack_into_slb_if_needed(pcpup->pc_curpcb);

More complete for cpudep_ap_bootstrap :

        pcpup->pc_curthread = pcpup->pc_idlethread;
. . .
        pcpup->pc_curpcb = pcpup->pc_curthread->td_pcb;
. . .
#ifdef __powerpc64__
        hack_into_slb_if_needed(pcpup->pc_curpcb); // HACK!!!
#endif

        sp = pcpup->pc_curpcb->pcb_sp;

The hack_into_slb_if_needed use is there to force
pcpup->pc_curpcb-> use being possible this early,
in other words enabling:

pcpup->pc_idlethread->pc_curpcb->

This is based on expecting handle_kernel_slb_spill
to not yet be working for the cpu being started.

Later, once enough is set up for the cpu being
started, handle_kernel_slb_spill should work.

>> in cpudep_ap_bootstrap. This was because other
>> activity from:
>> 
>>      SI_SUB_KTHREAD_INIT     = 0xe000000,    /* init process*/
>>      SI_SUB_KTHREAD_PAGE     = 0xe400000,    /* pageout daemon*/
>>      SI_SUB_KTHREAD_VM       = 0xe800000,    /* vm daemon*/
>>      SI_SUB_KTHREAD_BUF      = 0xea00000,    /* buffer daemon*/
>>      SI_SUB_KTHREAD_UPDATE   = 0xec00000,    /* update daemon*/
>>      SI_SUB_KTHREAD_IDLE     = 0xee00000,    /* idle procs*/
>> #ifndef EARLY_AP_STARTUP
>>      SI_SUB_SMP              = 0xf000000,    /* start the APs*/
>> #endif 
>> 
>> was competing for slb entries and doing slb entry replacements in
>> parallel with the ap startup activity so sometimes no slot covered
>> the pcpup->pc_curpcb relted address range. (Replacement slots are
>> picked based on mftb()%n_slbs .)

Ignore the above: I was not thinking of the right context.

>> Are you asking me to disable that call and see what happens?
> 
> I've not done anything about disabling the replacement of an
> slb entry for spanning what pcpup->pc_curpcb-> refers to
> when there is no such spanning entry already. (The code makes
> no replacement if an entry does span the address range.
> So, effectively, then, the code is a no-op for such conditions.)
> 
>> With the hack_into_slb_if_needed call and the other patches
>> reported in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233863
>> I'm booting and using historical and usefdt modes just fine
>> as far as I've tested. (usefdt mode needing vt.) The patches do
>> not involve reverting the KVA change. One of the test machines is 
>> a "G5 quad" and its is the  primary one I build on and test.

Looking up what I though I remembered when writing some of
the above (and what I've written in other places) . . .

/usr/src/sys/powerpc/aim/moea64_native.c has:

void
cpu_pcpu_init(struct pcpu *pcpu, int cpuid, size_t sz)
{
#ifdef __powerpc64__
/* Copy the SLB contents from the current CPU */
memcpy(pcpu->pc_aim.slb, PCPU_GET(aim.slb), sizeof(pcpu->pc_aim.slb));
#endif
}

cpu_pcpu_init is used via /usr/src/sys/kern/subr_pcpu.c 's pcpu_init :

void
pcpu_init(struct pcpu *pcpu, int cpuid, size_t size)
{

        bzero(pcpu, size);
        KASSERT(cpuid >= 0 && cpuid < MAXCPU,
            ("pcpu_init: invalid cpuid %d", cpuid));
        pcpu->pc_cpuid = cpuid;
        cpuid_to_pcpu[cpuid] = pcpu;
        STAILQ_INSERT_TAIL(&cpuhead, pcpu, pc_allcpu);
        cpu_pcpu_init(pcpu, cpuid, size);
        pcpu->pc_rm_queue.rmq_next = &pcpu->pc_rm_queue;
        pcpu->pc_rm_queue.rmq_prev = &pcpu->pc_rm_queue;
}

That is in turn used by powerpc-specific powerpc_init (for the bsp)
and by cpu_mp_start for other cpus, with cpu_mp_start used by
mp_start during SI_SUB_CPU, much earlier than what I listed before.

powerpc_init does the following just for the bsp:

        /*
         * Set up per-cpu data for the BSP now that the platform can tell
         * us which that is.
         */
        if (platform_smp_get_bsp(&bsp) != 0)
                bsp.cr_cpuid = 0;
        pc = &__pcpu[bsp.cr_cpuid];
        __asm __volatile("mtsprg 0, %0" :: "r"(pc));
        pcpu_init(pc, bsp.cr_cpuid, sizeof(struct pcpu));
        pc->pc_curthread = &thread0;
        thread0.td_oncpu = bsp.cr_cpuid;
        pc->pc_cpuid = bsp.cr_cpuid;
        pc->pc_hwref = bsp.cr_hwref;

cpu_mp_start does (in a loop on the bsp cpu):

                if (cpu.cr_cpuid != bsp.cr_cpuid) {
                        void *dpcpu;

                        pc = &__pcpu[cpu.cr_cpuid];
                        dpcpu = (void *)kmem_malloc(DPCPU_SIZE, M_WAITOK |
                            M_ZERO);
                        pcpu_init(pc, cpu.cr_cpuid, sizeof(*pc));
                        dpcpu_init(dpcpu, cpu.cr_cpuid);


The memory for the pcpup->pc_idlethread->pc_curpcb->
use in cpudep_ap_bootstrap is not necessarily covered by
the slb material copied from the bsp during cpu_mp_start.
This is what lead to my investigative:

hack_into_slb_if_needed(pcpup->pc_curpcb);

in cpudep_ap_bootstrap .


I will note that the hagups have been so rare after the
hack_into_slb_if_needed(pcpup->pc_curpcb) addition that
I have no evidence of just where the low level hangup
is. They could be a completely different issue and I'd
not know it yet.

I have added a printf that would indicate if 
cpudep_ap_bootstrap got just past evaluating:

pcpup->pc_curpcb->pcb_sp

when it (later?) hangs up. But I'm waiting to see a
hangup during my other activities in order to see
the evidence.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D80DF79C-89A0-43C5-8D73-8CF1255AB17B>