Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Sep 2017 19:21:57 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Emmanuel Vadot <manu@bidouilliste.com>, freebsd-arm <freebsd-arm@freebsd.org>, freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   Re: FYI: Pine64+ 2GB (so A64) booting and non-debug vs. debug kernel: "APs not started" for failure cases only, possible missing atomic_load_acq_int's?
Message-ID:  <A266F7F9-3A95-4E0A-9631-569F085B97E2@dsl-only.net>
In-Reply-To: <2FC8A531-5E8F-4765-A1F3-A8D6A6AA0C14@dsl-only.net>
References:  <1C18FF04-6772-4E9C-88C5-B8D5478C5809@dsl-only.net> <E450D0E4-0086-468B-9D0C-ED3DBBB13945@dsl-only.net> <6D63486A-E933-4CC2-9A24-0688BE01A0DA@dsl-only.net> <8E15A747-3413-4537-9ECA-5EDAD1285351@dsl-only.net> <256CF612-1D52-4BCC-981B-E476F6EEC9AB@dsl-only.net> <2FC8A531-5E8F-4765-A1F3-A8D6A6AA0C14@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[As the following does not flow well from the previous
message and stands somewhat on its own in some respects:
I top post this.]

ffff0000005fb14c <release_aps+0xb4> adrp        x8, ffff000000aaa000 =
<tmpbuffer+0x8a0>
ffff0000005fb150 <release_aps+0xb8> add x8, x8, #0x778
ffff0000005fb154 <release_aps+0xbc> adrp        x0, ffff0000006c9000 =
<digits+0x1e74e>
ffff0000005fb158 <release_aps+0xc0> add x0, x0, #0xb8a
ffff0000005fb15c <release_aps+0xc4> stlr        w20, [x8]
ffff0000005fb160 <release_aps+0xc8> sev

I ran into the following mention of SEV and making sure
it is appropriately delayed:

> The mechanism that signals an event to other PEs is IMPLEMENTATION =
DEFINED. The PE is not required to guarantee the ordering of this event =
with respect to the completion of memory accesses by instructions before =
the SEV instruction. Therefore, ARM recommends that software includes a =
DSB instruction before any SEV instruction.
>=20
> The SEVL instruction appears to execute in program order relative to =
any subsequent WFE instruction executed on the same PE, without the need =
for any explicit insertion of barrier instructions.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2017-Sep-16, at 4:08 PM, Mark Millard <markmi at dsl-only.net> wrote:

> [Adding a couple of numbers that help
> interpret what I found as far as what
> specifically did not work as expected.]
>=20
> On 2017-Sep-16, at 3:17 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> A new finding:
>>=20
>> When verbose boot messages are enabled
>> there is an earlier contrast between when
>> booting works overall vs. when it later
>> fails:
>>=20
>> When it works:
>>=20
>> subsystem f000000
>>  release_aps(0)... Release APs
>> done.
>>=20
>> When it fails:=20
>>=20
>> subsystem f000000
>>  release_aps(0)... Release APs
>> APs not started
>> done.
>>=20
>> And it well explains why ->pc_curthread
>> ends up NULL for secondaries (in particular
>> cpu =3D=3D 1), init_secondary had never executed
>> the assignments show below:=20
>>=20
>>       while (!aps_ready)
>>               __asm __volatile("wfe");
>>=20
>>       /* Initialize curthread */
>>       KASSERT(PCPU_GET(idlethread) !=3D NULL, ("no idle thread"));
>>       pcpup->pc_curthread =3D pcpup->pc_idlethread;
>>       pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb;
>>=20
>> The subsystem messages are from:
>>=20
>> static void
>> release_aps(void *dummy __unused)
>> {      =20
>>       int i;
>>=20
>>       /* Only release CPUs if they exist */
>>       if (mp_ncpus =3D=3D 1)
>>               return;
>>=20
>>       intr_pic_ipi_setup(IPI_AST, "ast", ipi_ast, NULL);
>>       intr_pic_ipi_setup(IPI_PREEMPT, "preempt", ipi_preempt, NULL);
>>       intr_pic_ipi_setup(IPI_RENDEZVOUS, "rendezvous", =
ipi_rendezvous, NULL);
>>       intr_pic_ipi_setup(IPI_STOP, "stop", ipi_stop, NULL);
>>       intr_pic_ipi_setup(IPI_STOP_HARD, "stop hard", ipi_stop, NULL);
>>       intr_pic_ipi_setup(IPI_HARDCLOCK, "hardclock", ipi_hardclock, =
NULL);
>>=20
>>       atomic_store_rel_int(&aps_ready, 1);
>>       /* Wake up the other CPUs */
>>       __asm __volatile("sev");
>>=20
>>       printf("Release APs\n");
>>=20
>>       for (i =3D 0; i < 2000; i++) {
>>               if (smp_started)
>>                       return;
>>               DELAY(1000);
>>       }
>>=20
>>       printf("APs not started\n");
>> }      =20
>> SYSINIT(start_aps, SI_SUB_SMP, SI_ORDER_FIRST, release_aps, NULL);
>>=20
>>=20
>> init_secondary has an example or two of not using
>> atomic_load_acq_int when atomic_store_rel_int is in
>> use. One is:
>>=20
>>       while (!aps_ready)
>>               __asm __volatile("wfe");
>>=20
>>       /* Initialize curthread */
>>       KASSERT(PCPU_GET(idlethread) !=3D NULL, ("no idle thread"));
>>       pcpup->pc_curthread =3D pcpup->pc_idlethread;
>>       pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb;
>>=20
>> where aps_ready was declared via:
>>=20
>> /* Set to 1 once we're ready to let the APs out of the pen. */
>> volatile int aps_ready =3D 0;
>>=20
>> where release_aps has the use of atomic_store_rel_int:
>>=20
>>       atomic_store_rel_int(&aps_ready, 1);
>>       /* Wake up the other CPUs */
>>       __asm __volatile("sev");
>>=20
>> There is also in init_secondary:
>>=20
>>       atomic_add_rel_32(&smp_cpus, 1);
>>=20
>>       if (smp_cpus =3D=3D mp_ncpus) {
>>               /* enable IPI's, tlb shootdown, freezes etc */
>>               atomic_store_rel_int(&smp_started, 1);
>>       }
>>=20
>> where smp_cpus is accessed without being explicitly
>> atomic. mp_ncpus seems to have no atomic use at all.
>>=20
>> Where:
>>=20
>> /usr/src/sys/sys/smp.h:extern int smp_cpus;
>> /usr/src/sys/kern/subr_smp.c:int smp_cpus =3D 1;  /* how many cpu's =
running */
>>=20
>> So no "volatile", unlike the earlier example.
>>=20
>> /usr/src/sys/kern/kern_umtx.c:          if (smp_cpus > 1) {
>> /usr/src/sys/kern/subr_smp.c:SYSCTL_INT(_kern_smp, OID_AUTO, cpus, =
CTLFLAG_RD|CTLFLAG_CAPRD, &smp_cpus, 0,
>>=20
>> /usr/src/sys/sys/smp.h:extern int mp_ncpus;
>> /usr/src/sys/kern/subr_smp.c:int mp_ncpus;
>>=20
>>=20
>> The smp_started is not explicitly accessed as atomic
>> in release_aps but in init_secondary has its update
>> to 1 via:
>>=20
>>       mtx_lock_spin(&ap_boot_mtx);
>>=20
>>       atomic_add_rel_32(&smp_cpus, 1);
>>=20
>>       if (smp_cpus =3D=3D mp_ncpus) {
>>               /* enable IPI's, tlb shootdown, freezes etc */
>>               atomic_store_rel_int(&smp_started, 1);
>>       }
>>=20
>>       mtx_unlock_spin(&ap_boot_mtx);
>>=20
>> where:
>>=20
>> /usr/src/sys/sys/smp.h:extern volatile int smp_started;
>> /usr/src/sys/kern/subr_smp.c:volatile int smp_started;
>>=20
>> ("volatile" again for this context.)
>>=20
>> I'll also note that for the sparc64 architecture
>> there is some code like:
>>=20
>>    if (__predict_false(atomic_load_acq_int(&smp_started) =3D=3D 0))
>>=20
>> that is explicitly matched to the atomic_store_rel_int
>> in its mp_machdep.c .
>>=20
>> I do not have enough background aarch64 knowledge to
>> know if it is provable that atomic_load_acq_int is not
>> needed in some of these cases.
>>=20
>> But getting "APs not started" at least sometimes
>> suggests an intermittent failure of the code as
>> it is.
>>=20
>>=20
>> Another difference is lack of explicit initialization
>> of smp_started but explicit initialization of aps_ready
>> and smp_cpus .
>>=20
>>=20
>>=20
>> I have no clue if the boot sequence is supposed to
>> handle "APs not started" by reverting to not being
>> a symmetric multiprocessing boot or some other
>> specific way instead of trying to avoiding use of
>> what was not initialized by:
>>=20
>>       pcpup->pc_curthread =3D pcpup->pc_idlethread;
>>       pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb;
>>=20
>> in init_secondary.
>=20
>=20
> Example mp_ncpus and smp_cpus figures from a
> failed Pine64+ 2GB boot:
>=20
> db> print/x *smp_cpus
>       1
> db> print/x *mp_ncpus
> 138800000004
>=20
> But that should be a 4 byte width. Showing
> some context for reference:
>=20
> db> x/bx mp_ncpus-4,4=20
> rebooting:      0   0   0   0
> db> x/bx mp_ncpus,4
> mp_ncpus:       4   0   0   0
> db> x/bx mp_ncpus+4,4=20
> scsi_delay:     88  13  0   0
>=20
> For completeness:
>=20
> db> x/bx smp_cpus-4,4
> sysctl___kern_smp_disabled+0x5c:        0   0   0   0
> db> x/bx smp_cpus,4
> smp_cpus:       1   0   0   0
> db> x/bx smp_cpus+4,4=20
> smp_cpus+0x4:   0   0   0   0
>=20
> So smp_cpus was not incremented in memory. This
> goes along with no occurances of:
>=20
>       pcpup->pc_curthread =3D pcpup->pc_idlethread;
>       pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb;
>=20
> updates happening in init_secondary:
>=20
>        /* Spin until the BSP releases the APs */
>        while (!aps_ready)
>                __asm __volatile("wfe");
>=20
>        /* Initialize curthread */
>        KASSERT(PCPU_GET(idlethread) !=3D NULL, ("no idle thread"));
>        pcpup->pc_curthread =3D pcpup->pc_idlethread;
>        pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb;
> . . .
>        mtx_lock_spin(&ap_boot_mtx);
>=20
>        atomic_add_rel_32(&smp_cpus, 1);
>=20
>        if (smp_cpus =3D=3D mp_ncpus) {
>                /* enable IPI's, tlb shootdown, freezes etc */
>                atomic_store_rel_int(&smp_started, 1);
>        }
>=20
>        mtx_unlock_spin(&ap_boot_mtx);
>=20
> Which seems to imply that the aps_ready
> update:
>=20
>        atomic_store_rel_int(&aps_ready, 1);
>        /* Wake up the other CPUs */
>        __asm __volatile("sev");
>=20
> in release_aps was not seen in the:
>=20
>        while (!aps_ready)
>                __asm __volatile("wfe");
>=20
> in init_secondary.
>=20
> My guess is that "while (!aps_ready)" needs
> to be explicit about its atomic status.
> aps_ready is already volatile but apparently
> that is not enough for this context to be
> reliable.
>=20
> The other potential needs for explicit atomics
> are for later execution but may be required
> overall as well.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A266F7F9-3A95-4E0A-9631-569F085B97E2>