From owner-freebsd-arm@freebsd.org Sat Sep 16 22:17:35 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3DB28E0C360 for ; Sat, 16 Sep 2017 22:17:35 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-66.reflexion.net [208.70.210.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F3AF765948 for ; Sat, 16 Sep 2017 22:17:34 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 15956 invoked from network); 16 Sep 2017 22:17:27 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 16 Sep 2017 22:17:27 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.40.3) with SMTP; Sat, 16 Sep 2017 18:17:27 -0400 (EDT) Received: (qmail 6091 invoked from network); 16 Sep 2017 22:17:27 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 16 Sep 2017 22:17:27 -0000 Received: from [192.168.1.109] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 78A6DEC770C; Sat, 16 Sep 2017 15:17:26 -0700 (PDT) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: FYI: Pine64+ 2GB (so A64) booting and non-debug vs. debug kernel: "APs not started" for failure cases only, possible missing atomic_load_acq_int's? Date: Sat, 16 Sep 2017 15:17:25 -0700 References: <1C18FF04-6772-4E9C-88C5-B8D5478C5809@dsl-only.net> <6D63486A-E933-4CC2-9A24-0688BE01A0DA@dsl-only.net> <8E15A747-3413-4537-9ECA-5EDAD1285351@dsl-only.net> To: Emmanuel Vadot , freebsd-arm , freebsd-hackers In-Reply-To: <8E15A747-3413-4537-9ECA-5EDAD1285351@dsl-only.net> Message-Id: <256CF612-1D52-4BCC-981B-E476F6EEC9AB@dsl-only.net> X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Sep 2017 22:17:35 -0000 A new finding: When verbose boot messages are enabled there is an earlier contrast between when booting works overall vs. when it later fails: When it works: subsystem f000000 release_aps(0)... Release APs done. When it fails:=20 subsystem f000000 release_aps(0)... Release APs APs not started done. And it well explains why ->pc_curthread ends up NULL for secondaries (in particular cpu =3D=3D 1), init_secondary had never executed the assignments show below:=20 while (!aps_ready) __asm __volatile("wfe"); /* Initialize curthread */ KASSERT(PCPU_GET(idlethread) !=3D NULL, ("no idle thread")); pcpup->pc_curthread =3D pcpup->pc_idlethread; pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb; The subsystem messages are from: static void release_aps(void *dummy __unused) { =20 int i; =20 /* Only release CPUs if they exist */ if (mp_ncpus =3D=3D 1) return; intr_pic_ipi_setup(IPI_AST, "ast", ipi_ast, NULL); intr_pic_ipi_setup(IPI_PREEMPT, "preempt", ipi_preempt, NULL); intr_pic_ipi_setup(IPI_RENDEZVOUS, "rendezvous", ipi_rendezvous, = NULL); intr_pic_ipi_setup(IPI_STOP, "stop", ipi_stop, NULL); intr_pic_ipi_setup(IPI_STOP_HARD, "stop hard", ipi_stop, NULL); intr_pic_ipi_setup(IPI_HARDCLOCK, "hardclock", ipi_hardclock, = NULL); atomic_store_rel_int(&aps_ready, 1); /* Wake up the other CPUs */ __asm __volatile("sev"); printf("Release APs\n"); for (i =3D 0; i < 2000; i++) { if (smp_started) return; DELAY(1000); } =20 printf("APs not started\n"); } =20 SYSINIT(start_aps, SI_SUB_SMP, SI_ORDER_FIRST, release_aps, NULL); init_secondary has an example or two of not using atomic_load_acq_int when atomic_store_rel_int is in use. One is: while (!aps_ready) __asm __volatile("wfe"); /* Initialize curthread */ KASSERT(PCPU_GET(idlethread) !=3D NULL, ("no idle thread")); pcpup->pc_curthread =3D pcpup->pc_idlethread; pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb; where aps_ready was declared via: /* Set to 1 once we're ready to let the APs out of the pen. */ volatile int aps_ready =3D 0; where release_aps has the use of atomic_store_rel_int: atomic_store_rel_int(&aps_ready, 1); /* Wake up the other CPUs */ __asm __volatile("sev"); There is also in init_secondary: atomic_add_rel_32(&smp_cpus, 1); if (smp_cpus =3D=3D mp_ncpus) { /* enable IPI's, tlb shootdown, freezes etc */ atomic_store_rel_int(&smp_started, 1); } where smp_cpus is accessed without being explicitly atomic. mp_ncpus seems to have no atomic use at all. Where: /usr/src/sys/sys/smp.h:extern int smp_cpus; /usr/src/sys/kern/subr_smp.c:int smp_cpus =3D 1; /* how many cpu's = running */ So no "volatile", unlike the earlier example. /usr/src/sys/kern/kern_umtx.c: if (smp_cpus > 1) { /usr/src/sys/kern/subr_smp.c:SYSCTL_INT(_kern_smp, OID_AUTO, cpus, = CTLFLAG_RD|CTLFLAG_CAPRD, &smp_cpus, 0, /usr/src/sys/sys/smp.h:extern int mp_ncpus; /usr/src/sys/kern/subr_smp.c:int mp_ncpus; The smp_started is not explicitly accessed as atomic in release_aps but in init_secondary has its update to 1 via: mtx_lock_spin(&ap_boot_mtx); atomic_add_rel_32(&smp_cpus, 1); if (smp_cpus =3D=3D mp_ncpus) { /* enable IPI's, tlb shootdown, freezes etc */ atomic_store_rel_int(&smp_started, 1); } mtx_unlock_spin(&ap_boot_mtx); where: /usr/src/sys/sys/smp.h:extern volatile int smp_started; /usr/src/sys/kern/subr_smp.c:volatile int smp_started; ("volatile" again for this context.) I'll also note that for the sparc64 architecture there is some code like: if (__predict_false(atomic_load_acq_int(&smp_started) =3D=3D 0)) that is explicitly matched to the atomic_store_rel_int in its mp_machdep.c . I do not have enough background aarch64 knowledge to know if it is provable that atomic_load_acq_int is not needed in some of these cases. But getting "APs not started" at least sometimes suggests an intermittent failure of the code as it is. Another difference is lack of explicit initialization of smp_started but explicit initialization of aps_ready and smp_cpus . I have no clue if the boot sequence is supposed to handle "APs not started" by reverting to not being a symmetric multiprocessing boot or some other specific way instead of trying to avoiding use of what was not initialized by: pcpup->pc_curthread =3D pcpup->pc_idlethread; pcpup->pc_curpcb =3D pcpup->pc_idlethread->td_pcb; in init_secondary. =3D=3D=3D Mark Millard markmi at dsl-only.net