Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Feb 2017 21:24:49 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        andrew@freebsd.org, Shawn Webb <shawn.webb@hardenedbsd.org>, Tom Vijlbrief <tvijlbrief@gmail.com>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: pine64 (an A64 Cortex-A53 context, e.g. -r312982): sh`forkshell child-process path after fork sometimes has a bad stack pointer value
Message-ID:  <93064627-5F72-4167-90B1-0A98ABF4C99C@dsl-only.net>
In-Reply-To: <2D04FF37-DEC8-42CE-961D-AE8CD58A0EAA@dsl-only.net>
References:  <DC3CC3BE-9D8C-41ED-ADD0-AFD4019B2E90@dsl-only.net> <2D04FF37-DEC8-42CE-961D-AE8CD58A0EAA@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[A top post of 7 summaries of more stack pointer changes for
before fork vs. after fork.]

Unless I figure out a way to well-track what is happening
internal to fork's activity this is likely my last post
of evidence for the issue.

Based on the new test that detects both directions and
checks the parent and child cases for fork's return
value. . .

All the examples of core dumps report:

(lldb) print pid
(pid_t) $0 =3D 0

and so go down the child-process code path in sh`forkshell.
There are no examples of the parent code path cases having
any such issue.

I do not show the "print commandname" output but it varies
greatly.


I have 3 examples of the stack pointer getting smaller
instead of staying at the same value, each landed in the
potential active stack area and so did not trash other
memory areas:

(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a33180
       x21 =3D 0x0000000000000000
       x22 =3D 0x0000000040abce68
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x0000000000000003
       x26 =3D 0x0000000040a33180
       x27 =3D 0x0000ffffffffcb08
       x28 =3D 0x0000000000000000
        fp =3D 0x0000ffffffffca70
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffc960
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffca10
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffc940

0x0000ffffffffca10-0x0000ffffffffc940 =3D 0xD0

(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a23300
       x21 =3D 0x0000000000000000
       x22 =3D 0x0000000040a50510
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x0000000040a23300
       x26 =3D 0x00000000ffffffff
       x27 =3D 0x0000000000434000  sh..bss + 6336
       x28 =3D 0x0000000040a50538
        fp =3D 0x0000ffffffffbdc0
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffbcb0
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffbd60
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffbc90

0x0000ffffffffbd60-0x0000ffffffffbc90 =3D 0xD0

(No more examples of 0xD0 though.)


(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a36180
       x21 =3D 0x0000000000000002
       x22 =3D 0x0000000040a53150
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x00000000000000fa
       x26 =3D 0x0000000040a53e40
       x27 =3D 0x0000000040a53226
       x28 =3D 0x0000000000434000  sh..bss + 6336
        fp =3D 0x0000ffffffffcf90
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffb540
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffcf30
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffb520

0x0000ffffffffcf30-0x0000ffffffffb520 =3D 0x1A10



For the rest of the examples the stack pointer gets larger,
but by widely variable amounts:

(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a350c0
       x21 =3D 0x0000000000000002
       x22 =3D 0x0000000040a49898
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x00000000000000fb
       x26 =3D 0x0000000040a49b90
       x27 =3D 0x0000000040a49918
       x28 =3D 0x0000000000434000  sh..bss + 6336
        fp =3D 0x0000ffffffffcee0
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffd180
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffce80
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffd160

0x0000ffffffffd160-0x0000ffffffffce80 =3D 0x2E0

(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a33180
       x21 =3D 0x0000000000000000
       x22 =3D 0x0000000040a53748
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x0000000040a33180
       x26 =3D 0x00000000ffffffff
       x27 =3D 0x0000000000434000  sh..bss + 6336
       x28 =3D 0x0000000040a53770
        fp =3D 0x0000ffffffffcae0
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffd5a0
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffca80
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffd580

0x0000ffffffffd580-0x0000ffffffffca80 =3D 0xB00


(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a33180
       x21 =3D 0x0000000000000000
       x22 =3D 0x0000000040a9f6b8
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x0000000040a33180
       x26 =3D 0x0000000000000003
       x27 =3D 0x0000000000434000  sh..bss + 6336
       x28 =3D 0x0000000040a9f6a8
        fp =3D 0x0000ffffffffc710
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffce40
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $0 =3D 0x0000ffffffffc6b0
(lldb) print/x stack_address_after_fork
(uintptr_t) $1 =3D 0x0000ffffffffce20

0x0000ffffffffce20-0x0000ffffffffc6b0 =3D 0x770


(lldb) up 3
frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
   856 		stack_address_before_fork =3D example_stack_address();
   857 		pid =3D fork();
   858 		stack_address_after_fork  =3D example_stack_address();
-> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
   860 		if (pid =3D=3D -1) {
   861 			TRACE(("Fork failed, errno=3D%d\n", errno));
   862 			INTON;
(lldb) register read
General Purpose Registers:
       x19 =3D 0x0000000000000000
       x20 =3D 0x0000000040a33180
       x21 =3D 0x0000000000000000
       x22 =3D 0x0000000040a9be28
       x23 =3D 0x0000000000434000  sh..bss + 6336
       x24 =3D 0x0000000000434000  sh..bss + 6336
       x25 =3D 0x0000000040a33180
       x26 =3D 0x0000000000000003
       x27 =3D 0x0000000000434000  sh..bss + 6336
       x28 =3D 0x0000000040a9bdb0
        fp =3D 0x0000ffffffffca50
        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
        sp =3D 0x0000ffffffffd8a0
        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
20 registers were unavailable.

(lldb) print/x stack_address_before_fork
(uintptr_t) $2 =3D 0x0000ffffffffc9f0
(lldb) print/x stack_address_after_fork
(uintptr_t) $3 =3D 0x0000ffffffffd880

0x0000ffffffffd880-0x0000ffffffffc9f0 =3D 0xE90


That is it for evidence, at least for now.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2017-Feb-12, at 6:46 PM, Mark Millard <markmi atdsl-only.net> wrote:

> On 2017-Feb-12, at 2:52 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>=20
>> On pine64 (an A64 Cortex-A53 context) multiple people on the lists
>> including me have reported sh getting occasional core dumps.
>>=20
>> I've analyzed a bunch of the sh core dumps and all failed in the
>> child-process path of forkshell when forkshell tried to return.=20
>> I've since done experiments with code to detect some forms of
>> odd stack pointer values so that the adjusted code calls abort
>> for such a detection before such a return would happen. [This
>> gives a nicer context to look at in core dumps (before things
>> are very messed up if the sp is bad).]
>>=20
>> In sh`forkshell, just after the fork returns, on the child-process
>> path there is sometimes a messed up sp value by what direction
>> it is from the prior frame-pointers on the stack --and on occasion
>> the value difference is very large, such as:
>> (from: lldb register read on the frame with the pc in sh`forkshell )
>>=20
>>       fp =3D 0x0000ffffffffce90
>>       sp =3D 0x0000ffffffffe980
>>=20
>> This has the sp with a larger address than what sh`__start
>> stored as the frame-pointer back-link when it is put to use via
>> ld-elf.so.1`.rtld_start (more like 0x0000ffffffffde10 as I
>> remember): outside the active stack region.
>>=20
>> [Note: my experiments so far would not establish if the sp
>> might sometimes have an unexpectedly large distance toward
>> lower memory addresses, specially if it was still in the
>> potential stack-region. It may be that both directions
>> happen.]
>>=20
>> The distance when it fails is vary variable across examples.
>> I just picked an example were stack frames would be written
>> over the top of other material when sh`forkshell makes other
>> calls on the child-process path, material that would be
>> outside what should be the active stack region.
>>=20
>> # uname -apKU
>> FreeBSD pine64 12.0-CURRENT FreeBSD 12.0-CURRENT  r312982M  arm64 =
aarch64 1200020 1200020
>>=20
>> (I've frozen at that version for this exploration.
>> It has taken me a while.)
>>=20
>> Looking around I see what might be a few possibilities. . .
>> (I'm no expert so some might be trivially eliminated.)
>>=20
>>=20
>> Possibility #0 (possibilities in no particular order):
>>=20
>> sys/arm64/arm64/vm_machdep.c :
>>=20
>> In cpu_fork what if the bcopy of td1-td_frame might not
>> always have access to the latest updated values, needing
>> some form of memory "fence" to be sure that such values are
>> accessible? :
>>=20
>>       tf =3D (struct trapframe *)STACKALIGN((struct trapframe *)pcb2 =
- 1);
>>       bcopy(td1->td_frame, tf, sizeof(*tf));
>>       tf->tf_x[0] =3D 0;
>>       tf->tf_x[1] =3D 0;
>>       tf->tf_spsr =3D 0;
>>=20
>>       td2->td_frame =3D tf;
>>=20
>>       /* Set the return value registers for fork() */
>>       td2->td_pcb->pcb_x[8] =3D (uintptr_t)fork_return;
>>       td2->td_pcb->pcb_x[9] =3D (uintptr_t)td2;
>>       td2->td_pcb->pcb_x[PCB_LR] =3D (uintptr_t)fork_trampoline;
>>       td2->td_pcb->pcb_sp =3D (uintptr_t)td2->td_frame;
>>       td2->td_pcb->pcb_fpusaved =3D &td2->td_pcb->pcb_fpustate;
>>       td2->td_pcb->pcb_vfpcpu =3D UINT_MAX;
>>=20
>>       /* Setup to release spin count in fork_exit(). */
>>       td2->td_md.md_spinlock_count =3D 1;
>>       td2->td_md.md_saved_daif =3D 0;
>>=20
>>=20
>> Possibility #1:
>>=20
>> sys/arm64/arm64/swtch.S :
>>=20
>> ENTRY(fork_trampoline)
>> . . .
>>       /* Restore sp and lr */
>>       ldp     x0, x1, [sp]
>>       msr     sp_el0, x0
>>       mov     lr, x1
>>=20
>> Similar point to #0 but for the ldp memory accesses
>> shown.
>>=20
>>=20
>> Possibility #3:
>>=20
>> sys/arm64/arm64/exception.S :
>>=20
>> Both of:
>>=20
>> handle_el0_sync
>> handle_el0_irq
>>=20
>> also update sp_el0 and so if any such can happen
>> during any part of fork_trampoline after its
>> "msr sp_el0, x0" but before its "msr daifset, #2"
>> (disabling interrupts), then the wrong sp_el0 value
>> would be in place at fork_tramploine's eret .
>>=20
>>=20
>> It will be interesting to see what the problem actually
>> was once it has been fixed.
>=20
> I have updated to a test that should report any time
> fork changes the stack pointer from what it was before
> the fork, parent or child process:
>=20
> # svnlite diff /usr/src/bin/sh/miscbltin.c /usr/src/bin/sh/jobs.c
> Index: /usr/src/bin/sh/miscbltin.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/bin/sh/miscbltin.c (revision 312982)
> +++ /usr/src/bin/sh/miscbltin.c (working copy)
> @@ -64,6 +64,15 @@
>=20
> #undef eflag
>=20
> +
> +/* JUST FOR TESTING */
> +uintptr_t example_stack_address(void)
> +{
> +    volatile uintptr_t test =3D 0;
> +    return (uintptr_t)(void*)&test;
> +}
> +
> +
> int readcmd(int, char **);
> int umaskcmd(int, char **);
> int ulimitcmd(int, char **);
> Index: /usr/src/bin/sh/jobs.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/bin/sh/jobs.c      (revision 312982)
> +++ /usr/src/bin/sh/jobs.c      (working copy)
> @@ -51,6 +51,9 @@
> #include <stdlib.h>
> #include <unistd.h>
>=20
> +/* JUST FOR TESTING */
> +#include <stdint.h>
> +
> #include "shell.h"
> #if JOBS
> #include <termios.h>
> @@ -833,6 +836,11 @@
>  * in a pipeline).
>  */
>=20
> +extern uintptr_t example_stack_address(void);
> +
> +uintptr_t stack_address_before_fork =3D 0;
> +uintptr_t stack_address_after_fork =3D 0;
> +
> pid_t
> forkshell(struct job *jp, union node *n, int mode)
> {
> @@ -845,7 +853,10 @@
>        if (mode =3D=3D FORK_BG && (jp =3D=3D NULL || jp->nprocs =3D=3D =
0))
>                checkzombies();
>        flushall();
> +       stack_address_before_fork =3D example_stack_address();
>        pid =3D fork();
> +       stack_address_after_fork  =3D example_stack_address();
> +       if (stack_address_after_fork !=3D stack_address_before_fork) =
abort();
>        if (pid =3D=3D -1) {
>                TRACE(("Fork failed, errno=3D%d\n", errno));
>                INTON;
> @@ -946,7 +957,6 @@
>        return pid;
> }
>=20
> -
> pid_t
> vforkexecshell(struct job *jp, char **argv, char **envp, const char =
*path, int idx, int pip[2])
> {
>=20
>=20
> I've been using repeated attempts to build devel/aarch64-none-elf-gcc
> for testing because the configure activity does lots of shell work
> that forks. (I first noticed the problem in such a context.)
>=20
> The first example core file from repeated attempts to build
> devel/aarch64-none-elf-gcc with this new test shows:
>=20
> (lldb) bt
> * thread #1: tid =3D 100188, 0x0000000040554e54 libc.so.7`_thr_kill + =
8, name =3D 'sh', stop reason =3D signal SIGABRT
>  * frame #0: 0x0000000040554e54 libc.so.7`_thr_kill + 8
>    frame #1: 0x0000000040554e18 libc.so.7`__raise(s=3D6) + 64 at =
raise.c:52
>    frame #2: 0x0000000040554d8c libc.so.7`abort + 84 at abort.c:65
>    frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
> (lldb) up 3
> frame #3: 0x000000000040f56c sh`forkshell(jp=3D<unavailable>, =
n=3D<unavailable>, mode=3D<unavailable>) + 1080 at jobs.c:859
>   856 		stack_address_before_fork =3D =
example_stack_address();
>   857 		pid =3D fork();
>   858 		stack_address_after_fork  =3D =
example_stack_address();
> -> 859 		if (stack_address_after_fork !=3D =
stack_address_before_fork) abort();
>   860 		if (pid =3D=3D -1) {
>   861 			TRACE(("Fork failed, errno=3D%d\n", =
errno));
>   862 			INTON;
>=20
> (lldb) register read
> General Purpose Registers:
>       x19 =3D 0x0000000000000000
>       x20 =3D 0x0000000040a350c0
>       x21 =3D 0x0000000000000002
>       x22 =3D 0x0000000040a49898
>       x23 =3D 0x0000000000434000  sh..bss + 6336
>       x24 =3D 0x0000000000434000  sh..bss + 6336
>       x25 =3D 0x00000000000000fb
>       x26 =3D 0x0000000040a49b90
>       x27 =3D 0x0000000040a49918
>       x28 =3D 0x0000000000434000  sh..bss + 6336
>        fp =3D 0x0000ffffffffcee0
>        lr =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
>        sp =3D 0x0000ffffffffd180
>        pc =3D 0x000000000040f56c  sh`forkshell + 1080 at jobs.c:862
> 20 registers were unavailable.
>=20
> (lldb) print/x stack_address_before_fork
> (uintptr_t) $0 =3D 0x0000ffffffffce80
>=20
> (lldb) print/x stack_address_after_fork
> (uintptr_t) $1 =3D 0x0000ffffffffd160
>=20
> (lldb) print/x main_handler
> (jmploc) $2 =3D {
>  loc =3D {
>    [0] =3D {
>      _jb =3D {
>        [0] =3D 0x0000ffffffffdb60fb5d25837d7ff700
>        [1] =3D 0x00000000000000170000ffffffffdc08
>        [2] =3D 0x00000000004320380000000000434a10
>        [3] =3D 0x00000000000000000000000000000000
>        [4] =3D 0x00000000000000000000000000000000
>        [5] =3D 0x00000000000000000000000000000000
>        [6] =3D 0x0000000000410c800000ffffffffdbb0
> . . .
>=20
> (main_handler is from setjmp in main.)
>=20
> In this example of the context for sh`forkshell after the fork:
>=20
> fp =3D 0x0000ffffffffcee0 < sp =3D 0x0000ffffffffd180 < =
0x0000ffffffffdbb0
>=20
> That explains why the bt stopped with frame #3: part of the stack
> had been trashed by the calls to example_stack_address and to abort
> (and what its internal call chain involves), unlike the prior example
> that trashed memory outside the active stack region.
>=20
> Compared to the prior example's sp =3D 0x0000ffffffffe980 the
> distance from the frame pointer in sh`forkshell is much smaller this
> time --but it is still in the same (wrong) direction.
>=20
> This illustrates the variability in the bad sp value from bad-case
> to bad-case.
>=20
> I plan to leave the repeated builds running for a time to accumulate
> some more core files based on this new test. If any examples happen
> of sp decreasing instead of increasing, the code should core dump for
> those as well, unlike my prior testing.
>=20
>=20
>=20
> Context details:
>=20
> # more /usr/src/sys/arm64/conf/GENERIC-NODBG
> #
> # GENERIC -- Custom configuration for the arm64/aarch64
> #
>=20
> include "GENERIC"
>=20
> ident   GENERIC-NODBG
>=20
> makeoptions     DEBUG=3D-g                # Build kernel with gdb(1) =
debug symbols
>=20
> options         ALT_BREAK_TO_DEBUGGER
>=20
> options         KDB                     # Enable kernel debugger =
support
>=20
> # For minimum debugger support (stable branch) use:
> #options        KDB_TRACE               # Print a stack trace for a =
panic
> options         DDB                     # Enable the kernel debugger
>=20
> # Extra stuff:
> #options        VERBOSE_SYSINIT         # Enable verbose sysinit =
messages
> #options        BOOTVERBOSE=3D1
> #options        BOOTHOWTO=3DRB_VERBOSE
> #options        KTR
> #options        KTR_MASK=3DKTR_TRAP
> ##options       KTR_CPUMASK=3D0xF
> #options        KTR_VERBOSE
>=20
> # Disable any extra checking for. . .
> nooptions       DEADLKRES               # Enable the deadlock resolver
> nooptions       INVARIANTS              # Enable calls of extra sanity =
checking
> nooptions       INVARIANT_SUPPORT       # Extra sanity checks of =
internal structures, required by INVARIANTS
> nooptions       WITNESS                 # Enable checks to detect =
deadlocks and cycles
> nooptions       WITNESS_SKIPSPIN        # Don't run witness on =
spinlocks for speed
> nooptions       DIAGNOSTIC
> nooptions       MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones
>=20
>=20
> The -r312982 variant is a cross build with -mcpu=3Dcortex-a53
> specified:
>=20
> # more ~/src.configs/src.conf.pine64-clang-bootstrap.amd64-host=20
> TO_TYPE=3Daarch64
> TOOLS_TO_TYPE=3D${TO_TYPE}
> #
> KERNCONF=3DGENERIC-NODBG
> TARGET=3Darm64
> .if ${.MAKE.LEVEL} =3D=3D 0
> TARGET_ARCH=3D${TO_TYPE}
> .export TARGET_ARCH
> .endif
> #
> WITH_CROSS_COMPILER=3D
> WITHOUT_SYSTEM_COMPILER=3D
> #
> WITH_LIBCPLUSPLUS=3D
> WITHOUT_BINUTILS_BOOTSTRAP=3D
> WITHOUT_ELFTOOLCHAIN_BOOTSTRAP=3D
> WITH_CLANG_BOOTSTRAP=3D
> WITH_CLANG=3D
> WITH_CLANG_IS_CC=3D
> WITH_CLANG_FULL=3D
> WITH_CLANG_EXTRAS=3D
> WITH_LLD=3D
> WITH_LLDB=3D
> #
> WITH_BOOT=3D
> WITHOUT_LIB32=3D
> WITHOUT_LIBSOFT=3D
> #
> WITHOUT_GCC_BOOTSTRAP=3D
> WITHOUT_GCC=3D
> WITHOUT_GCC_IS_CC=3D
> WITHOUT_GNUCXX=3D
> #
> NO_WERROR=3D
> #WERROR=3D
> MALLOC_PRODUCTION=3D
> #
> WITH_REPRODUCIBLE_BUILD=3D
> WITH_DEBUG_FILES=3D
> #
> CROSS_BINUTILS_PREFIX=3D/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
> XCFLAGS+=3D -B${CROSS_BINUTILS_PREFIX}
> XCXXFLAGS+=3D -B${CROSS_BINUTILS_PREFIX}
> # There is no XCPPFLAGS but XCPP gets XCFLAGS content.
> #
> XCFLAGS+=3D -mcpu=3Dcortex-a53
> XCXXFLAGS+=3D -mcpu=3Dcortex-a53
> # There is no XCPPFLAGS but XCPP gets XCFLAGS content.
>=20
>=20
> In order to cut down on the Spurious interrupt notices on the
> console I eventually did the below personal adjustment specific
> to my context (and it is in use above):
>=20
> # svnlite diff /usr/src/sys/arm/arm/gic.c
> Index: /usr/src/sys/arm/arm/gic.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/arm/arm/gic.c  (revision 312982)
> +++ /usr/src/sys/arm/arm/gic.c  (working copy)
> @@ -672,9 +672,13 @@
>=20
>        if (irq >=3D sc->nirqs) {
> #ifdef GIC_DEBUG_SPURIOUS
> +#define EXPECTED_SPURIOUS_IRQ 1023
> +           if (irq !=3D EXPECTED_SPURIOUS_IRQ) {
>                device_printf(sc->gic_dev,
> -                   "Spurious interrupt detected: last irq: %d on =
CPU%d\n",
> +                   "Spurious interrupt %d detected of %d: last irq: =
%d on CPU%d\n",
> +                   irq, sc->nirqs,
>                    sc->last_irq[PCPU_GET(cpuid)], PCPU_GET(cpuid));
> +            }
> #endif
>                return (FILTER_HANDLED);
>        }
> @@ -720,6 +724,16 @@
>        if (irq < sc->nirqs)
>                goto dispatch_irq;
>=20
> +       if (irq !=3D EXPECTED_SPURIOUS_IRQ) {
> +#undef EXPECTED_SPURIOUS_IRQ
> +#ifdef GIC_DEBUG_SPURIOUS
> +               device_printf(sc->gic_dev,
> +                   "Spurious end interrupt %d detected of %d: last =
irq: %d on CPU%d\n",
> +                   irq, sc->nirqs,
> +                   sc->last_irq[PCPU_GET(cpuid)], PCPU_GET(cpuid));
> +#endif
> +       }
> +
>        return (FILTER_HANDLED);
> }
>=20
> The added message block was to report unexpected values for the
> exit path. None have been reported from there.
>=20
> The gic.c changes eliminated all the Spurious interrupt notices =
--which
> made the console more reasonable to use.
>=20
> But so far I've not come up with a way to discover what leads
> to the irq=3D=3D1023 reports for the original message context.
> [Interrupts handled by one core but sent to multiple cores?]
>=20
>=20
> Other notes:
>=20
> I switched to Andrew Turner's freebsd Email address. It might be more
> appropriate. I also fixed the mistyped "Contex-a53" in the Subject to
> say "Cortex-a53".





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?93064627-5F72-4167-90B1-0A98ABF4C99C>