Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Sep 2014 23:55:59 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's
Message-ID:  <2E98A886-36FF-4B68-B729-F2143339E1DE@dsl-only.net>
In-Reply-To: <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net>
References:  <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <CAHSQbTBXxrgXQdNeCs=C5wJaT_bmh9FU836O6VnJDbJuqCUujw@mail.gmail.com> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <FDA60573-A9BD-4793-8273-22E960805CFF@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> <54248467.4050900@freebsd.org> <D7D2B0EF-7EEB-48E7-9485-990BACC709EE@dsl-only.net> <34AA4542-56A7-453E-A00E-868EE352C96C@dsl-only.net> <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
According to my adjusted dumping: At the "before =
Copyright"/ofwcall-for-peer crash ofw_real_mode=3D=3D0.

And that does turn off exception vector save/restore:

__inline void
ofw_save_trap_vec(char *save_trap_vec)
{
        if (!ofw_real_mode)
                return;

        bcopy((void *)EXC_RST, save_trap_vec, EXC_LAST - EXC_RST);
}

static __inline void
ofw_restore_trap_vec(char *restore_trap_vec)
{
        if (!ofw_real_mode)
                return;

        bcopy(restore_trap_vec, (void *)EXC_RST, EXC_LAST - EXC_RST);
        __syncicache(EXC_RSVD, EXC_LAST - EXC_RSVD);
}

So now it is clear to me how FreeBSD's exception vectors could be =
involved in a context that does not have FreeBSD's environment in place. =
(Finally!)

For powerpc64/GENERIC64 it should also then establish OFW_STD_32BIT:

boolean_t
OF_bootstrap()
{
        boolean_t status =3D FALSE;
                       =20
        if (openfirmware_entry !=3D NULL) {
                if (ofw_real_mode) {
                        status =3D OF_install(OFW_STD_REAL, 0);
                } else {
                        #ifdef __powerpc64__
                        status =3D OF_install(OFW_STD_32BIT, 0);
                        #else
                        status =3D OF_install(OFW_STD_DIRECT, 0);
                        #endif
                }

This seems to be like OFW_STD_REAL in what it sets up: ofw_real_methods.

static ofw_def_t ofw_real =3D {
        OFW_STD_REAL,
        ofw_real_methods,
        0
};
OFW_DEF(ofw_real);

static ofw_def_t ofw_32bit =3D {
        OFW_STD_32BIT,
        ofw_real_methods,
        0
};
OFW_DEF(ofw_32bit);

ofw_real_mode is used to figure out the context when it matters from =
what I can tell so far.


Just to experiment to be sure I temporarily hacked in ignoring =
ofw_real_mode in ofw_save_trap_vec and ofw_restore_trap_vec so they =
would be effective at exception vector swapping.

As I guessed it still hangs before the copyright notice. (Without =
getting to DDB so no dump information is displayed.)






=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 26, 2014, at 10:18 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

The first send of this was big enough for the moderator to be involved. =
So I canceled and am sending with less history included.

[I'll note that I seem to have trouble typing 0xdbb290 vs. 0xbdd290. The =
actual value is 0xdbb290. The references to the incorrect typing should =
say 0xbdd290, which is the wrong value. But I've had both types of =
references listing the wrong text... in various notes.]

=3D=3D=3D
Mark Millard
markmi@dsl-only.net

On Sep 26, 2014, at 10:11 PM, Mark Millard <markmi@dsl-only.net> wrote:

The openfirmware peer crash (i.e., the before Copyright notice crash) =
happens during/just-after the MMU setup and the peer pfwcall is the =
first ofwcall where pmap_bootstrapped is non-zero at the time. In other =
words: the very first ofwcall in the new context fails.

And this failure involves some of the same code area that I got a =
backtrace for and reported as a separate crash (with the trace listed). =
As a reminder for that backtrace that has a difference failure point:

.pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 [or ld r0,88(r4) in =
an alternate notation]
.pvo_tree_RB_FIND+0x38
.moea64_dev_direct_mapped_0x90
.pmap_dev_direct_mapped+0x84 ("_dev" was missing in earlier note)
.bs_remap_earlyboot_0x6c
.moea64_late_bootstrap+0x178
.moea64_bootstrap_native+0x120
.pmap_bootstrap+0xac
.powerpc_init+0x514
btext+0xa8

As for the sequence of ofwcall's that I reported: starting at the last =
OF_finddevice before the OF_instance_to_package that I reported in the =
sequence of ofwcall's from quiesce until the crash...

moea64_late_bootstrap does

        chosen =3D OF_finddevice("/chosen");
        if (chosen !=3D -1 && OF_getprop(chosen, "mmu", &mmui, 4) !=3D =
-1) {
            mmu =3D OF_instance_to_package(mmui);
            if (mmu =3D=3D -1 || (sz =3D OF_getproplen(mmu, =
"translations")) =3D=3D -1)
                sz =3D 0;
            if (sz > 6144 /* tmpstksz - 2 KB headroom */)
                panic("moea64_bootstrap: too many ofw translations");
                       =20
            if (sz > 0)
                moea64_add_ofw_mappings(mmup, mmu, sz);
        }

with moea64_add_ofw_mappings called. Then...

moea64_add_ofw_mappings does...

        bzero(translations, sz);
        OF_getprop(OF_finddevice("/"), "#address-cells", &acells,
            sizeof(acells));
        if (OF_getprop(mmu, "translations", trans_cells, sz) =3D=3D -1)
                panic("moea64_bootstrap: can't get ofw translations");

And it is the next ofwcall after that last OF_getprop that fails. (It =
happens to be a peer request.) Adding a dump of the pmap_bootstrapped =
value with the ofwcall name in my hack for reporting things about the =
crash confirmed that peer ofwcall as the first with pmap_bootstrapped =
non-zero.

I will note here that it is somewhat later than the above code that =
pvo_vaddr_compare ends up executing via bs_remap_earlyboot. That earlier =
moea64_late_bootstrap code continues after the } from the first if above =
with:

        /*
         * Calculate the last available physical address.
         */
        for (i =3D 0; phys_avail[i + 2] !=3D 0; i +=3D 2)
                ;
        Maxmem =3D powerpc_btop(phys_avail[i + 1]);

        /*
         * Initialize MMU and remap early physical mappings
         */
        MMU_CPU_BOOTSTRAP(mmup,0);
        mtmsr(mfmsr() | PSL_DR | PSL_IR);
        pmap_bootstrapped++;
        bs_remap_earlyboot();

(and more). I've not found the peer call yet but it may well be after =
the pvo_vaddr_compare shown above as far as execution order goes.





=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 25, 2014, at 2:41 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

The first boot after make -8 kernel without quiesce also died during =
peer, I'd guess the same one.

Looks like quiesce does not matter for the issue. (But it is handy for =
identifying which peer fails.)



=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 25, 2014, at 2:08 PM, Nathan Whitehorn <nwhitehorn at =
freebsd.org> wrote:

Can you comment out the call to quiesce? It may not be necessary on your =
system.
-Nathan

On 09/25/14 13:17, Mark Millard wrote:
> The "before copyright" hang/exception is during the first openfirmware =
"peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) =
completes fine, the ofwcall(args) is made but it does not return =
normally.
>=20
> Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's =
is:
>=20
> quiesce
> finddevice
> parent
> getprop
> getprop
> getprop
> finddevice
> getprop
> instance-to-package
> getproplen
> finddevice
> getprop
> getprop
> peer
>=20
> And when the boot fails before the copyright that ofwcall for peer =
ends up resulting in the register dump with no register pointing to the =
kernel's normal stack area.
>=20
> I still have no clue what is happening during peer. =
ofw_restore_trap_vec(save_trap_init) is being called and is returning =
before ofwcall is used. For all I know some uses of peer could require =
not being quiesce'd in order for peer to be reliable.
>=20
> In the form of my display indicating what executed the text reported =
ends in:
>=20
> <peer>^
>=20
> where the ^ indicates the stage that last completed in the call =
sequence inside openfirmware_core. This information is displayed by the
>=20
> x/s ofw_name_history
>=20
> in the automatically created default script for DDB. I read the =
sequence backwards from the end marker (here ^), following the =
wraparound if there is that much text and if I care to go back that far.
>=20
> FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep =
25 12:14:05 PDT 2014     root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64  =
powerpc
>=20
> My current hacks to get this information are:
>=20
> Index: /usr/src/sys/ddb/db_script.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/ddb/db_script.c (revision 271944)
> +++ /usr/src/sys/ddb/db_script.c (working copy)
> @@ -319,10 +319,25 @@
>  {
>   char scriptname[DB_MAXSCRIPTNAME];
> =20
> + /* HACK!!! : Additional lines to force a basic default script to =
exist.
> +  * Will dump information even if ddb input is not available for =
early crash.
> +  * Used to get more information about PowerMac G5 "before Copyright" =
hangs.
> +  */
> + struct ddb_script *dsp =3D =
db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT);
> + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; =
bt; x/s ofw_name_history");
> +
>   snprintf(scriptname, sizeof(scriptname), "%s.%s",
>       DB_SCRIPT_KDBENTER_PREFIX, eventname);
>   if (db_script_exec(scriptname, 0) =3D=3D ENOENT)
>   (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
> +
> + /* HACK!!! : Additional lines to always use the default script,
> +  *           even if scriptname existed and was executed.
> +  * Will dump information even if ddb input is not available for =
early crash.
> +  * Used to get more information about PowerMac G5 "before Copyright" =
hangs.
> +  */
> + else
> + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
>  }
> =20
>  /*-
> Index: /usr/src/sys/powerpc/conf/GENERIC64
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944)
> +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy)
> @@ -76,6 +76,8 @@
>  # Debugging support.  Always need this:
>  options   KDB # Enable kernel debugger support.
>  options   KDB_TRACE # Print a stack trace for a panic.
> +options   DDB
> +options   GDB
> =20
>  # Make an SMP-capable kernel by default
>  options   SMP # Symmetric MultiProcessor Kernel
> Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944)
> +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy)
> @@ -324,6 +324,12 @@
>   openfirmware(&args);
>  }
> =20
> +/* Part of HACK to have record of ofw call names */
> +#define ofw_name_history_record_size 256
> +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* =
Initially: automatically '\0' filled */
> +char * ofw_name_history_pos =3D ofw_name_history;
> +/* End Part of HACK */
> +
>  static int
>  openfirmware_core(void *args)
>  {
> @@ -330,6 +336,42 @@
>   int result;
>   register_t oldmsr;
> =20
> + { /* HACK to have record of ofw call names */
> + struct argtype_prefix {
> + cell_t name;
> + };
> +
> + char *name =3D (char*) (uintptr_t) (((struct =
argtype_prefix*)args)->name);
> +=20
> + int i;
> +
> + *ofw_name_history_pos =3D '<';
> +
> + for(i=3D0; (*name) && i!=3D20; i++) {
> + ofw_name_history_pos++;
> + if (ofw_name_history_pos =3D=3D =
&ofw_name_history[ofw_name_history_record_size]) {
> + ofw_name_history_pos =3D ofw_name_history;
> + }
> + *ofw_name_history_pos =3D *name;
> +
> + name++;
> + }
> +
> + ofw_name_history_pos++;
> + if (ofw_name_history_pos =3D=3D =
&ofw_name_history[ofw_name_history_record_size]) {
> + ofw_name_history_pos =3D ofw_name_history;
> + }
> + *ofw_name_history_pos =3D '>';
> +
> + ofw_name_history_pos++;
> + if (ofw_name_history_pos =3D=3D =
&ofw_name_history[ofw_name_history_record_size]) {
> + ofw_name_history_pos =3D ofw_name_history;
> + }
> + *ofw_name_history_pos =3D '@';
> +
> + ofw_name_history[ofw_name_history_record_size] =3D '\0'; /* Paranoia =
*/
> + } /* HACK end */
> +
>   /*
>    * Turn off exceptions - we really don't want to end up
>    * anywhere unexpected with PCPU set to something strange
> @@ -337,14 +379,22 @@
>    */
>   oldmsr =3D intr_disable();
> =20
> + *ofw_name_history_pos =3D '#'; /* HACK */
> +
>   ofw_sprg_prepare();
> =20
> + *ofw_name_history_pos =3D '$'; /* HACK */
> +
>   /* Save trap vectors */
>   ofw_save_trap_vec(save_trap_of);
> =20
> + *ofw_name_history_pos =3D '%'; /* HACK */
> +
>   /* Restore initially saved trap vectors */
>   ofw_restore_trap_vec(save_trap_init);
> =20
> + *ofw_name_history_pos =3D '^'; /* HACK */
> +
>  #if defined(AIM) && !defined(__powerpc64__)
>   /*
>    * Clear battable[] translations
> @@ -357,13 +407,21 @@
> =20
>   result =3D ofwcall(args);
> =20
> + *ofw_name_history_pos =3D '&'; /* HACK */
> +
>   /* Restore trap vecotrs */
>   ofw_restore_trap_vec(save_trap_of);
> =20
> + *ofw_name_history_pos =3D '*'; /* HACK */
> +
>   ofw_sprg_restore();
> =20
> + *ofw_name_history_pos =3D '~'; /* HACK */
> +
>   intr_restore(oldmsr);
> =20
> + *ofw_name_history_pos =3D '!'; /* HACK */
> +
>   return (result);
>  }
>=20
>=20
>=20
>=20
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2E98A886-36FF-4B68-B729-F2143339E1DE>