Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Feb 2015 23:51:51 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Subject:   Re: PowerMac G5 powerpc64: new context where repeatedly booting varies between failing and working
Message-ID:  <AB5193CD-DAE1-4BE8-8F34-56F271B0E658@dsl-only.net>
In-Reply-To: <F54659D4-C846-4F88-B678-193511F5A7A9@dsl-only.net>
References:  <7CA43EE3-8C11-4FBD-9F8A-42DF08B82362@dsl-only.net> <ABDD60F1-72C0-41E0-8DFB-4CFDCA9ACA82@dsl-only.net> <C355D814-D486-4644-B9C6-92992092FD55@dsl-only.net> <5FE82152-BBF7-4C6D-932D-AEE70546CACA@dsl-only.net> <36C14790-8E66-4C9D-9F29-A137FB49439D@dsl-only.net> <836A3016-D41B-45CB-AD4B-946767212026@dsl-only.net> <F54659D4-C846-4F88-B678-193511F5A7A9@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
My variant of your patch did not fix the sometimes-corruption (that =
typically at the same place in the boot sequence). The same good vs. bad =
values result and the overall range of the corruption is similar as =
well.



I've no hint that ofwcall itself (outside openfirmware) is writing the =
memory locations that end up corrupted. The pattern of corruption in the =
picture that I sent makes no sense for that stage doing it that I can =
see. But...

It would seem that either (A) openfirmware itself wrote those corrupted =
locations or (B) some form of dynamic binding is involved and injected =
some one-time code that did it.

In part I say this for (A) because at that point the openfirmware =
exception vectors are supposed to be in place so exception handling =
would be openfirmware code too as far as I know.


I wish I had a Logic Analyzer configuration for the G5 processor to =
record and analyze activity with. I've not figured out a way to get =
useful evidence from the context.



Thinking about it if (A) is the issue: the patch is using storage =
locations the FreeBSD powerpc64 ABI way(/places) but Apple's =
openfirmware on the G5's likely uses a Darwin PowerPC ABI style: does =
not even use TOC's and has %r2 for general use as a volatile register.

In fact as I remember when I looked up the openfirmware entry's first =
under a dozen instructions with x/i in ddb it was something like:

or    r2,r0,r2,   (a form of replaceable no-op given what follows?)
addis r2,r0,-0x49 (so %r2 ends up as 0xFFB70000 as a 32-bit =
interpretation?)
ori   r2,r2,0xf00 (so %r2 updates to 0xFFB70F00 as a ...?)
std   r1,r2,0x8,
std   r0,r2,0x10,
mfspr r0,lr
std   r0,r2,0x120,
mfmsr r1
std   r1,r2,0x108,

In other words: %r2's initial value is ignored and its value is quickly =
set and then it is used to point to a memory area to write to.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2015-Feb-18, at 09:53 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

Nathan W. wrote:

> Interesting. I'm assuming this is due to a bug in the 32-/64-bit ABI=20=

> thunking that is required to call into Open Firmware. Could you see if=20=

> the attached patch helps?
> -Nathan


It appears that direct use of TOC_ENTRY and such is not =
automatically/by-default available in 10.1-STABLE's context. Your basis =
for the patch is for 11.0-CURRENT after the relocatable kernel changes.

My context where I was lucky enough to get a memory layout that produced =
the failure that allowed detecting the memory corruption (and has a =
known way to quickly detect the specific corruption) is for some range =
of versions of 10.1-STABLE when my GENERIC64vtsc has a particular set of =
options enabled. I do not know how to take an arbitrary FreeBSD version =
and give it such a handy context for the issue. So I will stick with =
10.1-STABLE as much as I can for investigating this issue.

There is also the issue of the "once very early for many boots: %r1 and =
%r3 corruption on openfirmware return". I've been using my hack to =
"retry at most once per ofwcall use" to make my G5 quad-core PowerMac =
context boot most of the time (rather than needing to power off then on =
up to over a dozen times in a row to get a successful boot). The =
super-early boot failure rate had been blocking most investigation =
activities until I used this type of hack.

The closest 10.1-STABLE partial match to your patch mixed with my =
%r1/%r3 corruption handling that I've come up with overall is as =
follows. (Tabs probably turned to spaces.) Do you think it is sufficient =
for what you want tested?

(My observations suggest that the non-volatile registers are preserved =
by openfirmware even when I've seen other problems. I used to use =
explicit storage instead but switched to this style for the =
%r1/%r3-handling-hack part of the code because of it being invariant to =
relocatable vs. not. I've used %r29, %r28, %r27 as needing to survive =
the openfirmware call. %r25 does not need to do so.)

Index: sys/powerpc/ofw/ofwcall64.S
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- sys/powerpc/ofw/ofwcall64.S (revision 278443)
+++ sys/powerpc/ofw/ofwcall64.S (working copy)
@@ -114,23 +114,64 @@
        * the old MSR so we can get them back later.
        */
       mr      %r5,%r1
-       lis     %r1,(ofwstk+OFWSTKSZ-32)@ha
-       addi    %r1,%r1,(ofwstk+OFWSTKSZ-32)@l
-       std     %r5,8(%r1)      /* Save real stack pointer */
-       std     %r2,16(%r1)     /* Save old TOC */
-       std     %r6,24(%r1)     /* Save old MSR */
+       lis     %r1,(ofwstk+OFWSTKSZ-64)@ha
+       addi    %r1,%r1,(ofwstk+OFWSTKSZ-64)@l
+       std     %r5,40(%r1)     /* Save real stack pointer */
+       std     %r2,48(%r1)     /* Save old TOC */
+       std     %r6,56(%r1)     /* Save old MSR */
       li      %r5,0
       stw     %r5,4(%r1)
       stw     %r5,0(%r1)

+       /* HACK: recording %r1 (FreeBSD SP) before openfirmware for use =
in
+        *       possible retry and also for testing for corruption =
(net-change).
+        *       %r29 is supposed to be non-volitile for darwin 32 bit =
ABI.
+        */
+       mr      %r29,%r1
+
+       /* HACK: recording %r3 before openfirmware for use in possible =
retry.
+        *       %r28 is supposed to be non-volitile for darwin 32 bit =
ABI.
+        */
+       mr      %r28,%r3
+
+       /* HACK: recording %r4 before openfirmware for use in possible =
retry.
+        *       %r27 is supposed to be non-volitile for darwin 32 bit =
ABI.
+        */
+       mr      %r27,%r4
+
       /* Finally, branch to OF */
       mtctr   %r4
       bctrl

+       /* HACK: check if %r1 was corrupted (had a net-change) */
+       cmpw    %r29,%r1
+       bne     2f /* stack pointer corrupted so go retry once */
+
+       /* HACK Notes: the observed corruption had %r1 changed and =
%r1=3D%r3.
+        *             This code is somewhat more general.
+        */
+
+       /* HACK: %r1 okay but check %r3 for being 0 or -1 vs. anything =
else */
+       xoris   %r25,%r3,0
+       cmpw    %r25,%r3
+       bne     2f /* %r3 was neither 0 nor -1 so corruption: go retry =
once */
+
+1:     /* HACK: here both %r1 and %r3 appear to be okay:
+        *       so sequential flow was for "no problems"
+        *       but jumping here is a retry result being
+        *       returned, possibly with forced-good values
+        *       indicating a openfirmware error status (%r3=3D-1).
+        */
+
+       /* HACK removal: I've removed the mtsprg0 that put back
+        *               FreeBSD's value to help with exceptions and
+        *               and DDB display for when %r1 was corrupted.
+        */
+
       /* Reload stack pointer and MSR from the OFW stack */
-       ld      %r6,24(%r1)
-       ld      %r2,16(%r1)
-       ld      %r1,8(%r1)
+       ld      %r6,56(%r1)
+       ld      %r2,48(%r1)
+       ld      %r1,40(%r1)

       /* Now set the real MSR */
       mtmsrd  %r6
@@ -168,6 +209,40 @@
       mtlr    %r0
       blr

+/* HACK: code for %r1 and/or %r3 corruption's single-retry */
+/*       Still under openfirmware's msr, sprg0, stack values */
+
+2:     /* HACK: corruption observed so retry, restoring %r1 and %r3 =
first
+       mr      %r1,%r29
+       mr      %r3,%r28
+       mtctr   %r27
+       bctrl
+
+       /* HACK: check if %r1 was corrupted (had a net-change) */
+       cmpw    %r29,%r1
+       bne     3f /* retry corrupted %r1
+                   * so go give up with %r3 being -1 and %r1 =
forced-good
+                   */
+
+       /* HACK Notes: the observed corruption had %r1 changed and =
%r1=3D%r3
+        *             This code is somewhat more general.
+        */
+
+       /* HACK: %r1 okay but check %r3 for being 0 or -1 vs. anything =
els
+       xoris   %r25,%r3,0
+       cmpw    %r25,%r3
+       beq     1b /* %r3 also was 0 or -1 so no corruption observed on =
re
+                   * so go do a normal return
+                   */
+
+3:     /* Either %r1 had a net change after retry
+        * or %r3 was not one of 0,-1 after retry
+        * so force %r1 and have %r3 be -1 then go return
+        */
+       mr      %r1,%r29
+       li      %r3,-1 /* the openfirmware failure return value */
+       b       1b
+
/*
 * RTAS 32-bit Entry Point. Similar to the OF one, but simpler (no =
separate
 * stack)


The context would be:

root@FBSDG5M1:/usr/src # svnlite status
?       .snap
M       sys/ddb/db_main.c
M       sys/ddb/db_script.c
M      sys/powerpc/conf
?       sys/powerpc/conf/GENERIC64vtsc
M       sys/powerpc/ofw/ofw_machdep.c
M       sys/powerpc/ofw/ofwcall64.S
M       sys/powerpc/powermac/platform_powermac.c

root@FBSDG5M1:/usr/src # svnlite info
Path: .
Working Copy Root Path: /usr/src
URL: https://svn0.us-west.freebsd.org/base/stable/10
Relative URL: ^/stable/10
Repository Root: https://svn0.us-west.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 278443
Node Kind: directory
Schedule: normal
Last Changed Author: brooks
Last Changed Rev: 278443
Last Changed Date: 2015-02-09 01:22:47 -0800 (Mon, 09 Feb 2015)

The ddb's are there to have an automatic display on failure if it =
happens from a FreeBSD exception vector context.

ofw_machdep.c does the check for corruption around its ofwcall.

platform_powermac.c has a printf for reporting the expected pointer =
value just before it has ever been observed to go bad.

ofwcall64.S: See above if it is acceptable.

root@FBSDG5M1:/usr/src # more sys/powerpc/conf/GENERIC64vtsc
include GENERIC64
ident   GENERIC64vtsc

nooptions       PS3                     #Sony Playstation 3              =
 HACK!!! to allow sc

options         DDB                     # HACK!!! to dump early crash =
info (but 11.0-CURRENT already has it)
options         GDB                     # HACK!!! ...
options         VERBOSE_SYSINIT
options         BOOTVERBOSE=3D1
options         BOOTHOWTO=3DRB_VERBOSE
#options        KTR
#options        KTR_MASK=3DKTR_TRAP
#options        KTR_CPUMASK=3D0xF
#options        KTR_VERBOSE

# HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt =
historically mishandled during booting
device          sc
#device          kbdmux         # HACK: already listed by vt
options         SC_OFWFB        # OFW frame buffer
options         SC_DFLT_FONT    # compile font in
makeoptions     SC_DFLT_FONT=3Dcp437


# Disable extra checking typically used for FreeBSD 11.0-CURRENT:
nooptions       DEADLKRES               #Enable the deadlock resolver
nooptions       INVARIANTS              #Enable calls of extra sanity =
checking
nooptions       INVARIANT_SUPPORT       #Extra sanity checks of internal =
structures, required by INVARIANTS
nooptions       WITNESS                 #Enable checks to detect =
deadlocks and cycles
nooptions       WITNESS_SKIPSPIN        #Don't run witness on spinlocks =
for speed
nooptions       MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones

root@FBSDG5M1:/usr/src # more /etc/make.conf
WRKDIRPREFIX=3D/usr/obj/portswork
WITH_DEBUG=3D
MALLOC_PRODUCTION=3D

root@FBSDG5M1:/usr/src # more /etc/src.conf=20
CFLAGS+=3D-DELF_VERBOSE
#WITH_DEBUG_FILES=3D
#WITHOUT_CLANG=3D

root@FBSDG5M1:/usr/src # more /boot/loader.conf
#kernel=3D"kernel"
#kernel=3D"kernel10.1RE"
kernel=3D"kernel10.1S"
#kernel=3D"kernel11C"
verbose_loading=3D"YES"
kern.vty=3Dvt


=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2015-Feb-18, at 04:51 AM, Mark Millard <markmi at dsl-only.net> =
wrote:

I modified openfirmware_core to check on the status of the pointer value =
between most of its stages. With this I've also seen later failures than =
the usual one, such as after a OF_finddevice use has its ofwcall return.

And the change nails down the stage greatly for at what point it =
corrupts memory when it does fail...

// OKAY HERE
      result =3D ofwcall(args);
// SOMETIMES CORRUPTED HERE

Unfortunately to get this far ofwcall is my variant in order to, for =
example, enable recovery/retry from observed bad r1/r3 register problems =
that happened super-early on return from openfirmware in a high =
percentage of my boot attempts. I have yet to see how close to normal I =
can get ofwcall to be while still allowing this type of test.


The relevant detection code in openfirmware_core is...

/* HACK */
extern void** authnone_create(void);
...
static __inline void
ofw_restore_trap_vec(char *restore_trap_vec)
{
      if (!ofw_real_mode)
              return;

      bcopy(restore_trap_vec, (void *)EXC_RST, EXC_LAST - EXC_RST);
      __syncicache(EXC_RSVD, EXC_LAST - EXC_RSVD);
}
...
static int
openfirmware_core(void *args)
{
      int             result;
      register_t      oldmsr;

/* HACK */
void** jnk1pp;
void** jnk2pp;
void* jnk =3D *authnone_create()
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      /*
       * Turn off exceptions - we really don't want to end up
       * anywhere unexpected with PCPU set to something strange
       * or the stack pointer wrong.
       */
      oldmsr =3D intr_disable();

/* HACK */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      ofw_sprg_prepare();

/* HACK */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      /* Save trap vectors */
      ofw_save_trap_vec(save_trap_of);

/* HACK */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      /* Restore initially saved trap vectors */
      ofw_restore_trap_vec(save_trap_init);

/* HACK */
jnk1pp =3D authnone_create();

#if defined(AIM) && !defined(__powerpc64__)
      /*
       * Clear battable[] translations
       */
      if (!(cpu_features & PPC_FEATURE_64))
              __asm __volatile("mtdbatu 2, %0\n"
                               "mtdbatu 3, %0" : : "r" (0));
      isync();
#endif

      result =3D ofwcall(args);

/* HACK */
jnk2pp =3D authnone_create();

      /* Restore trap vecotrs */
      ofw_restore_trap_vec(save_trap_of);

/* HACK */
if (jnk !=3D *jnk1pp) jnk =3D *authnone_create();
if (jnk !=3D *jnk2pp) jnk =3D *authnone_create();
/* Note: *jnk2pp above is what detects the bad pointer value when it =
goes bad */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      ofw_sprg_restore();

/* HACK */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      intr_restore(oldmsr);

/* HACK */
if (jnk =3D=3D *authnone_create()) jnk =3D *authnone_create();

      return (result);
}

In the code this translates to...

00000000008a671c <.openfirmware_core+0x168> bl      00000000007a3de4 =
<.authnone_create>
00000000008a6720 <.openfirmware_core+0x16c> crmove  4*cr7+so,4*cr7+so
00000000008a6724 <.openfirmware_core+0x170> mr      r28,r3

Note: The above loads r28 with a good address that later does not fail =
when later dereferenced (while FreeBSD's exception vectors are in =
place).

00000000008a6728 <.openfirmware_core+0x174> mr      r3,r29
00000000008a672c <.openfirmware_core+0x178> bl      00000000008ac930 =
<.ofwcall>
00000000008a6730 <.openfirmware_core+0x17c> crmove  4*cr7+so,4*cr7+so
00000000008a6734 <.openfirmware_core+0x180> mr      r26,r3
00000000008a6738 <.openfirmware_core+0x184> bl      00000000007a3de4 =
<.authnone_create>
00000000008a673c <.openfirmware_core+0x188> crmove  4*cr7+so,4*cr7+so
00000000008a6740 <.openfirmware_core+0x18c> mr      r29,r3

Note: The above loads r29 with the bad address that is later detected by =
referencing it. This is the corrupted pointer value.

00000000008a6744 <.openfirmware_core+0x190> ld      r3,21216(r2)
00000000008a6748 <.openfirmware_core+0x194> lwz     r0,0(r3)
00000000008a674c <.openfirmware_core+0x198> cmpwi   cr7,r0,0
00000000008a6750 <.openfirmware_core+0x19c> beq+    cr7,00000000008a6778 =
<.openfirmware_core+0x1c4>
00000000008a6754 <.openfirmware_core+0x1a0> addi    r3,r3,16
00000000008a6758 <.openfirmware_core+0x1a4> li      r4,256
00000000008a675c <.openfirmware_core+0x1a8> li      r5,11776
00000000008a6760 <.openfirmware_core+0x1ac> bl      00000000008c158c =
<.bcopy>
00000000008a6764 <.openfirmware_core+0x1b0> crmove  4*cr7+so,4*cr7+so
00000000008a6768 <.openfirmware_core+0x1b4> li      r3,0
00000000008a676c <.openfirmware_core+0x1b8> li      r4,12032
00000000008a6770 <.openfirmware_core+0x1bc> bl      00000000008d5358 =
<.__syncicache>

Note: At this point it is back to FreeBSD exception vectors so kernel =
debug display will work for bad pointer detection tests.

00000000008a6774 <.openfirmware_core+0x1c0> crmove  4*cr7+so,4*cr7+so
00000000008a6778 <.openfirmware_core+0x1c4> ld      r0,0(r28)

Note: The above dereference of the before ofwcall pointer value (in r28) =
does not detect a bad pointer.

00000000008a677c <.openfirmware_core+0x1c8> cmpd    cr7,r0,r30
00000000008a6780 <.openfirmware_core+0x1cc> beq-    cr7,00000000008a6790 =
<.openfirmware_core+0x1dc>
00000000008a6784 <.openfirmware_core+0x1d0> bl      00000000007a3de4 =
<.authnone_create>
00000000008a6788 <.openfirmware_core+0x1d4> crmove  4*cr7+so,4*cr7+so
00000000008a678c <.openfirmware_core+0x1d8> ld      r30,0(r3)
00000000008a6790 <.openfirmware_core+0x1dc> ld      r0,0(r29)

It is that last instruction (.openfirmware_core+0x1dc) that "detects" =
the bad pointer and leads to a kernel debugger display of some of the =
corrupted memory, including the stored pointer that the above code =
accessed and dereferenced to detect the problem.

So the pointer was good just before the ofwcall and was bad just after =
it.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2015-Feb-17, at 09:34 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

[I had sent Nathan W. and Justin H. a picture of a display of a =
boot-time corrupted memory region. This time I tried to find the start =
and end of the region and I'm documenting in a textual form more =
appropriate to the list. I have also removed prior Email history from =
this Email but there is much context one must check that history for.]

Several of the new values put in place by the .got memory corruption =
reported below match up with .opd or other types of addresses reported =
by objdump for my /boot/kernel10.1S/kernel. They are noted below as I =
list detailed differences.

I made the early-boot-crash display a larger range and the span of the =
corruption seemed to go as follows for the corruption of part of the =
.got area. Also I induced a deference of the bad pointer as soon as it =
is discovered after the OF_peer(0) in question returns so later code =
would not be involved when it crashes. (Crash early, crash often...)


Overall structure:

0xd2da37 and before as far as I looked: no corruption found.

The area from 0xd2da38-0xd2dc9F: largely corrupted. 0x268 or 616 bytes =
or so in this corrupted range. 616=3D77*8.

After that range: good again as far as I looked.


The details:

Warning: The below is based on hand transcribed information from screen =
pictures that I took.

Showing pair of lines (good then corrupted), using x/x style lines:

0xd2da30: 0, b4fd2c, 0, b4fd70
0xd2da30: 0, b4fd2c, 0,      0

0xd2da40: 0,   e28948, 0, e1e460
0xd2da40: 0, 24000042, 0, d00058
(24000042 looks like a cr value?)
(0000000000d00058 l       .opd   0000000000000018 =
ofw_rendezvous_dispatch)

0xd2da50: 0, bc7de8,        0, bc7e08
0xd2da50: 0, cde110, c0000000,   8740
(0xc000000000008740 looks like a stack address?)
(0000000000cde110 g     F .opd   0000000000000018 =
smp_no_rendevous_barrier)

0xd2da60: 0, cd8470, 0, bd2608
0xd2da60: 0,      1, 0, c3a30c
(0000000000c3a30c g       .data  0000000000000000 ofw_sprg0_save)

0xd2da70: 0,  bb5ea0, 0, b70870
0xd2da70: 0, 1c35ec0, 0,      0

0xd2da80: 0,   c49918, 0, bc7e18
0xd2da80: 0, 44000022, 0, de4b30
(44000022 looks like a cr value?)
(0000000000de4b30 g     O .bss   0000000000000460 thread0)

0xd2da90:         0, b720a0, 0,   b71370
0xd2da90: 900000000,   1032, 0, ff846d78
(9000000000001032 looks like a SRR1 value.)
(ff846d78 is openfirmware entry point?)

0xd2daa0: 0, bc7e30,         0,   bc7e58
0xd2daa0: 0, e39080, 100000000,   3030
(0000000000e39080 g     O .bss   0000000000020000 __pcpu)
(1000000000003030 looks like a SRR1 value?)

0xd2dab0:        0, bc7e80, 0, bc7eb0
0xd2dab0: c0000000,   83b0, 0, c3a280
(0xc0000000000083b0 looks like a stack address?)
(c3a280 is inside my PowerMac G5 specific hack's ofwstk area: c392a0 up =
to 0x3a2a0)
(I've been gathering evidence about early-boot G5 crashes.)

0xd2dac0: 0, bc7ed0, 0, cf2960
0xd2dac0: 0, c40000, 0, c40000

0xd2dad0: 0, bc7f00, 0, bc7f28
0xd2dad0: 0, c40000, 0, c40000

0xd2dae0:        0, b72400, 0, bc7f28
0xd2dae0: c0000000,   8740, 0, cde110
(0xc000000000008740 looks like a stack address?)
(0000000000cde110 g     F .opd   0000000000000018 =
smp_no_rendevous_barrier)

0xd2daf0: 0, cf2b28, 0, b716a0
0xd2daf0: 0, d00058, 0, cde110
(d00058 was also at 0xd2da4c and was followed by cde110 there.)
(0000000000cde110 g     F .opd   0000000000000018 =
smp_no_rendevous_barrier)

0xd2db00: 0, cf2b88, 0, cf2b70
0xd2db00: 0, e6c280, 0,      0
(e6c280 is inside the emergency_buffer.7752 area: e6c278 up to e6c378)

0xd2db10:         0, cf2b58,        0, 8480
0xd2db10: 900000000,   1032, c0000000, 8740
(9000000000001032 looks like a SRR1 value?)
(0xc000000000008740 looks like a stack address?)

0xd2db20: 0, c2d920, 0, cf2b10
0xd2db20: 0, c2d920, 0, cf2b10 (yep: unchanged!)

0xd2db30: 0,   b71718,        0, c49888
0xd2db30: 0, ff846734, 10000000,   3030
(ff846734 would seem to be an openfirmware code address?)
(1000000000003030 looks like a SRR1 value?)

0xd2db40: 0, c498a0, 0,   c54000
0xd2db40: 0, c498a0, 0, ff846d78
(Yep: c498a0 was unchanged)
(ff846d78 is openfirmware entry point?)

0xd2db50:        0, e313a8, 0, e31608
0xd2db50: 24000042, e313a8, 0,      0
(24000042 looks like a cr value?)
(Trying to store to address 0x2400004200e313a8 for a specific
type of 10.1-STABLE build is how the problem was originally
noticed.)

0xd2db60: 0, c31f80, 0, bc81e8
0xd2db60: 0, c31f80, 0,      0
(Yep: 0x0000000000c31f80 is unchanged.)

0xd2db70:      0, e31408, 0, bc8228
0xd2db70: 200000, e31408, 0, bc8228
(Yep: Only the 0x200000 was a change.)

0xd2db80: 0, c32488,        0, bc8238
0xd2db80: 0,      1, 10000000,   3030
(1000000000003030 looks like a SRR1 value?)

0xd2db90: 0, e1e460, 0,   c31fc0
0xd2db90: 0,      0, 0, 7ff7e800

0xd2dba0: 0,   e31608, 0, bc8260
0xd2dba0: 0, 1000000a, 0, bc8260
(Yep: 0x0000000000bc8260 unchanged.)

0xd2dbb0: 0, e1e460, 0, e1fa60
0xd2dbb0: 0, e1e460, 0, e1fa60 (yep: unchanged!)

0xd2dbc0:      0, bc8288,        0, c32488
0xd2dbc0: 111081,      0, fd3c2000,      0
(fd3c2000 in openfirmware area?)

0xd2dbd0: 0, e3153c, 0, bc8298
0xd2dbd0: 10,     0, 0,      0

Now a few unchanged: 0xd2de0-0xd2dc1F

Then a change in the pattern of corruptions for the rest of the =
corrupted area:

0xd2dc20: 0, bc8288,       0, bc82e8
0xd2dc20: 0, bc8288, 127f500, bc82e8

Note how bc8288 and bc82e8 did not change.
=46rom here on those two columns are not
corrupted but the other two are.

0xd2dc30:       0, bc8300,      0, c32488
0xd2dc30: 8000000, bc8300, e7d540, c32488

0xd2dc40:     0, b4fef0,       0, e31558
0xd2dc40: ecc40, b4fef0, 84eec80, e31558

0xd2dc50:       0, bc8308,       0, cf2f00
0xd2dc50: 1e85440, bc8308, 8766200, cf2f00

0xd2dc60:      0, bc8310,       0, bc8350
0xd2dc60: fb9040, bc8310, 93bb000, bc8350

0xd2dc70:       0, c32038,       0, de5718
0xd2dc70: 94f6b00, c32038, 8632600, de5718

0xd2dc80:       0, de7768,       0, bc3760
0xd2dc80: 1fc0f40, de7768, 10f4b40, bc3760

0xd2dc90:       0, de7768,      0, e1fa00
0xd2dc90: 99e5700, cfc658, 228740, e1fa00

And after that things match for as far as I've looked: no corruptions.





=3D=3D=3D
Mark Millard
markmi at dsl-only.net








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AB5193CD-DAE1-4BE8-8F34-56F271B0E658>