FreeBSD Mail Archives

Date:      Mon, 9 Feb 2015 08:28:06 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: HEADS UP: ... [Better info for __syncicache exceptions at a dcbst: %DAR holding address not covered by PTEG's]
Message-ID:  <DA3763F9-748F-4A55-A4E3-47A14649EF26@dsl-only.net>
In-Reply-To: <DBFD0918-6274-47D3-8B35-5178ABD16728@dsl-only.net>
References:  <335C8DCD-33DF-4430-A0FA-77669C513C61@dsl-only.net> <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net> <E527514A-96F6-4794-8F03-504E51EC8CCB@dsl-only.net> <DBFD0918-6274-47D3-8B35-5178ABD16728@dsl-only.net>

An early copy/paste error propagated throughout my figures and messes =
things up. So I try again...

Also it turns out openfirmware has a "word" for dumping the registers =
for a crash: .registers

So I can coordinate some register values with __syncicache operation and =
that is better information than I tried to give before anyway. I used a =
failing direct boot attempt for the kernel10.1S that has len =3D =
0x10175d0.

01c27834 <__syncicache>      lis     r9,452
01c27838 <__syncicache+0x4>  lwz     r10,-25800(r9)

Note: cacheline_size =3D r10 above ends up being 0x20.
Yep the code is treating the 970MP processor as having a very small =
cache line size: 32. My understanding is that it is really 128. So each =
line is dcbst'd 4 times (offsets 0, 32, 64, and 96).

01c2783c <__syncicache+0x8>  addi    r11,r10,-1

Note: cacheline_size-1 =3D r11 above (0x1F).

[So fixing both my initial copy/paste of the from value and the =
cacheline_size-1 figures:

off =3D=3D from & (cacheline_size-1)
off =3D=3D 0x100000 & 0x1F (in all 3 cases)
off =3D=3D 0x0

So in essence that old material should be ignored.]

01c27840 <__syncicache+0xc>  and     r0,r3,r11
01c27844 <__syncicache+0x10> add     r0,r0,r4
01c27848 <__syncicache+0x14> andc    r8,r3,r11
01c2784c <__syncicache+0x18> subf    r0,r10,r0
01c27850 <__syncicache+0x1c> neg     r7,r10
01c27854 <__syncicache+0x20> mr      r9,r8
01c27858 <__syncicache+0x24> mr      r11,r0
01c2785c <__syncicache+0x28> dcbst   r0,r9

Note: p =3D r9 above.
The register dump shows r9=3D0xbe7000 and shows DAR=3D0xbe7000.
It also shows SRR0=3D0x1c2785c and interrupt vector 0x300.
DSISR=3D0x40000000 (not found in the primary or secondary PTEG).
And:
r10=3D0x20, r6=3D0x5305d0, r11=3D5305b0, r7=3D0xffffffffffffffe0 (so =
-0x20)
(r6,r11 together are related to l in the C source.)

01c27860 <__syncicache+0x2c> add     r11,r11,r7
01c27864 <__syncicache+0x30> add.    r6,r10,r11
01c27868 <__syncicache+0x34> add     r9,r9,r10
01c2786c <__syncicache+0x38> bgt+    01c2785c <__syncicache+0x28>
01c27870 <__syncicache+0x3c> sync   =20
01c27874 <__syncicache+0x40> icbi    r0,r8
01c27878 <__syncicache+0x44> add     r0,r0,r7
01c2787c <__syncicache+0x48> add.    r9,r0,r10
01c27880 <__syncicache+0x4c> add     r8,r8,r10
01c27884 <__syncicache+0x50> bgt+    01c27874 <__syncicache+0x40>
01c27888 <__syncicache+0x54> sync   =20
01c2788c <__syncicache+0x58> isync
01c27890 <__syncicache+0x5c> blr

So the problem seems to be not at the ends of the range but in the =
middle of it: a hole in the address range with no PTEG coverage.

=3D=3D=3D
Mark Millard
markmi@dsl-only.net

On 2015-Feb-8, at 04:20 PM, Mark Millard <markmi@dsl-only.net> wrote:

[Side note: My prior comment about having additional signed/unsigned =
conversion code related to "int len" in __syncicache was wrong. That the =
calling argument (*result)->f_size has size_t as a type ends up just =
depending on the value not being so big as to cause problems when it is =
converted to "int len" for __syncicache's len parameter.]

%SRR0 being 0x1c2785c is a loader code address and does not change =
unless one switches/updates the loader.

objdump shows the following for the unstripped variant of the loader =
that is in use =
(/usr/obj/usr/home/markmi/src_10_1_releng/sys/boot/powerpc/ofw/loader):

01c27834 <__syncicache>      lis     r9,452
01c27838 <__syncicache+0x4>  lwz     r10,-25800(r9)
01c2783c <__syncicache+0x8>  addi    r11,r10,-1
01c27840 <__syncicache+0xc>  and     r0,r3,r11
01c27844 <__syncicache+0x10> add     r0,r0,r4
01c27848 <__syncicache+0x14> andc    r8,r3,r11
01c2784c <__syncicache+0x18> subf    r0,r10,r0
01c27850 <__syncicache+0x1c> neg     r7,r10
01c27854 <__syncicache+0x20> mr      r9,r8
01c27858 <__syncicache+0x24> mr      r11,r0
01c2785c <__syncicache+0x28> dcbst   r0,r9
01c27860 <__syncicache+0x2c> add     r11,r11,r7
01c27864 <__syncicache+0x30> add.    r6,r10,r11
01c27868 <__syncicache+0x34> add     r9,r9,r10
01c2786c <__syncicache+0x38> bgt+    01c2785c <__syncicache+0x28>
01c27870 <__syncicache+0x3c> sync   =20
01c27874 <__syncicache+0x40> icbi    r0,r8
01c27878 <__syncicache+0x44> add     r0,r0,r7
01c2787c <__syncicache+0x48> add.    r9,r0,r10
01c27880 <__syncicache+0x4c> add     r8,r8,r10
01c27884 <__syncicache+0x50> bgt+    01c27874 <__syncicache+0x40>
01c27888 <__syncicache+0x54> sync   =20
01c2788c <__syncicache+0x58> isync
01c27890 <__syncicache+0x5c> blr

So the address in %SRR0 that is reported is for:

01c2785c <__syncicache+0x28> dcbst   r0,r9

The source shows:

void
__syncicache(void *from, int len)
{
       register_t l, off;
       char    *p;

       off =3D (uintptr_t)from & (cacheline_size - 1);
       l =3D len +=3D off;
       p =3D (char *)from - off;

       do {
               __asm __volatile ("dcbst 0,%0" :: "r"(p));
               p +=3D cacheline_size;
       } while ((l -=3D cacheline_size) > 0);
       __asm __volatile ("sync");
       p =3D (char *)from - off;
       do {
               __asm __volatile ("icbi 0,%0" :: "r"(p));
               p +=3D cacheline_size;
       } while ((len -=3D cacheline_size) > 0);
       __asm __volatile ("sync; isync");
}

Comparing the 3 examples of dcbst use for off and initial len values =
(using cacheline_size =3D=3D 128 so cacheline_size-1 =3D=3D 0x7F):

off =3D=3D 0x1c35b48 & 0x7F (in all 3 cases)
off =3D=3D 0x48

len =3D=3D 0x1014b80 is cache-line-multiple sized: This is the case that =
works fine.
from+len   =3D=3D 0x1c35b48 + 0x1014b80
from+len   =3D=3D 0x2c4A6C8
from+len-1 =3D=3D 0x2c4A6C7
start at 0x1c35b00
end with 0x2c4A680 (spans through 0x2c4A6FF)

len =3D=3D 0x1014bb0 is NOT cache-line-multiple sized: FAILS!
from+len   =3D=3D 0x1c35b48 + 0x1014b80
from+len   =3D=3D 0x2c4A6F8
from+len-1 =3D=3D 0x2c4A6F7
start at 0x1c35b00
end with 0x2c4A680 (spans through 0x2c4A6FF)
NOTE: Same start/end range as above but fails by causing an exception!

len =3D=3D 0x10175d0 is NOT cache-line-multiple sized: FAILS!
from+len   =3D=3D 0x1c35b48 + 0x10175d0
from+len   =3D=3D 0x2c4D118
from+len-1 =3D=3D 0x2c4D117
start at 0x1c35b00
end with 0x2c4D100 (spans through 0x2c4A67F)

The 3 cases have the same number of "extra before from" in the =
cache-lines.
But the 3 cases have varying numbers of "extra after from+len-1".

Only for len being a multiple of the cache-line-size would "extra before =
from" + "extra after from+len-1" be the cache-line-size (or zero if from =
was cache-line-aligned).

My guess is that something is configured based on "extra before from" + =
"extra after from+len-1" being the cache-line-size (or zero if from was =
cache-line-aligned) when it does not apply. So the processor ends up =
rejecting some bytes during a dcbst instruction for len values that are =
not multiples of the cache-line size.

For __syncicache's use from sys/boot/ofw/libofw/ppc64_elf_freebsd.c I've =
not figured out what all is configured related to the the range =
[from,from+len) (excludes from+len) --i.e. the range [from,from+len-1] =
since 0<len. where

from =3D=3D (void *) (*result)->f_addr
len  =3D=3D (*result)->f_size

Blindly listing things that can cause dcbst to exception (quotes form =
PowerISA_203_Final_Public.pdf):

> A Data Storage interrupt occurs when no higher priority exception =
exists and a data access cannot be per- formed for any of the following =
reasons. Data address translation is enabled (MSRDR=3D1) and the virtual =
address of any byte of the storage location specified by a Load, Store, =
icbi, dcbz, dcbst, dcbf[l], eciwx, or ecowx instruction cannot be =
translated to a real address.=20

> A Data Segment interrupt occurs when no higher prior- ity exception =
exists and a data access cannot be per- formed because data address =
translation is enabled and the effective address of any byte of the =
storage location specified by a Load, Store, icbi, dcbz, dcbst, dcbf[l] =
eciwx, or ecowx instruction cannot be trans- lated to a virtual address.=20=

> dcbf, dcbfep, dcbst, and dcbstep instructions are treated as Loads =
with respect to protection. Flushing or storing a line from the cache is =
not considered a Store since the store has already been done to update =
the cache and the dcbf, dcbfep, dcbst, or dcbstep instruction is only =
updating the copy in main storage. As a Load, they can cause Read Access =
Control exception type Data Storage interrupts.=20

(A table indicates Yes for Read Protection Violation and No for Write =
Protection Violation.)

> On implementations on which a Machine Check interrupt can be caused by =
referring to an invalid real address, executing a dcbz, dcbzep, or dcba =
instruction can cause a delayed Machine Check interrupt by establishing =
in the data cache a block that is associated with an invalid real =
address. See Section 3.2 of Book II. A Machine Check interrupt can =
eventually occur if and when a subsequent attempt is made to write that =
block to main storage, for example as the result of executing an =
instruc- tion that causes a cache miss for which the block is the target =
for replacement or as the result of exe- cuting a dcbst, dcbstep, dcbf, =
or dcbfep instruc- tion.=20

The dcbst related details for the code are (picking a failing example's =
argument values):

from  =3D=3D 0x1c35b48
len   =3D=3D 0x1014bb0 (example)

starting at 0x1c35b00
ending with 0x2c4A680 (but it will span through 0x2c4A6FF)

The code's calculations are based on:

off =3D=3D (uintptr_t)from & (cacheline_size - 1)
off =3D=3D 0x1c35b48 & 0x7F
off =3D=3D 0x48

(len as in the argument here, not later updates)
len+off =3D=3D 0x1014bb0 + 0x48
len+off =3D=3D 0x1014BF8
l =3D=3D 0x1014BF8 (initially)

Note that len+off is not a multiple of the cache size. Repeated =
subtractions of the cache size will end up with a non-zero remainder.

(p as it is initialized here, not later updates)
p   =3D=3D from - off
p   =3D=3D 0x1c35b48 - 0x48
p   =3D=3D 0x1c35b00
p+l =3D=3D 0x1c35b00+0x1014BF8
p+l =3D=3D 0x2c4A6F8

Note that 0x2c4A6F8 this is not cache aligned.
Note that as the dcbst loop progresses p+l is constant (other than in =
the middle of updating the pair).

So when...
l=3D=3D0xF8: p=3D=3D0x2c4A600
l=3D=3D0x78: p=3D=3D0x2c4A680
l<0: no dcbst for p

It appears that the dcbst loop does span the range of cache-lines that =
it should.

But apparently the processor is rejecting some bytes from that range =
when full cache-line sized units are counted. I would guess the =
rejection is on the l=3D0x78 side of things that varies from example to =
example.

=3D=3D=3D
Mark Millard
markmiatdsl-only.net

On 2015-Feb-8, at 01:16 AM, Mark Millard <markmi at dsl-only.net> wrote:

I've narrowed down greatly where the crashes happen, which need not be =
where root-cause is: that could be earlier.

In the following code [I'm using 10.1-RELEASE-p5 for reference here]

$ svnlite diff sys/boot/ofw/libofw/ppc64_elf_freebsd.c=20
Index: sys/boot/ofw/libofw/ppc64_elf_freebsd.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- sys/boot/ofw/libofw/ppc64_elf_freebsd.c	(revision 277808)
+++ sys/boot/ofw/libofw/ppc64_elf_freebsd.c	(working copy)
@@ -59,7 +59,11 @@
	 * be done by the kernel after relocation.
	 */
	if (!strcmp((*result)->f_type, "elf kernel"))
+{
+printf("ppc64_ofw_elf_loadfile before __syncicache\n");
		__syncicache((void *) (*result)->f_addr, =
(*result)->f_size);
+printf("ppc64_ofw_elf_loadfile after __syncicache\n");
+}
	return (0);
}

for a directly-bootable (no crash) kernel-build both printf's are =
displayed. (The above code part of /boot/loader .)

But for the kernels that I build that fail to directly boot only the =
first of the two printf's is displayed when a direct boot is attempted: =
Openfirmware's notice with %SRR0 and %SRR1 shows up after that instead =
of the text from the second printf.

Based on that much it looks like the crash is either in evaluating the =
arguments to __syncicache or happens during __syncicache's execution, =
not after.

Changing the first printf to something like the sequence:

printf("ppc64_ofw_elf_loadfile before __syncicache\n");
printf("(void*)result: %p\n",(void*)result);
printf("(void*)(*result): %p\n",(void*)(*result));
printf("(void*)(*result)->f_addr: %p\n",(void*)(*result)->f_addr);
printf("(*result)->f_size : 0x%lx\n",(*result)->f_size);

Shows that all of the stages print before the crash happens, answering =
the question about evaluation of the arguments: there is no problem =
evaluating them.

So the crashes are strictly during __syncicache's activity.

Only (*result)->f_size varies between the 3 examples that I've used:

10.1-RELEASE-p5 variant without VERBOSE_SYSINIT (and BOOTVERBOSE, =
BOOTHOWTO). (boots fine)
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT (and BOOTVERBOSE, =
BOOTHOWTO). (crashes)
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT (and BOOTVERBOSE, =
BOOTHOWTO). (crashes)

What the above printf's reported was:

(void*)result:    0x1c35b48
(void*)(*result):   0x1ebc0
(*result)->f_addr: 0x100000

10.1-RELEASE-p5 variant without VERBOSE_SYSINIT: (*result)->f_size:      =
  0x1014b80
10.1-RELEASE-p5 variant with VERBOSE_SYSINIT: (*result)->f_size:         =
  0x1014bb0
10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT: =
(*result)->f_size: 0x10175d0

(Listed in increasing order.)

As I remember for the last two the crash report listed %SRR0: 0x1c2785c =
for both. The "without VERBOSE_SYSINIT" one above does not crash and for =
it the "ppc64_ofw_elf_loadfile after __syncicache" message shows up as =
it should.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2015-Feb-7, at 09:43 AM, Mark Millard <markmi at dsl-only.net> wrote:

Correction to earlier Email: VERBOSE_SYSINIT with DDB (and GDB) all =
enabled (indirectly booted via using kernel10.1RE) got 0x1c277ec for the =
%SRR0 value, not 0x1c277fc. So slightly different than Kernel10.1S's =
0x1c277fc (for this 10.1-STABLE variant). (I looked at the wrong notes =
when composing the original Email.)

More comparisons of kernel build options:

VERBOSE_SYSINIT enabled with DDB (and GDB) disabled still has the =
booting problem for my 10.1-RELEASE-p5 variant. It also still has the =
0x1c277ec for the %SRR0 value.

For VERBOSE_SYSINIT disabled (DDB and GDB enabled) directly booted...

Preloaded elf kernel "/boot/kernel/kernel" at 0x1106000.
...
real memory  =3D 17152118784 (16357 MB)
available KVA =3D 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001115000 - 0x00000000017fffff, 7254016 bytes (1771 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 =
pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory =3D 16374190080 (15615 MB)

So 0x1c277ec is between the two:

0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)

(But I do not know what most of the regions and holes are supposed to =
be.)

VERBOSE_SYSINIT, DDB, and GDB enabled but indirectly booted via =
kernel10.1RE (via /boot/loader.conf's kernel=3D"kernel10.1RE"), =
stopping, unloading, then doing "boot kernel":

Preloaded elf kernel "/boot/kernel/kernel" at 0x1116000.
...
real memory  =3D 17152118784 (16357 MB)
available KVA =3D 7222611967 (6888 MB)
Physical memory chunk(s):
0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages)
0x0000000001105000 - 0x0000000001114fff, 65536 bytes (16 pages)
0x0000000001125000 - 0x00000000017fffff, 7188480 bytes (1755 pages)
0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages)
0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages)
0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages)
0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages)
0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 =
pages)
0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages)
0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages)
0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages)
avail memory =3D 16374190080 (15615 MB)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2015-Feb-7, at 03:49 AM, Mark Millard <markmi at dsl-only.net> wrote:

Nathan, you had the below written about my problems with booting my =
builds of, say, 10.1-STABLE  (kernel=3D"kernel10.1S" in =
/boot/loaderl.conf) without involving the kernel from my build of =
10.1-RELEASE-p5 (kernel=3D"kernel10.1RE" or sometimes kernel=3D"kernel" =
in /boot/loader.conf), where kernel=3D"kernel10.1RE" in =
/boot/loader.conf boots just fine...

> So this has to be some kind of icache issue. If you unload and reload=20=

> the *same* kernel, does it also help?
> -Nathan

(Part of the evidence was: Using kernel=3D"kernel10.1RE" in =
/boot/loader.conf, stopping at the 10sec prompt, unloading, and doing =
"boot kernel 10.1S" lets my 10.1-STABLE builds boot that will not boot =
directly.)

Well I've got a little more information from a different direction: A =
way to create the problem when building my 10.1-RELEASE-p5 kernel is to =
enable VERBOSE_SYSINIT. More specifically the comparison/contrast I've =
done so far is...

I added the following 3 lines to my GENERIC64vtsc for my 10.1-RELEASE-p5 =
source tree (no other changes elsewhere at all)

options 	VERBOSE_SYSINIT
options 	BOOTVERBOSE=3D1
options 	BOOTHOWTO=3DRB_VERBOSE

and rebuilt kernel the via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel =
the resulting kernel load fails if referenced by /boot/loader.conf via =
kernel=3D"kernel" line. The %SRR0 address value listed is the same as =
for kernel10.1S: 1c277fc. But booting using kernel=3D"kernel10.1RE" in =
/boot/loader.conf, stopping at the 10sec wait, unloading, and typing =
"boot kernel" boots fine --just like "boot kernel10.1S".

Note: GENERIC64vtsc has option DDB enabled (and GBD too). (This is =
associated my with my information gathering for early G5 boot =
crashes/hangups.)

Note: This is the first time I've ever tried any of those 3 options. My =
kernel10.1S build was not based on them.

Then I changed the 3 lines by just commenting out the first of the 3 =
that I had added

#options 	VERBOSE_SYSINIT
options 	BOOTVERBOSE=3D1
options 	BOOTHOWTO=3DRB_VERBOSE

and rebuilt via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel again. =
The resulting /boot/kernel/... boots just fine when kernel=3D"kernel" is =
used in /boot/loader.conf : no need for using kernel10.1RE or for =
stopping to do anything special.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DA3763F9-748F-4A55-A4E3-47A14649EF26>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation