Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Jan 2015 16:57:45 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: 10.1 powerpc64 kernel build/boot-ability oddity (PowerMac):10.1-RELEASE-p4 boots 10.1-STABLE fails to
Message-ID:  <78919D58-B433-404C-ACBD-388EA66B9821@dsl-only.net>
In-Reply-To: <2B4FCA85-6874-41D8-A093-E87EC96CB5FA@dsl-only.net>
References:  <2B4FCA85-6874-41D8-A093-E87EC96CB5FA@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
I was aware of the issue from the page Nathan referenced but my context =
is
backwards from the expected issue and from Nathan's wording (below):

A) When I do *not* stop it to switch kernels at the loader prompt is =
when
10.1-STABLE *crashes*. (True of both loader.conf having kernel=3D set to =
pick out /boot/kernel10.1S/ and of cp -ax of 10.1-STABLE to =
/boot/kernel/ (with loader.conf
defaulted or explicit about /boot/kernel/.)

B) When I *do* stop it and explicitly switch from 10.1-RELENG to =
10.1-STABLE
at the loader prompt is when it *works* fine for 10.1-STABLE.

So far I've tried all my usual permutations of make.conf and src.conf =
settings
and the behaviors is unchanged across the various builds .

I've tried building -r275566, -r276979, and -r477483 of 10.1-STABLE and =
they
all get the same result in my tests.

An interesting point is that across all those 10.1-STABLE builds the =
following two lines are always the same for the failure (no variation in =
address in SRR0 or in SRR1's value):

%SRR0: 00000000.01c277fc
%SRR1: 10000000.00003030

It normally says "Invalid memory address" but occasionally says =
"Decrementer exception".

I have yet to find a way to build 10.1-STABLE that works for direct =
booting but I've no problems with any 10.1-RELENG (or the 10.1-RELEASE) =
based builds that I've tried.

I'll slowly keep looking into it. (Generally other things are limiting =
me to synchronizing world and kernel once and a while for FreeBSD. My =
time is mostly going elsewhere still.)


> Nathan wrote:
>=20
> This is a bug in loader, unfortunately. Due to the way that it =
interacts=20
> with Open Firmware's memory management, it is not in general possible =
to=20
> change kernels at the loader prompt. Depending on memory layout,=20
> sometimes it will work (as you noticed) and sometimes it will enter an=20=

> inconsistent state, usually crashing very early (as you also noticed).=20=

> This is the one "known issue" mentioned on the PowerPC port website at=20=

> http://www.freebsd.org/platforms/ppc.html.
> -Nathan



=3D=3D=3D
Mark Millard
markmi@dsl-only.net

On 2015-Jan-26, at 03:25 AM, Mark Millard <markmi@dsl-only.net> wrote:

I discovered that I have a 10.1 powerpc64 kernel build/boot-ability =
oddity (PowerMac).  First some context:

The builds are/were done on a PowerMac G5 quad-core.

$ ls -Fpald /boot/kernel*
drwxr-xr-x  2 root  wheel  26624 Jan 19 22:26 /boot/kernel/
drwxr-xr-x  2 root  wheel  26624 Jan 19 22:26 /boot/kernel.old/
drwxr-xr-x  2 root  wheel  26624 Jan 19 22:26 /boot/kernel10.1RE/
drwxr-xr-x  2 root  wheel  26624 Jan 23 23:44 /boot/kernel10.1S/
drwxr-xr-x  2 root  wheel  26624 Jan 25 19:52 /boot/kernel10.1S-alt/
$ freebsd-version -ku
10.1-RELEASE-p4
10.1-STABLE

kernel/, kernel.old/, and kernel10.1RE/ are all copies of each other =
currently (cp -xa ...) . It/they are my build of a variant of =
10.1-RELEASE-p4. The other two are builds of variants of 10.1-STABLE =
kernels (r276979 and r277483 variants).

In this configuration I can boot kernel just fine. I can also stop in =
Openfirmware and type any of...

boot kernel.old
boot kernel10.1RE
boot kernel10.1S
boot kernel10.1S-alt

and the boot works fine and "uname -a" then agrees with whichever one =
that I picked. For example boot kernel10.1S-alt results in:

$ uname -a
FreeBSD FBSDG5M1 10.1-STABLE FreeBSD 10.1-STABLE #8 r277483M: Sun Jan 25 =
19:51:41 PST 2015     root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64vtsc  =
powerpc

But if I do either of the following and then try to "shutdown -r now" =
afterwards I end up with a decrementer error (sometimes) or addressing =
error (the rest of the time, which is most of the time). This is while =
openfirmware is still displaying things before I can stop it by typing. =
I end up with the options "mac-boot" and "shut-down".

cp -ax /boot/kernel10.1S/ /boot/kernel/
cp -ax /boot/kernel10.1S-alt/ /boot/kernel/

Power-off/power-on gets the same kinds of failures that "shutdown -r =
now" gets. (Note: I will focus on kernel-10.1S-alt since my source tree =
has been updated after I built kernel10.1S so it no longer fully =
matches.)

Booting from a USB stick instead of the SSD (cmd-option-OF, boot =
ud:2,\ppc\bootinfo.txt) and picking shell, doing an appropriate mount, =
and then one of

cp -ax /boot/kernel.old/ /boot/kernel/
cp -ax /boot/kernel10.1RE/ /boot/kernel/

and then umount and "shutdown -r now" reboots fine and things are back =
to normal for future booting.

It seems that 10.1-RELEASE-p4 establishes context for 10.1-STABLE that =
10.1-STABLE does not correctly establish for itself --at least in my =
builds. But I've no clue what the issue is yet.



Context notes:

I have multiple source trees (with 10.1-STABLE in /usr/src and the other =
elsewhere). I use "make -j 8 kernel KERNCONF=3DGENERIC64vtsc =
INSTKERNNAME=3D...". (The later svnlite status "?" lines for any extra =
files are not shown.)

$ svnlite info ~markmi/src_10_1_releng
Path: /home/markmi/src_10_1_releng
Working Copy Root Path: /home/markmi/src_10_1_releng
URL: https://svn0.us-west.freebsd.org/base/releng/10.1
Relative URL: ^/releng/10.1
Repository Root: https://svn0.us-west.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 277195
Node Kind: directory
Schedule: normal
Last Changed Author: delphij
Last Changed Rev: 277195
Last Changed Date: 2015-01-14 13:27:46 -0800 (Wed, 14 Jan 2015)

$ svnlite info /usr/src
Path: /usr/src
Working Copy Root Path: /usr/src
URL: https://svn0.us-west.freebsd.org/base/stable/10
Relative URL: ^/stable/10
Repository Root: https://svn0.us-west.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 277483
Node Kind: directory
Schedule: normal
Last Changed Author: smh
Last Changed Rev: 277483
Last Changed Date: 2015-01-21 01:45:48 -0800 (Wed, 21 Jan 2015)

$ svnlite status ~markmi/src_10_1_releng
M       /home/markmi/src_10_1_releng/sys/ddb/db_main.c
M       /home/markmi/src_10_1_releng/sys/ddb/db_script.c
M       /home/markmi/src_10_1_releng/sys/powerpc/ofw/ofw_machdep.c
M       /home/markmi/src_10_1_releng/sys/powerpc/ofw/ofwcall64.S
M       =
/home/markmi/src_10_1_releng/sys/powerpc/powermac/powermac_thermal.c
$ svnlite status /usr/src
M       /usr/src/sys/ddb/db_main.c
M       /usr/src/sys/ddb/db_script.c
M       /usr/src/sys/powerpc/ofw/ofw_machdep.c
M       /usr/src/sys/powerpc/ofw/ofwcall64.S
M       /usr/src/sys/powerpc/powermac/powermac_thermal.c

All of the above except powermac_thermal.c are tied to my trying to =
produce evidence for later intermittent PowerMac G5 boot issues than =
what I'm reporting here. I will not get into the details for why but =
I've set up to use a Justin Hibbits patch for powermac_thermal.c, not =
that I need it for the PowerMac that I'm using for this note. (I move =
the same SSD around between machines.)

I used svnlite diff for each of the above to produce .diff files. =
Diffing the .diffs and then then original files is shown below (no =
differences).

$ diff src10.1-RELENG.diff src10.1-STABLE.diff
3c3
< --- sys/ddb/db_main.c	(revision 277195)
---
> --- sys/ddb/db_main.c	(revision 277483)
27c27
< --- sys/ddb/db_script.c	(revision 277195)
---
> --- sys/ddb/db_script.c	(revision 277483)
57c57
< --- sys/powerpc/ofw/ofw_machdep.c	(revision 277195)
---
> --- sys/powerpc/ofw/ofw_machdep.c	(revision 277483)
73c73
< --- sys/powerpc/ofw/ofwcall64.S	(revision 277195)
---
> --- sys/powerpc/ofw/ofwcall64.S	(revision 277483)
401c401
< --- sys/powerpc/powermac/powermac_thermal.c	(revision 277195)
---
> --- sys/powerpc/powermac/powermac_thermal.c	(revision 277483)
$ diff ~markmi/src_10_1_releng/sys/ddb/db_main.c =
/usr/src/sys/ddb/db_main.c
$ diff ~markmi/src_10_1_releng/sys/ddb/db_script.c =
/usr/src/sys/ddb/db_script.c
$ diff ~markmi/src_10_1_releng/sys/powerpc/ofw/ofw_machdep.c =
/usr/src/sys/powerpc/ofw/ofw_machdep.c
$ diff ~markmi/src_10_1_releng/sys/powerpc/ofw/ofwcall64.S =
/usr/src/sys/powerpc/ofw/ofwcall64.S
$ diff ~markmi/src_10_1_releng/sys/powerpc/powermac/powermac_thermal.c =
/usr/src/sys/powerpc/powermac/powermac_thermal.c

The same variant of GENERIC64 is used for both the source trees: I call =
it GENERIC64vtsc:

$ more sys/powerpc/conf/GENERIC64vtsc
include GENERIC64
ident   GENERIC64vtsc

nooptions       PS3                     #Sony Playstation 3              =
 HACK!!! to allow sc

options         DDB                     # HACK!!! to dump early crash =
info (but 11.0-CURRENT already has it)
options         GDB                     # HACK!!! ...
#options        KTR
#options        KTR_MASK=3DKTR_TRAP
#options        KTR_CPUMASK=3D0xF
#options        KTR_VERBOSE

# HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt =
historically mishandled during booting
device          sc
#device          kbdmux         # HACK: already listed by vt
options         SC_OFWFB        # OFW frame buffer
options         SC_DFLT_FONT    # compile font in
makeoptions     SC_DFLT_FONT=3Dcp437


# Disable extra checking typically used for FreeBSD 11.0-CURRENT:
nooptions       DEADLKRES               #Enable the deadlock resolver
nooptions       INVARIANTS              #Enable calls of extra sanity =
checking
nooptions       INVARIANT_SUPPORT       #Extra sanity checks of internal =
structures, required by INVARIANTS
nooptions       WITNESS                 #Enable checks to detect =
deadlocks and cycles
nooptions       WITNESS_SKIPSPIN        #Don't run witness on spinlocks =
for speed
nooptions       MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones

(I'm not referring in this Email to the context that I sometimes use the =
file content for 11.0-CURRENT. That would be another thing to test but I =
have not tried to have my 11.0-CURRENT variant as /boot/kernel/ so far. =
But "boot kernel11C" does work when /boot/kernel/ is based on =
10.1-RELEASE-p4.)

$ more /etc/make.conf
WRKDIRPREFIX=3D/usr/obj/portswork
WITH_DEBUG=3D
#MALLOC_PRODUCTION=3D
$ more /etc/src.conf
#WITH_DEBUG_FILES=3D
#WITHOUT_CLANG=3D

But ~markmi/src_10_1_releng was built longer ago and had the #'s removed =
in /etc/src.conf and no MALLOC_PRODUCTION=3D line in /etc/make.conf at =
all.

(I'll note that I use WITHOUT_CLANG when I use WITH_DEBUG_FILES because =
clang fails to fully build otherwise.)

$ more /boot/loader.conf
verbose_loading=3D"YES"
kern.vty=3Dvt

=3D=3D=3D
Mark Millard
markmi at dsl-only.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?78919D58-B433-404C-ACBD-388EA66B9821>