Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Oct 2014 00:43:11 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        Justin Hibbits <chmeeedalf@gmail.com>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: My PowerMac G5's no longer crash at boot: PowerMac G5 specific ofwcall changes with justifying evidence [current workaround]
Message-ID:  <0EAE6493-FF6B-4F90-8C7B-F32A62DBD6B7@dsl-only.net>
In-Reply-To: <0CEC8978-E208-4F57-8481-DD9C321EF673@dsl-only.net>
References:  <76F704FD-BB74-4439-8318-DB4C167B420F@dsl-only.net>	<543B3828.8070806@freebsd.org>	<9D9B0372-8D8F-4153-85B5-40066206EF67@dsl-only.net>	<379AA7FC-98C9-48B9-92BB-60E134817AF1@dsl-only.net>	<C614025F-6455-4929-8468-462E76079274@dsl-only.net>	<A2AB9066-259B-4B7D-BDDC-D03AE5827E13@dsl-only.net> <CAHSQbTCKi_MBhERh6d=kX2y-=%2B2OzqpGM%2BN=ZEShi-kX2r8NPQ@mail.gmail.com> <543D5ACD.20901@freebsd.org> <3D4A76B3-431A-4C94-8747-70369A8A1764@dsl-only.net> <0F85ACBD-F6D6-4ABA-B8FA-00C586A086DE@dsl-only.net> <EE3EC252-DC9E-4F95-977C-FAF9F364CA92@dsl-only.net> <49920E63-CB4A-429C-AB3A-984075AE183D@dsl-only.net> <0CEC8978-E208-4F57-8481-DD9C321EF673@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Short of extracting and analyzing the openfirmware code and its behavior =
directly I've run out of ideas for investigation of the %r1 and %r3 =
corruptions during openfirmware calls on the PowerMac G5's.

So my next investigative direction will probably be to hack in %r1 and =
%r3 validation into the powerpc/GENERIC ofwcall 32 bit code and have it =
report if it finds anything odd. This may take a while for me to get to. =
And some time to conclude that nothing is being found if nothing is =
found.



I believe that given the known problems and observed %r1 and %r3 =
corruptions that the FreeBSD ofwcall code for powerpc64 on PowerMacs =
would be safer if ofwcall was changed to have the following properties =
(at least on/for powerpc64 PowerMacs):

A) check if %r3 ends up neither 0 nor -1 and if not then change it to -1 =
for what is returned overall. In other words: do not presume things are =
okay with other information returned other ways (fields of struct =
pointed to by argument) unless the returned openfirmware status in %r3 =
is exactly zero. So otherwise have the openfirmware error indicator (-1) =
returned from ofwcall.

[Do all openfirmware's have the one's complement Boolean style return =
values (0 vs. -1) that PowerMac G5's seem to have? If not the code above =
would fail to be very general.]

B) Similarly check for if %r1 had a net-change (a corruption) and use =
the known/recorded before-value and have %r3 be -1 to get to the point =
of returning to the caller a failure status to the code calling ofwcall.

C) Possibly have one automatic retry of the openfirmware call if (A) or =
(B) type problems happen before having such a failure (-1) return. =
Re-setup %r1 and %r3 first for such a retry if such is attempted. Handle =
retry-failure as in (A) and (B) above.

[This comes from my investigation only finding one-time-failures in the =
sequence of ofwcall's: after a failure later calls from the same boot =
sequence and until shutdown worked without observed corruptions of %r1 =
or %r3.]

D) As paranoia for now: Have a general bias to not depending on most =
registers being preserved across the openfirmware call since bad =
register values are part of the observed problem. Probably be biased to =
mostly use the registers that ofwcall already explicit saves and =
restores (non-volatile registers that openfirmware should also =
explicitly save and restore) but use separate storage to save and then =
recover values across any calls into openfirmware.


However, such changes would mean that such PowerMac builds would not be =
generic FreeBSD code unless such things were tolerable for the other =
powerpc64 contexts that use ofwcall from ofwcall64.S.



My code for this below certainly qualifies as a personal hack based on =
information specific to PowerMac G5's. I have also left in place the =
early restore of the FreeBSD sprg0 value that allowed the original =
exception to have a proper value to use during my investigations. (Those =
specific exceptions should no longer be possible in my code.) I've got =
ofw_sprg0_save being accessible and used from both ofw_machdep.c and =
ofwcalla64.S because of leaving this paranoia item in place.

I also have DDB/GDB option additions in GENERIC64 and ddb hacks such =
that early crashes tend to "bt; show registers" before hanging. (There =
is also the PS3 disable and the addition of sc.)

My context is still 10.1-RC1 based. /etc/make.conf with =
WITH_DEBUG_FILES=3D , WITHOUT_CLANG=3D , WITH_DEBUG=3D , and =
WORKDIRPREFIX assigned. I tend to have verbose_loading=3D"YES" in =
/boot/loader.conf . kern.vty depends on which video hardware is =
involved. Panic dumps are effectively disabled by it attempting larger =
dma transfers than are actually supported: that that size relationship =
ends up reported instead.

root@FBSDG5M1:/usr/home/markmi # svnlite diff /usr/src/sys/
Index: /usr/src/sys/ddb/db_main.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/ddb/db_main.c	(revision 272558)
+++ /usr/src/sys/ddb/db_main.c	(working copy)
@@ -46,6 +46,9 @@
 #include <ddb/db_command.h>
 #include <ddb/db_sym.h>
=20
+/* HACK: part of dealing with lack of input for early boot time */
+#include <ddb/db_output.h>
+
 SYSCTL_NODE(_debug, OID_AUTO, ddb, CTLFLAG_RW, 0, "DDB settings");
=20
 static dbbe_init_f db_init;
@@ -210,6 +213,9 @@
 	watchpt =3D IS_WATCHPOINT_TRAP(type, code);
=20
 	if (db_stop_at_pc(&bkpt)) {
+		/* HACK: part of early boot handling: no input possible =
*/
+		db_disable_pager();
+
 		if (db_inst_count) {
 			db_printf("After %d instructions (%d loads, %d =
stores),\n",
 			    db_inst_count, db_load_count, =
db_store_count);
Index: /usr/src/sys/ddb/db_script.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/ddb/db_script.c	(revision 272558)
+++ /usr/src/sys/ddb/db_script.c	(working copy)
@@ -319,10 +319,25 @@
 {
 	char scriptname[DB_MAXSCRIPTNAME];
=20
+	/* HACK!!! : Additional lines to force a basic default script to =
exist.
+	 * Will dump information even if ddb input is not available for =
early crash.
+	 * Used to get more information about PowerMac G5 "before =
Copyright" hangs.
+	 */
+	struct ddb_script *dsp =3D =
db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT);
+	if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "bt; show =
registers");
+
 	snprintf(scriptname, sizeof(scriptname), "%s.%s",
 	    DB_SCRIPT_KDBENTER_PREFIX, eventname);
 	if (db_script_exec(scriptname, 0) =3D=3D ENOENT)
 		(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
+
+	/* HACK!!! : Additional lines to always use the default script,
+	 *           even if scriptname existed and was executed.
+	 * Will dump information even if ddb input is not available for =
early crash.
+	 * Used to get more information about PowerMac G5 "before =
Copyright" hangs.
+	 */
+	else
+		(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
 }
=20
 /*-
Index: /usr/src/sys/powerpc/conf/GENERIC64
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/powerpc/conf/GENERIC64	(revision 272558)
+++ /usr/src/sys/powerpc/conf/GENERIC64	(working copy)
@@ -28,7 +28,7 @@
=20
 # Platform support
 options 	POWERMAC		#NewWorld Apple PowerMacs
-options 	PS3			#Sony Playstation 3
+#options 	PS3			#Sony Playstation 3              =
 HACK!!! to allow sc
 options 	MAMBO			#IBM Mambo Full System Simulator
 options 	PSERIES			#PAPR-compliant systems (e.g. =
IBM p)
=20
@@ -76,6 +76,12 @@
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger =
support.
 options 	KDB_TRACE		# Print a stack trace for a =
panic.
+options 	DDB			# HACK!!! to dump early crash =
info
+options 	GDB			# HACK!!! ...
+#options 	KTR
+#options 	KTR_MASK=3DKTR_TRAP
+#options 	KTR_CPUMASK=3D0xF
+#options 	KTR_VERBOSE
=20
 # Make an SMP-capable kernel by default
 options 	SMP			# Symmetric MultiProcessor =
Kernel
@@ -115,6 +121,14 @@
 device		vt		# Core console driver
 device		kbdmux
=20
+# HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt =
mishandled
+# syscons is a console driver, resembling an SCO console
+device          sc
+#device          kbdmux		# HACK: already listed by vt
+options         SC_OFWFB	# OFW frame buffer
+options         SC_DFLT_FONT	# compile font in
+makeoptions     SC_DFLT_FONT=3Dcp437
+
 # Serial (COM) ports
 device		scc
 device		uart
Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/powerpc/ofw/ofw_machdep.c	(revision 272558)
+++ /usr/src/sys/powerpc/ofw/ofw_machdep.c	(working copy)
@@ -94,6 +94,11 @@
 /*
  * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the =
callback.
  */
+/* HACK: ofw_sprg0_save storage defined in ofwcall
+ *	 for use in very early FreeBSD sprg0 restore
+ *	 as part of ready-for-possible-exception parania.
+ */
+extern
 register_t	ofw_sprg0_save;
=20
 static __inline void
Index: /usr/src/sys/powerpc/ofw/ofwcall64.S
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- /usr/src/sys/powerpc/ofw/ofwcall64.S	(revision 272558)
+++ /usr/src/sys/powerpc/ofw/ofwcall64.S	(working copy)
@@ -52,6 +52,20 @@
 GLOBAL(rtas_entry)
 	.llong	0			/* RTAS entry point */
=20
+ /* HACK: part of dealing with openfirmware %r1, %r3 corruptions */
+ofw_entry_addr:				/* accessed under ofw =
msr */
+	.space	4
+ofw_r1_for_retry:			/* accessed under ofw msr */
+	.space	4
+ofw_r3_for_retry:			/* accessed under ofw msr */
+	.space	4
+
+ /* HACK: part of having FreeBSD sprg0 in place for potential =
exceptions */
+ofwsprg0save:				/* accessed under ofw msr */
+	.space	8 /* sizeof(register_t) */
+GLOBAL(ofw_sprg0_save)			/* accessed under FreeBSD msr */
+	.llong	0
+
 /*
  * Open Firmware Real-mode Entry Point. This is a huge pain.
  */
@@ -90,50 +104,121 @@
 	std	%r30,192(%r1)
 	std	%r31,200(%r1)
=20
+	/* HACK: Avoid depending much on preserved registers
+	 *       and be biased to use the ones saved above
+	 */
+
 	/* Record the old MSR */
-	mfmsr	%r6
+	mfmsr	%r14
=20
 	/* read client interface handler */
-	lis	%r4,openfirmware_entry@ha
-	ld	%r4,openfirmware_entry@l(%r4)
+	lis	%r15,openfirmware_entry@ha
+	ld	%r15,openfirmware_entry@l(%r15)
=20
+	/* HACK: part of having FreeBSD's sprg0 in place for exceptions.
+	 *       Parania code at this point since corrupted %r1 values =
are
+	 *       avoided by forcing the before-openfirmware value.
+	 */
+	lis	%r16,ofw_sprg0_save@ha
+	ld	%r16,ofw_sprg0_save@l(%r16)
+
 	/*
 	 * Set the MSR to the OF value. This has the side effect of =
disabling
 	 * exceptions, which is important for the next few steps.
+	 * NOTE: The call chain may well have already disabled such in =
FreeBSD's
+	 *       msr.
 	 */
=20
-	lis	%r5,ofmsr@ha
-	ld	%r5,ofmsr@l(%r5)
-	mtmsrd	%r5
+	lis	%r17,ofmsr@ha
+	ld	%r17,ofmsr@l(%r17)
+	mtmsrd	%r17
 	isync
=20
 	/*
 	 * Set up OF stack. This needs to be accessible in real mode and
 	 * use the 32-bit ABI stack frame format. The pointer to the =
current
-	 * kernel stack is placed at the very top of the stack along =
with
-	 * the old MSR so we can get them back later.
+	 * kernel stack is placed below the effective ofw-stack along =
with the
+	 * active FreeBSD TOC and FreeBSD MSR so we can get them back =
later.
 	 */
-	mr	%r5,%r1
+	mr	%r18,%r1
 	lis	%r1,(ofwstk+OFWSTKSZ-32)@ha
 	addi	%r1,%r1,(ofwstk+OFWSTKSZ-32)@l
-	std	%r5,8(%r1)	/* Save real stack pointer */
-	std	%r2,16(%r1)	/* Save old TOC */
-	std	%r6,24(%r1)	/* Save old MSR */
-	li	%r5,0
-	stw	%r5,4(%r1)
-	stw	%r5,0(%r1)
+	std	%r18,8(%r1)	/* Save FreeBSD stack pointer */
+	std	%r2,16(%r1)	/* Save FreeBSD TOC */
+	std	%r14,24(%r1)	/* Save FreeBSD MSR */
+	li	%r19,0
+	stw	%r19,4(%r1)
+	stw	%r19,0(%r1)
=20
+	/* HACK: Avoid depending much on preserved registers */
+
+	/* HACK: recording openfirmware entry address for use in =
possible retry */
+	lis	%r20,ofw_entry_addr@ha
+	stw	%r15,ofw_entry_addr@l(%r20)
+
+	/* HACK: recording %r1 before openfirmware for use in possible =
retry
+	 *       and also for testing for corruption (net-change)
+	 */
+	lis	%r21,ofw_r1_for_retry@ha
+	stw	%r1,ofw_r1_for_retry@l(%r21)
+
+	/* HACK: recording %r3 before openfirmware for use in possible =
retry */
+	lis	%r22,ofw_r3_for_retry@ha
+	stw	%r3,ofw_r3_for_retry@l(%r22)
+
+	/* HACK: part of having FreeBSD's sprg0 in place for exceptions.
+	 *       Parania code at this point since corrupted %r1 values =
are
+	 *       avoided by forcing the before-openfirmware value.
+	 */
+	lis	%r23,ofwsprg0save@ha
+	std	%r16,ofwsprg0save@l(%r23)
+
 	/* Finally, branch to OF */
-	mtctr	%r4
+	mtctr	%r15
 	bctrl
=20
-	/* Reload stack pointer and MSR from the OFW stack */
-	ld	%r6,24(%r1)
+	/* HACK: check if %r1 was corrupted (had a net-change) */
+	lis	%r21,ofw_r1_for_retry@ha
+	lwz	%r24,ofw_r1_for_retry@l(%r21)
+	cmpw	%r24,%r1
+	bne	2f /* stack pointer corrupted so go retry once */
+
+	/* HACK: %r1 okay but check %r3 for being 0 or -1 vs. anything =
else */
+	xoris	%r25,%r3,0
+	cmpw	%r25,%r3
+	bne	2f /* %r3 was neither 0 nor -1 so corruption: go retry =
once */
+
+	/* HACK: here both %r1 and %r3 appeared to be okay:
+	 *       so sequential flow was for "no problems"
+	 */
+
+1:	/* HACK status: continue/return from whatever status,
+	 * trying to get back cleanly to the FreeBSD context
+	 */
+
+	/* HACK: part of having FreeBSD's sprg0 in place for any =
exception
+	 *       during return.
+	 *       Parania code at this point since corrupted %r1 values =
are
+	 *       avoided by forcing the before-openfirmware value.
+	 * NOTE: Calling code also deals with this but too late for the
+	 *       original exceptions after openfirmware returned to this =
code.
+	 */
+	lis	%r23,ofwsprg0save@ha
+	ld	%r16,ofwsprg0save@l(%r23)
+	mtsprg0	%r16
+
+	/* Reload FreeBSD stack pointer and MSR
+	 * from the bottom of the (i.e., below the effective) OFW stack
+	 *
+	 * HACK note: %r1 may have been forced to the =
before-openfirmware value
+	 *            (to avoid garbage results and the resulting =
exceptions)
+	 */
+	ld	%r26,24(%r1)
 	ld	%r2,16(%r1)
 	ld	%r1,8(%r1)
=20
-	/* Now set the real MSR */
-	mtmsrd	%r6
+	/* Now set the FreeBSD MSR */
+	mtmsrd	%r26
 	isync
=20
 	/* Sign-extend the return value from OF */
@@ -168,6 +253,43 @@
 	mtlr 	%r0
 	blr
=20
+/* HACK: code for %r1 and/or %r3 corruption's single-retry */
+/*       Still under openfirmware's msr, sprg0, stack values */
+
+2:	/* HACK: corruption observed so retry, restoring %r1 and %r3 =
first */
+	lis	%r20,ofw_entry_addr@ha
+	lwz	%r15,ofw_entry_addr@l(%r20)
+	lis	%r21,ofw_r1_for_retry@ha
+	lwz	%r1,ofw_r1_for_retry@l(%r21)
+	lis	%r22,ofw_r3_for_retry@ha
+	lwz	%r3,ofw_r3_for_retry@l(%r22)
+	mtctr	%r15
+	bctrl
+
+	/* HACK: check if %r1 was corrupted (had a net-change) */
+	lis	%r21,ofw_r1_for_retry@ha
+	lwz	%r24,ofw_r1_for_retry@l(%r21)
+	cmpw	%r24,%r1
+	bne	3f /* retry corrupted %r1
+		    * so go give up with %r3 being -1 and %r1 =
forced-good
+		    */
+
+	/* HACK: %r1 okay but check %r3 for being 0 or -1 vs. anything =
else */
+	xoris	%r25,%r3,0
+	cmpw	%r25,%r3
+	beq	1b /* %r3 also was 0 or -1 so no corruption observed on =
retry
+		    * so go do a normal return
+		    */
+
+3:	/* Either %r1 had a net change after retry
+	 * or %r3 was not one of 0,-1 after retry
+	 * so force %r1 and have %r3 be -1 then go return
+	 */
+	lis	%r21,ofw_r1_for_retry@ha
+	lwz	%r1,ofw_r1_for_retry@l(%r21)
+	li	%r3,-1 /* the openfirmware failure return value */
+	b	1b
+
 /*
  * RTAS 32-bit Entry Point. Similar to the OF one, but simpler (no =
separate
  * stack)





=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0EAE6493-FF6B-4F90-8C7B-F32A62DBD6B7>