From owner-freebsd-ppc@FreeBSD.ORG Mon Sep 22 23:01:15 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B14008F3 for ; Mon, 22 Sep 2014 23:01:15 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46B806AC for ; Mon, 22 Sep 2014 23:01:14 +0000 (UTC) Received: (qmail 13543 invoked from network); 22 Sep 2014 23:01:07 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 22 Sep 2014 23:01:07 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Mon, 22 Sep 2014 19:01:07 -0400 (EDT) Received: (qmail 970 invoked from network); 22 Sep 2014 23:01:06 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 22 Sep 2014 23:01:06 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 209B41C402B; Mon, 22 Sep 2014 16:01:01 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: 10.1-BETA2 "as distributed": PowerMac G4 (yes: 4) no boot hang with GeForce4 Ti 4600, so no hang without options DDB/GDB? From: Mark Millard In-Reply-To: Date: Mon, 22 Sep 2014 16:01:04 -0700 Message-Id: References: To: FreeBSD PowerPC ML X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Sep 2014 23:01:15 -0000 I tried installing 10.1-BETA2 directly from the 10.1-BETA2 MANIFEST and = *.txz files (via bsdinstall) to an SSD. The result booted the GeForce4 = Ti 4600 PowerMac G4 just fine. It appears that the additions of "options DDB" and "options GDB" to = GENERIC (and then doing buildworld kernel ...) may be what lead to the = boot time hang for the GeForce4 Ti 4600 video board based PowerMac G4. = The alternate is that WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D, WITH_DEBUG=3D= contributes or verbose_loading=3D"YES" in /boot/loader.conf does. Or = just building from source from a powerpc/GENERIC context (on a G5 = PowerMac) leads to the issue. At this point I've no clue why the GeForce4 Ti 4600 combined with any of = those alternatives leads to the hang. I may never know since the hang is = silent and rather early in the boot sequence. The radeon based PowerMac = G4's did not hang. Nor did the GeForce 7800 GT PowerMac G5's. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 17, 2014, at 11:01 PM, Mark Millard = wrote: For FreeBSD 10.1-BETA1 the 1.4GHz Dual-processor PowerMac G4 (yep: 4) = with a NVIDIA GeForce4 Ti 4600 in it always hangs after: > GDB: no debug ports present > KDB: debugger backends: DDB > KDB: current backend: DDB without writing out the Copyright notice or anything else. The same SSD boots an ATI Radeon 9000/PRO If (AGP/PCI) PowerMac G4 of = the same PowerMac model just fine. (I give uname -a and build details = later.) The same SSD also boots a PowerMac G5 with a GeForce 7800 GT = video board just fine. Other than the SSD, video board, and memory, = there is only basic stock equipment in the PowerMacs. The two G4's match = for those details as well: only the video boards are different models = and the amounts of RAM match. And my older 10.1-PRERELEASE #0 r271215 boot SSD still boots the GeForce = Ti 4600 PowerMac G4 just fine. While the place for the boot-hang is suggestive, I've nothing beyond = that indicating any relationship to the random PowerMac G5 boot problem = at the same place in the display sequence. And on the G4 DDB is not reporting anything, unlike on the G5 when it = hangs there. Context for failed boots off the GeForce Ti 4600 PowerMac G4 (I've not = tried much variation from this so no claims of essential status for any = of it): > FreeBSD FBSDG4S1 10.1-BETA1 FreeBSD 10.1-BETA1 #1 r271610M: Wed Sep 17 = 21:47:20 PDT 2014 root@FBSDG4S1:/usr/obj/usr/src/sys/GENERIC = powerpc The "M" status of r271610M (now that I "make -j 8 buildworld kernel" = based on svn materials in /usr/src/) is from the GENERIC modification in = the modifications listed below. The non-default things are... A) Adding to /usr/src/sys/powerpc/conf/GENERIC: "options DDB" and = "options GDB". B) Having /etc/make.conf use WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D, = WITH_DEBUG=3D, as well as having a WRKDIRPREFIX=3D(path not listed = here). C) Having /boot/loader.conf with just: verbose_loading=3D"YES". Context for working boots of the same GeForce Ti 4600 PowerMac G4: > FreeBSD FBSDG4S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271215: = Sat Sep 6 23:56:15 PDT 2014 = root@FBSDG4S0:/usr/obj/usr/src/sys/GENERIC powerpc /usr/src/sys/powerpc/conf/GENERIC was not modified at all. (Thus no M = suffix on r271215.) But /etc/make.conf uses WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D, = WITH_DEBUG=3D, as well as having a WRKDIRPREFIX=3D(path not listed = here). /boot/loader.conf empty. =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 00:18:28 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A01522E5 for ; Tue, 23 Sep 2014 00:18:28 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1BED7D4B for ; Tue, 23 Sep 2014 00:18:27 +0000 (UTC) Received: (qmail 9372 invoked from network); 23 Sep 2014 00:18:26 -0000 Received: from unknown (HELO mail-cs-04.app.dca.reflexion.local) (10.81.19.4) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 00:18:26 -0000 Received: by mail-cs-04.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Mon, 22 Sep 2014 20:18:26 -0400 (EDT) Received: (qmail 21923 invoked from network); 23 Sep 2014 00:18:26 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 00:18:26 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 61C741C402C; Mon, 22 Sep 2014 17:18:20 -0700 (PDT) From: Mark Millard Subject: powerpc64/GENERIC64 use of dcbst vs. dcbf: is the dcbst use really okay? Anyone know? Message-Id: <19413BD4-88D5-4897-B50C-48C47F5E2ACA@dsl-only.net> Date: Mon, 22 Sep 2014 17:18:21 -0700 To: FreeBSD PowerPC ML , Nathan Whitehorn , Justin Hibbits Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 00:18:28 -0000 Anyone know why the following is true in FreeBSD (10.1-BETA2, for = example) for kernel vs. openfirmware transitions (in both directions) = for powerpc64/GENERIC64? (And some other places are noted.) The issue is = dcbst vs. dcbf instruction usage. (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) Some context first... Apple's published BootX-81 always saves and restored the Exception = Vectors when going between openfirmware and the kernel: it maintains = separate vectors for the two contexts. In addition it carefully uses = dcbf and icbi no matter if copies to that area at address 0 or to a save = area. And that is followed by isync. (And more, sync and eieio: Apple = seems paranoid.) Apple used dcbf instead of dcbst. IBM writes of dcbst vs. dcbf: > Instruction caches, if they exist, are not required to be consistent = with data caches, memory, or I/O data trans- fers. Software must use the = appropriate cache management instructions to ensure that instruction = caches are kept coherent when instructions are modified by the processor = or by input data transfer. When a processor alters a memory location = that may be contained in an instruction cache, software must ensure that = updates to memory are visible to the instruction fetching mechanism. = Although the instructions to enforce consistency vary among = implementations, the following sequence for a uniprocessor system is = typical:=20 > 1. dcbst (update memory) > 2. sync (wait for update) > 3. icbi (invalidate copy in instruction cache) 4. isync (perform = context synchronization)=20 > Note: Most operating systems will provide a system service for this = function. These operations are neces- sary because the memory may be = designated as write-back. Since instruction fetching may bypass the data = cache, changes made to items in the data cache may not otherwise be = reflected in memory until after the instruction fetch completes.=20 > For implementations used in multiprocessor systems, variations on this = sequence may be recommended. For example, in a multiprocessor system = with a unified instruction/data cache (at any level), if instructions = are fetched without coherency being enforced, the preceding instruction = sequence is inadequate. Because the icbi instruction does not invalidate = blocks in a unified cache, a dcbf instruction should be used instead of = a dcbst instruction for this case. >=20 Then the point given that background information... FreeBSD's powerpc64/GENERIC64 seems to have a mix of dcbst and dcbf use. = The following have dcbst (unless patched separately at run time): 000000000086c1e8 <.agp_apple_unbind_page+0x60> dcbst r0,r0 000000000086c27c <.agp_apple_bind_page+0x64> dcbst r0,r0 00000000008b1b78 <.elf_reloc_internal+0x12c> dcbst r0,r30 00000000008bcd30 <.__syncicache+0x38> dcbst r0,r0 That last is used during the openfirmware vs. kernel transitions. The = above are from "objdump -d --prefix-address /boot/kernel/kernel". Is the dcbst use risky because of any unified caches at any level on any = of the processors that powerpc64/GENERIC64 is supposed to handle? =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 00:18:28 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 802402E2 for ; Tue, 23 Sep 2014 00:18:28 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1BF8ED4D for ; Tue, 23 Sep 2014 00:18:27 +0000 (UTC) Received: (qmail 4632 invoked from network); 23 Sep 2014 00:18:25 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 00:18:25 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Mon, 22 Sep 2014 20:18:25 -0400 (EDT) Received: (qmail 20296 invoked from network); 23 Sep 2014 00:18:25 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 00:18:25 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id CCD711C402B; Mon, 22 Sep 2014 17:18:19 -0700 (PDT) From: Mark Millard Subject: powerpc64/GENERIC64 use of dcbst vs. dcbf: is the dcbst use really okay? Anyone know? Message-Id: Date: Mon, 22 Sep 2014 17:18:21 -0700 To: FreeBSD PowerPC ML , Nathan Whitehorn , Justin Hibbits Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 00:18:28 -0000 Anyone know why the following is true in FreeBSD (10.1-BETA2, for = example) for kernel vs. openfirmware transitions (in both directions) = for powerpc64/GENERIC64? (And some other places are noted.) The issue is = dcbst vs. dcbf instruction usage. (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) Some context first... Apple's published BootX-81 always saves and restored the Exception = Vectors when going between openfirmware and the kernel: it maintains = separate vectors for the two contexts. In addition it carefully uses = dcbf and icbi no matter if copies to that area at address 0 or to a save = area. And that is followed by isync. (And more, sync and eieio: Apple = seems paranoid.) Apple used dcbf instead of dcbst. IBM writes of dcbst vs. dcbf: > Instruction caches, if they exist, are not required to be consistent = with data caches, memory, or I/O data trans- fers. Software must use the = appropriate cache management instructions to ensure that instruction = caches are kept coherent when instructions are modified by the processor = or by input data transfer. When a processor alters a memory location = that may be contained in an instruction cache, software must ensure that = updates to memory are visible to the instruction fetching mechanism. = Although the instructions to enforce consistency vary among = implementations, the following sequence for a uniprocessor system is = typical:=20 > 1. dcbst (update memory) > 2. sync (wait for update) > 3. icbi (invalidate copy in instruction cache) 4. isync (perform = context synchronization)=20 > Note: Most operating systems will provide a system service for this = function. These operations are neces- sary because the memory may be = designated as write-back. Since instruction fetching may bypass the data = cache, changes made to items in the data cache may not otherwise be = reflected in memory until after the instruction fetch completes.=20 > For implementations used in multiprocessor systems, variations on this = sequence may be recommended. For example, in a multiprocessor system = with a unified instruction/data cache (at any level), if instructions = are fetched without coherency being enforced, the preceding instruction = sequence is inadequate. Because the icbi instruction does not invalidate = blocks in a unified cache, a dcbf instruction should be used instead of = a dcbst instruction for this case. >=20 Then the point given that background information... FreeBSD's powerpc64/GENERIC64 seems to have a mix of dcbst and dcbf use. = The following have dcbst (unless patched separately at run time): 000000000086c1e8 <.agp_apple_unbind_page+0x60> dcbst r0,r0 000000000086c27c <.agp_apple_bind_page+0x64> dcbst r0,r0 00000000008b1b78 <.elf_reloc_internal+0x12c> dcbst r0,r30 00000000008bcd30 <.__syncicache+0x38> dcbst r0,r0 That last is used during the openfirmware vs. kernel transitions. The = above are from "objdump -d --prefix-address /boot/kernel/kernel". Is the dcbst use risky because of any unified caches at any level on any = of the processors that powerpc64/GENERIC64 is supposed to handle? =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 02:00:11 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8F62F466 for ; Tue, 23 Sep 2014 02:00:11 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2400B87D for ; Tue, 23 Sep 2014 02:00:10 +0000 (UTC) Received: (qmail 17585 invoked from network); 23 Sep 2014 02:00:09 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 02:00:09 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Mon, 22 Sep 2014 22:00:09 -0400 (EDT) Received: (qmail 16807 invoked from network); 23 Sep 2014 02:00:09 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 02:00:09 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id F25651C402C for ; Mon, 22 Sep 2014 19:00:02 -0700 (PDT) From: Mark Millard Subject: powerpc64/GENERIC64: a mtmsrd without a "context synchronizing instruction" (immediately?) following... Message-Id: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> Date: Mon, 22 Sep 2014 19:00:07 -0700 To: FreeBSD PowerPC ML Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 02:00:11 -0000 Context: 10.1-BETA2 powerpc64/GENERIC64 (with option DDB and option = GDB). (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) IBM writes of mtmsr/mtmsrd: > For software that will run on processors that comply with = earlier versions of the architecture, a context synchronizing = instruction is required after the mtmsr[d] instruction.=20 That sort of principle does not seem to be followed by one example in = powerpc64/GENERIC64: 0000000000102168 <.__start+0x78> rldimi r9,r8,63,0 000000000010216c <.__start+0x7c> mtmsrd r9 0000000000102170 <.__start+0x80> bl 0000000000101120 = 0000000000102174 <.__start+0x84> ld r2,40(r1) 0000000000102178 <.__start+0x88> lis r3,16 000000000010217c <.__start+0x8c> addi r3,r3,0 ... There other mtmsr's/mtmsrd's that I found had one or two isync's = following, proving the context synchronization instruction. IBM also reports: > Processors designed prior to Version 2.01 of the architecture ignore = the L field. These processors set the MSR as if L were =910=92, and = perform synchronization as if L were =911=92. Therefore software that = uses mtmsrd and runs on such processors must obey the following rules. >=20 > If L=3D =921=92, the contents of bits of register rS other than bits = [48] and [62] must be such that if L were =910=92 the instruction would = not alter the contents of the corresponding MSR bits.=20 > If L =3D =910=92 and the instruction alters the contents of any of the = MSR bits listed below, the instruction must be followed by a context = synchronizing instruction or event in order to ensure that the context = alteration caused by the mtmsrd instruction has taken effect on such = processors.=20 > To obtain the best performance on processors, if the context = synchronizing instruction is isync the isync should immediately follow = the mtmsrd. (Some such processors treat an isync instruction that = immediately follows an mtmsrd instruction having L =3D =920=92 as a = no-op, thereby avoiding the performance penalty of a second context = synchronization.) >=20 Another interesting IBM note for mtmsr (not mtmsrd), but effectively = just a side note here: > The mtmsr instruction, which is otherwise illegal in the 64-bit = architecture may optionally be imple- mented in 64-bit bridge = implementations.=20 FreeBSD powerpc64/GENERIC64 seems to use mtmsr fairly freely. (k_trap, = trapexit, asttrapexit, .breakpoint, dbtrap, dbleave, ichss_set, = prof_clock_cnt, hardclock_cpu, kdb_trap, powerpc_interrupt, = flush_disabnle_caches, spinlock_exit, spin_lock_enter, powerpc_init, = cpu_sleep, moea64_add_ofw_mappings, moea64_late_bootstrap, = moea64_mid_bootstrap, moea64_cpu_bootstrap_native, = moea64_bootstrap_native, write_scom, read_scom, pcr_set, = openfirmware_core, save_vec, enable_vec, configure_final, = cpu_est_clockrage, cpu_idle_60x, save_fpu, enable_fpu, = mps3_cpu_bootstrap. Apple also used mtmsr (not mtmsrd) in the = openfirmware vs. kernel transitions in the published BootX-81 source = code.) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 04:15:54 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B68714C1 for ; Tue, 23 Sep 2014 04:15:54 +0000 (UTC) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 811D47B6 for ; Tue, 23 Sep 2014 04:15:54 +0000 (UTC) Received: from comporellon.tachypleus.net (polaris.tachypleus.net [75.101.50.44]) (authenticated bits=0) by c.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8N4FkGP019616 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for ; Mon, 22 Sep 2014 21:15:47 -0700 Message-ID: <5420F3F2.4010202@freebsd.org> Date: Mon, 22 Sep 2014 21:15:46 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: freebsd-ppc@freebsd.org Subject: Re: powerpc64/GENERIC64: a mtmsrd without a "context synchronizing instruction" (immediately?) following... References: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> In-Reply-To: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Sonic-CAuth: UmFuZG9tSVZohAgrvx6SjfFPepgXTmUNrE3nzo9GSLCoCROt68PL1bKl5QNsydpPA0raL1g+BWlzFJBsEHoYuj3hVlbZHWg0FGIv/TvSDhk= X-Sonic-ID: C;4LMfS9hC5BGQtzZXoK8kYw== M;Js6RS9hC5BGQtzZXoK8kYw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 04:15:54 -0000 On 09/22/14 19:00, Mark Millard wrote: > Context: 10.1-BETA2 powerpc64/GENERIC64 (with option DDB and option GDB). > > (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) > > IBM writes of mtmsr/mtmsrd: > >> For software that will run on processors that comply with earlier versions of the architecture, a context synchronizing instruction is required after the mtmsr[d] instruction. > That sort of principle does not seem to be followed by one example in powerpc64/GENERIC64: > > 0000000000102168 <.__start+0x78> rldimi r9,r8,63,0 > 000000000010216c <.__start+0x7c> mtmsrd r9 > 0000000000102170 <.__start+0x80> bl 0000000000101120 > 0000000000102174 <.__start+0x84> ld r2,40(r1) > 0000000000102178 <.__start+0x88> lis r3,16 > 000000000010217c <.__start+0x8c> addi r3,r3,0 > ... > > There other mtmsr's/mtmsrd's that I found had one or two isync's following, proving the context synchronization instruction. This one was missing. Nice catch! This is among the first instructions the CPU executes, so it cannot be the cause of any hangs that happen after display anything on the screen. > IBM also reports: > >> Processors designed prior to Version 2.01 of the architecture ignore the L field. These processors set the MSR as if L were ‘0’, and perform synchronization as if L were ‘1’. Therefore software that uses mtmsrd and runs on such processors must obey the following rules. >> >> If L= ’1’, the contents of bits of register rS other than bits [48] and [62] must be such that if L were ‘0’ the instruction would not alter the contents of the corresponding MSR bits. >> If L = ‘0’ and the instruction alters the contents of any of the MSR bits listed below, the instruction must be followed by a context synchronizing instruction or event in order to ensure that the context alteration caused by the mtmsrd instruction has taken effect on such processors. >> To obtain the best performance on processors, if the context synchronizing instruction is isync the isync should immediately follow the mtmsrd. (Some such processors treat an isync instruction that immediately follows an mtmsrd instruction having L = ’0’ as a no-op, thereby avoiding the performance penalty of a second context synchronization.) >> > > > Another interesting IBM note for mtmsr (not mtmsrd), but effectively just a side note here: > >> The mtmsr instruction, which is otherwise illegal in the 64-bit architecture may optionally be imple- mented in 64-bit bridge implementations. > FreeBSD powerpc64/GENERIC64 seems to use mtmsr fairly freely. (k_trap, trapexit, asttrapexit, .breakpoint, dbtrap, dbleave, ichss_set, prof_clock_cnt, hardclock_cpu, kdb_trap, powerpc_interrupt, flush_disabnle_caches, spinlock_exit, spin_lock_enter, powerpc_init, cpu_sleep, moea64_add_ofw_mappings, moea64_late_bootstrap, moea64_mid_bootstrap, moea64_cpu_bootstrap_native, moea64_bootstrap_native, write_scom, read_scom, pcr_set, openfirmware_core, save_vec, enable_vec, configure_final, cpu_est_clockrage, cpu_idle_60x, save_fpu, enable_fpu, mps3_cpu_bootstrap. Apple also used mtmsr (not mtmsrd) in the openfirmware vs. kernel transitions in the published BootX-81 source code.) > > I think you are looking at very old documentation. The 32-bit mtmsr is implemented on all POWER ISA compliant CPUs (see e.g. page 886 of the 2.07 document). -Nathan From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 04:55:56 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A75BC9C0 for ; Tue, 23 Sep 2014 04:55:56 +0000 (UTC) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 890B4B17 for ; Tue, 23 Sep 2014 04:55:56 +0000 (UTC) Received: from comporellon.tachypleus.net (polaris.tachypleus.net [75.101.50.44]) (authenticated bits=0) by d.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8N4to9B003151 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 22 Sep 2014 21:55:50 -0700 Message-ID: <5420FD56.2020003@freebsd.org> Date: Mon, 22 Sep 2014 21:55:50 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: Mark Millard , FreeBSD PowerPC ML , Justin Hibbits Subject: Re: powerpc64/GENERIC64 use of dcbst vs. dcbf: is the dcbst use really okay? Anyone know? References: In-Reply-To: X-Sonic-CAuth: UmFuZG9tSVYxI0cmbp1QTcWh4Hpe8tHLnJyNRCH/D7Xr+5DXjK4MhvyE8garS9hufxNawsgolQhFJg+ms3dWrwpRKb2ol950B0ztWOPHZNg= X-Sonic-ID: C;NoSr491C5BGeyADu5Qupew== M;+ofo491C5BGeyADu5Qupew== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 04:55:56 -0000 On 09/22/14 17:18, Mark Millard wrote: > Anyone know why the following is true in FreeBSD (10.1-BETA2, for > example) for kernel vs. openfirmware transitions (in both directions) > for powerpc64/GENERIC64? (And some other places are noted.) The issue > is dcbst vs. dcbf instruction usage. > > (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) Yes, a mix is used. It's done deliberately. dcbf implies dcbst plus a cache flush, but there is in general no need for an invalidation where dcbst is used in the kernel. > Some context first... > > Apple's published BootX-81 always saves and restored the Exception > Vectors when going between openfirmware and the kernel: it maintains > separate vectors for the two contexts. In addition it carefully uses > dcbf and icbi no matter if copies to that area at address 0 or to a > save area. And that is followed by isync. (And more, sync and eieio: > Apple seems paranoid.) There's a lot of paranoia that is useful to attach to this process. Open Firmware is supposed to restore its own exception vectors (and does, in general) if it needs to. We could imagine doing that too. Doing that would probably involve a great deal of work implementing the OF callback infrastructure that we don't currently support, however. Is there a reason you think the exception vectors are causing a problem? -Nathan > Apple used dcbf instead of dcbst. > > IBM writes of dcbst vs. dcbf: > >> * Instruction caches, if they exist, are not required to be >> consistent with data caches, memory, or I/O data trans- fers. >> Software must use the appropriate cache management instructions >> to ensure that instruction caches are kept coherent when >> instructions are modified by the processor or by input data >> transfer. When a processor alters a memory location that may be >> contained in an instruction cache, software must ensure that >> updates to memory are visible to the instruction fetching >> mechanism. Although the instructions to enforce consistency vary >> among implementations, the following sequence for a uniprocessor >> system is typical: >> 1. *dcbst *(update memory) >> 2. *sync *(wait for update) >> 3. *icbi *(invalidate copy in instruction cache) 4. *isync >> *(perform context synchronization) >> *Note: *Most operating systems will provide a system service for >> this function. These operations are neces- sary because the >> memory may be designated as write-back. Since instruction >> fetching may bypass the data cache, changes made to items in the >> data cache may not otherwise be reflected in memory until after >> the instruction fetch completes. >> >> For implementations used in multiprocessor systems, variations on >> this sequence may be recommended. For example, in a multiprocessor >> system with a unified instruction/data cache (at any level), if >> instructions are fetched without coherency being enforced, the >> preceding instruction sequence is inadequate. Because the *icbi >> *instruction does not invalidate blocks in a unified cache, a *dcbf >> *instruction should be used instead of a *dcbst *instruction for this >> case. >> > > Then the point given that background information... > > FreeBSD's powerpc64/GENERIC64 seems to have a mix of dcbst and dcbf > use. The following have dcbst (unless patched separately at run time): > > 000000000086c1e8 <.agp_apple_unbind_page+0x60> dcbst r0,r0 > > 000000000086c27c <.agp_apple_bind_page+0x64> dcbst r0,r0 > > 00000000008b1b78 <.elf_reloc_internal+0x12c> dcbst r0,r30 > > 00000000008bcd30 <.__syncicache+0x38> dcbst r0,r0 > > That last is used during the openfirmware vs. kernel transitions. The > above are from "objdump -d --prefix-address /boot/kernel/kernel". > > Is the dcbst use risky because of any unified caches at any level on > any of the processors that powerpc64/GENERIC64 is supposed to handle? > > > === > Mark Millard > markmi at dsl-only.net > From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 05:35:58 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 51AF4D0B for ; Tue, 23 Sep 2014 05:35:58 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C841FE46 for ; Tue, 23 Sep 2014 05:35:57 +0000 (UTC) Received: (qmail 30640 invoked from network); 23 Sep 2014 05:35:55 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 05:35:55 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Tue, 23 Sep 2014 01:35:55 -0400 (EDT) Received: (qmail 10104 invoked from network); 23 Sep 2014 05:35:55 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 05:35:55 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 10DCB1C402B; Mon, 22 Sep 2014 22:35:47 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: powerpc64/GENERIC64 use of dcbst vs. dcbf: is the dcbst use really okay? Anyone know? From: Mark Millard In-Reply-To: <5420FD56.2020003@freebsd.org> Date: Mon, 22 Sep 2014 22:35:53 -0700 Message-Id: <9C7E071E-2547-47F9-98B3-54F4908C4932@dsl-only.net> References: <5420FD56.2020003@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 05:35:58 -0000 I keep seeing if I can get more evidence of what is going on for the = PowerMac G5 boot-hang problem. (I'd like full memory with reliable boots = some day...) In the process I look around and research what I see. And that can lead to such questions even if the question formed is not a = possible explanation of a contribution to the G5 hangs. In this case experiments had shown that dcbf's did not change the = boot-hang problem. But I still had the question. I gather from your explanation that those 4 places that use dcbst do not = have issues with processors fetching "without coherence being enforced". = .__syncicache seemed to be for more general use then the others and so = seemed less likely to have special criteria that might always apply. = That is part of what prompted me to ask: analyzing all the usage to = prove such properties was more than I wanted to take on. Side notes: For Apple's BootX-81: if/when the kernel returns BootX-81 does its own = restore of the openfirmware exception vectors before returning to = openfirmware (returning a -1). So they seem to not presume openfirmware = itself does so in that place. (More paranoia code?) One point of paranoia Apple did not follow in BootX-81: They have dcbf = then the matching icbi with nothing between and do all of the pairs of = those instruction before doing either just one "isync; sync; eieio" = (going into the kernel) or just one "sync; mtmsr ...; isync" on return = from the kernel (kernel exit). =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 22, 2014, at 9:55 PM, Nathan Whitehorn = wrote: On 09/22/14 17:18, Mark Millard wrote: > Anyone know why the following is true in FreeBSD (10.1-BETA2, for = example) for kernel vs. openfirmware transitions (in both directions) = for powerpc64/GENERIC64? (And some other places are noted.) The issue is = dcbst vs. dcbf instruction usage. >=20 > (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) Yes, a mix is used. It's done deliberately. dcbf implies dcbst plus a = cache flush, but there is in general no need for an invalidation where = dcbst is used in the kernel. > Some context first... >=20 > Apple's published BootX-81 always saves and restored the Exception = Vectors when going between openfirmware and the kernel: it maintains = separate vectors for the two contexts. In addition it carefully uses = dcbf and icbi no matter if copies to that area at address 0 or to a save = area. And that is followed by isync. (And more, sync and eieio: Apple = seems paranoid.) There's a lot of paranoia that is useful to attach to this process. Open = Firmware is supposed to restore its own exception vectors (and does, in = general) if it needs to. We could imagine doing that too. Doing that = would probably involve a great deal of work implementing the OF callback = infrastructure that we don't currently support, however. Is there a = reason you think the exception vectors are causing a problem? -Nathan > Apple used dcbf instead of dcbst. >=20 > IBM writes of dcbst vs. dcbf: >=20 >> Instruction caches, if they exist, are not required to be consistent = with data caches, memory, or I/O data trans- fers. Software must use the = appropriate cache management instructions to ensure that instruction = caches are kept coherent when instructions are modified by the processor = or by input data transfer. When a processor alters a memory location = that may be contained in an instruction cache, software must ensure that = updates to memory are visible to the instruction fetching mechanism. = Although the instructions to enforce consistency vary among = implementations, the following sequence for a uniprocessor system is = typical:=20 >> 1. dcbst (update memory) >> 2. sync (wait for update) >> 3. icbi (invalidate copy in instruction cache) 4. isync (perform = context synchronization)=20 >> Note: Most operating systems will provide a system service for this = function. These operations are neces- sary because the memory may be = designated as write-back. Since instruction fetching may bypass the data = cache, changes made to items in the data cache may not otherwise be = reflected in memory until after the instruction fetch completes.=20 >> For implementations used in multiprocessor systems, variations on = this sequence may be recommended. For example, in a multiprocessor = system with a unified instruction/data cache (at any level), if = instructions are fetched without coherency being enforced, the preceding = instruction sequence is inadequate. Because the icbi instruction does = not invalidate blocks in a unified cache, a dcbf instruction should be = used instead of a dcbst instruction for this case. >>=20 >=20 > Then the point given that background information... >=20 > FreeBSD's powerpc64/GENERIC64 seems to have a mix of dcbst and dcbf = use. The following have dcbst (unless patched separately at run time): >=20 > 000000000086c1e8 <.agp_apple_unbind_page+0x60> dcbst r0,r0 >=20 > 000000000086c27c <.agp_apple_bind_page+0x64> dcbst r0,r0 >=20 > 00000000008b1b78 <.elf_reloc_internal+0x12c> dcbst r0,r30 >=20 > 00000000008bcd30 <.__syncicache+0x38> dcbst r0,r0 >=20 > That last is used during the openfirmware vs. kernel transitions. The = above are from "objdump -d --prefix-address /boot/kernel/kernel". >=20 > Is the dcbst use risky because of any unified caches at any level on = any of the processors that powerpc64/GENERIC64 is supposed to handle? >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 08:03:36 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 53785A30 for ; Tue, 23 Sep 2014 08:03:36 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EE62AE5B for ; Tue, 23 Sep 2014 08:03:35 +0000 (UTC) Received: (qmail 27148 invoked from network); 23 Sep 2014 08:03:33 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 08:03:33 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Tue, 23 Sep 2014 04:03:33 -0400 (EDT) Received: (qmail 19183 invoked from network); 23 Sep 2014 08:03:33 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 08:03:33 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 3A4B51C4052 for ; Tue, 23 Sep 2014 01:03:32 -0700 (PDT) From: Mark Millard Subject: 10.1-BETA2 default /etc/motd vs. tier 2: portmaster misc/freebsd-doc-en with default configurations is amd64/i386 only Message-Id: Date: Tue, 23 Sep 2014 01:03:31 -0700 To: FreeBSD PowerPC ML Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 08:03:36 -0000 I tried to install misc/freebsd-doc-en per the new default /etc/motd: > Documents installed with the system are in the = /usr/local/share/doc/freebsd/ > directory, or can be installed later with: pkg install en-freebsd-doc > For other languages, replace "en" with a language code like de or fr. (interpreted to a portmaster context using default configuration = selections). But the result was: > =3D=3D=3D>>> misc/freebsd-doc-en >> devel/apache-ant >> = java/bootstrap-openjdk (8/30) >=20 > =3D=3D=3D> Cleaning for bootstrap-openjdk- > =3D=3D=3D> bootstrap-openjdk- is only for amd64 i386, while you are = running powerpc64. > *** Error code 1 >=20 > Stop. > make: stopped in /usr/ports/java/bootstrap-openjdk >=20 > =3D=3D=3D>>> make build failed for java/bootstrap-openjdk > =3D=3D=3D>>> Aborting update Looks like /etc/motd and its instructions tends to assume a tier 1 = amd64/i386 context. Elsewhere default selections for building = /usr/local/share/doc/freebsd need not not build (via = misc/freebsd-doc-en) and does not build for powerpc/powerpc64. =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 22:53:40 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6E760F83 for ; Tue, 23 Sep 2014 22:53:40 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E52C07B2 for ; Tue, 23 Sep 2014 22:53:38 +0000 (UTC) Received: (qmail 32183 invoked from network); 23 Sep 2014 22:53:37 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 23 Sep 2014 22:53:37 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Tue, 23 Sep 2014 18:53:37 -0400 (EDT) Received: (qmail 16716 invoked from network); 23 Sep 2014 22:53:36 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 23 Sep 2014 22:53:36 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id A16EF1C4053 for ; Tue, 23 Sep 2014 15:53:31 -0700 (PDT) From: Mark Millard Message-Id: <7BA54C8F-5B1C-4F8A-B0FD-E218A1D3E1F8@dsl-only.net> Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: //lib/libm.so.5: could not read symbols: Bad value and /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition Date: Tue, 23 Sep 2014 15:53:35 -0700 References: <6FE3262D-7AC1-4A1A-B298-5DEABAE37750@dsl-only.net> To: FreeBSD PowerPC ML In-Reply-To: <6FE3262D-7AC1-4A1A-B298-5DEABAE37750@dsl-only.net> X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 22:53:40 -0000 For 10.1-BETA2 I used the MANIFEST and *.txz files with bsdinstall to = make a powerpc/GENERIC SSD where I did not rebuild the world or kernel = but did attempt portmaster my usual ports. No WITH_DEBUG=3D or other = such added. But WRKDIRPREFIX=3D(path not listed here) present. It still got the: /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition //lib/libm.so.5: could not read symbols: Bad value *** [fractals] Error code 1 So even the standard way of building distributions has the problem for = powerpc/powerpc64. uname -a output: FreeBSD FBSDG4S0 10.1-BETA2 FreeBSD 10.1-BETA2 #0 r271848: Fri Sep 19 = 03:54:33 UTC 2014 = root@releng1.nyi.freebsd.org:/usr/obj/powerpc.powerpc/usr/src/sys/GENERIC = powerpc It would appear that = .../graphics/freeglut/work/freeglut-2.8.1/configure.ac generation of = progs/demos/Fractals/Makefile via: # Generate output. AC_CONFIG_FILES([ Makefile doc/Makefile include/GL/Makefile include/Makefile progs/Makefile progs/demos/CallbackMaker/Makefile progs/demos/Fractals/Makefile progs/demos/Fractals_random/Makefile progs/demos/Lorenz/Makefile progs/demos/Makefile progs/demos/One/Makefile progs/demos/shapes/Makefile progs/demos/smooth_opengl3/Makefile progs/demos/spaceball/Makefile progs/demos/subwin/Makefile src/Makefile ]) AC_OUTPUT needs to not only have the produced progs/demos/Fractals/Makefile = contain: LIBM =3D -lm (which it does) but to put LIBM to use by effectively adding $(LIBM) to: LIBS =3D -lXi -lXrandr -lXxf86vm=20 or some other way of having -lm show up in the link command. (Other = things may need similar -l's.) If true then the problem is not in/with libm.so.5 itself. My guess is that = .../graphics/freeglut/work/freeglut-2.8.1/progs/demos/Fractals/Makefile.am= should have: fractals_LDADD =3D ../../../src/lib@LIBRARY@.la $(GL_LIBS) $(LIBM) (I added the $(LIBM).) If so it would appear that pre-configure: in = /usr/ports/graphics/freeglut/Makefile might use something like: @${REINPLACE_CMD} -e "s|\$(GL_LIBS)|$(GL_LIBS) $(LIBM)|g" \ ${WRKSRC}/progs/demos/Fractals/Makefile.am (Again: There may be more than just Fractals and libm.so.5 involved = overall.) Or maybe a patch file for progs/demos/Fractals/Makefile.am could be set = up. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 18, 2014, at 12:02 AM, Mark Millard = wrote: For 10.1-??? I've been getting: /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition //lib/libm.so.5: could not read symbols: Bad value *** [fractals] Error code 1 make[6]: stopped in = /usr/obj/portswork/usr/ports/graphics/freeglut/work/freeglut-2.8.1/progs/d= emos/Fractals when I attempt to portmaster xscreensaver. (The rest of the ports I try = to build work fine, including all their dependencies. If xscreensaver = finished it would be about 409 ports involved in all.) I now note it to the list because I've now tried on on powerpc/GENERIC = and powerpc64/GENERIC64 with and without /etc/make.conf having: WITH_DEBUG_FILES=3D WITHOUT_CLANG=3D WITH_DEBUG=3D [WRKDIRPREFIX=3D(path not listed here) always present] when I buildworld kernel and use portmaster for the ports. It appears that no matter what style of build on a PowerMac under either = powerpc/GENERIC or powerpc64/GENERIC64 /lib/libm.so.5 ends up with this = problem (or the ld checks for invalid DSO's end up wrong --or both). I first noticed this with 10.1-PRERELEASE FreeBSD FBSDG4S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271215: Sat = Sep 6 23:56:15 PDT 2014 root@FBSDG4S0:/usr/obj/usr/src/sys/GENERIC = powerpc I can not claim just what was the last prior working case I had with = 10.0-STABLE but all those were "as distributed" installs instead of = personal "buildworld kernel" based on a source updates. For 10.1-??? = I've been experimenting with source based tracking/building, mostly = building on Quad-core PowerMac G5s (booted with either GENERIC based or = GENERIC64 based worlds/kernels, up to DDB/GDB being added or not). Here = GENERIC and GENERIC64 were not updated at all. It has continued with the likes of FreeBSD FBSDG4S1 10.1-BETA1 FreeBSD 10.1-BETA1 #1 r271610M: Wed Sep 17 = 21:47:20 PDT 2014 root@FBSDG4S1:/usr/obj/usr/src/sys/GENERIC = powerpc and its GENERIC64 variant. (M in r271610M because of DDB and GDB options = added to GENERIC and GENERIC64.) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Tue Sep 23 23:38:20 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C71898A8 for ; Tue, 23 Sep 2014 23:38:20 +0000 (UTC) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id ABAFCB65 for ; Tue, 23 Sep 2014 23:38:20 +0000 (UTC) Received: from aurora.physics.berkeley.edu (aurora.Physics.Berkeley.EDU [128.32.117.67]) (authenticated bits=0) by d.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8NNcF2n012847 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Tue, 23 Sep 2014 16:38:15 -0700 Message-ID: <54220467.5070603@freebsd.org> Date: Tue, 23 Sep 2014 16:38:15 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Mark Millard , FreeBSD PowerPC ML Subject: Re: //lib/libm.so.5: could not read symbols: Bad value and /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition References: <6FE3262D-7AC1-4A1A-B298-5DEABAE37750@dsl-only.net> <7BA54C8F-5B1C-4F8A-B0FD-E218A1D3E1F8@dsl-only.net> In-Reply-To: <7BA54C8F-5B1C-4F8A-B0FD-E218A1D3E1F8@dsl-only.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVYF4ScJwGma5U+wIWw0AKR4Wzr4FEyYOuKkIshnwo1G3LOHTdl+GKEVuxj10WBfXtGw8IBU3YNK2G8UYGIDmEUU9HrcGgla7iw= X-Sonic-ID: C;OBSRsHpD5BGtEQDu5Qupew== M;TLi6sHpD5BGtEQDu5Qupew== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 23:38:20 -0000 You might want to ask the people over on freebsd-toolchain about this. It looks like some issue with binutils. -Nathan On 09/23/14 15:53, Mark Millard wrote: > For 10.1-BETA2 I used the MANIFEST and *.txz files with bsdinstall to make a powerpc/GENERIC SSD where I did not rebuild the world or kernel but did attempt portmaster my usual ports. No WITH_DEBUG= or other such added. But WRKDIRPREFIX=(path not listed here) present. > > It still got the: > > /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition > //lib/libm.so.5: could not read symbols: Bad value > *** [fractals] Error code 1 > > So even the standard way of building distributions has the problem for powerpc/powerpc64. > > > uname -a output: > > FreeBSD FBSDG4S0 10.1-BETA2 FreeBSD 10.1-BETA2 #0 r271848: Fri Sep 19 03:54:33 UTC 2014 root@releng1.nyi.freebsd.org:/usr/obj/powerpc.powerpc/usr/src/sys/GENERIC powerpc > > > > > It would appear that .../graphics/freeglut/work/freeglut-2.8.1/configure.ac generation of progs/demos/Fractals/Makefile via: > > # Generate output. > AC_CONFIG_FILES([ > Makefile > doc/Makefile > include/GL/Makefile > include/Makefile > progs/Makefile > progs/demos/CallbackMaker/Makefile > progs/demos/Fractals/Makefile > progs/demos/Fractals_random/Makefile > progs/demos/Lorenz/Makefile > progs/demos/Makefile > progs/demos/One/Makefile > progs/demos/shapes/Makefile > progs/demos/smooth_opengl3/Makefile > progs/demos/spaceball/Makefile > progs/demos/subwin/Makefile > src/Makefile > ]) > AC_OUTPUT > > needs to not only have the produced progs/demos/Fractals/Makefile contain: > > LIBM = -lm > > (which it does) but to put LIBM to use by effectively adding $(LIBM) to: > > LIBS = -lXi -lXrandr -lXxf86vm > > or some other way of having -lm show up in the link command. (Other things may need similar -l's.) > > If true then the problem is not in/with libm.so.5 itself. > > > My guess is that .../graphics/freeglut/work/freeglut-2.8.1/progs/demos/Fractals/Makefile.am should have: > > fractals_LDADD = ../../../src/lib@LIBRARY@.la $(GL_LIBS) $(LIBM) > > (I added the $(LIBM).) > > If so it would appear that pre-configure: in /usr/ports/graphics/freeglut/Makefile might use something like: > > @${REINPLACE_CMD} -e "s|\$(GL_LIBS)|$(GL_LIBS) $(LIBM)|g" \ > ${WRKSRC}/progs/demos/Fractals/Makefile.am > > (Again: There may be more than just Fractals and libm.so.5 involved overall.) > > Or maybe a patch file for progs/demos/Fractals/Makefile.am could be set up. > > > > > > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 18, 2014, at 12:02 AM, Mark Millard wrote: > > For 10.1-??? I've been getting: > > /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition > //lib/libm.so.5: could not read symbols: Bad value > *** [fractals] Error code 1 > > make[6]: stopped in /usr/obj/portswork/usr/ports/graphics/freeglut/work/freeglut-2.8.1/progs/demos/Fractals > > when I attempt to portmaster xscreensaver. (The rest of the ports I try to build work fine, including all their dependencies. If xscreensaver finished it would be about 409 ports involved in all.) > > I now note it to the list because I've now tried on on powerpc/GENERIC and powerpc64/GENERIC64 with and without /etc/make.conf having: > > WITH_DEBUG_FILES= > WITHOUT_CLANG= > WITH_DEBUG= > > [WRKDIRPREFIX=(path not listed here) always present] > > when I buildworld kernel and use portmaster for the ports. > > > It appears that no matter what style of build on a PowerMac under either powerpc/GENERIC or powerpc64/GENERIC64 /lib/libm.so.5 ends up with this problem (or the ld checks for invalid DSO's end up wrong --or both). > > > I first noticed this with 10.1-PRERELEASE > > FreeBSD FBSDG4S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271215: Sat Sep 6 23:56:15 PDT 2014 root@FBSDG4S0:/usr/obj/usr/src/sys/GENERIC powerpc > > I can not claim just what was the last prior working case I had with 10.0-STABLE but all those were "as distributed" installs instead of personal "buildworld kernel" based on a source updates. For 10.1-??? I've been experimenting with source based tracking/building, mostly building on Quad-core PowerMac G5s (booted with either GENERIC based or GENERIC64 based worlds/kernels, up to DDB/GDB being added or not). Here GENERIC and GENERIC64 were not updated at all. > > It has continued with the likes of > > FreeBSD FBSDG4S1 10.1-BETA1 FreeBSD 10.1-BETA1 #1 r271610M: Wed Sep 17 21:47:20 PDT 2014 root@FBSDG4S1:/usr/obj/usr/src/sys/GENERIC powerpc > > and its GENERIC64 variant. (M in r271610M because of DDB and GDB options added to GENERIC and GENERIC64.) > > > === > Mark Millard > markmi at dsl-only.net > > > _______________________________________________ > freebsd-ppc@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-ppc > To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org" > From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 00:15:44 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 524D1CD3; Wed, 24 Sep 2014 00:15:44 +0000 (UTC) Received: from mail-lb0-x22a.google.com (mail-lb0-x22a.google.com [IPv6:2a00:1450:4010:c04::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A0897E74; Wed, 24 Sep 2014 00:15:43 +0000 (UTC) Received: by mail-lb0-f170.google.com with SMTP id z11so4366968lbi.15 for ; Tue, 23 Sep 2014 17:15:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=iNb0k+ME+cLUDr3/rs8ygeANW1Rb7yZbbhNtra9Gy0U=; b=CmTZ6/cojtXD0++o6f+URZsUIBPJ2CQV/o5hhFB+TBuMKGj35xYnML8Adm3WRAJs6B Uhd5Atkby528z6PNpqWIYVN0UykSll+AcXPCQP8f7Dzwlow1h4Cy9r9/O5WJgav2PC4h HNRJtWgVuBK/sPyXuwN6QzRKVWBv6xXL1NCF8DyrbU9QOLcR9EUbl50MIV2/sBZnNwAU wtzzE7By8Ye/lpAU1PIZm4plIic9zNfOzfxbtmObfrGPDLU0LU+O2NEVRKrqEUq77Afp 5PTVAzclnN0eX4f1tWdUs+lSNbeN4AyL/HLKjqDxpiZVfp6bgA3QB42WKrR3duHT5oHF PPQw== MIME-Version: 1.0 X-Received: by 10.112.184.161 with SMTP id ev1mr2621658lbc.82.1411517741415; Tue, 23 Sep 2014 17:15:41 -0700 (PDT) Sender: chmeeedalf@gmail.com Received: by 10.25.15.29 with HTTP; Tue, 23 Sep 2014 17:15:41 -0700 (PDT) In-Reply-To: <54220467.5070603@freebsd.org> References: <6FE3262D-7AC1-4A1A-B298-5DEABAE37750@dsl-only.net> <7BA54C8F-5B1C-4F8A-B0FD-E218A1D3E1F8@dsl-only.net> <54220467.5070603@freebsd.org> Date: Tue, 23 Sep 2014 17:15:41 -0700 X-Google-Sender-Auth: v0MarXybXNEcd2QhdPbUEjoLV2E Message-ID: Subject: Re: //lib/libm.so.5: could not read symbols: Bad value and /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition From: Justin Hibbits To: Nathan Whitehorn Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD PowerPC ML , Mark Millard X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 00:15:44 -0000 Actually it's an issue with the port. The toolchain guys made ld more strict, requiring all libraries to be specified on the command line, rather than following the DT_NEEDED links. There is a whole host of ports that have the problem of missing -lm. Some have been fixed, but many more remain. - Justin On Tue, Sep 23, 2014 at 4:38 PM, Nathan Whitehorn wrote: > You might want to ask the people over on freebsd-toolchain about this. It > looks like some issue with binutils. > -Nathan > > > On 09/23/14 15:53, Mark Millard wrote: >> >> For 10.1-BETA2 I used the MANIFEST and *.txz files with bsdinstall to make >> a powerpc/GENERIC SSD where I did not rebuild the world or kernel but did >> attempt portmaster my usual ports. No WITH_DEBUG= or other such added. But >> WRKDIRPREFIX=(path not listed here) present. >> >> It still got the: >> >> /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition >> //lib/libm.so.5: could not read symbols: Bad value >> *** [fractals] Error code 1 >> >> So even the standard way of building distributions has the problem for >> powerpc/powerpc64. >> >> >> uname -a output: >> >> FreeBSD FBSDG4S0 10.1-BETA2 FreeBSD 10.1-BETA2 #0 r271848: Fri Sep 19 >> 03:54:33 UTC 2014 >> root@releng1.nyi.freebsd.org:/usr/obj/powerpc.powerpc/usr/src/sys/GENERIC >> powerpc >> >> >> >> >> It would appear that >> .../graphics/freeglut/work/freeglut-2.8.1/configure.ac generation of >> progs/demos/Fractals/Makefile via: >> >> # Generate output. >> AC_CONFIG_FILES([ >> Makefile >> doc/Makefile >> include/GL/Makefile >> include/Makefile >> progs/Makefile >> progs/demos/CallbackMaker/Makefile >> progs/demos/Fractals/Makefile >> progs/demos/Fractals_random/Makefile >> progs/demos/Lorenz/Makefile >> progs/demos/Makefile >> progs/demos/One/Makefile >> progs/demos/shapes/Makefile >> progs/demos/smooth_opengl3/Makefile >> progs/demos/spaceball/Makefile >> progs/demos/subwin/Makefile >> src/Makefile >> ]) >> AC_OUTPUT >> >> needs to not only have the produced progs/demos/Fractals/Makefile contain: >> >> LIBM = -lm >> >> (which it does) but to put LIBM to use by effectively adding $(LIBM) to: >> >> LIBS = -lXi -lXrandr -lXxf86vm >> >> or some other way of having -lm show up in the link command. (Other things >> may need similar -l's.) >> >> If true then the problem is not in/with libm.so.5 itself. >> >> >> My guess is that >> .../graphics/freeglut/work/freeglut-2.8.1/progs/demos/Fractals/Makefile.am >> should have: >> >> fractals_LDADD = ../../../src/lib@LIBRARY@.la $(GL_LIBS) $(LIBM) >> >> (I added the $(LIBM).) >> >> If so it would appear that pre-configure: in >> /usr/ports/graphics/freeglut/Makefile might use something like: >> >> @${REINPLACE_CMD} -e "s|\$(GL_LIBS)|$(GL_LIBS) $(LIBM)|g" \ >> ${WRKSRC}/progs/demos/Fractals/Makefile.am >> >> (Again: There may be more than just Fractals and libm.so.5 involved >> overall.) >> >> Or maybe a patch file for progs/demos/Fractals/Makefile.am could be set >> up. >> >> >> >> >> >> >> >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> On Sep 18, 2014, at 12:02 AM, Mark Millard wrote: >> >> For 10.1-??? I've been getting: >> >> /usr/bin/ld: : invalid DSO for symbol `sin@@FBSD_1.0' definition >> //lib/libm.so.5: could not read symbols: Bad value >> *** [fractals] Error code 1 >> >> make[6]: stopped in >> /usr/obj/portswork/usr/ports/graphics/freeglut/work/freeglut-2.8.1/progs/demos/Fractals >> >> when I attempt to portmaster xscreensaver. (The rest of the ports I try to >> build work fine, including all their dependencies. If xscreensaver finished >> it would be about 409 ports involved in all.) >> >> I now note it to the list because I've now tried on on powerpc/GENERIC and >> powerpc64/GENERIC64 with and without /etc/make.conf having: >> >> WITH_DEBUG_FILES= >> WITHOUT_CLANG= >> WITH_DEBUG= >> >> [WRKDIRPREFIX=(path not listed here) always present] >> >> when I buildworld kernel and use portmaster for the ports. >> >> >> It appears that no matter what style of build on a PowerMac under either >> powerpc/GENERIC or powerpc64/GENERIC64 /lib/libm.so.5 ends up with this >> problem (or the ld checks for invalid DSO's end up wrong --or both). >> >> >> I first noticed this with 10.1-PRERELEASE >> >> FreeBSD FBSDG4S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271215: Sat >> Sep 6 23:56:15 PDT 2014 root@FBSDG4S0:/usr/obj/usr/src/sys/GENERIC >> powerpc >> >> I can not claim just what was the last prior working case I had with >> 10.0-STABLE but all those were "as distributed" installs instead of personal >> "buildworld kernel" based on a source updates. For 10.1-??? I've been >> experimenting with source based tracking/building, mostly building on >> Quad-core PowerMac G5s (booted with either GENERIC based or GENERIC64 based >> worlds/kernels, up to DDB/GDB being added or not). Here GENERIC and >> GENERIC64 were not updated at all. >> >> It has continued with the likes of >> >> FreeBSD FBSDG4S1 10.1-BETA1 FreeBSD 10.1-BETA1 #1 r271610M: Wed Sep 17 >> 21:47:20 PDT 2014 root@FBSDG4S1:/usr/obj/usr/src/sys/GENERIC powerpc >> >> and its GENERIC64 variant. (M in r271610M because of DDB and GDB options >> added to GENERIC and GENERIC64.) >> >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> >> _______________________________________________ >> freebsd-ppc@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-ppc >> To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org" >> > > _______________________________________________ > freebsd-ppc@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-ppc > To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org" From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 01:25:52 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 591AF662 for ; Wed, 24 Sep 2014 01:25:52 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D5A5B7A5 for ; Wed, 24 Sep 2014 01:25:51 +0000 (UTC) Received: (qmail 29667 invoked from network); 24 Sep 2014 01:25:44 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 24 Sep 2014 01:25:44 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Tue, 23 Sep 2014 21:25:44 -0400 (EDT) Received: (qmail 6762 invoked from network); 24 Sep 2014 01:25:43 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 24 Sep 2014 01:25:43 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id CC21B1C402B; Tue, 23 Sep 2014 18:25:37 -0700 (PDT) From: Mark Millard Message-Id: Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: powerpc64/GENERIC64: a mtmsrd without a "context synchronizing instruction" (immediately?) following... Date: Tue, 23 Sep 2014 18:25:42 -0700 References: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> To: FreeBSD PowerPC ML , Nathan Whitehorn In-Reply-To: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 01:25:52 -0000 Nathan Whitehorn wrote of the use of the document that I was = referencing: > I think you are looking at very old documentation. The 32-bit mtmsr is=20= > implemented on all POWER ISA compliant CPUs (see e.g. page 886 of the=20= > 2.07 document). > -Nathan I think we may be using different documents rather than different = versions of the same document. I may need to find what Nathan is using = and its time frame (PowerPC Architecture 2.07?). But he may want to = check what I've been referencing. So... pem_64bit_v3.0.2005jul15.pdf is Version 3.0 and directly from the IBM = site and has 657 pages... = https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F7E732FF811F7831= 87256FDD004D3797 It is the current document of its type as far as I can tell. The title = is: > PowerPC=AE Microprocessor Family: >=20 > The Programming Environments Manual for 64-bit Microprocessors >=20 > Version 3.0 >=20 > July 15, 2005. It is described on its web page as: > This manual is to help programmers provide software that is compatible = across the family of PowerPC processors. This book provides a general = description of features common to PPC processors and indicates those = features that are optional or that may be implemented differently in the = design of each processor. This book is for only 64-bit processors. It is different from other architecture documents in that it also = documents the Operating Environment Architecture (supervisor = level/privileged-state resources for operating systems), not just the = UISA and VEA. The document warns that while the UISA is always adhered = to there can be VEA and OEA variations that the document does not cover. = But it also says that the "general-purpose" PowerPC microprocessors = comply with the document. In its own words... > The three levels of the PowerPC Architecture are defined as follows: >=20 > PowerPC user instruction set architecture (UISA)=97The UISA defines = the level of the architecture to which user-level (referred to as = problem state in the architecture specification) software should = conform. The UISA defines the base user-level instruction set, = user-level registers, data types, floating-point mem- ory conventions = and exception model as seen by user programs, and the memory and = programming models. The icon shown in the margin identifies text that is = relevant with respect to the UISA.=20 > PowerPC virtual environment architecture (VEA)=97The VEA defines = additional user-level functionality that falls outside typical = user-level software requirements. The VEA describes the memory model for = an envi- ronment in which multiple devices can access memory, defines = aspects of the cache model, defines cache control instructions, and = defines the time base facility from a user-level perspective. The icon = shown in the margin identifies text that is relevant with respect to the = VEA.=20 > PowerPC operating environment architecture (OEA)=97The OEA defines = supervisor-level (referred to as privileged state in the architecture = specification) resources typically required by an operating system. The = OEA defines the PowerPC memory management model, supervisor-level = registers, synchronization requirements, and the exception model. The = OEA also defines the time base feature from a supervisor- level = perspective. The icon shown in the margin identifies text that is = relevant with respect to the OEA.=20 > Implementations that adhere to the VEA level are guaranteed to adhere = to the UISA level, but may not neces- sarily adhere to the OEA level; = likewise, implementations that conform to the OEA level are also = guaranteed to conform to the UISA and the VEA levels.=20 > All PowerPC devices adhere to the UISA, offering compatibility among = all PowerPC application programs. However, there may be different = versions of the VEA and OEA than those described here. For example, some = devices, such as embedded controllers, may not require some of the = features as defined by this VEA and OEA, and may implement a simpler or = modified version of those features.=20 > The general-purpose PowerPC microprocessors comply both with the UISA = and with the VEA and OEA discussed here. In this book, these three = levels of the architecture are referred to collectively as the PowerPC = Architecture. The distinctions between the levels of the PowerPC = Architecture are maintained clearly throughout this manual, using the = conventions described in the Section Conventions on page 25.=20 =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 22, 2014, at 7:00 PM, Mark Millard wrote: Context: 10.1-BETA2 powerpc64/GENERIC64 (with option DDB and option = GDB). (I later quote from pem_64_bit_v3.0.2005jul2005.pdf (from IBM).) IBM writes of mtmsr/mtmsrd: > For software that will run on processors that comply with = earlier versions of the architecture, a context synchronizing = instruction is required after the mtmsr[d] instruction.=20 That sort of principle does not seem to be followed by one example in = powerpc64/GENERIC64: 0000000000102168 <.__start+0x78> rldimi r9,r8,63,0 000000000010216c <.__start+0x7c> mtmsrd r9 0000000000102170 <.__start+0x80> bl 0000000000101120 = 0000000000102174 <.__start+0x84> ld r2,40(r1) 0000000000102178 <.__start+0x88> lis r3,16 000000000010217c <.__start+0x8c> addi r3,r3,0 ... There other mtmsr's/mtmsrd's that I found had one or two isync's = following, proving the context synchronization instruction. IBM also reports: > Processors designed prior to Version 2.01 of the architecture ignore = the L field. These processors set the MSR as if L were =910=92, and = perform synchronization as if L were =911=92. Therefore software that = uses mtmsrd and runs on such processors must obey the following rules. >=20 > If L=3D =921=92, the contents of bits of register rS other than bits = [48] and [62] must be such that if L were =910=92 the instruction would = not alter the contents of the corresponding MSR bits.=20 > If L =3D =910=92 and the instruction alters the contents of any of the = MSR bits listed below, the instruction must be followed by a context = synchronizing instruction or event in order to ensure that the context = alteration caused by the mtmsrd instruction has taken effect on such = processors.=20 > To obtain the best performance on processors, if the context = synchronizing instruction is isync the isync should immediately follow = the mtmsrd. (Some such processors treat an isync instruction that = immediately follows an mtmsrd instruction having L =3D =920=92 as a = no-op, thereby avoiding the performance penalty of a second context = synchronization.) >=20 Another interesting IBM note for mtmsr (not mtmsrd), but effectively = just a side note here: > The mtmsr instruction, which is otherwise illegal in the 64-bit = architecture may optionally be imple- mented in 64-bit bridge = implementations.=20 FreeBSD powerpc64/GENERIC64 seems to use mtmsr fairly freely. (k_trap, = trapexit, asttrapexit, .breakpoint, dbtrap, dbleave, ichss_set, = prof_clock_cnt, hardclock_cpu, kdb_trap, powerpc_interrupt, = flush_disabnle_caches, spinlock_exit, spin_lock_enter, powerpc_init, = cpu_sleep, moea64_add_ofw_mappings, moea64_late_bootstrap, = moea64_mid_bootstrap, moea64_cpu_bootstrap_native, = moea64_bootstrap_native, write_scom, read_scom, pcr_set, = openfirmware_core, save_vec, enable_vec, configure_final, = cpu_est_clockrage, cpu_idle_60x, save_fpu, enable_fpu, = mps3_cpu_bootstrap. Apple also used mtmsr (not mtmsrd) in the = openfirmware vs. kernel transitions in the published BootX-81 source = code.) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 01:33:39 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E320978B for ; Wed, 24 Sep 2014 01:33:39 +0000 (UTC) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C6A33873 for ; Wed, 24 Sep 2014 01:33:39 +0000 (UTC) Received: from aurora.physics.berkeley.edu (aurora.Physics.Berkeley.EDU [128.32.117.67]) (authenticated bits=0) by c.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8O1XZGN017056 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Tue, 23 Sep 2014 18:33:36 -0700 Message-ID: <54221F6F.3080100@freebsd.org> Date: Tue, 23 Sep 2014 18:33:35 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Mark Millard , FreeBSD PowerPC ML Subject: Re: powerpc64/GENERIC64: a mtmsrd without a "context synchronizing instruction" (immediately?) following... References: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> In-Reply-To: X-Sonic-CAuth: UmFuZG9tSVZTulAAL/nV9PTWZhz6EOMd8qjB5jR2C/heULbtClow5DNVFUJu0eY5t/cc3pc3Ffiurif3tVboi/EU9JVrEPO78FZRkqMxUx4= X-Sonic-ID: C;FKAkzYpD5BGuKDZXoK8kYw== M;MJp8zYpD5BGuKDZXoK8kYw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 01:33:40 -0000 On 09/23/14 18:25, Mark Millard wrote: > Nathan Whitehorn wrote of the use of the document that I was referencing: > >> I think you are looking at very old documentation. The 32-bit mtmsr is >> implemented on all POWER ISA compliant CPUs (see e.g. page 886 of the >> 2.07 document). >> -Nathan > > I think we may be using different documents rather than different > versions of the same document. I may need to find what Nathan is using > and its time frame (PowerPC Architecture 2.07?). But he may want to > check what I've been referencing. So... > > pem_64bit_v3.0.2005jul15.pdf is Version 3.0 and directly from the IBM > site and has 657 pages... > > https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F7E732FF811F783187256FDD004D3797 > > It is the current document of its type as far as I can tell. The title is: >> >> PowerPC® Microprocessor Family: >> >> The Programming Environments Manual for 64-bit Microprocessors >> >> Version 3.0 >> >> July 15, 2005. > Right, this is massively obsolete. Apparently they were planning to deprecate mtmsr and changed their minds. You want the current one from https://www.power.org/documentation/power-isa-version-2-07/. > It is described on its web page as: > >> This manual is to help programmers provide software that is >> compatible across the family of PowerPC processors. This book >> provides a general description of features common to PPC processors >> and indicates those features that are optional or that may be >> implemented differently in the design of each processor. This book is >> for only 64-bit processors. > > It is different from other architecture documents in that it also > documents the Operating Environment Architecture (supervisor > level/privileged-state resources for operating systems), not just the > UISA and VEA. The document warns that while the UISA is always adhered > to there can be VEA and OEA variations that the document does not > cover. But it also says that the "general-purpose" PowerPC > microprocessors comply with the document. In its own words... > Right, this is the same with the current version of ISA. Book-3S describes what was called the OEA at one point. In any event, your machine (a PowerPC 970) certainly supports the instruction. -Nathan From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 03:38:10 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9F1FBF9D for ; Wed, 24 Sep 2014 03:38:10 +0000 (UTC) Received: from asp.reflexion.net (outbound-240.asp.reflexion.net [69.84.129.240]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3AC1E38F for ; Wed, 24 Sep 2014 03:38:09 +0000 (UTC) Received: (qmail 13994 invoked from network); 24 Sep 2014 03:38:08 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 24 Sep 2014 03:38:08 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Tue, 23 Sep 2014 23:38:08 -0400 (EDT) Received: (qmail 13164 invoked from network); 24 Sep 2014 03:38:07 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 24 Sep 2014 03:38:07 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 430D91C4053; Tue, 23 Sep 2014 20:38:01 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: powerpc64/GENERIC64: a mtmsrd without a "context synchronizing instruction" (immediately?) following... From: Mark Millard In-Reply-To: <54221F6F.3080100@freebsd.org> Date: Tue, 23 Sep 2014 20:38:02 -0700 Message-Id: <92E42AE0-264B-4729-996B-15182E0A3BE3@dsl-only.net> References: <5A754BA9-544A-408F-B45C-691627DCA4ED@dsl-only.net> <54221F6F.3080100@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 03:38:10 -0000 Thanks for the reference! Better than what I'd found for sure. Agreed = for what the PowerMac's processor supports. Just FYI since I was curious and looked it up: The page labeled 1352 of = the Power ISA 2.07 document says in section A.23 about the Move To = Machine State Register Instruction... > Also, mtmsr is optional in Power ISA but required in POWER.=20 =46rom 1.3.5 "Categories"... B: "Base" is required for all implementations S: required for Server implementations E: required for Embedded implementations 64: Required for 64-bit implementations; not defined for 32-bit = implementations (I'm omitting lots of categories, including dependent subcategories.) Appendix I lists both S and E for mtmsr and lists S for mtsrd. B and 64 = are not mentioned for either. So it would take wanting to span a non-server/non-embedded processor = before there would be a problem for mtmsr. It only takes spanning a = non-server for there to possibly be a problem for mtmsrd. mtmsrd shares = "S" status with rvwindkle, described in the table as the "Rip Van = Winkle" instruction. Lots of instructions have S categorization. A non-B = "S and E" double categorization is fairly rare. E by itself has some TLB = variants, wrtee, wrteei but little else. I'm not counting E.??'s or = S.??'s here. Interesting. Thanks again. =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 23, 2014, at 6:33 PM, Nathan Whitehorn = wrote: On 09/23/14 18:25, Mark Millard wrote: > Nathan Whitehorn wrote of the use of the document that I was = referencing: >=20 >> I think you are looking at very old documentation. The 32-bit mtmsr = is=20 >> implemented on all POWER ISA compliant CPUs (see e.g. page 886 of the=20= >> 2.07 document). >> -Nathan >=20 > I think we may be using different documents rather than different = versions of the same document. I may need to find what Nathan is using = and its time frame (PowerPC Architecture 2.07?). But he may want to = check what I've been referencing. So... >=20 > pem_64bit_v3.0.2005jul15.pdf is Version 3.0 and directly from the IBM = site and has 657 pages... >=20 > = https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F7E732FF811F7831= 87256FDD004D3797 >=20 > It is the current document of its type as far as I can tell. The title = is: >> PowerPC=AE Microprocessor Family: >>=20 >> The Programming Environments Manual for 64-bit Microprocessors >>=20 >> Version 3.0 >>=20 >> July 15, 2005. >=20 Right, this is massively obsolete. Apparently they were planning to = deprecate mtmsr and changed their minds. You want the current one from = https://www.power.org/documentation/power-isa-version-2-07/. > It is described on its web page as: >=20 >> This manual is to help programmers provide software that is = compatible across the family of PowerPC processors. This book provides a = general description of features common to PPC processors and indicates = those features that are optional or that may be implemented differently = in the design of each processor. This book is for only 64-bit = processors. >=20 >=20 > It is different from other architecture documents in that it also = documents the Operating Environment Architecture (supervisor = level/privileged-state resources for operating systems), not just the = UISA and VEA. The document warns that while the UISA is always adhered = to there can be VEA and OEA variations that the document does not cover. = But it also says that the "general-purpose" PowerPC microprocessors = comply with the document. In its own words... >=20 Right, this is the same with the current version of ISA. Book-3S = describes what was called the OEA at one point. In any event, your = machine (a PowerPC 970) certainly supports the instruction. -Nathan From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 07:27:29 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AA5BA716 for ; Wed, 24 Sep 2014 07:27:29 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2F180BF9 for ; Wed, 24 Sep 2014 07:27:28 +0000 (UTC) Received: (qmail 14647 invoked from network); 24 Sep 2014 07:20:47 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 24 Sep 2014 07:20:47 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Wed, 24 Sep 2014 03:20:47 -0400 (EDT) Received: (qmail 2185 invoked from network); 24 Sep 2014 07:20:46 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 24 Sep 2014 07:20:46 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id AB7D81C402B; Wed, 24 Sep 2014 00:20:45 -0700 (PDT) From: Mark Millard Subject: A different PowerMac G5 boot crash but with a backtrace: fails at .pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 Date: Wed, 24 Sep 2014 00:20:43 -0700 Message-Id: To: FreeBSD PowerPC ML , Nathan Whitehorn Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 07:27:29 -0000 powerpc64/GENERIC64 on PowerMac G5 Quad Core: I caught a different = kernel/boot crash with a backtrace, failing at .pvo_vaddr_compare+0x14: = ld r0,88(r4). Unfortunately with my current "show register; bt; show = regster/u; bt/u" the beginning of the text from before the show's = scrolled off screen. Still... register r4: 0x2e123e8 dar: 2e12440 srr0: 0x8b8648 .pvo_vaddr_compare+0x14 lr: 0x98b8fac .pvo_tree_RB_FIND_0x38 ctr: 0x883840 moea64_dev_direct_mapped .pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 [or ld r0,88(r4) in = an alternate notation] .pvo_tree_RB_FIND+0x38 .moea64_dev_direct_mapped_0x90 .pmap_direct_mapped+0x84 .bs_remap_earlyboot_0x6c .moea64_late_bootstrap+0x178 .moea64_bootstrap_native+0x120 .pmap_bootstrap+0xac .powerpc_init+0x514 btext+0xa8 srr1: 9000000000003030 cr: 2400024 xer 0 dsisr: 40000000 r0: 0x98008000 r1: 0xbda9f0 tmpstk+0x39f0 r2: 0xd18468 r3: 0xbdab38 tmpstk+0x3938 r4: 0x2e123e8 r5: 0xe10000 __pcpu+0xa80 r6: 0 r7: 0 r8: 0xf r9: 0x98008000 r10: 0x1 r11: 0 r12: 0x10000000 r13: 0xbdd290 thread0 r14-r19: all 0 r20: 0x10c1000 r21: 0x4 r22: 0x1801bd4 r23: 0xe42bf0 earlyboot_mapping r24: 0 r25: 0 r26: 0x100000 kernbase r27: 0xe42bf0 earlyboot_mapping r28: 0xe10000 __pcpu_0xa80 r29: 0xbdab38 tmpstk+0x3938 r30: 0x2e123e8 r31: 0xbda9f0 tmpstk+0x39f0 Context: FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #4 r271944M: Tue Sep 23 = 22:39:02 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc $ svnlite diff /usr/src/sys Index: /usr/src/sys/ddb/db_script.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/ddb/db_script.c (revision 271944) +++ /usr/src/sys/ddb/db_script.c (working copy) @@ -319,10 +319,25 @@ { char scriptname[DB_MAXSCRIPTNAME]; =20 + /* HACK!!! : Additional lines to force a basic default script to = exist. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show = registers; bt; show registers/u; bt/u"); + snprintf(scriptname, sizeof(scriptname), "%s.%s", DB_SCRIPT_KDBENTER_PREFIX, eventname); if (db_script_exec(scriptname, 0) =3D=3D ENOENT) (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); + + /* HACK!!! : Additional lines to always use the default script, + * even if scriptname existed and was executed. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + else + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); } =20 /*- Index: /usr/src/sys/powerpc/conf/GENERIC64 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) @@ -76,6 +76,8 @@ # Debugging support. Always need this: options KDB # Enable kernel debugger = support. options KDB_TRACE # Print a stack trace for a = panic. +options DDB +options GDB =20 # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor = Kernel =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 09:05:02 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 185E3B0F for ; Wed, 24 Sep 2014 09:05:02 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6306A8F6 for ; Wed, 24 Sep 2014 09:05:00 +0000 (UTC) Received: (qmail 16744 invoked from network); 24 Sep 2014 09:04:59 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 24 Sep 2014 09:04:59 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Wed, 24 Sep 2014 05:04:59 -0400 (EDT) Received: (qmail 11166 invoked from network); 24 Sep 2014 09:04:58 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 24 Sep 2014 09:04:58 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 140351C402B; Wed, 24 Sep 2014 02:04:57 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> Date: Wed, 24 Sep 2014 02:04:56 -0700 Message-Id: <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> To: FreeBSD PowerPC ML , Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 09:05:02 -0000 Now that I've had a kernel/boot crash with a successful DDB bt and show = registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. Something unique to the "before copyright" context is... No registers are reported to have values that point into the range = between tmpstk and esym. In other words: There is no valid stack pointer reported as far as I can = tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: r0: 0 r1: 0 r2: 0xc81538 vop_unlock_desc r3: 0xd18868 r4: 0x894b58 r5: 0 r6: 0xc1dee0 M_AUDITBSM r7: 0xe3f818 ofw_real_mode r8: 0x1 r9: 0xe0f580 __pcpu r10: 0x1c35ec0 r11: 0 r12: 0x10000000 r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) r14-r19: all 0 r20: 0x10c1000 r21: 0x4 r22: 0x180abd4 r23: 0x1803a28 r24: 0xc000000000008760 r25: 0xcc89b8 smp_no... r26: 0xcea108 ofw_rend... r27: 0x894b58 ofwcall+0xa8 r28: 0x894b58 ofwcall+0xa8 r29: 2400022 r30: 9000000000001032 r31: 0xbb7d38 srr0: 0x102720 k_trap+0x28 srr1: 9000000000001032 lr: 0x1026f0 u_trap+0x10 ctr: 0xff846d78 cr: 2000deb0 xer: 0 dar: f...d50 (lots of f's) dsisr: 42000000 =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 20, 2014, at 3:42 PM, Mark Millard = wrote: [I corrected the SSR0 in the subject to be SRR0.] I did miss a register in my list (it matched the shown r30 value). And = it turns out to probably be very important to interpreting what the = "show registers" is reporting: SRR1: 0x9000000000001032 But bits 43-46 of SRR1 are supposed to indicate which type of Program = Exception, using a single binary 1 to so. No such 1's are present. Illegal instruction would have been bit 44 being 1. (PowerPC has the = upper bit numbered zero and increases from there.) So the ddb "show registers" is apparently not reporting the status as of = when the "stopped at 0 illegal instruction 0" happened. Thus other = things are also likely not from that exact time frame. And I misinterpreted the LR value status: The LR value was just left = over from the restore_kernsrs returning when it finished. Execution then = flowed into k_trap. Nothing unusual involved. =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 18, 2014, at 8:57 PM, Mark Millard wrote: I modified DDB to automatically "show registers" even at the early = "before Copyright" crash time. The end of this note will show the = /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB bt, = the bt does not actually print a back trace for this context. (It might = for others.) The registers give interesting context despite the lack of a back trace. = I do not know if it will be sufficient to be of much immediate help if = someone used the information to start looking at the problem. I'll start with register lr: 0x1026f0 u_trap+0x10. /usr/src/sys/powerpc/aim/trap_subr64.S has: s_trap: bf 17,k_trap /* branch if PSL_PR is false */ GET_CPUINFO(%r1) u_trap: ld %r1,PC_CURPCB(%r1) mr %r27,%r28 /* Save LR, r29 */ mtsprg2 %r29 bl restore_kernsrs /* enable kernel mapping */ mfsprg2 %r29 mr %r28,%r27 /* * Now the common trap catching code. */ k_trap: FRAME_SETUP(PC_TEMPSAVE) /* Call C interrupt dispatcher: */ trapagain: and so this appears to indicate a pending return to execute the "mfsprg2 = %r29" after "bl restore_kernsrs", which indicates that restore_kernsrs = should be active. But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in = FRAME_SETUP(PC_TEMPSAVE) someplace.) So it appears to me that the processor got to the k_trap code during the = supposed restore_kernsrs time frame. (But I'm no expert at these sorts = of things or for the processor.) I'll list the other register values: r0: 0 r1: 0 r2: 0xc1be80 M_AUDITBSM r3: 0xb16138 r4: 0x8926e8 .ofwcall+0xa8 r5: 0 r6: 0xbb5f90 r7: 0xe3d118 ofw_real_mode r8: 0x1 r9: 0xe0ce80 __pcpu r10: 0x1c35ec9 r11: 0 r12: 0x10000000 r13: db890 thread0 r14-r19: all 0 r20: 0x10bc000 r21: 0x4 r22: 0x1801db4 r23: 0x1803a28 r24: 0xc000000000008760 r25: 0xcc6908 smp_no_rendevous_barrier r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) r27: 0x8926e8 .ofwcall+0xa8 r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) r29: 0x24000022 r30: 0x9000000000001032 r31: 0xc7f488 vop_unlock_desc ctr: 0xff846d78 cr: 0x2000d7b0 xer: 0 dar: 0xfffffffffffffd50 dsisr: 0x42000000 (Hopefully this manual transcription from the screen display is complete = --and also accurate for what it does present.) The personal HACK to /usr/src/sys/ddb/db_script.c's = db_script_kdbenter(...) to have it show registers and try bt... $ cd /usr/src/sys/ddb/ $ svnlite diff . Index: db_script.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- db_script.c (revision 271610) +++ db_script.c (working copy) @@ -319,10 +319,25 @@ { char scriptname[DB_MAXSCRIPTNAME]; =20 + /* HACK!!! : Additional lines to force a basic default script to = exist. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show = registers; bt"); + snprintf(scriptname, sizeof(scriptname), "%s.%s", DB_SCRIPT_KDBENTER_PREFIX, eventname); if (db_script_exec(scriptname, 0) =3D=3D ENOENT) (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); + + /* HACK!!! : Additional lines to always use the default script, + * even if scriptname existed and was executed. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + else + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); } =20 /*- =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 16, 2014, at 9:28 PM, Mark Millard = wrote: In part I sent directly to you because of a past exchange (July-27) = where you had written: > Nathan and I both speculate that it's > dropping into Open Firmware (we make extensive use of OFW), and then > messing something up, taking a page fault or something. The specific text that I report and its uniformity when it is produced = seems to add a little information beyond a speculated "page fault or = something" and so might eventually help a little. As I understand the = text it is reporting execution reaching address zero without any prior = un-handled exceptions or other such that would stop it. A corrupted = stack (pointer) so a bad return address or some such? I'd guess there = are no explicit jumps to address zero so I expect that indirection is = likely involved, with the content for the indirection messed up. I really wish that I had a logic analyzer configuration for this. I've = not found a way to make the failing context visible so far and the extra = way of looking at things might have helped. =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 16, 2014, at 8:28 PM, Justin Hibbits = wrote: Hi mark, I see this on my G5, and I think it's due to the amount of RAM in the = machine. More than 4gb seems to confuse open firmware when called by = FreeBSD. There is some effort to remove the need of the callbacks but = thus far it's not far along. The good news is that after it boots it's = solid except when switching vtys, buy earlier this year or last year I = added a sysctl hack to disable the call into open firmware on vty switch = (don't recall offhand and not at my computer right now, but if you grep = the sysctl output for reset and ofw you can find it). -Justin On Sep 16, 2014 8:01 PM, "Mark Millard" wrote: I've now spent time with rebooting and power-off/power-on for all 3 = PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 get the > GDB: no debug ports present > KDB: debugger backends: DDB > KDB: current backend: DDB > [ thread pid -1 tid 1006665719 ] > Stopped at 0: illegal instruction 0 > db> when they fail just before the Copyright notice would normally be = displayed. None fail any earlier. At that spot none have failed any = other way. It is the same SSD in all 3. (Happens with other SSD's as = well.) Overall there is a mix of Radeon and NVIDIA display boards. = Besides the SSD use and RAM upgrades the rest is stock equipment. scons = used, not vt. (I've yet to try vt.) Seeing a failure after the Copyright notice as been fairly rare in all = my experiments from when I started last April or so. The ones that I've = noted had Data Storage Interrupt reported. So far no examples of the = above have been reported after the Copyright notice. So I'd guess that = they are separate issues. Of course it seems that only in the last few = days would I have seen the above sort of thing if it did happen after = the Copyright notice: The prior history does not count for judgements = about that. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 16, 2014, at 8:15 AM, Mark Millard wrote: Using 10.1-BETA1 I added "options DDB" and "options GDB" to powerpc64's = GENERIC64. (I also used WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D, and = WITH_DEBUG=3D in /etc/make.conf.) So buildworld, kernel was basically = just set up to have more of a debugging context around (including for = any ports builds). The result was new information about the PowerMac G5 boot hangups: The = screen is no longer blank when the G5 is hung up without there being a = Copyright notice yet. It says... > GDB: no debug ports present > KDB: debugger backends: DDB > KDB: current backend: DDB > [ thread pid -1 tid 1006665719 ] > Stopped at 0: illegal instruction 0 > db> (I had no ability to input at that point.) Normally the Copyright notice = would have displayed instead of "[...]" and what follows. (I do not = claim to have all the spacing, capitalization, and such correct above.) That text is constant from hang to hang when it hangs just before it = would normally output the Copyright notice: The numbers do not vary, = much less the other text. It has never failed until after the two KDB = messages are present. So far I've only tested one PowerMac G5, booting = over and over for a few hours. (I do not claim to be set up for remote kernel debugging. I just decided = to let GDB go along for the ride when I added DDB.) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 15:36:59 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 360AC875 for ; Wed, 24 Sep 2014 15:36:59 +0000 (UTC) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 130D1E28 for ; Wed, 24 Sep 2014 15:36:58 +0000 (UTC) Received: from comporellon.tachypleus.net (polaris.tachypleus.net [75.101.50.44]) (authenticated bits=0) by d.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8OFaqIA008807 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Wed, 24 Sep 2014 08:36:52 -0700 Message-ID: <5422E513.6010806@freebsd.org> Date: Wed, 24 Sep 2014 08:36:51 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: Mark Millard , FreeBSD PowerPC ML Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> In-Reply-To: <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> X-Sonic-CAuth: UmFuZG9tSVZ6+wDQPg800AsRaoH8VYXXuh3qrQP9EUPt/hOMZ/bti5n2THp4Yx32P1ACI1RL4ebJQhSGNYaRLug5OkCZJd+6EGOSDXw6dkU= X-Sonic-ID: C;rOIemwBE5BGu8gDu5Qupew== M;WGNvmwBE5BGu8gDu5Qupew== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 15:36:59 -0000 There shouldn't be any exceptions at that point, nested or otherwise. What I suspect is happening is that Open Firmware has turned them on for some bizarre reason, taken one, and ended up in the kernel's handlers but with the Open Firmware environment. Saving and restoring the OF interrupt vectors would be a possible solution; flattening the device tree in loader so that the kernel doesn't call Open Firmware at all would be another. I think Justin may have tried the first at some point. -Nathan On 09/24/14 02:04, Mark Millard wrote: > Now that I've had a kernel/boot crash with a successful DDB bt and > show registers (a different submittal) it makes for a good > comparison/contrast with what DDB reports for this "before copyright" > crash. > > Something unique to the "before copyright" context is... > > No registers are reported to have values that point into the range > between tmpstk and esym. > > In other words: There is no valid stack pointer reported as far as I > can tell. r1 has the value 0 instead of being a handling a valid stack > address. tmpstk=0xbd7000 and esym=0xbdb000 (example for one of my > WITH_DEBUG_FILES= and options DDB and GDB builds of 10.1-BETA2). That > at least gives a ball park on the range to expect for pointing into > the stack even with some build variation. > > It leaves me wondering if the DDB report is for a nested exception > handling. That could explain why lr points to u_trap+0x10 and srr0 > points to k_trap+0x28 when normally srr0 would point to the the > failing instruction (or the instruction after) and lr to where that > routine would normally return to. > > The register values that are reported for my 10.1-BETA2 builds that > crash before the copyright notice are: > > r0: 0 > r1: 0 > r2: 0xc81538 vop_unlock_desc > r3: 0xd18868 > r4: 0x894b58 > r5: 0 > r6: 0xc1dee0 M_AUDITBSM > r7: 0xe3f818 ofw_real_mode > r8: 0x1 > r9: 0xe0f580 __pcpu > r10: 0x1c35ec0 > r11: 0 > r12: 0x10000000 > r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as > 0xdbb290.) > r14-r19: all 0 > r20: 0x10c1000 > r21: 0x4 > r22: 0x180abd4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc89b8 smp_no... > r26: 0xcea108 ofw_rend... > r27: 0x894b58 ofwcall+0xa8 > r28: 0x894b58 ofwcall+0xa8 > r29: 2400022 > r30: 9000000000001032 > r31: 0xbb7d38 > > srr0: 0x102720 k_trap+0x28 > srr1: 9000000000001032 > lr: 0x1026f0 u_trap+0x10 > ctr: 0xff846d78 > cr: 2000deb0 > xer: 0 > dar: f...d50 (lots of f's) > dsisr: 42000000 > > > > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 20, 2014, at 3:42 PM, Mark Millard > wrote: > > [I corrected the SSR0 in the subject to be SRR0.] > > I did miss a register in my list (it matched the shown r30 value). And > it turns out to probably be very important to interpreting what the > "show registers" is reporting: > > SRR1: 0x9000000000001032 > > But bits 43-46 of SRR1 are supposed to indicate which type of Program > Exception, using a single binary 1 to so. No such 1's are present. > > Illegal instruction would have been bit 44 being 1. (PowerPC has the > upper bit numbered zero and increases from there.) > > So the ddb "show registers" is apparently not reporting the status as > of when the "stopped at 0 illegal instruction 0" happened. Thus other > things are also likely not from that exact time frame. > > > > And I misinterpreted the LR value status: The LR value was just left > over from the restore_kernsrs returning when it finished. Execution > then flowed into k_trap. Nothing unusual involved. > > > > > > === > Mark Millard > markmi@dsl-only.net > > On Sep 18, 2014, at 8:57 PM, Mark Millard > wrote: > > I modified DDB to automatically "show registers" even at the early > "before Copyright" crash time. The end of this note will show the > /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB > bt, the bt does not actually print a back trace for this context. (It > might for others.) > > The registers give interesting context despite the lack of a back > trace. I do not know if it will be sufficient to be of much immediate > help if someone used the information to start looking at the problem. > > I'll start with register lr: 0x1026f0 u_trap+0x10. > > /usr/src/sys/powerpc/aim/trap_subr64.S has: > > s_trap: > bf 17,k_trap /* branch if PSL_PR is false */ > GET_CPUINFO(%r1) > u_trap: > ld %r1,PC_CURPCB(%r1) > mr %r27,%r28 /* Save LR, r29 */ > mtsprg2 %r29 > bl restore_kernsrs /* enable kernel mapping */ > mfsprg2 %r29 > mr %r28,%r27 > > /* > * Now the common trap catching code. > */ > k_trap: > FRAME_SETUP(PC_TEMPSAVE) > /* Call C interrupt dispatcher: */ > trapagain: > > and so this appears to indicate a pending return to execute the > "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that > restore_kernsrs should be active. > > But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in > FRAME_SETUP(PC_TEMPSAVE) someplace.) > > So it appears to me that the processor got to the k_trap code during > the supposed restore_kernsrs time frame. (But I'm no expert at these > sorts of things or for the processor.) > > I'll list the other register values: > > r0: 0 > r1: 0 > r2: 0xc1be80 M_AUDITBSM > r3: 0xb16138 > r4: 0x8926e8 .ofwcall+0xa8 > r5: 0 > r6: 0xbb5f90 > r7: 0xe3d118 ofw_real_mode > r8: 0x1 > r9: 0xe0ce80 __pcpu > r10: 0x1c35ec9 > r11: 0 > r12: 0x10000000 > r13: db890 thread0 > r14-r19: all 0 > r20: 0x10bc000 > r21: 0x4 > r22: 0x1801db4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc6908 smp_no_rendevous_barrier > r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) > r27: 0x8926e8 .ofwcall+0xa8 > r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) > r29: 0x24000022 > r30: 0x9000000000001032 > r31: 0xc7f488 vop_unlock_desc > > ctr: 0xff846d78 > cr: 0x2000d7b0 > xer: 0 > dar: 0xfffffffffffffd50 > dsisr: 0x42000000 > > (Hopefully this manual transcription from the screen display is > complete --and also accurate for what it does present.) > > > > > The personal HACK to /usr/src/sys/ddb/db_script.c's > db_script_kdbenter(...) to have it show registers and try bt... > > $ cd /usr/src/sys/ddb/ > $ svnlite diff . > Index: db_script.c > =================================================================== > --- db_script.c(revision 271610) > +++ db_script.c(working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > > +/* HACK!!! : Additional lines to force a basic default script to exist. > +* Will dump information even if ddb input is not available for early > crash. > +* Used to get more information about PowerMac G5 "before Copyright" > hangs. > +*/ > +struct ddb_script *dsp = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > +if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; > bt"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) == ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > +/* HACK!!! : Additional lines to always use the default script, > +* even if scriptname existed and was executed. > +* Will dump information even if ddb input is not available for early > crash. > +* Used to get more information about PowerMac G5 "before Copyright" > hangs. > +*/ > +else > +(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > > /*- > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 16, 2014, at 9:28 PM, Mark Millard > wrote: > > In part I sent directly to you because of a past exchange (July-27) > where you had written: > >> Nathan and I both speculate that it's >> dropping into Open Firmware (we make extensive use of OFW), and then >> messing something up, taking a page fault or something. > > The specific text that I report and its uniformity when it is produced > seems to add a little information beyond a speculated "page fault or > something" and so might eventually help a little. As I understand the > text it is reporting execution reaching address zero without any prior > un-handled exceptions or other such that would stop it. A corrupted > stack (pointer) so a bad return address or some such? I'd guess there > are no explicit jumps to address zero so I expect that indirection is > likely involved, with the content for the indirection messed up. > > I really wish that I had a logic analyzer configuration for this. I've > not found a way to make the failing context visible so far and the > extra way of looking at things might have helped. > > > > > === > Mark Millard > markmi@dsl-only.net > > On Sep 16, 2014, at 8:28 PM, Justin Hibbits > wrote: > > Hi mark, > > I see this on my G5, and I think it's due to the amount of RAM in the > machine. More than 4gb seems to confuse open firmware when called by > FreeBSD. There is some effort to remove the need of the callbacks but > thus far it's not far along. The good news is that after it boots it's > solid except when switching vtys, buy earlier this year or last year I > added a sysctl hack to disable the call into open firmware on vty > switch (don't recall offhand and not at my computer right now, but if > you grep the sysctl output for reset and ofw you can find it). > > -Justin > > On Sep 16, 2014 8:01 PM, "Mark Millard" > wrote: > > I've now spent time with rebooting and power-off/power-on for all > 3 PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 > get the > >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> > > when they fail just before the Copyright notice would normally be > displayed. None fail any earlier. At that spot none have failed > any other way. It is the same SSD in all 3. (Happens with other > SSD's as well.) Overall there is a mix of Radeon and NVIDIA > display boards. Besides the SSD use and RAM upgrades the rest is > stock equipment. scons used, not vt. (I've yet to try vt.) > > Seeing a failure after the Copyright notice as been fairly rare in > all my experiments from when I started last April or so. The ones > that I've noted had Data Storage Interrupt reported. So far no > examples of the above have been reported after the Copyright > notice. So I'd guess that they are separate issues. Of course it > seems that only in the last few days would I have seen the above > sort of thing if it did happen after the Copyright notice: The > prior history does not count for judgements about that. > > === > Mark Millard > markmi at dsl-only.net > > On Sep 16, 2014, at 8:15 AM, Mark Millard > wrote: > > Using 10.1-BETA1 I added "options DDB" and "options GDB" to > powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=, > WITHOUT_CLANG=, and WITH_DEBUG= in /etc/make.conf.) So buildworld, > kernel was basically just set up to have more of a debugging > context around (including for any ports builds). > > The result was new information about the PowerMac G5 boot hangups: > The screen is no longer blank when the G5 is hung up without there > being a Copyright notice yet. It says... > >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> > > (I had no ability to input at that point.) Normally the Copyright > notice would have displayed instead of "[...]" and what follows. > (I do not claim to have all the spacing, capitalization, and such > correct above.) > > That text is constant from hang to hang when it hangs just before > it would normally output the Copyright notice: The numbers do not > vary, much less the other text. It has never failed until after > the two KDB messages are present. So far I've only tested one > PowerMac G5, booting over and over for a few hours. > > > > (I do not claim to be set up for remote kernel debugging. I just > decided to let GDB go along for the ride when I added DDB.) > > === > Mark Millard > markmi at dsl-only.net > > > > > > From owner-freebsd-ppc@FreeBSD.ORG Wed Sep 24 16:40:17 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 822E96CD; Wed, 24 Sep 2014 16:40:17 +0000 (UTC) Received: from mail-qa0-x231.google.com (mail-qa0-x231.google.com [IPv6:2607:f8b0:400d:c00::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 250BF910; Wed, 24 Sep 2014 16:40:17 +0000 (UTC) Received: by mail-qa0-f49.google.com with SMTP id n8so3365174qaq.22 for ; Wed, 24 Sep 2014 09:40:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; bh=7WIKIUNVACDBizWPVl5s1blsrmII291WZ3k+kO2um7c=; b=ym76F9Jfc/ooeFF0hqL0W2GCdNz+Boh/Azwz/PKWqNa4X915joZOhA2iXYZcMe6egP rYN8nULxwG2Szbp2ext0I3dqo2UUiRK/WNhH6WHWiwL4OCvbkZA780mXEE3RW+1oHi4S K0GcYE6u6SDvDx4hWDKrWEy/kO5K4yxN0EjII8TU8kOdOj0qBJsn2Oepa/XEY5H6cR4X 31FU8QM4uUi8ceg6dZ9K49qb/zglYGwO8zMoKPPMSl6pEBkVTdV/dIt22xncn8M0hRbN HPqiYKzp1KPqjTG0qS6iR2KAWk10vYpIHo79jPcfj5OJjn6140mgomNrBdDBJj5zGlye SJjw== X-Received: by 10.224.161.11 with SMTP id p11mr10978478qax.40.1411576816270; Wed, 24 Sep 2014 09:40:16 -0700 (PDT) Received: from zhabar.attlocal.net (107-222-186-3.lightspeed.sntcca.sbcglobal.net. [107.222.186.3]) by mx.google.com with ESMTPSA id k10sm13244989qaj.7.2014.09.24.09.40.14 for (version=SSLv3 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 24 Sep 2014 09:40:15 -0700 (PDT) Date: Wed, 24 Sep 2014 09:40:10 -0700 From: Justin Hibbits To: Nathan Whitehorn Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's Message-ID: <20140924094010.68225ddf@zhabar.attlocal.net> In-Reply-To: <5422E513.6010806@freebsd.org> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; powerpc64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: FreeBSD PowerPC ML , Mark Millard X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Sep 2014 16:40:17 -0000 I had started testing callbacks, but didn't get any farther than panicking the system with a bad callback handler :) Needless to say, I didn't get too far. Haven't tried restoring the vectors. - Justin On Wed, 24 Sep 2014 08:36:51 -0700 Nathan Whitehorn wrote: > There shouldn't be any exceptions at that point, nested or otherwise. > What I suspect is happening is that Open Firmware has turned them on > for some bizarre reason, taken one, and ended up in the kernel's > handlers but with the Open Firmware environment. Saving and restoring > the OF interrupt vectors would be a possible solution; flattening the > device tree in loader so that the kernel doesn't call Open Firmware > at all would be another. I think Justin may have tried the first at > some point. -Nathan > > On 09/24/14 02:04, Mark Millard wrote: > > Now that I've had a kernel/boot crash with a successful DDB bt and > > show registers (a different submittal) it makes for a good > > comparison/contrast with what DDB reports for this "before > > copyright" crash. > > > > Something unique to the "before copyright" context is... > > > > No registers are reported to have values that point into the range > > between tmpstk and esym. > > > > In other words: There is no valid stack pointer reported as far as > > I can tell. r1 has the value 0 instead of being a handling a valid > > stack address. tmpstk=0xbd7000 and esym=0xbdb000 (example for one > > of my WITH_DEBUG_FILES= and options DDB and GDB builds of > > 10.1-BETA2). That at least gives a ball park on the range to expect > > for pointing into the stack even with some build variation. > > > > It leaves me wondering if the DDB report is for a nested exception > > handling. That could explain why lr points to u_trap+0x10 and srr0 > > points to k_trap+0x28 when normally srr0 would point to the the > > failing instruction (or the instruction after) and lr to where that > > routine would normally return to. > > > > The register values that are reported for my 10.1-BETA2 builds that > > crash before the copyright notice are: > > > > r0: 0 > > r1: 0 > > r2: 0xc81538 vop_unlock_desc > > r3: 0xd18868 > > r4: 0x894b58 > > r5: 0 > > r6: 0xc1dee0 M_AUDITBSM > > r7: 0xe3f818 ofw_real_mode > > r8: 0x1 > > r9: 0xe0f580 __pcpu > > r10: 0x1c35ec0 > > r11: 0 > > r12: 0x10000000 > > r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as > > 0xdbb290.) > > r14-r19: all 0 > > r20: 0x10c1000 > > r21: 0x4 > > r22: 0x180abd4 > > r23: 0x1803a28 > > r24: 0xc000000000008760 > > r25: 0xcc89b8 smp_no... > > r26: 0xcea108 ofw_rend... > > r27: 0x894b58 ofwcall+0xa8 > > r28: 0x894b58 ofwcall+0xa8 > > r29: 2400022 > > r30: 9000000000001032 > > r31: 0xbb7d38 > > > > srr0: 0x102720 k_trap+0x28 > > srr1: 9000000000001032 > > lr: 0x1026f0 u_trap+0x10 > > ctr: 0xff846d78 > > cr: 2000deb0 > > xer: 0 > > dar: f...d50 (lots of f's) > > dsisr: 42000000 > > > > > > > > > > > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On Sep 20, 2014, at 3:42 PM, Mark Millard > > wrote: > > > > [I corrected the SSR0 in the subject to be SRR0.] > > > > I did miss a register in my list (it matched the shown r30 value). > > And it turns out to probably be very important to interpreting what > > the "show registers" is reporting: > > > > SRR1: 0x9000000000001032 > > > > But bits 43-46 of SRR1 are supposed to indicate which type of > > Program Exception, using a single binary 1 to so. No such 1's are > > present. > > > > Illegal instruction would have been bit 44 being 1. (PowerPC has > > the upper bit numbered zero and increases from there.) > > > > So the ddb "show registers" is apparently not reporting the status > > as of when the "stopped at 0 illegal instruction 0" happened. Thus > > other things are also likely not from that exact time frame. > > > > > > > > And I misinterpreted the LR value status: The LR value was just > > left over from the restore_kernsrs returning when it finished. > > Execution then flowed into k_trap. Nothing unusual involved. > > > > > > > > > > > > === > > Mark Millard > > markmi@dsl-only.net > > > > On Sep 18, 2014, at 8:57 PM, Mark Millard > > wrote: > > > > I modified DDB to automatically "show registers" even at the early > > "before Copyright" crash time. The end of this note will show the > > /usr/src/sys/ddb/db_script.c diff for the hack. While I also had > > DDB bt, the bt does not actually print a back trace for this > > context. (It might for others.) > > > > The registers give interesting context despite the lack of a back > > trace. I do not know if it will be sufficient to be of much > > immediate help if someone used the information to start looking at > > the problem. > > > > I'll start with register lr: 0x1026f0 u_trap+0x10. > > > > /usr/src/sys/powerpc/aim/trap_subr64.S has: > > > > s_trap: > > bf 17,k_trap /* branch if PSL_PR is > > false */ GET_CPUINFO(%r1) > > u_trap: > > ld %r1,PC_CURPCB(%r1) > > mr %r27,%r28 /* Save LR, r29 */ > > mtsprg2 %r29 > > bl restore_kernsrs /* enable kernel mapping */ > > mfsprg2 %r29 > > mr %r28,%r27 > > > > /* > > * Now the common trap catching code. > > */ > > k_trap: > > FRAME_SETUP(PC_TEMPSAVE) > > /* Call C interrupt dispatcher: */ > > trapagain: > > > > and so this appears to indicate a pending return to execute the > > "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that > > restore_kernsrs should be active. > > > > But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently > > in FRAME_SETUP(PC_TEMPSAVE) someplace.) > > > > So it appears to me that the processor got to the k_trap code > > during the supposed restore_kernsrs time frame. (But I'm no expert > > at these sorts of things or for the processor.) > > > > I'll list the other register values: > > > > r0: 0 > > r1: 0 > > r2: 0xc1be80 M_AUDITBSM > > r3: 0xb16138 > > r4: 0x8926e8 .ofwcall+0xa8 > > r5: 0 > > r6: 0xbb5f90 > > r7: 0xe3d118 ofw_real_mode > > r8: 0x1 > > r9: 0xe0ce80 __pcpu > > r10: 0x1c35ec9 > > r11: 0 > > r12: 0x10000000 > > r13: db890 thread0 > > r14-r19: all 0 > > r20: 0x10bc000 > > r21: 0x4 > > r22: 0x1801db4 > > r23: 0x1803a28 > > r24: 0xc000000000008760 > > r25: 0xcc6908 smp_no_rendevous_barrier > > r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other > > zv) r27: 0x8926e8 .ofwcall+0xa8 > > r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) > > r29: 0x24000022 > > r30: 0x9000000000001032 > > r31: 0xc7f488 vop_unlock_desc > > > > ctr: 0xff846d78 > > cr: 0x2000d7b0 > > xer: 0 > > dar: 0xfffffffffffffd50 > > dsisr: 0x42000000 > > > > (Hopefully this manual transcription from the screen display is > > complete --and also accurate for what it does present.) > > > > > > > > > > The personal HACK to /usr/src/sys/ddb/db_script.c's > > db_script_kdbenter(...) to have it show registers and try bt... > > > > $ cd /usr/src/sys/ddb/ > > $ svnlite diff . > > Index: db_script.c > > =================================================================== > > --- db_script.c(revision 271610) > > +++ db_script.c(working copy) > > @@ -319,10 +319,25 @@ > > { > > char scriptname[DB_MAXSCRIPTNAME]; > > > > +/* HACK!!! : Additional lines to force a basic default script to > > exist. +* Will dump information even if ddb input is not available > > for early crash. > > +* Used to get more information about PowerMac G5 "before > > Copyright" hangs. > > +*/ > > +struct ddb_script *dsp = > > db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); +if (!dsp) > > db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; bt"); > > + > > snprintf(scriptname, sizeof(scriptname), "%s.%s", > > DB_SCRIPT_KDBENTER_PREFIX, eventname); > > if (db_script_exec(scriptname, 0) == ENOENT) > > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > > + > > +/* HACK!!! : Additional lines to always use the default script, > > +* even if scriptname existed and was executed. > > +* Will dump information even if ddb input is not available for > > early crash. > > +* Used to get more information about PowerMac G5 "before > > Copyright" hangs. > > +*/ > > +else > > +(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > > } > > > > /*- > > > > > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On Sep 16, 2014, at 9:28 PM, Mark Millard > > wrote: > > > > In part I sent directly to you because of a past exchange (July-27) > > where you had written: > > > >> Nathan and I both speculate that it's > >> dropping into Open Firmware (we make extensive use of OFW), and > >> then messing something up, taking a page fault or something. > > > > The specific text that I report and its uniformity when it is > > produced seems to add a little information beyond a speculated > > "page fault or something" and so might eventually help a little. As > > I understand the text it is reporting execution reaching address > > zero without any prior un-handled exceptions or other such that > > would stop it. A corrupted stack (pointer) so a bad return address > > or some such? I'd guess there are no explicit jumps to address zero > > so I expect that indirection is likely involved, with the content > > for the indirection messed up. > > > > I really wish that I had a logic analyzer configuration for this. > > I've not found a way to make the failing context visible so far and > > the extra way of looking at things might have helped. > > > > > > > > > > === > > Mark Millard > > markmi@dsl-only.net > > > > On Sep 16, 2014, at 8:28 PM, Justin Hibbits > > wrote: > > > > Hi mark, > > > > I see this on my G5, and I think it's due to the amount of RAM in > > the machine. More than 4gb seems to confuse open firmware when > > called by FreeBSD. There is some effort to remove the need of the > > callbacks but thus far it's not far along. The good news is that > > after it boots it's solid except when switching vtys, buy earlier > > this year or last year I added a sysctl hack to disable the call > > into open firmware on vty switch (don't recall offhand and not at > > my computer right now, but if you grep the sysctl output for reset > > and ofw you can find it). > > > > -Justin > > > > On Sep 16, 2014 8:01 PM, "Mark Millard" > > wrote: > > > > I've now spent time with rebooting and power-off/power-on for > > all 3 PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and > > all 3 get the > > > >> GDB: no debug ports present > >> KDB: debugger backends: DDB > >> KDB: current backend: DDB > >> [ thread pid -1 tid 1006665719 ] > >> Stopped at 0: illegal instruction 0 > >> db> > > > > when they fail just before the Copyright notice would normally > > be displayed. None fail any earlier. At that spot none have failed > > any other way. It is the same SSD in all 3. (Happens with other > > SSD's as well.) Overall there is a mix of Radeon and NVIDIA > > display boards. Besides the SSD use and RAM upgrades the rest is > > stock equipment. scons used, not vt. (I've yet to try vt.) > > > > Seeing a failure after the Copyright notice as been fairly rare > > in all my experiments from when I started last April or so. The ones > > that I've noted had Data Storage Interrupt reported. So far no > > examples of the above have been reported after the Copyright > > notice. So I'd guess that they are separate issues. Of course it > > seems that only in the last few days would I have seen the above > > sort of thing if it did happen after the Copyright notice: The > > prior history does not count for judgements about that. > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On Sep 16, 2014, at 8:15 AM, Mark Millard > > wrote: > > > > Using 10.1-BETA1 I added "options DDB" and "options GDB" to > > powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=, > > WITHOUT_CLANG=, and WITH_DEBUG= in /etc/make.conf.) So > > buildworld, kernel was basically just set up to have more of a > > debugging context around (including for any ports builds). > > > > The result was new information about the PowerMac G5 boot > > hangups: The screen is no longer blank when the G5 is hung up > > without there being a Copyright notice yet. It says... > > > >> GDB: no debug ports present > >> KDB: debugger backends: DDB > >> KDB: current backend: DDB > >> [ thread pid -1 tid 1006665719 ] > >> Stopped at 0: illegal instruction 0 > >> db> > > > > (I had no ability to input at that point.) Normally the > > Copyright notice would have displayed instead of "[...]" and what > > follows. (I do not claim to have all the spacing, capitalization, > > and such correct above.) > > > > That text is constant from hang to hang when it hangs just > > before it would normally output the Copyright notice: The numbers > > do not vary, much less the other text. It has never failed until > > after the two KDB messages are present. So far I've only tested one > > PowerMac G5, booting over and over for a few hours. > > > > > > > > (I do not claim to be set up for remote kernel debugging. I just > > decided to let GDB go along for the ride when I added DDB.) > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > > > > > > > > > > From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 03:32:02 2014 Return-Path: Delivered-To: ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 50051A46; Thu, 25 Sep 2014 03:32:02 +0000 (UTC) Received: from outbound-smtp.edu.lahti.fi (outbound-smtp.edu.lahti.fi [143.51.142.38]) by mx1.freebsd.org (Postfix) with ESMTP id 0924E7ED; Thu, 25 Sep 2014 03:32:01 +0000 (UTC) Received: from edu.lahti.fi (unknown [109.95.47.233]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by inbound-smtp.edu.lahti.fi (Postfix) with ESMTPSA id A0FB8480DC0; Thu, 25 Sep 2014 06:31:49 +0300 (EEST) Message-ID: <3769C3A941E7152E07B05F1542B29085@edu.lahti.fi> From: "Joe Nosay" To: "Adrian Chadd" , "ISO 8859 1 Q Ermal Lu E7i" , "chromium" , "ppc" , "FreeBSD Hackers" , "CeDeROM" , "USAR VET" , "gecko" Subject: =?ISO-8859-1?Q?FW=3A=09Adrian_Chadd?= Date: Wed, 25 Sep 2014 04:31:49 +0000 MIME-Version: 1.0 X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 16.4.3522.110 X-MIMEOLE: Produced By Microsoft MimeOLE V16.4.3522.110 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 03:32:02 -0000 http://antiq.co.il/koyzzu/afqiqruhwbgmoaewqtdyitufwiupqnntlbhvhnh From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 09:13:03 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EE2AB213 for ; Thu, 25 Sep 2014 09:13:03 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2C18ECB4 for ; Thu, 25 Sep 2014 09:13:02 +0000 (UTC) Received: (qmail 801 invoked from network); 25 Sep 2014 09:12:55 -0000 Received: from unknown (HELO mail-cs-04.app.dca.reflexion.local) (10.81.19.4) by 0 (rfx-qmail) with SMTP; 25 Sep 2014 09:12:55 -0000 Received: by mail-cs-04.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Thu, 25 Sep 2014 05:12:55 -0400 (EDT) Received: (qmail 17020 invoked from network); 25 Sep 2014 09:12:53 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 25 Sep 2014 09:12:53 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id D522D1C402C; Thu, 25 Sep 2014 02:12:51 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <5422E513.6010806@freebsd.org> Date: Thu, 25 Sep 2014 02:12:48 -0700 Message-Id: <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 09:13:04 -0000 The register dump that has no kernel stack addresses in any registers = does have register contents suggesting a ofwcall use, matching up = reasonably with the code I looked at that is related to ofwcall. ofwcall = is only reached via openfirmware_core from what I can tell. (If there = are other paths into openfirmware than via ofwcall then the register = dump suggests that they are not in use around the crash.) And openfirmware_core has logic for exception vector swapping, going = both directions: > static int > openfirmware_core(void *args) > { > int result; > register_t oldmsr; > =20 > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > * or the stack pointer wrong. > */ > oldmsr =3D intr_disable(); > =20 > ofw_sprg_prepare(); > =20 > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > */ > if (!(cpu_features & PPC_FEATURE_64)) > __asm __volatile("mtdbatu 2, %0\n" > "mtdbatu 3, %0" : : "r" (0)); > isync(); > #endif >=20 > result =3D ofwcall(args); >=20 > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); >=20 > ofw_sprg_restore(); >=20 > intr_restore(oldmsr); >=20 > return (result); > } In turn openfirmware_core is used only by ofw_rendezvous_dispatch and in = turn that is used only by openfirmware: only PCPU_GET(cpuid) =3D=3D 0 = does the above. save_trap_init is initialized by powerpc_init using = ofw_save_trap_vec. [Note that ofw_restore_trap_vec uses __syncicache which does not use = dcbf after the bcopy but instead uses dcbst: That is part of what lead = my investigation into the distinction --and so to my more overall dcbst = vs. dcbf use questions after proving dcbf would not be sufficient for a = fix to the specific boot issue.] Unless the initialization of save_trap_init ends up with the wrong = contents for openfirmware it would appear that the exception vectors are = kept tracking by the above code. But the above does assume that the = openfirmware vectors are unchanged after save_trap_init is initialized: = there is no attempt at tracking of any potential updates to the = openfirmware exception vectors. I would infer then that after ofw_restore_trap_vec(save_trap_of) is = executed is when the exception that DDB reports happened: That is when = FreeBSD's exception vectors are again in place. But a stack pointer into = the kernel stack is not then in place in any register (based on DDB's = register dump): stack handling is messed up already by the point of the = reported exception. And that may actually be why an illegal instruction = at address zero was reached: an incorrect stack context used to get an = address to execute at. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn wrote: There shouldn't be any exceptions at that point, nested or otherwise. = What I suspect is happening is that Open Firmware has turned them on for = some bizarre reason, taken one, and ended up in the kernel's handlers = but with the Open Firmware environment. Saving and restoring the OF = interrupt vectors would be a possible solution; flattening the device = tree in loader so that the kernel doesn't call Open Firmware at all = would be another. I think Justin may have tried the first at some point. -Nathan On 09/24/14 02:04, Mark Millard wrote: > Now that I've had a kernel/boot crash with a successful DDB bt and = show registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. >=20 > Something unique to the "before copyright" context is... >=20 > No registers are reported to have values that point into the range = between tmpstk and esym. >=20 > In other words: There is no valid stack pointer reported as far as I = can tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. >=20 > It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. >=20 > The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: >=20 > r0: 0 > r1: 0 > r2: 0xc81538 vop_unlock_desc > r3: 0xd18868 > r4: 0x894b58 > r5: 0 > r6: 0xc1dee0 M_AUDITBSM > r7: 0xe3f818 ofw_real_mode > r8: 0x1 > r9: 0xe0f580 __pcpu > r10: 0x1c35ec0 > r11: 0 > r12: 0x10000000 > r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) > r14-r19: all 0 > r20: 0x10c1000 > r21: 0x4 > r22: 0x180abd4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc89b8 smp_no... > r26: 0xcea108 ofw_rend... > r27: 0x894b58 ofwcall+0xa8 > r28: 0x894b58 ofwcall+0xa8 > r29: 2400022 > r30: 9000000000001032 > r31: 0xbb7d38 >=20 > srr0: 0x102720 k_trap+0x28 > srr1: 9000000000001032 > lr: 0x1026f0 u_trap+0x10 > ctr: 0xff846d78 > cr: 2000deb0 > xer: 0 > dar: f...d50 (lots of f's) > dsisr: 42000000 >=20 >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 20, 2014, at 3:42 PM, Mark Millard = wrote: >=20 > [I corrected the SSR0 in the subject to be SRR0.] >=20 > I did miss a register in my list (it matched the shown r30 value). And = it turns out to probably be very important to interpreting what the = "show registers" is reporting: >=20 > SRR1: 0x9000000000001032 >=20 > But bits 43-46 of SRR1 are supposed to indicate which type of Program = Exception, using a single binary 1 to so. No such 1's are present. >=20 > Illegal instruction would have been bit 44 being 1. (PowerPC has the = upper bit numbered zero and increases from there.) >=20 > So the ddb "show registers" is apparently not reporting the status as = of when the "stopped at 0 illegal instruction 0" happened. Thus other = things are also likely not from that exact time frame. >=20 >=20 >=20 > And I misinterpreted the LR value status: The LR value was just left = over from the restore_kernsrs returning when it finished. Execution then = flowed into k_trap. Nothing unusual involved. >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 18, 2014, at 8:57 PM, Mark Millard wrote: >=20 > I modified DDB to automatically "show registers" even at the early = "before Copyright" crash time. The end of this note will show the = /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB bt, = the bt does not actually print a back trace for this context. (It might = for others.) >=20 > The registers give interesting context despite the lack of a back = trace. I do not know if it will be sufficient to be of much immediate = help if someone used the information to start looking at the problem. >=20 > I'll start with register lr: 0x1026f0 u_trap+0x10. >=20 > /usr/src/sys/powerpc/aim/trap_subr64.S has: >=20 > s_trap: > bf 17,k_trap /* branch if PSL_PR is false = */ > GET_CPUINFO(%r1) > u_trap: > ld %r1,PC_CURPCB(%r1) > mr %r27,%r28 /* Save LR, r29 */ > mtsprg2 %r29 > bl restore_kernsrs /* enable kernel mapping */ > mfsprg2 %r29 > mr %r28,%r27 >=20 > /* > * Now the common trap catching code. > */ > k_trap: > FRAME_SETUP(PC_TEMPSAVE) > /* Call C interrupt dispatcher: */ > trapagain: >=20 > and so this appears to indicate a pending return to execute the = "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that = restore_kernsrs should be active. >=20 > But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in = FRAME_SETUP(PC_TEMPSAVE) someplace.) >=20 > So it appears to me that the processor got to the k_trap code during = the supposed restore_kernsrs time frame. (But I'm no expert at these = sorts of things or for the processor.) >=20 > I'll list the other register values: >=20 > r0: 0 > r1: 0 > r2: 0xc1be80 M_AUDITBSM > r3: 0xb16138 > r4: 0x8926e8 .ofwcall+0xa8 > r5: 0 > r6: 0xbb5f90 > r7: 0xe3d118 ofw_real_mode > r8: 0x1 > r9: 0xe0ce80 __pcpu > r10: 0x1c35ec9 > r11: 0 > r12: 0x10000000 > r13: db890 thread0 > r14-r19: all 0 > r20: 0x10bc000 > r21: 0x4 > r22: 0x1801db4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc6908 smp_no_rendevous_barrier > r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) > r27: 0x8926e8 .ofwcall+0xa8 > r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) > r29: 0x24000022 > r30: 0x9000000000001032 > r31: 0xc7f488 vop_unlock_desc >=20 > ctr: 0xff846d78 > cr: 0x2000d7b0 > xer: 0 > dar: 0xfffffffffffffd50 > dsisr: 0x42000000 >=20 > (Hopefully this manual transcription from the screen display is = complete --and also accurate for what it does present.) >=20 >=20 >=20 >=20 > The personal HACK to /usr/src/sys/ddb/db_script.c's = db_script_kdbenter(...) to have it show registers and try bt... >=20 > $ cd /usr/src/sys/ddb/ > $ svnlite diff . > Index: db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- db_script.c (revision 271610) > +++ db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 9:28 PM, Mark Millard = wrote: >=20 > In part I sent directly to you because of a past exchange (July-27) = where you had written: >=20 >> Nathan and I both speculate that it's >> dropping into Open Firmware (we make extensive use of OFW), and then >> messing something up, taking a page fault or something. >=20 > The specific text that I report and its uniformity when it is produced = seems to add a little information beyond a speculated "page fault or = something" and so might eventually help a little. As I understand the = text it is reporting execution reaching address zero without any prior = un-handled exceptions or other such that would stop it. A corrupted = stack (pointer) so a bad return address or some such? I'd guess there = are no explicit jumps to address zero so I expect that indirection is = likely involved, with the content for the indirection messed up. >=20 > I really wish that I had a logic analyzer configuration for this. I've = not found a way to make the failing context visible so far and the extra = way of looking at things might have helped. >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 16, 2014, at 8:28 PM, Justin Hibbits = wrote: >=20 > Hi mark, >=20 > I see this on my G5, and I think it's due to the amount of RAM in the = machine. More than 4gb seems to confuse open firmware when called by = FreeBSD. There is some effort to remove the need of the callbacks but = thus far it's not far along. The good news is that after it boots it's = solid except when switching vtys, buy earlier this year or last year I = added a sysctl hack to disable the call into open firmware on vty switch = (don't recall offhand and not at my computer right now, but if you grep = the sysctl output for reset and ofw you can find it). >=20 > -Justin >=20 > On Sep 16, 2014 8:01 PM, "Mark Millard" wrote: > I've now spent time with rebooting and power-off/power-on for all 3 = PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 get the >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > when they fail just before the Copyright notice would normally be = displayed. None fail any earlier. At that spot none have failed any = other way. It is the same SSD in all 3. (Happens with other SSD's as = well.) Overall there is a mix of Radeon and NVIDIA display boards. = Besides the SSD use and RAM upgrades the rest is stock equipment. scons = used, not vt. (I've yet to try vt.) >=20 > Seeing a failure after the Copyright notice as been fairly rare in all = my experiments from when I started last April or so. The ones that I've = noted had Data Storage Interrupt reported. So far no examples of the = above have been reported after the Copyright notice. So I'd guess that = they are separate issues. Of course it seems that only in the last few = days would I have seen the above sort of thing if it did happen after = the Copyright notice: The prior history does not count for judgements = about that. >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 8:15 AM, Mark Millard wrote: >=20 > Using 10.1-BETA1 I added "options DDB" and "options GDB" to = powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D,= and WITH_DEBUG=3D in /etc/make.conf.) So buildworld, kernel was = basically just set up to have more of a debugging context around = (including for any ports builds). >=20 > The result was new information about the PowerMac G5 boot hangups: The = screen is no longer blank when the G5 is hung up without there being a = Copyright notice yet. It says... >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > (I had no ability to input at that point.) Normally the Copyright = notice would have displayed instead of "[...]" and what follows. (I do = not claim to have all the spacing, capitalization, and such correct = above.) >=20 > That text is constant from hang to hang when it hangs just before it = would normally output the Copyright notice: The numbers do not vary, = much less the other text. It has never failed until after the two KDB = messages are present. So far I've only tested one PowerMac G5, booting = over and over for a few hours. >=20 >=20 >=20 > (I do not claim to be set up for remote kernel debugging. I just = decided to let GDB go along for the ride when I added DDB.) >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 >=20 >=20 >=20 From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 10:46:25 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3401B7E3 for ; Thu, 25 Sep 2014 10:46:25 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 646258D9 for ; Thu, 25 Sep 2014 10:46:24 +0000 (UTC) Received: (qmail 23389 invoked from network); 25 Sep 2014 10:46:23 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 25 Sep 2014 10:46:23 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Thu, 25 Sep 2014 06:46:23 -0400 (EDT) Received: (qmail 3415 invoked from network); 25 Sep 2014 10:46:20 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 25 Sep 2014 10:46:20 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id EE6351C402C; Thu, 25 Sep 2014 03:46:17 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> Date: Thu, 25 Sep 2014 03:46:18 -0700 Message-Id: References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 10:46:25 -0000 One source code oddity that I notice is the following mixed use of = ofw_real_mode: always tested vs. never tested (#if 0 ... #endif) ... > /* > * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the = callback. > */ > register_t ofw_sprg0_save; >=20 > static __inline void > ofw_sprg_prepare(void) > { > if (ofw_real_mode) > return; >=20 > /* > * Assume that interrupt are disabled at this point, or > * SPRG1-3 could be trashed > */ > __asm __volatile("mfsprg0 %0\n\t" > "mtsprg0 %1\n\t" > "mtsprg1 %2\n\t" > "mtsprg2 %3\n\t" > "mtsprg3 %4\n\t" > : "=3D&r"(ofw_sprg0_save) > : "r"(ofmsr[1]), > "r"(ofmsr[2]), > "r"(ofmsr[3]), > "r"(ofmsr[4])); > } > =20 > static __inline void > ofw_sprg_restore(void) > { > #if 0 > if (ofw_real_mode) > return; > #endif >=20 > /* > * Note that SPRG1-3 contents are irrelevant. They are scratch > * registers used in the early portion of trap handling when > * interrupts are disabled. > * > * PCPU data cannot be used until this routine is called ! > */ > __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save)); > } It would seem that for ofw_real_mode !=3D 0 that ofw_sprg_prepare would = never set up ofw_sprg0_save (via mfsprg0) for the later = ofw_sprg_restore's always-executed mtsprg0 that is based on = ofw_sprg0_save. register_t seems to trace back to __int64_t --and that would leave = ofw_sprg0_save initialized to zero as a global and that would have to be = okay as the SPRG0 value to restore in such a case. (I have not tracked = down what any of the per-processor values for SPRG0 are/should-be.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:12 AM, Mark Millard = wrote: The register dump that has no kernel stack addresses in any registers = does have register contents suggesting a ofwcall use, matching up = reasonably with the code I looked at that is related to ofwcall. ofwcall = is only reached via openfirmware_core from what I can tell. (If there = are other paths into openfirmware than via ofwcall then the register = dump suggests that they are not in use around the crash.) And openfirmware_core has logic for exception vector swapping, going = both directions: > static int > openfirmware_core(void *args) > { > int result; > register_t oldmsr; > =20 > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > * or the stack pointer wrong. > */ > oldmsr =3D intr_disable(); > =20 > ofw_sprg_prepare(); > =20 > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > */ > if (!(cpu_features & PPC_FEATURE_64)) > __asm __volatile("mtdbatu 2, %0\n" > "mtdbatu 3, %0" : : "r" (0)); > isync(); > #endif >=20 > result =3D ofwcall(args); >=20 > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); >=20 > ofw_sprg_restore(); >=20 > intr_restore(oldmsr); >=20 > return (result); > } In turn openfirmware_core is used only by ofw_rendezvous_dispatch and in = turn that is used only by openfirmware: only PCPU_GET(cpuid) =3D=3D 0 = does the above. save_trap_init is initialized by powerpc_init using = ofw_save_trap_vec. [Note that ofw_restore_trap_vec uses __syncicache which does not use = dcbf after the bcopy but instead uses dcbst: That is part of what lead = my investigation into the distinction --and so to my more overall dcbst = vs. dcbf use questions after proving dcbf would not be sufficient for a = fix to the specific boot issue.] Unless the initialization of save_trap_init ends up with the wrong = contents for openfirmware it would appear that the exception vectors are = kept tracking by the above code. But the above does assume that the = openfirmware vectors are unchanged after save_trap_init is initialized: = there is no attempt at tracking of any potential updates to the = openfirmware exception vectors. I would infer then that after ofw_restore_trap_vec(save_trap_of) is = executed is when the exception that DDB reports happened: That is when = FreeBSD's exception vectors are again in place. But a stack pointer into = the kernel stack is not then in place in any register (based on DDB's = register dump): stack handling is messed up already by the point of the = reported exception. And that may actually be why an illegal instruction = at address zero was reached: an incorrect stack context used to get an = address to execute at. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn wrote: There shouldn't be any exceptions at that point, nested or otherwise. = What I suspect is happening is that Open Firmware has turned them on for = some bizarre reason, taken one, and ended up in the kernel's handlers = but with the Open Firmware environment. Saving and restoring the OF = interrupt vectors would be a possible solution; flattening the device = tree in loader so that the kernel doesn't call Open Firmware at all = would be another. I think Justin may have tried the first at some point. -Nathan On 09/24/14 02:04, Mark Millard wrote: > Now that I've had a kernel/boot crash with a successful DDB bt and = show registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. >=20 > Something unique to the "before copyright" context is... >=20 > No registers are reported to have values that point into the range = between tmpstk and esym. >=20 > In other words: There is no valid stack pointer reported as far as I = can tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. >=20 > It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. >=20 > The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: >=20 > r0: 0 > r1: 0 > r2: 0xc81538 vop_unlock_desc > r3: 0xd18868 > r4: 0x894b58 > r5: 0 > r6: 0xc1dee0 M_AUDITBSM > r7: 0xe3f818 ofw_real_mode > r8: 0x1 > r9: 0xe0f580 __pcpu > r10: 0x1c35ec0 > r11: 0 > r12: 0x10000000 > r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) > r14-r19: all 0 > r20: 0x10c1000 > r21: 0x4 > r22: 0x180abd4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc89b8 smp_no... > r26: 0xcea108 ofw_rend... > r27: 0x894b58 ofwcall+0xa8 > r28: 0x894b58 ofwcall+0xa8 > r29: 2400022 > r30: 9000000000001032 > r31: 0xbb7d38 >=20 > srr0: 0x102720 k_trap+0x28 > srr1: 9000000000001032 > lr: 0x1026f0 u_trap+0x10 > ctr: 0xff846d78 > cr: 2000deb0 > xer: 0 > dar: f...d50 (lots of f's) > dsisr: 42000000 >=20 >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 20, 2014, at 3:42 PM, Mark Millard = wrote: >=20 > [I corrected the SSR0 in the subject to be SRR0.] >=20 > I did miss a register in my list (it matched the shown r30 value). And = it turns out to probably be very important to interpreting what the = "show registers" is reporting: >=20 > SRR1: 0x9000000000001032 >=20 > But bits 43-46 of SRR1 are supposed to indicate which type of Program = Exception, using a single binary 1 to so. No such 1's are present. >=20 > Illegal instruction would have been bit 44 being 1. (PowerPC has the = upper bit numbered zero and increases from there.) >=20 > So the ddb "show registers" is apparently not reporting the status as = of when the "stopped at 0 illegal instruction 0" happened. Thus other = things are also likely not from that exact time frame. >=20 >=20 >=20 > And I misinterpreted the LR value status: The LR value was just left = over from the restore_kernsrs returning when it finished. Execution then = flowed into k_trap. Nothing unusual involved. >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 18, 2014, at 8:57 PM, Mark Millard wrote: >=20 > I modified DDB to automatically "show registers" even at the early = "before Copyright" crash time. The end of this note will show the = /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB bt, = the bt does not actually print a back trace for this context. (It might = for others.) >=20 > The registers give interesting context despite the lack of a back = trace. I do not know if it will be sufficient to be of much immediate = help if someone used the information to start looking at the problem. >=20 > I'll start with register lr: 0x1026f0 u_trap+0x10. >=20 > /usr/src/sys/powerpc/aim/trap_subr64.S has: >=20 > s_trap: > bf 17,k_trap /* branch if PSL_PR is false = */ > GET_CPUINFO(%r1) > u_trap: > ld %r1,PC_CURPCB(%r1) > mr %r27,%r28 /* Save LR, r29 */ > mtsprg2 %r29 > bl restore_kernsrs /* enable kernel mapping */ > mfsprg2 %r29 > mr %r28,%r27 >=20 > /* > * Now the common trap catching code. > */ > k_trap: > FRAME_SETUP(PC_TEMPSAVE) > /* Call C interrupt dispatcher: */ > trapagain: >=20 > and so this appears to indicate a pending return to execute the = "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that = restore_kernsrs should be active. >=20 > But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in = FRAME_SETUP(PC_TEMPSAVE) someplace.) >=20 > So it appears to me that the processor got to the k_trap code during = the supposed restore_kernsrs time frame. (But I'm no expert at these = sorts of things or for the processor.) >=20 > I'll list the other register values: >=20 > r0: 0 > r1: 0 > r2: 0xc1be80 M_AUDITBSM > r3: 0xb16138 > r4: 0x8926e8 .ofwcall+0xa8 > r5: 0 > r6: 0xbb5f90 > r7: 0xe3d118 ofw_real_mode > r8: 0x1 > r9: 0xe0ce80 __pcpu > r10: 0x1c35ec9 > r11: 0 > r12: 0x10000000 > r13: db890 thread0 > r14-r19: all 0 > r20: 0x10bc000 > r21: 0x4 > r22: 0x1801db4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc6908 smp_no_rendevous_barrier > r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) > r27: 0x8926e8 .ofwcall+0xa8 > r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) > r29: 0x24000022 > r30: 0x9000000000001032 > r31: 0xc7f488 vop_unlock_desc >=20 > ctr: 0xff846d78 > cr: 0x2000d7b0 > xer: 0 > dar: 0xfffffffffffffd50 > dsisr: 0x42000000 >=20 > (Hopefully this manual transcription from the screen display is = complete --and also accurate for what it does present.) >=20 >=20 >=20 >=20 > The personal HACK to /usr/src/sys/ddb/db_script.c's = db_script_kdbenter(...) to have it show registers and try bt... >=20 > $ cd /usr/src/sys/ddb/ > $ svnlite diff . > Index: db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- db_script.c (revision 271610) > +++ db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 9:28 PM, Mark Millard = wrote: >=20 > In part I sent directly to you because of a past exchange (July-27) = where you had written: >=20 >> Nathan and I both speculate that it's >> dropping into Open Firmware (we make extensive use of OFW), and then >> messing something up, taking a page fault or something. >=20 > The specific text that I report and its uniformity when it is produced = seems to add a little information beyond a speculated "page fault or = something" and so might eventually help a little. As I understand the = text it is reporting execution reaching address zero without any prior = un-handled exceptions or other such that would stop it. A corrupted = stack (pointer) so a bad return address or some such? I'd guess there = are no explicit jumps to address zero so I expect that indirection is = likely involved, with the content for the indirection messed up. >=20 > I really wish that I had a logic analyzer configuration for this. I've = not found a way to make the failing context visible so far and the extra = way of looking at things might have helped. >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 16, 2014, at 8:28 PM, Justin Hibbits = wrote: >=20 > Hi mark, >=20 > I see this on my G5, and I think it's due to the amount of RAM in the = machine. More than 4gb seems to confuse open firmware when called by = FreeBSD. There is some effort to remove the need of the callbacks but = thus far it's not far along. The good news is that after it boots it's = solid except when switching vtys, buy earlier this year or last year I = added a sysctl hack to disable the call into open firmware on vty switch = (don't recall offhand and not at my computer right now, but if you grep = the sysctl output for reset and ofw you can find it). >=20 > -Justin >=20 > On Sep 16, 2014 8:01 PM, "Mark Millard" wrote: > I've now spent time with rebooting and power-off/power-on for all 3 = PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 get the >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > when they fail just before the Copyright notice would normally be = displayed. None fail any earlier. At that spot none have failed any = other way. It is the same SSD in all 3. (Happens with other SSD's as = well.) Overall there is a mix of Radeon and NVIDIA display boards. = Besides the SSD use and RAM upgrades the rest is stock equipment. scons = used, not vt. (I've yet to try vt.) >=20 > Seeing a failure after the Copyright notice as been fairly rare in all = my experiments from when I started last April or so. The ones that I've = noted had Data Storage Interrupt reported. So far no examples of the = above have been reported after the Copyright notice. So I'd guess that = they are separate issues. Of course it seems that only in the last few = days would I have seen the above sort of thing if it did happen after = the Copyright notice: The prior history does not count for judgements = about that. >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 8:15 AM, Mark Millard wrote: >=20 > Using 10.1-BETA1 I added "options DDB" and "options GDB" to = powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D,= and WITH_DEBUG=3D in /etc/make.conf.) So buildworld, kernel was = basically just set up to have more of a debugging context around = (including for any ports builds). >=20 > The result was new information about the PowerMac G5 boot hangups: The = screen is no longer blank when the G5 is hung up without there being a = Copyright notice yet. It says... >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > (I had no ability to input at that point.) Normally the Copyright = notice would have displayed instead of "[...]" and what follows. (I do = not claim to have all the spacing, capitalization, and such correct = above.) >=20 > That text is constant from hang to hang when it hangs just before it = would normally output the Copyright notice: The numbers do not vary, = much less the other text. It has never failed until after the two KDB = messages are present. So far I've only tested one PowerMac G5, booting = over and over for a few hours. >=20 >=20 >=20 > (I do not claim to be set up for remote kernel debugging. I just = decided to let GDB go along for the ride when I added DDB.) >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 >=20 >=20 >=20 From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 20:19:40 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9DE3523B for ; Thu, 25 Sep 2014 20:19:40 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A1A23A48 for ; Thu, 25 Sep 2014 20:19:38 +0000 (UTC) Received: (qmail 21733 invoked from network); 25 Sep 2014 20:19:36 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 25 Sep 2014 20:19:36 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Thu, 25 Sep 2014 16:19:36 -0400 (EDT) Received: (qmail 14895 invoked from network); 25 Sep 2014 20:17:52 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 25 Sep 2014 20:17:52 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 9CEB11C4057; Thu, 25 Sep 2014 13:17:47 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: Date: Thu, 25 Sep 2014 13:17:50 -0700 Message-Id: <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 20:19:40 -0000 The "before copyright" hang/exception is during the first openfirmware = "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) = completes fine, the ofwcall(args) is made but it does not return = normally. Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's = is: quiesce finddevice parent getprop getprop getprop finddevice getprop instance-to-package getproplen finddevice getprop getprop peer And when the boot fails before the copyright that ofwcall for peer ends = up resulting in the register dump with no register pointing to the = kernel's normal stack area. I still have no clue what is happening during peer. = ofw_restore_trap_vec(save_trap_init) is being called and is returning = before ofwcall is used. For all I know some uses of peer could require = not being quiesce'd in order for peer to be reliable. In the form of my display indicating what executed the text reported = ends in: ^ where the ^ indicates the stage that last completed in the call sequence = inside openfirmware_core. This information is displayed by the x/s ofw_name_history in the automatically created default script for DDB. I read the sequence = backwards from the end marker (here ^), following the wraparound if = there is that much text and if I care to go back that far. FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep 25 = 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc My current hacks to get this information are: Index: /usr/src/sys/ddb/db_script.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/ddb/db_script.c (revision 271944) +++ /usr/src/sys/ddb/db_script.c (working copy) @@ -319,10 +319,25 @@ { char scriptname[DB_MAXSCRIPTNAME]; =20 + /* HACK!!! : Additional lines to force a basic default script to = exist. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show = registers; bt; x/s ofw_name_history"); + snprintf(scriptname, sizeof(scriptname), "%s.%s", DB_SCRIPT_KDBENTER_PREFIX, eventname); if (db_script_exec(scriptname, 0) =3D=3D ENOENT) (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); + + /* HACK!!! : Additional lines to always use the default script, + * even if scriptname existed and was executed. + * Will dump information even if ddb input is not available for = early crash. + * Used to get more information about PowerMac G5 "before = Copyright" hangs. + */ + else + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); } =20 /*- Index: /usr/src/sys/powerpc/conf/GENERIC64 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) @@ -76,6 +76,8 @@ # Debugging support. Always need this: options KDB # Enable kernel debugger = support. options KDB_TRACE # Print a stack trace for a = panic. +options DDB +options GDB =20 # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor = Kernel Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944) +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy) @@ -324,6 +324,12 @@ openfirmware(&args); } =20 +/* Part of HACK to have record of ofw call names */ +#define ofw_name_history_record_size 256 +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* = Initially: automatically '\0' filled */ +char * ofw_name_history_pos =3D ofw_name_history; +/* End Part of HACK */ + static int openfirmware_core(void *args) { @@ -330,6 +336,42 @@ int result; register_t oldmsr; =20 + { /* HACK to have record of ofw call names */ + struct argtype_prefix { + cell_t name; + }; + + char *name =3D (char*) (uintptr_t) (((struct = argtype_prefix*)args)->name); +=20 + int i; + + *ofw_name_history_pos =3D '<'; + + for(i=3D0; (*name) && i!=3D20; i++) { + ofw_name_history_pos++; + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { + ofw_name_history_pos =3D = ofw_name_history; + } + *ofw_name_history_pos =3D *name; + + name++; + } + + ofw_name_history_pos++; + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { + ofw_name_history_pos =3D ofw_name_history; + } + *ofw_name_history_pos =3D '>'; + + ofw_name_history_pos++; + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { + ofw_name_history_pos =3D ofw_name_history; + } + *ofw_name_history_pos =3D '@'; + + ofw_name_history[ofw_name_history_record_size] =3D '\0'; = /* Paranoia */ + } /* HACK end */ + /* * Turn off exceptions - we really don't want to end up * anywhere unexpected with PCPU set to something strange @@ -337,14 +379,22 @@ */ oldmsr =3D intr_disable(); =20 + *ofw_name_history_pos =3D '#'; /* HACK */ + ofw_sprg_prepare(); =20 + *ofw_name_history_pos =3D '$'; /* HACK */ + /* Save trap vectors */ ofw_save_trap_vec(save_trap_of); =20 + *ofw_name_history_pos =3D '%'; /* HACK */ + /* Restore initially saved trap vectors */ ofw_restore_trap_vec(save_trap_init); =20 + *ofw_name_history_pos =3D '^'; /* HACK */ + #if defined(AIM) && !defined(__powerpc64__) /* * Clear battable[] translations @@ -357,13 +407,21 @@ =20 result =3D ofwcall(args); =20 + *ofw_name_history_pos =3D '&'; /* HACK */ + /* Restore trap vecotrs */ ofw_restore_trap_vec(save_trap_of); =20 + *ofw_name_history_pos =3D '*'; /* HACK */ + ofw_sprg_restore(); =20 + *ofw_name_history_pos =3D '~'; /* HACK */ + intr_restore(oldmsr); =20 + *ofw_name_history_pos =3D '!'; /* HACK */ + return (result); } =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 3:46 AM, Mark Millard = wrote: One source code oddity that I notice is the following mixed use of = ofw_real_mode: always tested vs. never tested (#if 0 ... #endif) ... > /* > * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the = callback. > */ > register_t ofw_sprg0_save; >=20 > static __inline void > ofw_sprg_prepare(void) > { > if (ofw_real_mode) > return; >=20 > /* > * Assume that interrupt are disabled at this point, or > * SPRG1-3 could be trashed > */ > __asm __volatile("mfsprg0 %0\n\t" > "mtsprg0 %1\n\t" > "mtsprg1 %2\n\t" > "mtsprg2 %3\n\t" > "mtsprg3 %4\n\t" > : "=3D&r"(ofw_sprg0_save) > : "r"(ofmsr[1]), > "r"(ofmsr[2]), > "r"(ofmsr[3]), > "r"(ofmsr[4])); > } > =20 > static __inline void > ofw_sprg_restore(void) > { > #if 0 > if (ofw_real_mode) > return; > #endif >=20 > /* > * Note that SPRG1-3 contents are irrelevant. They are scratch > * registers used in the early portion of trap handling when > * interrupts are disabled. > * > * PCPU data cannot be used until this routine is called ! > */ > __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save)); > } It would seem that for ofw_real_mode !=3D 0 that ofw_sprg_prepare would = never set up ofw_sprg0_save (via mfsprg0) for the later = ofw_sprg_restore's always-executed mtsprg0 that is based on = ofw_sprg0_save. register_t seems to trace back to __int64_t --and that would leave = ofw_sprg0_save initialized to zero as a global and that would have to be = okay as the SPRG0 value to restore in such a case. (I have not tracked = down what any of the per-processor values for SPRG0 are/should-be.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:12 AM, Mark Millard = wrote: The register dump that has no kernel stack addresses in any registers = does have register contents suggesting a ofwcall use, matching up = reasonably with the code I looked at that is related to ofwcall. ofwcall = is only reached via openfirmware_core from what I can tell. (If there = are other paths into openfirmware than via ofwcall then the register = dump suggests that they are not in use around the crash.) And openfirmware_core has logic for exception vector swapping, going = both directions: > static int > openfirmware_core(void *args) > { > int result; > register_t oldmsr; > =20 > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > * or the stack pointer wrong. > */ > oldmsr =3D intr_disable(); > =20 > ofw_sprg_prepare(); > =20 > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > */ > if (!(cpu_features & PPC_FEATURE_64)) > __asm __volatile("mtdbatu 2, %0\n" > "mtdbatu 3, %0" : : "r" (0)); > isync(); > #endif >=20 > result =3D ofwcall(args); >=20 > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); >=20 > ofw_sprg_restore(); >=20 > intr_restore(oldmsr); >=20 > return (result); > } In turn openfirmware_core is used only by ofw_rendezvous_dispatch and in = turn that is used only by openfirmware: only PCPU_GET(cpuid) =3D=3D 0 = does the above. save_trap_init is initialized by powerpc_init using = ofw_save_trap_vec. [Note that ofw_restore_trap_vec uses __syncicache which does not use = dcbf after the bcopy but instead uses dcbst: That is part of what lead = my investigation into the distinction --and so to my more overall dcbst = vs. dcbf use questions after proving dcbf would not be sufficient for a = fix to the specific boot issue.] Unless the initialization of save_trap_init ends up with the wrong = contents for openfirmware it would appear that the exception vectors are = kept tracking by the above code. But the above does assume that the = openfirmware vectors are unchanged after save_trap_init is initialized: = there is no attempt at tracking of any potential updates to the = openfirmware exception vectors. I would infer then that after ofw_restore_trap_vec(save_trap_of) is = executed is when the exception that DDB reports happened: That is when = FreeBSD's exception vectors are again in place. But a stack pointer into = the kernel stack is not then in place in any register (based on DDB's = register dump): stack handling is messed up already by the point of the = reported exception. And that may actually be why an illegal instruction = at address zero was reached: an incorrect stack context used to get an = address to execute at. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn wrote: There shouldn't be any exceptions at that point, nested or otherwise. = What I suspect is happening is that Open Firmware has turned them on for = some bizarre reason, taken one, and ended up in the kernel's handlers = but with the Open Firmware environment. Saving and restoring the OF = interrupt vectors would be a possible solution; flattening the device = tree in loader so that the kernel doesn't call Open Firmware at all = would be another. I think Justin may have tried the first at some point. -Nathan On 09/24/14 02:04, Mark Millard wrote: > Now that I've had a kernel/boot crash with a successful DDB bt and = show registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. >=20 > Something unique to the "before copyright" context is... >=20 > No registers are reported to have values that point into the range = between tmpstk and esym. >=20 > In other words: There is no valid stack pointer reported as far as I = can tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. >=20 > It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. >=20 > The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: >=20 > r0: 0 > r1: 0 > r2: 0xc81538 vop_unlock_desc > r3: 0xd18868 > r4: 0x894b58 > r5: 0 > r6: 0xc1dee0 M_AUDITBSM > r7: 0xe3f818 ofw_real_mode > r8: 0x1 > r9: 0xe0f580 __pcpu > r10: 0x1c35ec0 > r11: 0 > r12: 0x10000000 > r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) > r14-r19: all 0 > r20: 0x10c1000 > r21: 0x4 > r22: 0x180abd4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc89b8 smp_no... > r26: 0xcea108 ofw_rend... > r27: 0x894b58 ofwcall+0xa8 > r28: 0x894b58 ofwcall+0xa8 > r29: 2400022 > r30: 9000000000001032 > r31: 0xbb7d38 >=20 > srr0: 0x102720 k_trap+0x28 > srr1: 9000000000001032 > lr: 0x1026f0 u_trap+0x10 > ctr: 0xff846d78 > cr: 2000deb0 > xer: 0 > dar: f...d50 (lots of f's) > dsisr: 42000000 >=20 >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 20, 2014, at 3:42 PM, Mark Millard = wrote: >=20 > [I corrected the SSR0 in the subject to be SRR0.] >=20 > I did miss a register in my list (it matched the shown r30 value). And = it turns out to probably be very important to interpreting what the = "show registers" is reporting: >=20 > SRR1: 0x9000000000001032 >=20 > But bits 43-46 of SRR1 are supposed to indicate which type of Program = Exception, using a single binary 1 to so. No such 1's are present. >=20 > Illegal instruction would have been bit 44 being 1. (PowerPC has the = upper bit numbered zero and increases from there.) >=20 > So the ddb "show registers" is apparently not reporting the status as = of when the "stopped at 0 illegal instruction 0" happened. Thus other = things are also likely not from that exact time frame. >=20 >=20 >=20 > And I misinterpreted the LR value status: The LR value was just left = over from the restore_kernsrs returning when it finished. Execution then = flowed into k_trap. Nothing unusual involved. >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 18, 2014, at 8:57 PM, Mark Millard wrote: >=20 > I modified DDB to automatically "show registers" even at the early = "before Copyright" crash time. The end of this note will show the = /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB bt, = the bt does not actually print a back trace for this context. (It might = for others.) >=20 > The registers give interesting context despite the lack of a back = trace. I do not know if it will be sufficient to be of much immediate = help if someone used the information to start looking at the problem. >=20 > I'll start with register lr: 0x1026f0 u_trap+0x10. >=20 > /usr/src/sys/powerpc/aim/trap_subr64.S has: >=20 > s_trap: > bf 17,k_trap /* branch if PSL_PR is false = */ > GET_CPUINFO(%r1) > u_trap: > ld %r1,PC_CURPCB(%r1) > mr %r27,%r28 /* Save LR, r29 */ > mtsprg2 %r29 > bl restore_kernsrs /* enable kernel mapping */ > mfsprg2 %r29 > mr %r28,%r27 >=20 > /* > * Now the common trap catching code. > */ > k_trap: > FRAME_SETUP(PC_TEMPSAVE) > /* Call C interrupt dispatcher: */ > trapagain: >=20 > and so this appears to indicate a pending return to execute the = "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that = restore_kernsrs should be active. >=20 > But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in = FRAME_SETUP(PC_TEMPSAVE) someplace.) >=20 > So it appears to me that the processor got to the k_trap code during = the supposed restore_kernsrs time frame. (But I'm no expert at these = sorts of things or for the processor.) >=20 > I'll list the other register values: >=20 > r0: 0 > r1: 0 > r2: 0xc1be80 M_AUDITBSM > r3: 0xb16138 > r4: 0x8926e8 .ofwcall+0xa8 > r5: 0 > r6: 0xbb5f90 > r7: 0xe3d118 ofw_real_mode > r8: 0x1 > r9: 0xe0ce80 __pcpu > r10: 0x1c35ec9 > r11: 0 > r12: 0x10000000 > r13: db890 thread0 > r14-r19: all 0 > r20: 0x10bc000 > r21: 0x4 > r22: 0x1801db4 > r23: 0x1803a28 > r24: 0xc000000000008760 > r25: 0xcc6908 smp_no_rendevous_barrier > r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) > r27: 0x8926e8 .ofwcall+0xa8 > r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) > r29: 0x24000022 > r30: 0x9000000000001032 > r31: 0xc7f488 vop_unlock_desc >=20 > ctr: 0xff846d78 > cr: 0x2000d7b0 > xer: 0 > dar: 0xfffffffffffffd50 > dsisr: 0x42000000 >=20 > (Hopefully this manual transcription from the screen display is = complete --and also accurate for what it does present.) >=20 >=20 >=20 >=20 > The personal HACK to /usr/src/sys/ddb/db_script.c's = db_script_kdbenter(...) to have it show registers and try bt... >=20 > $ cd /usr/src/sys/ddb/ > $ svnlite diff . > Index: db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- db_script.c (revision 271610) > +++ db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 9:28 PM, Mark Millard = wrote: >=20 > In part I sent directly to you because of a past exchange (July-27) = where you had written: >=20 >> Nathan and I both speculate that it's >> dropping into Open Firmware (we make extensive use of OFW), and then >> messing something up, taking a page fault or something. >=20 > The specific text that I report and its uniformity when it is produced = seems to add a little information beyond a speculated "page fault or = something" and so might eventually help a little. As I understand the = text it is reporting execution reaching address zero without any prior = un-handled exceptions or other such that would stop it. A corrupted = stack (pointer) so a bad return address or some such? I'd guess there = are no explicit jumps to address zero so I expect that indirection is = likely involved, with the content for the indirection messed up. >=20 > I really wish that I had a logic analyzer configuration for this. I've = not found a way to make the failing context visible so far and the extra = way of looking at things might have helped. >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi@dsl-only.net >=20 > On Sep 16, 2014, at 8:28 PM, Justin Hibbits = wrote: >=20 > Hi mark, >=20 > I see this on my G5, and I think it's due to the amount of RAM in the = machine. More than 4gb seems to confuse open firmware when called by = FreeBSD. There is some effort to remove the need of the callbacks but = thus far it's not far along. The good news is that after it boots it's = solid except when switching vtys, buy earlier this year or last year I = added a sysctl hack to disable the call into open firmware on vty switch = (don't recall offhand and not at my computer right now, but if you grep = the sysctl output for reset and ofw you can find it). >=20 > -Justin >=20 > On Sep 16, 2014 8:01 PM, "Mark Millard" wrote: > I've now spent time with rebooting and power-off/power-on for all 3 = PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 get the >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > when they fail just before the Copyright notice would normally be = displayed. None fail any earlier. At that spot none have failed any = other way. It is the same SSD in all 3. (Happens with other SSD's as = well.) Overall there is a mix of Radeon and NVIDIA display boards. = Besides the SSD use and RAM upgrades the rest is stock equipment. scons = used, not vt. (I've yet to try vt.) >=20 > Seeing a failure after the Copyright notice as been fairly rare in all = my experiments from when I started last April or so. The ones that I've = noted had Data Storage Interrupt reported. So far no examples of the = above have been reported after the Copyright notice. So I'd guess that = they are separate issues. Of course it seems that only in the last few = days would I have seen the above sort of thing if it did happen after = the Copyright notice: The prior history does not count for judgements = about that. >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 16, 2014, at 8:15 AM, Mark Millard wrote: >=20 > Using 10.1-BETA1 I added "options DDB" and "options GDB" to = powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D,= and WITH_DEBUG=3D in /etc/make.conf.) So buildworld, kernel was = basically just set up to have more of a debugging context around = (including for any ports builds). >=20 > The result was new information about the PowerMac G5 boot hangups: The = screen is no longer blank when the G5 is hung up without there being a = Copyright notice yet. It says... >=20 >> GDB: no debug ports present >> KDB: debugger backends: DDB >> KDB: current backend: DDB >> [ thread pid -1 tid 1006665719 ] >> Stopped at 0: illegal instruction 0 >> db> >=20 > (I had no ability to input at that point.) Normally the Copyright = notice would have displayed instead of "[...]" and what follows. (I do = not claim to have all the spacing, capitalization, and such correct = above.) >=20 > That text is constant from hang to hang when it hangs just before it = would normally output the Copyright notice: The numbers do not vary, = much less the other text. It has never failed until after the two KDB = messages are present. So far I've only tested one PowerMac G5, booting = over and over for a few hours. >=20 >=20 >=20 > (I do not claim to be set up for remote kernel debugging. I just = decided to let GDB go along for the ride when I added DDB.) >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 >=20 >=20 >=20 >=20 >=20 From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 21:09:04 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BD5C4C1B for ; Thu, 25 Sep 2014 21:09:04 +0000 (UTC) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 93189FEC for ; Thu, 25 Sep 2014 21:09:04 +0000 (UTC) Received: from aurora.physics.berkeley.edu (aurora.Physics.Berkeley.EDU [128.32.117.67]) (authenticated bits=0) by d.mail.sonic.net (8.14.9/8.14.9) with ESMTP id s8PL8tHA003153 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Thu, 25 Sep 2014 14:08:55 -0700 Message-ID: <54248467.4050900@freebsd.org> Date: Thu, 25 Sep 2014 14:08:55 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Mark Millard Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> In-Reply-To: <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> X-Sonic-CAuth: UmFuZG9tSVbF5MVs0UxQKkX/oWeYrjK+vugZ1Wk4Y1FWoYisuNKFCSiukDi5OT5/o8xkc8VmLF/Qt0gA85k65l51LViLce7TZkz9nIiiURI= X-Sonic-ID: C;LHbkKPhE5BGeZQDu5Qupew== M;jBIQKfhE5BGeZQDu5Qupew== X-Spam-Flag: Unknown X-Sonic-Spam-Details: not scanned (too big) by cerberusd Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 21:09:04 -0000 Can you comment out the call to quiesce? It may not be necessary on your system. -Nathan On 09/25/14 13:17, Mark Millard wrote: > The "before copyright" hang/exception is during the first openfirmware > "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) > completes fine, the ofwcall(args) is made but it does not return normally. > > Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's is: > > quiesce > finddevice > parent > getprop > getprop > getprop > finddevice > getprop > instance-to-package > getproplen > finddevice > getprop > getprop > peer > > And when the boot fails before the copyright that ofwcall for peer > ends up resulting in the register dump with no register pointing to > the kernel's normal stack area. > > I still have no clue what is happening during peer. > ofw_restore_trap_vec(save_trap_init) is being called and is returning > before ofwcall is used. For all I know some uses of peer could require > not being quiesce'd in order for peer to be reliable. > > In the form of my display indicating what executed the text reported > ends in: > > ^ > > where the ^ indicates the stage that last completed in the call > sequence inside openfirmware_core. This information is displayed by the > > x/s ofw_name_history > > in the automatically created default script for DDB. I read the > sequence backwards from the end marker (here ^), following the > wraparound if there is that much text and if I care to go back that far. > > FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep > 25 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 powerpc > > My current hacks to get this information are: > > Index: /usr/src/sys/ddb/db_script.c > =================================================================== > --- /usr/src/sys/ddb/db_script.c(revision 271944) > +++ /usr/src/sys/ddb/db_script.c(working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > > > +/* HACK!!! : Additional lines to force a basic default script to exist. > +* Will dump information even if ddb input is not available for early > crash. > +* Used to get more information about PowerMac G5 "before Copyright" > hangs. > +*/ > +struct ddb_script *dsp = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > +if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; > bt; x/s ofw_name_history"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) == ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > +/* HACK!!! : Additional lines to always use the default script, > +* even if scriptname existed and was executed. > +* Will dump information even if ddb input is not available for early > crash. > +* Used to get more information about PowerMac G5 "before Copyright" > hangs. > +*/ > +else > +(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > > > /*- > Index: /usr/src/sys/powerpc/conf/GENERIC64 > =================================================================== > --- /usr/src/sys/powerpc/conf/GENERIC64(revision 271944) > +++ /usr/src/sys/powerpc/conf/GENERIC64(working copy) > @@ -76,6 +76,8 @@ > # Debugging support. Always need this: > options KDB# Enable kernel debugger support. > options KDB_TRACE# Print a stack trace for a panic. > +options DDB > +options GDB > > > # Make an SMP-capable kernel by default > options SMP# Symmetric MultiProcessor Kernel > Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c > =================================================================== > --- /usr/src/sys/powerpc/ofw/ofw_machdep.c(revision 271944) > +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c(working copy) > @@ -324,6 +324,12 @@ > openfirmware(&args); > } > > > +/* Part of HACK to have record of ofw call names */ > +#define ofw_name_history_record_size 256 > +char ofw_name_history[ofw_name_history_record_size+1] = {}; /* > Initially: automatically '\0' filled */ > +char * ofw_name_history_pos = ofw_name_history; > +/* End Part of HACK */ > + > static int > openfirmware_core(void *args) > { > @@ -330,6 +336,42 @@ > intresult; > register_toldmsr; > > > +{ /* HACK to have record of ofw call names */ > +struct argtype_prefix { > +cell_t name; > +}; > + > +char *name = (char*) (uintptr_t) (((struct argtype_prefix*)args)->name); > + > +int i; > + > +*ofw_name_history_pos = '<'; > + > +for(i=0; (*name) && i!=20; i++) { > +ofw_name_history_pos++; > +if (ofw_name_history_pos == > &ofw_name_history[ofw_name_history_record_size]) { > +ofw_name_history_pos = ofw_name_history; > +} > +*ofw_name_history_pos = *name; > + > +name++; > +} > + > +ofw_name_history_pos++; > +if (ofw_name_history_pos == > &ofw_name_history[ofw_name_history_record_size]) { > +ofw_name_history_pos = ofw_name_history; > +} > +*ofw_name_history_pos = '>'; > + > +ofw_name_history_pos++; > +if (ofw_name_history_pos == > &ofw_name_history[ofw_name_history_record_size]) { > +ofw_name_history_pos = ofw_name_history; > +} > +*ofw_name_history_pos = '@'; > + > +ofw_name_history[ofw_name_history_record_size] = '\0'; /* Paranoia */ > +} /* HACK end */ > + > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > @@ -337,14 +379,22 @@ > */ > oldmsr = intr_disable(); > > > +*ofw_name_history_pos = '#'; /* HACK */ > + > ofw_sprg_prepare(); > > > +*ofw_name_history_pos = '$'; /* HACK */ > + > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > > > +*ofw_name_history_pos = '%'; /* HACK */ > + > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > > > +*ofw_name_history_pos = '^'; /* HACK */ > + > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > @@ -357,13 +407,21 @@ > > > result = ofwcall(args); > > > +*ofw_name_history_pos = '&'; /* HACK */ > + > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); > > > +*ofw_name_history_pos = '*'; /* HACK */ > + > ofw_sprg_restore(); > > > +*ofw_name_history_pos = '~'; /* HACK */ > + > intr_restore(oldmsr); > > > +*ofw_name_history_pos = '!'; /* HACK */ > + > return (result); > } > > > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 25, 2014, at 3:46 AM, Mark Millard > wrote: > > One source code oddity that I notice is the following mixed use of > ofw_real_mode: always tested vs. never tested (#if 0 ... #endif) ... > >> /* >> * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the >> callback. >> */ >> register_t ofw_sprg0_save; >> >> static __inline void >> ofw_sprg_prepare(void) >> { >> if (ofw_real_mode) >> return; >> >> /* >> * Assume that interrupt are disabled at this point, or >> * SPRG1-3 could be trashed >> */ >> __asm __volatile("mfsprg0 %0\n\t" >> "mtsprg0 %1\n\t" >> "mtsprg1 %2\n\t" >> "mtsprg2 %3\n\t" >> "mtsprg3 %4\n\t" >> : "=&r"(ofw_sprg0_save) >> : "r"(ofmsr[1]), >> "r"(ofmsr[2]), >> "r"(ofmsr[3]), >> "r"(ofmsr[4])); >> } >> >> static __inline void >> ofw_sprg_restore(void) >> { >> #if 0 >> if (ofw_real_mode) >> return; >> #endif >> >> /* >> * Note that SPRG1-3 contents are irrelevant. They are scratch >> * registers used in the early portion of trap handling when >> * interrupts are disabled. >> * >> * PCPU data cannot be used until this routine is called ! >> */ >> __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save)); >> } > > It would seem that for ofw_real_mode != 0 that ofw_sprg_prepare would > never set up ofw_sprg0_save (via mfsprg0) for the later > ofw_sprg_restore's always-executed mtsprg0 that is based on > ofw_sprg0_save. > > register_t seems to trace back to __int64_t --and that would leave > ofw_sprg0_save initialized to zero as a global and that would have to > be okay as the SPRG0 value to restore in such a case. (I have not > tracked down what any of the per-processor values for SPRG0 > are/should-be.) > > > > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 25, 2014, at 2:12 AM, Mark Millard > wrote: > > The register dump that has no kernel stack addresses in any registers > does have register contents suggesting a ofwcall use, matching up > reasonably with the code I looked at that is related to ofwcall. > ofwcall is only reached via openfirmware_core from what I can tell. > (If there are other paths into openfirmware than via ofwcall then the > register dump suggests that they are not in use around the crash.) > > And openfirmware_core has logic for exception vector swapping, going > both directions: > >> static int >> openfirmware_core(void *args) >> { >> int result; >> register_t oldmsr; >> >> /* >> * Turn off exceptions - we really don't want to end up >> * anywhere unexpected with PCPU set to something strange >> * or the stack pointer wrong. >> */ >> oldmsr = intr_disable(); >> >> ofw_sprg_prepare(); >> >> /* Save trap vectors */ >> ofw_save_trap_vec(save_trap_of); >> >> /* Restore initially saved trap vectors */ >> ofw_restore_trap_vec(save_trap_init); >> >> #if defined(AIM) && !defined(__powerpc64__) >> /* >> * Clear battable[] translations >> */ >> if (!(cpu_features & PPC_FEATURE_64)) >> __asm __volatile("mtdbatu 2, %0\n" >> "mtdbatu 3, %0" : : "r" (0)); >> isync(); >> #endif >> >> result = ofwcall(args); >> >> /* Restore trap vecotrs */ >> ofw_restore_trap_vec(save_trap_of); >> >> ofw_sprg_restore(); >> >> intr_restore(oldmsr); >> >> return (result); >> } > > In turn openfirmware_core is used only by ofw_rendezvous_dispatch and > in turn that is used only by openfirmware: only PCPU_GET(cpuid) == 0 > does the above. save_trap_init is initialized by powerpc_init using > ofw_save_trap_vec. > > [Note that ofw_restore_trap_vec uses __syncicache which does not use > dcbf after the bcopy but instead uses dcbst: That is part of what lead > my investigation into the distinction --and so to my more overall > dcbst vs. dcbf use questions after proving dcbf would not be > sufficient for a fix to the specific boot issue.] > > Unless the initialization of save_trap_init ends up with the wrong > contents for openfirmware it would appear that the exception vectors > are kept tracking by the above code. But the above does assume that > the openfirmware vectors are unchanged after save_trap_init is > initialized: there is no attempt at tracking of any potential updates > to the openfirmware exception vectors. > > I would infer then that after ofw_restore_trap_vec(save_trap_of) is > executed is when the exception that DDB reports happened: That is when > FreeBSD's exception vectors are again in place. But a stack pointer > into the kernel stack is not then in place in any register (based on > DDB's register dump): stack handling is messed up already by the point > of the reported exception. And that may actually be why an illegal > instruction at address zero was reached: an incorrect stack context > used to get an address to execute at. > > > > > === > Mark Millard > markmi at dsl-only.net > > On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn freebsd.org > wrote: > > There shouldn't be any exceptions at that point, nested or otherwise. > What I suspect is happening is that Open Firmware has turned them on > for some bizarre reason, taken one, and ended up in the kernel's > handlers but with the Open Firmware environment. Saving and restoring > the OF interrupt vectors would be a possible solution; flattening the > device tree in loader so that the kernel doesn't call Open Firmware at > all would be another. I think Justin may have tried the first at some > point. > -Nathan > > On 09/24/14 02:04, Mark Millard wrote: >> Now that I've had a kernel/boot crash with a successful DDB bt and >> show registers (a different submittal) it makes for a good >> comparison/contrast with what DDB reports for this "before copyright" >> crash. >> >> Something unique to the "before copyright" context is... >> >> No registers are reported to have values that point into the range >> between tmpstk and esym. >> >> In other words: There is no valid stack pointer reported as far as I >> can tell. r1 has the value 0 instead of being a handling a valid >> stack address. tmpstk=0xbd7000 and esym=0xbdb000 (example for one of >> my WITH_DEBUG_FILES= and options DDB and GDB builds of 10.1-BETA2). >> That at least gives a ball park on the range to expect for pointing >> into the stack even with some build variation. >> >> It leaves me wondering if the DDB report is for a nested exception >> handling. That could explain why lr points to u_trap+0x10 and srr0 >> points to k_trap+0x28 when normally srr0 would point to the the >> failing instruction (or the instruction after) and lr to where that >> routine would normally return to. >> >> The register values that are reported for my 10.1-BETA2 builds that >> crash before the copyright notice are: >> >> r0: 0 >> r1: 0 >> r2: 0xc81538 vop_unlock_desc >> r3: 0xd18868 >> r4: 0x894b58 >> r5: 0 >> r6: 0xc1dee0 M_AUDITBSM >> r7: 0xe3f818 ofw_real_mode >> r8: 0x1 >> r9: 0xe0f580 __pcpu >> r10: 0x1c35ec0 >> r11: 0 >> r12: 0x10000000 >> r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as >> 0xdbb290.) >> r14-r19: all 0 >> r20: 0x10c1000 >> r21: 0x4 >> r22: 0x180abd4 >> r23: 0x1803a28 >> r24: 0xc000000000008760 >> r25: 0xcc89b8 smp_no... >> r26: 0xcea108 ofw_rend... >> r27: 0x894b58 ofwcall+0xa8 >> r28: 0x894b58 ofwcall+0xa8 >> r29: 2400022 >> r30: 9000000000001032 >> r31: 0xbb7d38 >> >> srr0: 0x102720 k_trap+0x28 >> srr1: 9000000000001032 >> lr: 0x1026f0 u_trap+0x10 >> ctr: 0xff846d78 >> cr: 2000deb0 >> xer: 0 >> dar: f...d50 (lots of f's) >> dsisr: 42000000 >> >> >> >> >> >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> On Sep 20, 2014, at 3:42 PM, Mark Millard > > wrote: >> >> [I corrected the SSR0 in the subject to be SRR0.] >> >> I did miss a register in my list (it matched the shown r30 value). >> And it turns out to probably be very important to interpreting what >> the "show registers" is reporting: >> >> SRR1: 0x9000000000001032 >> >> But bits 43-46 of SRR1 are supposed to indicate which type of Program >> Exception, using a single binary 1 to so. No such 1's are present. >> >> Illegal instruction would have been bit 44 being 1. (PowerPC has the >> upper bit numbered zero and increases from there.) >> >> So the ddb "show registers" is apparently not reporting the status as >> of when the "stopped at 0 illegal instruction 0" happened. Thus other >> things are also likely not from that exact time frame. >> >> >> >> And I misinterpreted the LR value status: The LR value was just left >> over from the restore_kernsrs returning when it finished. Execution >> then flowed into k_trap. Nothing unusual involved. >> >> >> >> >> >> === >> Mark Millard >> markmi@dsl-only.net >> >> On Sep 18, 2014, at 8:57 PM, Mark Millard > > wrote: >> >> I modified DDB to automatically "show registers" even at the early >> "before Copyright" crash time. The end of this note will show the >> /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB >> bt, the bt does not actually print a back trace for this context. (It >> might for others.) >> >> The registers give interesting context despite the lack of a back >> trace. I do not know if it will be sufficient to be of much immediate >> help if someone used the information to start looking at the problem. >> >> I'll start with register lr: 0x1026f0 u_trap+0x10. >> >> /usr/src/sys/powerpc/aim/trap_subr64.S has: >> >> s_trap: >> bf 17,k_trap /* branch if PSL_PR is false */ >> GET_CPUINFO(%r1) >> u_trap: >> ld %r1,PC_CURPCB(%r1) >> mr %r27,%r28 /* Save LR, r29 */ >> mtsprg2 %r29 >> bl restore_kernsrs /* enable kernel mapping */ >> mfsprg2 %r29 >> mr %r28,%r27 >> >> /* >> * Now the common trap catching code. >> */ >> k_trap: >> FRAME_SETUP(PC_TEMPSAVE) >> /* Call C interrupt dispatcher: */ >> trapagain: >> >> and so this appears to indicate a pending return to execute the >> "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that >> restore_kernsrs should be active. >> >> But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in >> FRAME_SETUP(PC_TEMPSAVE) someplace.) >> >> So it appears to me that the processor got to the k_trap code during >> the supposed restore_kernsrs time frame. (But I'm no expert at these >> sorts of things or for the processor.) >> >> I'll list the other register values: >> >> r0: 0 >> r1: 0 >> r2: 0xc1be80 M_AUDITBSM >> r3: 0xb16138 >> r4: 0x8926e8 .ofwcall+0xa8 >> r5: 0 >> r6: 0xbb5f90 >> r7: 0xe3d118 ofw_real_mode >> r8: 0x1 >> r9: 0xe0ce80 __pcpu >> r10: 0x1c35ec9 >> r11: 0 >> r12: 0x10000000 >> r13: db890 thread0 >> r14-r19: all 0 >> r20: 0x10bc000 >> r21: 0x4 >> r22: 0x1801db4 >> r23: 0x1803a28 >> r24: 0xc000000000008760 >> r25: 0xcc6908 smp_no_rendevous_barrier >> r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv) >> r27: 0x8926e8 .ofwcall+0xa8 >> r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) >> r29: 0x24000022 >> r30: 0x9000000000001032 >> r31: 0xc7f488 vop_unlock_desc >> >> ctr: 0xff846d78 >> cr: 0x2000d7b0 >> xer: 0 >> dar: 0xfffffffffffffd50 >> dsisr: 0x42000000 >> >> (Hopefully this manual transcription from the screen display is >> complete --and also accurate for what it does present.) >> >> >> >> >> The personal HACK to /usr/src/sys/ddb/db_script.c's >> db_script_kdbenter(...) to have it show registers and try bt... >> >> $ cd /usr/src/sys/ddb/ >> $ svnlite diff . >> Index: db_script.c >> =================================================================== >> --- db_script.c(revision 271610) >> +++ db_script.c(working copy) >> @@ -319,10 +319,25 @@ >> { >> char scriptname[DB_MAXSCRIPTNAME]; >> >> +/* HACK!!! : Additional lines to force a basic default script to exist. >> +* Will dump information even if ddb input is not available for early >> crash. >> +* Used to get more information about PowerMac G5 "before Copyright" >> hangs. >> +*/ >> +struct ddb_script *dsp = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); >> +if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; >> bt"); >> + >> snprintf(scriptname, sizeof(scriptname), "%s.%s", >> DB_SCRIPT_KDBENTER_PREFIX, eventname); >> if (db_script_exec(scriptname, 0) == ENOENT) >> (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); >> + >> +/* HACK!!! : Additional lines to always use the default script, >> +* even if scriptname existed and was executed. >> +* Will dump information even if ddb input is not available for early >> crash. >> +* Used to get more information about PowerMac G5 "before Copyright" >> hangs. >> +*/ >> +else >> +(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); >> } >> >> /*- >> >> >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> On Sep 16, 2014, at 9:28 PM, Mark Millard > > wrote: >> >> In part I sent directly to you because of a past exchange (July-27) >> where you had written: >> >>> Nathan and I both speculate that it's >>> dropping into Open Firmware (we make extensive use of OFW), and then >>> messing something up, taking a page fault or something. >> >> The specific text that I report and its uniformity when it is >> produced seems to add a little information beyond a speculated "page >> fault or something" and so might eventually help a little. As I >> understand the text it is reporting execution reaching address zero >> without any prior un-handled exceptions or other such that would stop >> it. A corrupted stack (pointer) so a bad return address or some such? >> I'd guess there are no explicit jumps to address zero so I expect >> that indirection is likely involved, with the content for the >> indirection messed up. >> >> I really wish that I had a logic analyzer configuration for this. >> I've not found a way to make the failing context visible so far and >> the extra way of looking at things might have helped. >> >> >> >> >> === >> Mark Millard >> markmi@dsl-only.net >> >> On Sep 16, 2014, at 8:28 PM, Justin Hibbits > > wrote: >> >> Hi mark, >> >> I see this on my G5, and I think it's due to the amount of RAM in the >> machine. More than 4gb seems to confuse open firmware when called by >> FreeBSD. There is some effort to remove the need of the callbacks but >> thus far it's not far along. The good news is that after it boots >> it's solid except when switching vtys, buy earlier this year or last >> year I added a sysctl hack to disable the call into open firmware on >> vty switch (don't recall offhand and not at my computer right now, >> but if you grep the sysctl output for reset and ofw you can find it). >> >> -Justin >> >> On Sep 16, 2014 8:01 PM, "Mark Millard" > > wrote: >> >> I've now spent time with rebooting and power-off/power-on for all >> 3 PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all >> 3 get the >> >>> GDB: no debug ports present >>> KDB: debugger backends: DDB >>> KDB: current backend: DDB >>> [ thread pid -1 tid 1006665719 ] >>> Stopped at 0: illegal instruction 0 >>> db> >> >> when they fail just before the Copyright notice would normally be >> displayed. None fail any earlier. At that spot none have failed >> any other way. It is the same SSD in all 3. (Happens with other >> SSD's as well.) Overall there is a mix of Radeon and NVIDIA >> display boards. Besides the SSD use and RAM upgrades the rest is >> stock equipment. scons used, not vt. (I've yet to try vt.) >> >> Seeing a failure after the Copyright notice as been fairly rare >> in all my experiments from when I started last April or so. The >> ones that I've noted had Data Storage Interrupt reported. So far >> no examples of the above have been reported after the Copyright >> notice. So I'd guess that they are separate issues. Of course it >> seems that only in the last few days would I have seen the above >> sort of thing if it did happen after the Copyright notice: The >> prior history does not count for judgements about that. >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> On Sep 16, 2014, at 8:15 AM, Mark Millard > > wrote: >> >> Using 10.1-BETA1 I added "options DDB" and "options GDB" to >> powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=, >> WITHOUT_CLANG=, and WITH_DEBUG= in /etc/make.conf.) So >> buildworld, kernel was basically just set up to have more of a >> debugging context around (including for any ports builds). >> >> The result was new information about the PowerMac G5 boot >> hangups: The screen is no longer blank when the G5 is hung up >> without there being a Copyright notice yet. It says... >> >>> GDB: no debug ports present >>> KDB: debugger backends: DDB >>> KDB: current backend: DDB >>> [ thread pid -1 tid 1006665719 ] >>> Stopped at 0: illegal instruction 0 >>> db> >> >> (I had no ability to input at that point.) Normally the Copyright >> notice would have displayed instead of "[...]" and what follows. >> (I do not claim to have all the spacing, capitalization, and such >> correct above.) >> >> That text is constant from hang to hang when it hangs just before >> it would normally output the Copyright notice: The numbers do not >> vary, much less the other text. It has never failed until after >> the two KDB messages are present. So far I've only tested one >> PowerMac G5, booting over and over for a few hours. >> >> >> >> (I do not claim to be set up for remote kernel debugging. I just >> decided to let GDB go along for the ride when I added DDB.) >> >> === >> Mark Millard >> markmi at dsl-only.net >> >> >> >> >> >> > > > > From owner-freebsd-ppc@FreeBSD.ORG Thu Sep 25 21:41:47 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7D5775B1 for ; Thu, 25 Sep 2014 21:41:47 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5B67266D for ; Thu, 25 Sep 2014 21:41:45 +0000 (UTC) Received: (qmail 3511 invoked from network); 25 Sep 2014 21:41:44 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 25 Sep 2014 21:41:44 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Thu, 25 Sep 2014 17:41:44 -0400 (EDT) Received: (qmail 18190 invoked from network); 25 Sep 2014 21:41:35 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 25 Sep 2014 21:41:35 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id BC8421C402B; Thu, 25 Sep 2014 14:41:29 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <54248467.4050900@freebsd.org> Date: Thu, 25 Sep 2014 14:41:29 -0700 Message-Id: References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> <54248467.4050900@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Sep 2014 21:41:47 -0000 The first boot after make -8 kernel without quiesce also died during = peer, I'd guess the same one. Looks like quiesce does not matter for the issue. (But it is handy for = identifying which peer fails.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:08 PM, Nathan Whitehorn wrote: Can you comment out the call to quiesce? It may not be necessary on your = system. -Nathan On 09/25/14 13:17, Mark Millard wrote: > The "before copyright" hang/exception is during the first openfirmware = "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) = completes fine, the ofwcall(args) is made but it does not return = normally. >=20 > Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's = is: >=20 > quiesce > finddevice > parent > getprop > getprop > getprop > finddevice > getprop > instance-to-package > getproplen > finddevice > getprop > getprop > peer >=20 > And when the boot fails before the copyright that ofwcall for peer = ends up resulting in the register dump with no register pointing to the = kernel's normal stack area. >=20 > I still have no clue what is happening during peer. = ofw_restore_trap_vec(save_trap_init) is being called and is returning = before ofwcall is used. For all I know some uses of peer could require = not being quiesce'd in order for peer to be reliable. >=20 > In the form of my display indicating what executed the text reported = ends in: >=20 > ^ >=20 > where the ^ indicates the stage that last completed in the call = sequence inside openfirmware_core. This information is displayed by the >=20 > x/s ofw_name_history >=20 > in the automatically created default script for DDB. I read the = sequence backwards from the end marker (here ^), following the = wraparound if there is that much text and if I care to go back that far. >=20 > FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep = 25 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc >=20 > My current hacks to get this information are: >=20 > Index: /usr/src/sys/ddb/db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/ddb/db_script.c (revision 271944) > +++ /usr/src/sys/ddb/db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt; x/s ofw_name_history"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- > Index: /usr/src/sys/powerpc/conf/GENERIC64 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) > +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) > @@ -76,6 +76,8 @@ > # Debugging support. Always need this: > options KDB # Enable kernel debugger support. > options KDB_TRACE # Print a stack trace for a panic. > +options DDB > +options GDB > =20 > # Make an SMP-capable kernel by default > options SMP # Symmetric MultiProcessor Kernel > Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944) > +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy) > @@ -324,6 +324,12 @@ > openfirmware(&args); > } > =20 > +/* Part of HACK to have record of ofw call names */ > +#define ofw_name_history_record_size 256 > +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* = Initially: automatically '\0' filled */ > +char * ofw_name_history_pos =3D ofw_name_history; > +/* End Part of HACK */ > + > static int > openfirmware_core(void *args) > { > @@ -330,6 +336,42 @@ > int result; > register_t oldmsr; > =20 > + { /* HACK to have record of ofw call names */ > + struct argtype_prefix { > + cell_t name; > + }; > + > + char *name =3D (char*) (uintptr_t) (((struct = argtype_prefix*)args)->name); > +=20 > + int i; > + > + *ofw_name_history_pos =3D '<'; > + > + for(i=3D0; (*name) && i!=3D20; i++) { > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D *name; > + > + name++; > + } > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '>'; > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '@'; > + > + ofw_name_history[ofw_name_history_record_size] =3D '\0'; /* Paranoia = */ > + } /* HACK end */ > + > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > @@ -337,14 +379,22 @@ > */ > oldmsr =3D intr_disable(); > =20 > + *ofw_name_history_pos =3D '#'; /* HACK */ > + > ofw_sprg_prepare(); > =20 > + *ofw_name_history_pos =3D '$'; /* HACK */ > + > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '%'; /* HACK */ > + > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > + *ofw_name_history_pos =3D '^'; /* HACK */ > + > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > @@ -357,13 +407,21 @@ > =20 > result =3D ofwcall(args); > =20 > + *ofw_name_history_pos =3D '&'; /* HACK */ > + > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '*'; /* HACK */ > + > ofw_sprg_restore(); > =20 > + *ofw_name_history_pos =3D '~'; /* HACK */ > + > intr_restore(oldmsr); > =20 > + *ofw_name_history_pos =3D '!'; /* HACK */ > + > return (result); > } >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 25, 2014, at 3:46 AM, Mark Millard = wrote: >=20 > One source code oddity that I notice is the following mixed use of = ofw_real_mode: always tested vs. never tested (#if 0 ... #endif) ... >=20 >> /* >> * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the = callback. >> */ >> register_t ofw_sprg0_save; >>=20 >> static __inline void >> ofw_sprg_prepare(void) >> { >> if (ofw_real_mode) >> return; >>=20 >> /* >> * Assume that interrupt are disabled at this point, or >> * SPRG1-3 could be trashed >> */ >> __asm __volatile("mfsprg0 %0\n\t" >> "mtsprg0 %1\n\t" >> "mtsprg1 %2\n\t" >> "mtsprg2 %3\n\t" >> "mtsprg3 %4\n\t" >> : "=3D&r"(ofw_sprg0_save) >> : "r"(ofmsr[1]), >> "r"(ofmsr[2]), >> "r"(ofmsr[3]), >> "r"(ofmsr[4])); >> } >> =20 >> static __inline void >> ofw_sprg_restore(void) >> { >> #if 0 >> if (ofw_real_mode) >> return; >> #endif >>=20 >> /* >> * Note that SPRG1-3 contents are irrelevant. They are = scratch >> * registers used in the early portion of trap handling when >> * interrupts are disabled. >> * >> * PCPU data cannot be used until this routine is called ! >> */ >> __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save)); >> } >=20 > It would seem that for ofw_real_mode !=3D 0 that ofw_sprg_prepare = would never set up ofw_sprg0_save (via mfsprg0) for the later = ofw_sprg_restore's always-executed mtsprg0 that is based on = ofw_sprg0_save. >=20 > register_t seems to trace back to __int64_t --and that would leave = ofw_sprg0_save initialized to zero as a global and that would have to be = okay as the SPRG0 value to restore in such a case. (I have not tracked = down what any of the per-processor values for SPRG0 are/should-be.) >=20 >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 25, 2014, at 2:12 AM, Mark Millard = wrote: >=20 > The register dump that has no kernel stack addresses in any registers = does have register contents suggesting a ofwcall use, matching up = reasonably with the code I looked at that is related to ofwcall. ofwcall = is only reached via openfirmware_core from what I can tell. (If there = are other paths into openfirmware than via ofwcall then the register = dump suggests that they are not in use around the crash.) >=20 > And openfirmware_core has logic for exception vector swapping, going = both directions: >=20 >> static int >> openfirmware_core(void *args) >> { >> int result; >> register_t oldmsr; >> =20 >> /* >> * Turn off exceptions - we really don't want to end up >> * anywhere unexpected with PCPU set to something strange >> * or the stack pointer wrong. >> */ >> oldmsr =3D intr_disable(); >> =20 >> ofw_sprg_prepare(); >> =20 >> /* Save trap vectors */ >> ofw_save_trap_vec(save_trap_of); >> =20 >> /* Restore initially saved trap vectors */ >> ofw_restore_trap_vec(save_trap_init); >> =20 >> #if defined(AIM) && !defined(__powerpc64__) >> /* >> * Clear battable[] translations >> */ >> if (!(cpu_features & PPC_FEATURE_64)) >> __asm __volatile("mtdbatu 2, %0\n" >> "mtdbatu 3, %0" : : "r" (0)); >> isync(); >> #endif >>=20 >> result =3D ofwcall(args); >>=20 >> /* Restore trap vecotrs */ >> ofw_restore_trap_vec(save_trap_of); >>=20 >> ofw_sprg_restore(); >>=20 >> intr_restore(oldmsr); >>=20 >> return (result); >> } >=20 > In turn openfirmware_core is used only by ofw_rendezvous_dispatch and = in turn that is used only by openfirmware: only PCPU_GET(cpuid) =3D=3D 0 = does the above. save_trap_init is initialized by powerpc_init using = ofw_save_trap_vec. >=20 > [Note that ofw_restore_trap_vec uses __syncicache which does not use = dcbf after the bcopy but instead uses dcbst: That is part of what lead = my investigation into the distinction --and so to my more overall dcbst = vs. dcbf use questions after proving dcbf would not be sufficient for a = fix to the specific boot issue.] >=20 > Unless the initialization of save_trap_init ends up with the wrong = contents for openfirmware it would appear that the exception vectors are = kept tracking by the above code. But the above does assume that the = openfirmware vectors are unchanged after save_trap_init is initialized: = there is no attempt at tracking of any potential updates to the = openfirmware exception vectors. >=20 > I would infer then that after ofw_restore_trap_vec(save_trap_of) is = executed is when the exception that DDB reports happened: That is when = FreeBSD's exception vectors are again in place. But a stack pointer into = the kernel stack is not then in place in any register (based on DDB's = register dump): stack handling is messed up already by the point of the = reported exception. And that may actually be why an illegal instruction = at address zero was reached: an incorrect stack context used to get an = address to execute at. >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn wrote: >=20 > There shouldn't be any exceptions at that point, nested or otherwise. = What I suspect is happening is that Open Firmware has turned them on for = some bizarre reason, taken one, and ended up in the kernel's handlers = but with the Open Firmware environment. Saving and restoring the OF = interrupt vectors would be a possible solution; flattening the device = tree in loader so that the kernel doesn't call Open Firmware at all = would be another. I think Justin may have tried the first at some point. > -Nathan >=20 > On 09/24/14 02:04, Mark Millard wrote: >> Now that I've had a kernel/boot crash with a successful DDB bt and = show registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. >>=20 >> Something unique to the "before copyright" context is... >>=20 >> No registers are reported to have values that point into the range = between tmpstk and esym. >>=20 >> In other words: There is no valid stack pointer reported as far as I = can tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. >>=20 >> It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. >>=20 >> The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: >>=20 >> r0: 0 >> r1: 0 >> r2: 0xc81538 vop_unlock_desc >> r3: 0xd18868 >> r4: 0x894b58 >> r5: 0 >> r6: 0xc1dee0 M_AUDITBSM >> r7: 0xe3f818 ofw_real_mode >> r8: 0x1 >> r9: 0xe0f580 __pcpu >> r10: 0x1c35ec0 >> r11: 0 >> r12: 0x10000000 >> r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) >> r14-r19: all 0 >> r20: 0x10c1000 >> r21: 0x4 >> r22: 0x180abd4 >> r23: 0x1803a28 >> r24: 0xc000000000008760 >> r25: 0xcc89b8 smp_no... >> r26: 0xcea108 ofw_rend... >> r27: 0x894b58 ofwcall+0xa8 >> r28: 0x894b58 ofwcall+0xa8 >> r29: 2400022 >> r30: 9000000000001032 >> r31: 0xbb7d38 >>=20 >> srr0: 0x102720 k_trap+0x28 >> srr1: 9000000000001032 >> lr: 0x1026f0 u_trap+0x10 >> ctr: 0xff846d78 >> cr: 2000deb0 >> xer: 0 >> dar: f...d50 (lots of f's) >> dsisr: 42000000 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >> On Sep 20, 2014, at 3:42 PM, Mark Millard = wrote: >>=20 >> [I corrected the SSR0 in the subject to be SRR0.] >>=20 >> I did miss a register in my list (it matched the shown r30 value). = And it turns out to probably be very important to interpreting what the = "show registers" is reporting: >>=20 >> SRR1: 0x9000000000001032 >>=20 >> But bits 43-46 of SRR1 are supposed to indicate which type of Program = Exception, using a single binary 1 to so. No such 1's are present. >>=20 >> Illegal instruction would have been bit 44 being 1. (PowerPC has the = upper bit numbered zero and increases from there.) >>=20 >> So the ddb "show registers" is apparently not reporting the status as = of when the "stopped at 0 illegal instruction 0" happened. Thus other = things are also likely not from that exact time frame. >>=20 >>=20 >>=20 >> And I misinterpreted the LR value status: The LR value was just left = over from the restore_kernsrs returning when it finished. Execution then = flowed into k_trap. Nothing unusual involved. >>=20 >>=20 >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi@dsl-only.net >>=20 >> On Sep 18, 2014, at 8:57 PM, Mark Millard = wrote: >>=20 >> I modified DDB to automatically "show registers" even at the early = "before Copyright" crash time. The end of this note will show the = /usr/src/sys/ddb/db_script.c diff for the hack. While I also had DDB bt, = the bt does not actually print a back trace for this context. (It might = for others.) >>=20 >> The registers give interesting context despite the lack of a back = trace. I do not know if it will be sufficient to be of much immediate = help if someone used the information to start looking at the problem. >>=20 >> I'll start with register lr: 0x1026f0 u_trap+0x10. >>=20 >> /usr/src/sys/powerpc/aim/trap_subr64.S has: >>=20 >> s_trap: >> bf 17,k_trap /* branch if PSL_PR is false = */ >> GET_CPUINFO(%r1) >> u_trap: >> ld %r1,PC_CURPCB(%r1) >> mr %r27,%r28 /* Save LR, r29 */ >> mtsprg2 %r29 >> bl restore_kernsrs /* enable kernel mapping */ >> mfsprg2 %r29 >> mr %r28,%r27 >>=20 >> /* >> * Now the common trap catching code. >> */ >> k_trap: >> FRAME_SETUP(PC_TEMPSAVE) >> /* Call C interrupt dispatcher: */ >> trapagain: >>=20 >> and so this appears to indicate a pending return to execute the = "mfsprg2 %r29" after "bl restore_kernsrs", which indicates that = restore_kernsrs should be active. >>=20 >> But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in = FRAME_SETUP(PC_TEMPSAVE) someplace.) >>=20 >> So it appears to me that the processor got to the k_trap code during = the supposed restore_kernsrs time frame. (But I'm no expert at these = sorts of things or for the processor.) >>=20 >> I'll list the other register values: >>=20 >> r0: 0 >> r1: 0 >> r2: 0xc1be80 M_AUDITBSM >> r3: 0xb16138 >> r4: 0x8926e8 .ofwcall+0xa8 >> r5: 0 >> r6: 0xbb5f90 >> r7: 0xe3d118 ofw_real_mode >> r8: 0x1 >> r9: 0xe0ce80 __pcpu >> r10: 0x1c35ec9 >> r11: 0 >> r12: 0x10000000 >> r13: db890 thread0 >> r14-r19: all 0 >> r20: 0x10bc000 >> r21: 0x4 >> r22: 0x1801db4 >> r23: 0x1803a28 >> r24: 0xc000000000008760 >> r25: 0xcc6908 smp_no_rendevous_barrier >> r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other = zv) >> r27: 0x8926e8 .ofwcall+0xa8 >> r28: 0x8926e8 .ofwcall+0xa8 (yep: same value) >> r29: 0x24000022 >> r30: 0x9000000000001032 >> r31: 0xc7f488 vop_unlock_desc >>=20 >> ctr: 0xff846d78 >> cr: 0x2000d7b0 >> xer: 0 >> dar: 0xfffffffffffffd50 >> dsisr: 0x42000000 >>=20 >> (Hopefully this manual transcription from the screen display is = complete --and also accurate for what it does present.) >>=20 >>=20 >>=20 >>=20 >> The personal HACK to /usr/src/sys/ddb/db_script.c's = db_script_kdbenter(...) to have it show registers and try bt... >>=20 >> $ cd /usr/src/sys/ddb/ >> $ svnlite diff . >> Index: db_script.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- db_script.c (revision 271610) >> +++ db_script.c (working copy) >> @@ -319,10 +319,25 @@ >> { >> char scriptname[DB_MAXSCRIPTNAME]; >> =20 >> + /* HACK!!! : Additional lines to force a basic default script to = exist. >> + * Will dump information even if ddb input is not available for = early crash. >> + * Used to get more information about PowerMac G5 "before = Copyright" hangs. >> + */ >> + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); >> + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show = registers; bt"); >> + >> snprintf(scriptname, sizeof(scriptname), "%s.%s", >> DB_SCRIPT_KDBENTER_PREFIX, eventname); >> if (db_script_exec(scriptname, 0) =3D=3D ENOENT) >> (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); >> + >> + /* HACK!!! : Additional lines to always use the default script, >> + * even if scriptname existed and was executed. >> + * Will dump information even if ddb input is not available for = early crash. >> + * Used to get more information about PowerMac G5 "before = Copyright" hangs. >> + */ >> + else >> + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); >> } >> =20 >> /*- >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >> On Sep 16, 2014, at 9:28 PM, Mark Millard = wrote: >>=20 >> In part I sent directly to you because of a past exchange (July-27) = where you had written: >>=20 >>> Nathan and I both speculate that it's >>> dropping into Open Firmware (we make extensive use of OFW), and then >>> messing something up, taking a page fault or something. >>=20 >> The specific text that I report and its uniformity when it is = produced seems to add a little information beyond a speculated "page = fault or something" and so might eventually help a little. As I = understand the text it is reporting execution reaching address zero = without any prior un-handled exceptions or other such that would stop = it. A corrupted stack (pointer) so a bad return address or some such? = I'd guess there are no explicit jumps to address zero so I expect that = indirection is likely involved, with the content for the indirection = messed up. >>=20 >> I really wish that I had a logic analyzer configuration for this. = I've not found a way to make the failing context visible so far and the = extra way of looking at things might have helped. >>=20 >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi@dsl-only.net >>=20 >> On Sep 16, 2014, at 8:28 PM, Justin Hibbits = wrote: >>=20 >> Hi mark, >>=20 >> I see this on my G5, and I think it's due to the amount of RAM in the = machine. More than 4gb seems to confuse open firmware when called by = FreeBSD. There is some effort to remove the need of the callbacks but = thus far it's not far along. The good news is that after it boots it's = solid except when switching vtys, buy earlier this year or last year I = added a sysctl hack to disable the call into open firmware on vty switch = (don't recall offhand and not at my computer right now, but if you grep = the sysctl output for reset and ofw you can find it). >>=20 >> -Justin >>=20 >> On Sep 16, 2014 8:01 PM, "Mark Millard" wrote: >> I've now spent time with rebooting and power-off/power-on for all 3 = PowerMac G5's (one PowerMac7,2 and two PowerMac11,2's) and all 3 get the >>=20 >>> GDB: no debug ports present >>> KDB: debugger backends: DDB >>> KDB: current backend: DDB >>> [ thread pid -1 tid 1006665719 ] >>> Stopped at 0: illegal instruction 0 >>> db> >>=20 >> when they fail just before the Copyright notice would normally be = displayed. None fail any earlier. At that spot none have failed any = other way. It is the same SSD in all 3. (Happens with other SSD's as = well.) Overall there is a mix of Radeon and NVIDIA display boards. = Besides the SSD use and RAM upgrades the rest is stock equipment. scons = used, not vt. (I've yet to try vt.) >>=20 >> Seeing a failure after the Copyright notice as been fairly rare in = all my experiments from when I started last April or so. The ones that = I've noted had Data Storage Interrupt reported. So far no examples of = the above have been reported after the Copyright notice. So I'd guess = that they are separate issues. Of course it seems that only in the last = few days would I have seen the above sort of thing if it did happen = after the Copyright notice: The prior history does not count for = judgements about that. >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >> On Sep 16, 2014, at 8:15 AM, Mark Millard = wrote: >>=20 >> Using 10.1-BETA1 I added "options DDB" and "options GDB" to = powerpc64's GENERIC64. (I also used WITH_DEBUG_FILES=3D, WITHOUT_CLANG=3D,= and WITH_DEBUG=3D in /etc/make.conf.) So buildworld, kernel was = basically just set up to have more of a debugging context around = (including for any ports builds). >>=20 >> The result was new information about the PowerMac G5 boot hangups: = The screen is no longer blank when the G5 is hung up without there being = a Copyright notice yet. It says... >>=20 >>> GDB: no debug ports present >>> KDB: debugger backends: DDB >>> KDB: current backend: DDB >>> [ thread pid -1 tid 1006665719 ] >>> Stopped at 0: illegal instruction 0 >>> db> >>=20 >> (I had no ability to input at that point.) Normally the Copyright = notice would have displayed instead of "[...]" and what follows. (I do = not claim to have all the spacing, capitalization, and such correct = above.) >>=20 >> That text is constant from hang to hang when it hangs just before it = would normally output the Copyright notice: The numbers do not vary, = much less the other text. It has never failed until after the two KDB = messages are present. So far I've only tested one PowerMac G5, booting = over and over for a few hours. >>=20 >>=20 >>=20 >> (I do not claim to be set up for remote kernel debugging. I just = decided to let GDB go along for the ride when I added DDB.) >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >=20 >=20 >=20 >=20 From owner-freebsd-ppc@FreeBSD.ORG Fri Sep 26 03:41:00 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DAE9D27D; Fri, 26 Sep 2014 03:41:00 +0000 (UTC) Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8853BF50; Fri, 26 Sep 2014 03:41:00 +0000 (UTC) Received: by mail-qc0-f170.google.com with SMTP id c9so6159459qcz.1 for ; Thu, 25 Sep 2014 20:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:mime-version:content-type; bh=c+mHIlKO+fCjdIvgFJgO5wfac7BK08uWkNWf3HmKEdQ=; b=Eg1SPc6jD6H1xUbhiMVNzAMQHilyxw+sljOXKDkgRfXKtpkLHu8EqMjU+4I90UgjXk w/P5rZ2MMffB0BpQh18k0eOcsjuNBgHUPLsE9F1FxtpJq5Xc91iDDPUZvcj2CC0k8hs3 voma3tfVke2RoeVrvmlzgtDd4F9DTy8D+a8EsByXg+/F5wX3zLqszDjnK7vVe4iSVhfs 1tg9T60fviLVxniI+GEjhqFs8jVEzgUd8JuvdhH0HjKhZJg7X89ZZ3p6U20lNIIgj2v9 DwDSHB7X31WvwIfA2Br1ZapivcSo5wuZ/h3tIYoEpvXus8rt5DGJqA/VBOYaF/duyo6h w5kw== X-Received: by 10.224.32.138 with SMTP id c10mr11780687qad.1.1411702858643; Thu, 25 Sep 2014 20:40:58 -0700 (PDT) Received: from zhabar.attlocal.net (107-222-186-3.lightspeed.sntcca.sbcglobal.net. [107.222.186.3]) by mx.google.com with ESMTPSA id l46sm3745705qgd.27.2014.09.25.20.40.57 for (version=SSLv3 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 25 Sep 2014 20:40:57 -0700 (PDT) Date: Thu, 25 Sep 2014 20:40:52 -0700 From: Justin Hibbits To: FreeBSD Current , FreeBSD PowerPC ML Subject: Boot failure with r272146 Message-ID: <20140925204052.6f4c1d60@zhabar.attlocal.net> X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; powerpc64-portbld-freebsd11.0) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/eteFzjDwiVP0p.yFwS3DpcI" X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Sep 2014 03:41:01 -0000 --MP_/eteFzjDwiVP0p.yFwS3DpcI Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline With r272146 my SATA controller fails to attach, preventing the kernel from mounting root. I've attached a log of as much as dconschat would allow. The relevant portion is pcib10: atapci0: mem 0xfa402000-0xfa403fff at device 12.0 on pci10 pcib1: failed to reserve resource for pcib10 pcib10: failed to allocate initial I/O port window (0-0xffffffff,0x10) atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). atapci0: unable to map interrupt device_attach: atapci0 attach returned 6 pcib10: allocated memory range (0xfa400000-0xfa400fff) for rid 10 of pci1:3:14:0 atapci0: mem 0xfa402000-0xfa403fff at device 12.0 on pci10 pcib1: failed to reserve resource for pcib10 pcib10: failed to allocate initial I/O port window (0-0xffffffff,0x10) atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). atapci0: unable to map interrupt device_attach: atapci0 attach returned 6 ata0: mem 0xfa404000-0xfa407fff at device 13.0 on pci10 ofw_pci mapdev: start fa404000, len 16384 ata0: unable to allocate interrupt device_attach: ata0 attach returned 6 It works fine with r271697 kernel (latest I have booting). I haven't yet tried bisecting. Hardware is a PowerMac G5 (last generation). - Justin --MP_/eteFzjDwiVP0p.yFwS3DpcI Content-Type: application/octet-stream; name=kernel_boot.fail Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=kernel_boot.fail W2Rjb25zIGNvbm5lY3RlZF0NCmUgMzIsIGJhc2UgMHhmYTIwMDAwMCwgc2l6ZSAyMSwgbWVtb3J5 IGRpc2FibGVkDQpwY2liODogYWxsb2NhdGVkIG1lbW9yeSByYW5nZSAoMHhmYTIwMDAwMC0weGZh M2ZmZmZmKSBmb3IgcmlkIDEwIG9mIHBjaTE6MjoxNTowDQpnZW0wOiA8QXBwbGUgU2hhc3RhIEdN QUMgRXRoZXJuZXQ+IG1lbSAweGZhMjAwMDAwLTB4ZmEzZmZmZmYgYXQgZGV2aWNlIDE1LjAgb24g cGNpOA0KcGNpYjg6IHNsb3QgMTUgSU5UQSBpcyByb3V0ZWQgdG8gaXJxIDQNCm9md19wY2kgbWFw ZGV2OiBzdGFydCBmYTIwMDAwMCwgbGVuIDIwOTcxNTINCmdlbTA6IGludmFsaWQgTUFDIGFkZHJl c3MNCmRldmljZV9hdHRhY2g6IGdlbTAgYXR0YWNoIHJldHVybmVkIDYNCnBjaWI5OiA8T0ZXIFBD SS1QQ0kgYnJpZGdlPiBhdCBkZXZpY2UgOC4wIG9uIHBjaTENCnBjaWI5OiAgIGRvbWFpbiAgICAg ICAgICAgIDENCnBjaWI5OiAgIHNlY29uZGFyeSBidXMgICAgIDENCnBjaWI5OiAgIHN1Ym9yZGlu YXRlIGJ1cyAgIDENCnBjaWI5OiAgIG1lbW9yeSBkZWNvZGUgICAgIDB4ODAwMDAwMDAtMHg4MDBm ZmZmZg0KcGNpOTogPE9GVyBQQ0kgYnVzPiBvbiBwY2liOQ0KcGNpOTogZG9tYWluPTEsIHBoeXNp Y2FsIGJ1cz0xDQpmb3VuZC0+CXZlbmRvcj0weDEwNmIsIGRldj0weDAwNGYsIHJldmlkPTB4MDAN Cglkb21haW49MSwgYnVzPTEsIHNsb3Q9NywgZnVuYz0wDQoJY2xhc3M9ZmYtMDAtMDAsIGhkcnR5 cGU9MHgwMCwgbWZkZXY9MA0KCWNtZHJlZz0weDAwMDYsIHN0YXRyZWc9MHgwMjAwLCBjYWNoZWxu c3o9MTYgKGR3b3JkcykNCglsYXR0aW1lcj0weDEwICg0ODAgbnMpLCBtaW5nbnQ9MHgwMCAoMCBu cyksIG1heGxhdD0weDAwICgwIG5zKQ0KCW1hcFsxMF06IHR5cGUgTWVtb3J5LCByYW5nZSAzMiwg YmFzZSAweDgwMDAwMDAwLCBzaXplIDE5LCBlbmFibGVkDQpwY2liOTogYWxsb2NhdGVkIG1lbW9y eSByYW5nZSAoMHg4MDAwMDAwMC0weDgwMDdmZmZmKSBmb3IgcmlkIDEwIG9mIHBjaTE6MTo3OjAN CmZvdW5kLT4JdmVuZG9yPTB4MTAzMywgZGV2PTB4MDAzNSwgcmV2aWQ9MHg0Mw0KCWRvbWFpbj0x LCBidXM9MSwgc2xvdD0xMSwgZnVuYz0wDQoJY2xhc3M9MGMtMDMtMTAsIGhkcnR5cGU9MHgwMCwg bWZkZXY9MQ0KCWNtZHJlZz0weDAwMDAsIHN0YXRyZWc9MHgwMjEwLCBjYWNoZWxuc3o9MTYgKGR3 b3JkcykNCglsYXR0aW1lcj0weDEwICg0ODAgbnMpLCBtaW5nbnQ9MHgwMSAoMjUwIG5zKSwgbWF4 bGF0PTB4MmEgKDEwNTAwIG5zKQ0KCWludHBpbj1hLCBpcnE9MA0KCXBvd2Vyc3BlYyAyICBzdXBw b3J0cyBEMCBEMSBEMiBEMyAgY3VycmVudCBEMA0KCW1hcFsxMF06IHR5cGUgTWVtb3J5LCByYW5n ZSAzMiwgYmFzZSAweDgwMDgyMDAwLCBzaXplIDEyLCBtZW1vcnkgZGlzYWJsZWQNCnBjaWI5OiBh bGxvY2F0ZWQgbWVtb3J5IHJhbmdlICgweDgwMDgyMDAwLTB4ODAwODJmZmYpIGZvciByaWQgMTAg b2YgcGNpMToxOjExOjANCmZvdW5kLT4JdmVuZG9yPTB4MTAzMywgZGV2PTB4MDAzNSwgcmV2aWQ9 MHg0Mw0KCWRvbWFpbj0xLCBidXM9MSwgc2xvdD0xMSwgZnVuYz0xDQoJY2xhc3M9MGMtMDMtMTAs IGhkcnR5cGU9MHgwMCwgbWZkZXY9MA0KCWNtZHJlZz0weDAwMDAsIHN0YXRyZWc9MHgwMjEwLCBj YWNoZWxuc3o9MTYgKGR3b3JkcykNCglsYXR0aW1lcj0weDEwICg0ODAgbnMpLCBtaW5nbnQ9MHgw MSAoMjUwIG5zKSwgbWF4bGF0PTB4MmEgKDEwNTAwIG5zKQ0KCWludHBpbj1iLCBpcnE9MA0KCXBv d2Vyc3BlYyAyICBzdXBwb3J0cyBEMCBEMSBEMiBEMyAgY3VycmVudCBEMA0KCW1hcFsxMF06IHR5 cGUgTWVtb3J5LCByYW5nZSAzMiwgYmFzZSAweDgwMDgxMDAwLCBzaXplIDEyLCBtZW1vcnkgZGlz YWJsZWQNCnBjaWI5OiBhbGxvY2F0ZWQgbWVtb3J5IHJhbmdlICgweDgwMDgxMDAwLTB4ODAwODFm ZmYpIGZvciByaWQgMTAgb2YgcGNpMToxOjExOjENCmZvdW5kLT4JdmVuZG9yPTB4MTAzMywgZGV2 PTB4MDBlMCwgcmV2aWQ9MHgwNA0KCWRvbWFpbj0xLCBidXM9MSwgc2xvdD0xMSwgZnVuYz0yDQoJ Y2xhc3M9MGMtMDMtMjAsIGhkcnR5cGU9MHgwMCwgbWZkZXY9MA0KCWNtZHJlZz0weDAwMDQsIHN0 YXRyZWc9MHgwMjEwLCBjYWNoZWxuc3o9MTYgKGR3b3JkcykNCglsYXR0aW1lcj0weDEwICg0ODAg bnMpLCBtaW5nbnQ9MHgxMCAoNDAwMCBucyksIG1heGxhdD0weDIyICg4NTAwIG5zKQ0KCWludHBp bj1jLCBpcnE9MA0KCXBvd2Vyc3BlYyAyICBzdXBwb3J0cyBEMCBEMSBEMiBEMyAgY3VycmVudCBE MA0KCW1hcFsxMF06IHR5cGUgTWVtb3J5LCByYW5nZSAzMiwgYmFzZSAweDgwMDgwMDAwLCBzaXpl ICA4LCBtZW1vcnkgZGlzYWJsZWQNCnBjaWI5OiBhbGxvY2F0ZWQgbWVtb3J5IHJhbmdlICgweDgw MDgwMDAwLTB4ODAwODAwZmYpIGZvciByaWQgMTAgb2YgcGNpMToxOjExOjINCm1hY2lvMDogPFNo YXN0YSBJL08gQ29udHJvbGxlcj4gbWVtIDB4ODAwMDAwMDAtMHg4MDA3ZmZmZiBhdCBkZXZpY2Ug Ny4wIG9uIHBjaTkNCm9md19wY2kgbWFwZGV2OiBzdGFydCA4MDAwMDAwMCwgbGVuIDUyNDI4OA0K bWFjZ3BpbzA6IDxNYWNJTyBHUElPIENvbnRyb2xsZXI+IG1lbSAweDUwLTB4OGEgb24gbWFjaW8w DQptYWNncGlvMDogPGdwaW8sIHNtdS1pbnRlcnJ1cHQ+IGdwaW8gMTMgaXJxIDQ4IChubyBkcml2 ZXIgYXR0YWNoZWQpDQptYWNncGlvMDogPGdwaW8sIHByb2dyYW1tZXItc3dpdGNoPiBncGlvIDEy IGlycSA0NyAobm8gZHJpdmVyIGF0dGFjaGVkKQ0KbWFjZ3BpbzA6IDxncGlvLCBjaGlwLWZhdWx0 PiBncGlvIDE0IGlycSA0OSAobm8gZHJpdmVyIGF0dGFjaGVkKQ0KbWFjZ3BpbzA6IDxncGlvLCBz bGV3aW5nLWRvbmU+IGdwaW8gNTYgaXJxIDkxIChubyBkcml2ZXIgYXR0YWNoZWQpDQptYWNncGlv MDogPGdwaW8sIG1sYi1nb29kPiBncGlvIDE5IChubyBkcml2ZXIgYXR0YWNoZWQpDQptYWNncGlv MDogPGdwaW8sIHZkbmFwMD4gZ3BpbyAyMCAobm8gZHJpdmVyIGF0dGFjaGVkKQ0KbWFjZ3BpbzA6 IDxncGlvLCB0aW1lYmFzZS1lbmFibGU+IGdwaW8gMzggaXJxIDczIChubyBkcml2ZXIgYXR0YWNo ZWQpDQptYWNncGlvMDogPGdwaW8sIGRpZy1ody1yZXNldC1jPiBncGlvIDkgKG5vIGRyaXZlciBh dHRhY2hlZCkNCm1hY2dwaW8wOiA8Z3BpbywgY29kZWMtZXJyb3ItaXJxPiBncGlvIDUwIGlycSA4 NSAobm8gZHJpdmVyIGF0dGFjaGVkKQ0KbWFjZ3BpbzA6IDxncGlvLCBjb2RlYy1jbG9jay1tdXg+ IGdwaW8gNDkgKG5vIGRyaXZlciBhdHRhY2hlZCkNCm1hY2dwaW8wOiA8Z3BpbywgbGluZWluLWRl dGVjdD4gZ3BpbyA0MiBpcnEgNzcgKG5vIGRyaXZlciBhdHRhY2hlZCkNCnNjYzA6IDxaaWxvZyBa ODUzMCBkdWFsIGNoYW5uZWwgU0NDPiBtZW0gMHgxMzAwMC0weDEzZmZmLDB4ODQwMC0weDg0ZmYs MHg4NTAwLTB4ODVmZiwweDg2MDAtMHg4NmZmLDB4ODcwMC0weDg3ZmYgaXJxIDIzLDE3LDE4LDI0 LDE5LDIwIG9uIG1hY2lvMA0Kc2NjMDogcmVzZXR0aW5nIGhhcmR3YXJlDQp1YXJ0MDogPHo4NTMw LCBjaGFubmVsIEE+IG9uIHNjYzANCnVhcnQwOiBmYXN0IGludGVycnVwdA0KdWFydDE6IDx6ODUz MCwgY2hhbm5lbCBCPiBvbiBzY2MwDQp1YXJ0MTogZmFzdCBpbnRlcnJ1cHQNCnNjYzA6IGZhc3Qg aW50ZXJydXB0DQppaWNoYjE6IDxLZXl3ZXN0IEkyQyBjb250cm9sbGVyPiBtZW0gMHgxODAwMC0w eDE4ZmZmIGlycSAyNyBvbiBtYWNpbzANCmlpY2hiMTogUmV2aXNpb246IEExDQppaWNidXMxOiA8 T0ZXIEkyQyBidXM+IG9uIGlpY2hiMQ0Kb255eDA6IDxUZXhhcyBJbnN0cnVtZW50cyBQQ00zMDUy IEF1ZGlvIENvZGVjPiBhdCBhZGRyIDB4OGMgb24gaWljYnVzMQ0KaWljYnVzMTogPHVua25vd24g Y2FyZD4gYXQgYWRkciAweDI0DQpwY20wOiA8QXBwbGUgSTJTIEF1ZGlvIENvbnRyb2xsZXI+IG1l bSAweDEwMDAwLTB4MTBmZmYsMHg4MDAwLTB4ODBmZiwweDgxMDAtMHg4MWZmIGlycSAyOCwxMSwx MiwzMCwxNSwxNiBvbiBtYWNpbzANCm9oY2kwOiA8TkVDIHVQRCA5MjEwIFVTQiBjb250cm9sbGVy PiBtZW0gMHg4MDA4MjAwMC0weDgwMDgyZmZmIGlycSA3MCBhdCBkZXZpY2UgMTEuMCBvbiBwY2k5 DQpvZndfcGNpIG1hcGRldjogc3RhcnQgODAwODIwMDAsIGxlbiA0MDk2DQp1c2J1czAgb24gb2hj aTANCm9oY2kwOiB1c2JwZjogQXR0YWNoZWQNCm9oY2kxOiA8TkVDIHVQRCA5MjEwIFVTQiBjb250 cm9sbGVyPiBtZW0gMHg4MDA4MTAwMC0weDgwMDgxZmZmIGlycSA3MCBhdCBkZXZpY2UgMTEuMSBv biBwY2k5DQpvZndfcGNpIG1hcGRldjogc3RhcnQgODAwODEwMDAsIGxlbiA0MDk2DQp1c2J1czEg b24gb2hjaTENCm9oY2kxOiB1c2JwZjogQXR0YWNoZWQNCmVoY2kwOiA8TkVDIHVQRCA3MjAxMDAg VVNCIDIuMCBjb250cm9sbGVyPiBtZW0gMHg4MDA4MDAwMC0weDgwMDgwMGZmIGlycSA3MCBhdCBk ZXZpY2UgMTEuMiBvbiBwY2k5DQpvZndfcGNpIG1hcGRldjogc3RhcnQgODAwODAwMDAsIGxlbiAy NTYNCnVzYnVzMjogRUhDSSB2ZXJzaW9uIDEuMA0KdXNidXMyIG9uIGVoY2kwDQplaGNpMDogdXNi cGY6IEF0dGFjaGVkDQpwY2liMTA6IDxPRlcgUENJLVBDSSBicmlkZ2U+IGF0IGRldmljZSA5LjAg b24gcGNpMQ0KcGNpYjEwOiAgIGRvbWFpbiAgICAgICAgICAgIDENCnBjaWIxMDogICBzZWNvbmRh cnkgYnVzICAgICAzDQpwY2liMTA6ICAgc3Vib3JkaW5hdGUgYnVzICAgMw0KcGNpYjEwOiAgIG1l bW9yeSBkZWNvZGUgICAgIDB4ZmE0MDAwMDAtMHhmYTRmZmZmZg0KcGNpMTA6IDxPRlcgUENJIGJ1 cz4gb24gcGNpYjEwDQpwY2kxMDogZG9tYWluPTEsIHBoeXNpY2FsIGJ1cz0zDQpmb3VuZC0+CXZl bmRvcj0weDExNjYsIGRldj0weDAyNDAsIHJldmlkPTB4MDANCglkb21haW49MSwgYnVzPTMsIHNs b3Q9MTIsIGZ1bmM9MA0KCWNsYXNzPTAxLTAxLThmLCBoZHJ0eXBlPTB4MDAsIG1mZGV2PTENCglj bWRyZWc9MHgwMDA2LCBzdGF0cmVnPTB4MDIyMCwgY2FjaGVsbnN6PTAgKGR3b3JkcykNCglsYXR0 aW1lcj0weDEwICg0ODAgbnMpLCBtaW5nbnQ9MHgwMCAoMCBucyksIG1heGxhdD0weDAwICgwIG5z KQ0KCW1hcFsxMF06IHR5cGUgSS9PIFBvcnQsIHJhbmdlIDMyLCBiYXNlIDAsIHNpemUgIDMsIHBv cnQgZGlzYWJsZWQNCgltYXBbMTRdOiB0eXBlIEkvTyBQb3J0LCByYW5nZSAzMiwgYmFzZSAwLCBz aXplICAyLCBwb3J0IGRpc2FibGVkDQoJbWFwWzE4XTogdHlwZSBJL08gUG9ydCwgcmFuZ2UgMzIs IGJhc2UgMCwgc2l6ZSAgMywgcG9ydCBkaXNhYmxlZA0KCW1hcFsxY106IHR5cGUgSS9PIFBvcnQs IHJhbmdlIDMyLCBiYXNlIDAsIHNpemUgIDIsIHBvcnQgZGlzYWJsZWQNCgltYXBbMjBdOiB0eXBl IEkvTyBQb3J0LCByYW5nZSAzMiwgYmFzZSAwLCBzaXplICA0LCBwb3J0IGRpc2FibGVkDQoJbWFw WzI0XTogdHlwZSBNZW1vcnksIHJhbmdlIDMyLCBiYXNlIDB4ZmE0MDIwMDAsIHNpemUgMTMsIGVu YWJsZWQNCnBjaWIxMDogYWxsb2NhdGVkIG1lbW9yeSByYW5nZSAoMHhmYTQwMjAwMC0weGZhNDAz ZmZmKSBmb3IgcmlkIDI0IG9mIHBjaTE6MzoxMjowDQpmb3VuZC0+CXZlbmRvcj0weDEwNmIsIGRl dj0weDAwNTAsIHJldmlkPTB4MDANCglkb21haW49MSwgYnVzPTMsIHNsb3Q9MTMsIGZ1bmM9MA0K CWNsYXNzPWZmLTAwLTAwLCBoZHJ0eXBlPTB4MDAsIG1mZGV2PTANCgljbWRyZWc9MHgwMDA0LCBz dGF0cmVnPTB4ODIwMCwgY2FjaGVsbnN6PTE2IChkd29yZHMpDQoJbGF0dGltZXI9MHgyMCAoOTYw IG5zKSwgbWluZ250PTB4MDAgKDAgbnMpLCBtYXhsYXQ9MHgwMCAoMCBucykNCgltYXBbMTBdOiB0 eXBlIE1lbW9yeSwgcmFuZ2UgMzIsIGJhc2UgMHhmYTQwNDAwMCwgc2l6ZSAxNCwgbWVtb3J5IGRp c2FibGVkDQpwY2liMTA6IGFsbG9jYXRlZCBtZW1vcnkgcmFuZ2UgKDB4ZmE0MDQwMDAtMHhmYTQw N2ZmZikgZm9yIHJpZCAxMCBvZiBwY2kxOjM6MTM6MA0KZm91bmQtPgl2ZW5kb3I9MHgxMDZiLCBk ZXY9MHgwMDUyLCByZXZpZD0weDAwDQoJZG9tYWluPTEsIGJ1cz0zLCBzbG90PTE0LCBmdW5jPTAN CgljbGFzcz0wYy0wMC0xMCwgaGRydHlwZT0weDAwLCBtZmRldj0wDQoJY21kcmVnPTB4MDAwMCwg c3RhdHJlZz0weDAyOTAsIGNhY2hlbG5zej0xNiAoZHdvcmRzKQ0KCWxhdHRpbWVyPTB4ZjggKDc0 NDAgbnMpLCBtaW5nbnQ9MHgwYyAoMzAwMCBucyksIG1heGxhdD0weDE4ICg2MDAwIG5zKQ0KCWlu dHBpbj1hLCBpcnE9MA0KCXBvd2Vyc3BlYyAyICBzdXBwb3J0cyBEMCBEMSBEMiBEMyAgY3VycmVu dCBEMA0KCW1hcFsxMF06IHR5cGUgTWVtb3J5LCByYW5nZSAzMiwgYmFzZSAweGZhNDAwMDAwLCBz aXplIDEyLCBtZW1vcnkgZGlzYWJsZWQNCnBjaWIxMDogYWxsb2NhdGVkIG1lbW9yeSByYW5nZSAo MHhmYTQwMDAwMC0weGZhNDAwZmZmKSBmb3IgcmlkIDEwIG9mIHBjaTE6MzoxNDowDQphdGFwY2kw OiA8U2VydmVyV29ya3MgSzIgU0FUQTE1MCBjb250cm9sbGVyPiBtZW0gMHhmYTQwMjAwMC0weGZh NDAzZmZmIGF0IGRldmljZSAxMi4wIG9uIHBjaTEwDQpwY2liMTogZmFpbGVkIHRvIHJlc2VydmUg cmVzb3VyY2UgZm9yIHBjaWIxMA0KcGNpYjEwOiBmYWlsZWQgdG8gYWxsb2NhdGUgaW5pdGlhbCBJ L08gcG9ydCB3aW5kb3cgKDAtMHhmZmZmZmZmZiwweDEwKQ0KYXRhcGNpMDogMHgxMCBieXRlcyBv ZiByaWQgMHgyMCByZXMgNCBmYWlsZWQgKDAsIDB4ZmZmZmZmZmZmZmZmZmZmZikuDQphdGFwY2kw OiB1bmFibGUgdG8gbWFwIGludGVycnVwdA0KZGV2aWNlX2F0dGFjaDogYXRhcGNpMCBhdHRhY2gg cmV0dXJuZWQgNg0KYXRhMDogPFNoYXN0YSBLYXVhaSBBVEEgQ29udHJvbGxlcj4gbWVtIDB4ZmE0 MDQwMDAtMHhmYTQwN2ZmZiBhdCBkZXZpY2UgMTMuMCBvbiBwY2kxMA0Kb2Z3X3BjaSBtYXBkZXY6 IHN0YXJ0IGZhNDA0MDAwLCBsZW4gMTYzODQNCmF0YTA6IHVuYWJsZSB0byBhbGxvY2F0ZSBpbnRl cnJ1cHQNCmRldmljZV9hdHRhY2g6IGF0YTAgYXR0YWNoIHJldHVybmVkIDYNCmZ3b2hjaTA6IHZl bmRvcj0xMDZiLCBkZXY9NTINCmZ3b2hjaTA6IHZlbmRvcj0xMDZiLCBkZXY9NTINCmZ3b2hjaTA6 IDwxMzk0IE9wZW4gSG9zdCBDb250cm9sbGVyIEludGVyZmFjZT4gbWVtIDB4ZmE0MDAwMDAtMHhm YTQwMGZmZiBpcnEgMzkgYXQgZGV2aWNlIDE0LjAgb24gcGNpMTANCm9md19wY2kgbWFwZGV2OiBz dGFydCBmYTQwMDAwMCwgbGVuIDQwOTYNCmZ3b2hjaTA6IE9IQ0kgdmVyc2lvbiAxLjAgKFJPTT0w KQ0KZndvaGNpMDogTm8uIG9mIElzb2Nocm9ub3VzIGNoYW5uZWxzIGlzIDguDQpmd29oY2kwOiBF VUk2NCAwMDoxNDo1MTpmZjpmZTozMzpjYTpiNg0KZndvaGNpMDogaW52YWxpZCBzcGVlZCA3IChm aXhlZCB0byAzKS4NCmZ3b2hjaTA6IFBoeSAxMzk0YSBhdmFpbGFibGUgUzgwMCwgMyBwb3J0cy4N CmZ3b2hjaTA6IExpbmsgUzgwMCwgbWF4X3JlYyA0MDk2IGJ5dGVzLg0KZmlyZXdpcmUwOiA8SUVF RTEzOTQoRmlyZVdpcmUpIGJ1cz4gb24gZndvaGNpMA0KZGNvbnNfY3JvbTA6IDxkY29ucyBjb25m aWd1cmF0aW9uIFJPTT4gb24gZmlyZXdpcmUwDQpkY29uc19jcm9tMDogYnVzX2FkZHIgMHgxNzI0 MDAwDQpmd2UwOiA8RXRoZXJuZXQgb3ZlciBGaXJlV2lyZT4gb24gZmlyZXdpcmUwDQppZl9md2Uw OiBGYWtlIEV0aGVybmV0IGFkZHJlc3M6IDAyOjE0OjUxOjMzOmNhOmI2DQpmd2UwOiBicGYgYXR0 YWNoZWQNCmZ3ZTA6IEV0aGVybmV0IGFkZHJlc3M6IDAyOjE0OjUxOjMzOmNhOmI2DQpzYnAwOiA8 U0JQLTIvU0NTSSBvdmVyIEZpcmVXaXJlPiBvbiBmaXJld2lyZTANCmZ3b2hjaTA6IEluaXRpYXRl IGJ1cyByZXNldA0KZndvaGNpMDogZndvaGNpX2ludHJfY29yZTogQlVTIHJlc2V0DQpmd29oY2kw OiBmd29oY2lfaW50cl9jb3JlOiBub2RlX2lkPTB4MDAwMDAwMDAsIFNlbGZJRCBDb3VudD0xLCBu b24gQ1lDTEVNQVNURVIgbW9kZQ0Kc211MDogPEFwcGxlIFN5c3RlbSBNYW5hZ2VtZW50IFVuaXQ+ IG9uIG9md2J1czANCnNtdTA6IEZhbjogRFJJVkUgQkFZIEEgSU5UQUtFIHR5cGU6IDANCnNtdTA6 IEZhbjogQkFDS1NJREUgdHlwZTogMA0Kc211MDogRmFuOiBDUFUgQSBJTlRBS0UgdHlwZTogMA0K c211MDogRmFuOiBDUFUgQiBJTlRBS0UgdHlwZTogMA0Kc211MDogRmFuOiBDUFUgQSBFWEhBVVNU IHR5cGU6IDANCnNtdTA6IEZhbjogQ1BVIEIgRVhIQVVTVCB0eXBlOiAwDQpzbXUwOiBGYW46IEVY UEFOU0lPTiBTTE9UUyBJTlRBS0UgdHlwZTogMA0Kc211MDogcmVnaXN0ZXJlZCBhcyBhIHRpbWUt b2YtZGF5IGNsb2NrIChyZXNvbHV0aW9uIDEwMDB1cywgYWRqdXN0bWVudCAwLjAwMDUwMDAwMHMp DQppaWNoYjI6IDxTTVUgSTJDIGNvbnRyb2xsZXI+IG9uIHNtdTANCmlpY2J1czI6IDxPRlcgSTJD IGJ1cz4gb24gaWljaGIyDQpzbXVzYXQwOiA8U01VIFNhdGVsbGl0ZSBTZW5zb3JzPiBhdCBhZGRy IDB4YjAgb24gaWljYnVzMg0KaWljYnVzMjogPHVua25vd24gY2FyZD4gYXQgYWRkciAweGQ0DQpp aWNoYjM6IDxTTVUgSTJDIGNvbnRyb2xsZXI+IG9uIHNtdTANCmlpY2J1czM6IDxPRlcgSTJDIGJ1 cz4gb24gaWljaGIzDQpwcm9jZnMgcmVnaXN0ZXJlZA0KVGltZWNvdW50ZXIgInRpbWViYXNlIiBm cmVxdWVuY3kgMzMzMzMzMzMgSHogcXVhbGl0eSAwDQpFdmVudCB0aW1lciAiZGVjcmVtZW50ZXIi IGZyZXF1ZW5jeSAzMzMzMzMzMyBIeiBxdWFsaXR5IDEwMDANClRpbWVjb3VudGVycyB0aWNrIGV2 ZXJ5IDEuMDAwIG1zZWMNCnZsYW46IGZpcmV3aXJlMDogMyBub2RlcywgbWF4aG9wIDw9IDIgY2Fi bGUgSVJNIGlybSgyKSANCmluaXRpYWxpemVkLCB1c2luZyBoYXNoIHRhYmxlcyB3aXRoIGNoYWlu aW5nDQp0Y3BfaW5pdDogbmV0LmluZXQudGNwLnRjYmhhc2hzaXplIGF1dG8gdHVuZWQgdG8gMTMx MDcyDQpsbzA6IGJwZiBhdHRhY2hlZA0KbWF4NjY5MDA6IDIgc2Vuc29ycyBkZXRlY3RlZC4NCm1h eDY2OTAwOiBTZW5zb3JzDQptYXg2NjkwMDogTG9jYXRpb24gOiBCQUNLU0lERSBJRDogNg0KbWF4 NjY5MDA6IExvY2F0aW9uIDogS09ESUFLIERJT0RFIElEOiA3DQptYXg2NjkwMTogMiBzZW5zb3Jz IGRldGVjdGVkLg0KbWF4NjY5MDE6IFNlbnNvcnMNCm1heDY2OTAxOiBMb2NhdGlvbiA6IFRVTk5F TCBJRDogMQ0KbWF4NjY5MDE6IExvY2F0aW9uIDogVFVOTkVMIEhFQVRTSU5LIElEOiAyDQpiZ2Ux OiBsaW5rIHN0YXRlIGNoYW5nZWQgdG8gVVAKW2Rjb25zY2hhdCBleGl0aW5nLi4uXQo= --MP_/eteFzjDwiVP0p.yFwS3DpcI-- From owner-freebsd-ppc@FreeBSD.ORG Fri Sep 26 13:59:06 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7BA1D170; Fri, 26 Sep 2014 13:59:06 +0000 (UTC) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4B271E6C; Fri, 26 Sep 2014 13:59:05 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1XXW35-0006VN-8G; Fri, 26 Sep 2014 13:59:04 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id s8QDx2JU005485; Fri, 26 Sep 2014 07:59:02 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1+MdRM3i6ybimFyBtngM1LE X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: Boot failure with r272146 From: Ian Lepore To: Justin Hibbits In-Reply-To: <20140925204052.6f4c1d60@zhabar.attlocal.net> References: <20140925204052.6f4c1d60@zhabar.attlocal.net> Content-Type: multipart/mixed; boundary="=-CpXzZj/3atU4iGQ0CvjO" Date: Fri, 26 Sep 2014 07:59:01 -0600 Message-ID: <1411739941.66615.257.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Cc: FreeBSD Current , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Sep 2014 13:59:06 -0000 --=-CpXzZj/3atU4iGQ0CvjO Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, 2014-09-25 at 20:40 -0700, Justin Hibbits wrote: > With r272146 my SATA controller fails to attach, preventing the kernel > from mounting root. I've attached a log of as much as dconschat would > allow. The relevant portion is pcib10: > > atapci0: mem 0xfa402000-0xfa403fff > at device 12.0 on pci10 pcib1: failed to reserve resource for pcib10 > pcib10: failed to allocate initial I/O port window (0-0xffffffff,0x10) > atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). > atapci0: unable to map interrupt > device_attach: atapci0 attach returned 6 > > pcib10: allocated memory range (0xfa400000-0xfa400fff) for rid 10 of > pci1:3:14:0 atapci0: mem > 0xfa402000-0xfa403fff at device 12.0 on pci10 pcib1: failed to reserve > resource for pcib10 pcib10: failed to allocate initial I/O port window > (0-0xffffffff,0x10) atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, > 0xffffffffffffffff). atapci0: unable to map interrupt > device_attach: atapci0 attach returned 6 > ata0: mem 0xfa404000-0xfa407fff at device > 13.0 on pci10 ofw_pci mapdev: start fa404000, len 16384 > ata0: unable to allocate interrupt > device_attach: ata0 attach returned 6 > > > It works fine with r271697 kernel (latest I have booting). I haven't > yet tried bisecting. > > Hardware is a PowerMac G5 (last generation). > > - Justin Ooops, I think a paste-o in my r272109 caused it. See if this fixes it. -- Ian --=-CpXzZj/3atU4iGQ0CvjO Content-Disposition: inline; filename="ofw_pcibus_fix.diff" Content-Type: text/x-patch; name="ofw_pcibus_fix.diff"; charset="us-ascii" Content-Transfer-Encoding: 7bit Index: sys/powerpc/ofw/ofw_pcibus.c =================================================================== --- sys/powerpc/ofw/ofw_pcibus.c (revision 272109) +++ sys/powerpc/ofw/ofw_pcibus.c (working copy) @@ -201,7 +201,7 @@ ofw_pcibus_enum_devtree(device_t dev, u_int domain * resource list. */ if (dinfo->opd_dinfo.cfg.intpin == 0) - ofw_bus_intr_to_rl(dev, node, &dinfo->opd_dinfo.resources); + ofw_bus_intr_to_rl(dev, child, &dinfo->opd_dinfo.resources); } } --=-CpXzZj/3atU4iGQ0CvjO-- From owner-freebsd-ppc@FreeBSD.ORG Fri Sep 26 14:40:14 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 885CD8CA; Fri, 26 Sep 2014 14:40:14 +0000 (UTC) Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 97C30368; Fri, 26 Sep 2014 14:40:13 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id z11so2790334lbi.41 for ; Fri, 26 Sep 2014 07:40:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=b74g8tESPrTXpd7weVSAxofU5M3A+h6wxFuW1NDfDf0=; b=K9k15wUnbqtX7Z9CGDKVovCNECuljsd/2mNUNATfToZYZ2BoqpRxhMS05VGThbDlhZ 4/KFEjvhKHVtJ3azw5J3UuaW6k5HJSL56F+9MryGhkVc7HIRa22VogelG96h02yttZjb sGzJ0ktk3x+WOhLlMpONKBu6ok2bJsVHVdBmQ1efOCoYh6TgF8Wlgns7o1ERS0gXJ32R WU1iCSGrxhdfHA/RzU0lLKLdv2sZ/EIjEsovSm0xQFH6++W0At1vqoZf5OT8yr/VWjQU Vfs6c5XIG/v09cSja6u058Nm93A5uHh30KX2SlJklWIMwcn6EUyiv4fOw6Flkm26+pYt 8jcg== MIME-Version: 1.0 X-Received: by 10.152.170.167 with SMTP id an7mr3510119lac.94.1411742409879; Fri, 26 Sep 2014 07:40:09 -0700 (PDT) Received: by 10.25.15.29 with HTTP; Fri, 26 Sep 2014 07:40:09 -0700 (PDT) Received: by 10.25.15.29 with HTTP; Fri, 26 Sep 2014 07:40:09 -0700 (PDT) In-Reply-To: <1411739941.66615.257.camel@revolution.hippie.lan> References: <20140925204052.6f4c1d60@zhabar.attlocal.net> <1411739941.66615.257.camel@revolution.hippie.lan> Date: Fri, 26 Sep 2014 07:40:09 -0700 Message-ID: Subject: Re: Boot failure with r272146 From: Justin Hibbits To: Ian Lepore Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Current , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Sep 2014 14:40:14 -0000 That fixed it, thanks! -Justin On Sep 26, 2014 6:59 AM, "Ian Lepore" wrote: > On Thu, 2014-09-25 at 20:40 -0700, Justin Hibbits wrote: > > With r272146 my SATA controller fails to attach, preventing the kernel > > from mounting root. I've attached a log of as much as dconschat would > > allow. The relevant portion is pcib10: > > > > atapci0: mem 0xfa402000-0xfa403fff > > at device 12.0 on pci10 pcib1: failed to reserve resource for pcib10 > > pcib10: failed to allocate initial I/O port window (0-0xffffffff,0x10) > > atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). > > atapci0: unable to map interrupt > > device_attach: atapci0 attach returned 6 > > > > pcib10: allocated memory range (0xfa400000-0xfa400fff) for rid 10 of > > pci1:3:14:0 atapci0: mem > > 0xfa402000-0xfa403fff at device 12.0 on pci10 pcib1: failed to reserve > > resource for pcib10 pcib10: failed to allocate initial I/O port window > > (0-0xffffffff,0x10) atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, > > 0xffffffffffffffff). atapci0: unable to map interrupt > > device_attach: atapci0 attach returned 6 > > ata0: mem 0xfa404000-0xfa407fff at device > > 13.0 on pci10 ofw_pci mapdev: start fa404000, len 16384 > > ata0: unable to allocate interrupt > > device_attach: ata0 attach returned 6 > > > > > > It works fine with r271697 kernel (latest I have booting). I haven't > > yet tried bisecting. > > > > Hardware is a PowerMac G5 (last generation). > > > > - Justin > > Ooops, I think a paste-o in my r272109 caused it. See if this fixes it. > > -- Ian > > > > > Index: sys/powerpc/ofw/ofw_pcibus.c > =================================================================== > --- sys/powerpc/ofw/ofw_pcibus.c (revision 272109) > +++ sys/powerpc/ofw/ofw_pcibus.c (working copy) > @@ -201,7 +201,7 @@ ofw_pcibus_enum_devtree(device_t dev, u_int domain > * resource list. > */ > if (dinfo->opd_dinfo.cfg.intpin == 0) > - ofw_bus_intr_to_rl(dev, node, > &dinfo->opd_dinfo.resources); > + ofw_bus_intr_to_rl(dev, child, > &dinfo->opd_dinfo.resources); > } > } > > > From owner-freebsd-ppc@FreeBSD.ORG Fri Sep 26 15:17:24 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 10DCD529; Fri, 26 Sep 2014 15:17:24 +0000 (UTC) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D5758A34; Fri, 26 Sep 2014 15:17:23 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1XXXGr-0000Lt-Kv; Fri, 26 Sep 2014 15:17:21 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id s8QFHKYa005634; Fri, 26 Sep 2014 09:17:20 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18VybkOjjGaUPFt38DkMj3x X-Authentication-Warning: paranoia.hippie.lan: Host revolution.hippie.lan [172.22.42.240] claimed to be [172.22.42.240] Subject: Re: Boot failure with r272146 From: Ian Lepore To: Justin Hibbits In-Reply-To: References: <20140925204052.6f4c1d60@zhabar.attlocal.net> <1411739941.66615.257.camel@revolution.hippie.lan> Content-Type: text/plain; charset="us-ascii" Date: Fri, 26 Sep 2014 09:17:19 -0600 Message-ID: <1411744639.66615.270.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: FreeBSD Current , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Sep 2014 15:17:24 -0000 On Fri, 2014-09-26 at 07:40 -0700, Justin Hibbits wrote: > That fixed it, thanks! > > -Justin Fix committed as r272181, sorry for the glitch. -- Ian > On Sep 26, 2014 6:59 AM, "Ian Lepore" wrote: > > > On Thu, 2014-09-25 at 20:40 -0700, Justin Hibbits wrote: > > > With r272146 my SATA controller fails to attach, preventing the kernel > > > from mounting root. I've attached a log of as much as dconschat would > > > allow. The relevant portion is pcib10: > > > > > > atapci0: mem 0xfa402000-0xfa403fff > > > at device 12.0 on pci10 pcib1: failed to reserve resource for pcib10 > > > pcib10: failed to allocate initial I/O port window (0-0xffffffff,0x10) > > > atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff). > > > atapci0: unable to map interrupt > > > device_attach: atapci0 attach returned 6 > > > > > > pcib10: allocated memory range (0xfa400000-0xfa400fff) for rid 10 of > > > pci1:3:14:0 atapci0: mem > > > 0xfa402000-0xfa403fff at device 12.0 on pci10 pcib1: failed to reserve > > > resource for pcib10 pcib10: failed to allocate initial I/O port window > > > (0-0xffffffff,0x10) atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, > > > 0xffffffffffffffff). atapci0: unable to map interrupt > > > device_attach: atapci0 attach returned 6 > > > ata0: mem 0xfa404000-0xfa407fff at device > > > 13.0 on pci10 ofw_pci mapdev: start fa404000, len 16384 > > > ata0: unable to allocate interrupt > > > device_attach: ata0 attach returned 6 > > > > > > > > > It works fine with r271697 kernel (latest I have booting). I haven't > > > yet tried bisecting. > > > > > > Hardware is a PowerMac G5 (last generation). > > > > > > - Justin > > > > Ooops, I think a paste-o in my r272109 caused it. See if this fixes it. > > > > -- Ian > > > > > > > > > > Index: sys/powerpc/ofw/ofw_pcibus.c > > =================================================================== > > --- sys/powerpc/ofw/ofw_pcibus.c (revision 272109) > > +++ sys/powerpc/ofw/ofw_pcibus.c (working copy) > > @@ -201,7 +201,7 @@ ofw_pcibus_enum_devtree(device_t dev, u_int domain > > * resource list. > > */ > > if (dinfo->opd_dinfo.cfg.intpin == 0) > > - ofw_bus_intr_to_rl(dev, node, > > &dinfo->opd_dinfo.resources); > > + ofw_bus_intr_to_rl(dev, child, > > &dinfo->opd_dinfo.resources); > > } > > } > > > > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From owner-freebsd-ppc@FreeBSD.ORG Sat Sep 27 05:18:45 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58404F1A for ; Sat, 27 Sep 2014 05:18:45 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7FCB7C7D for ; Sat, 27 Sep 2014 05:18:44 +0000 (UTC) Received: (qmail 26147 invoked from network); 27 Sep 2014 05:18:43 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 27 Sep 2014 05:18:43 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Sat, 27 Sep 2014 01:18:43 -0400 (EDT) Received: (qmail 25556 invoked from network); 27 Sep 2014 05:18:39 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 27 Sep 2014 05:18:39 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 90D421C4056; Fri, 26 Sep 2014 22:18:32 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <34AA4542-56A7-453E-A00E-868EE352C96C@dsl-only.net> Date: Fri, 26 Sep 2014 22:18:34 -0700 Message-Id: <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> <54248467.4050900@freebsd.org> <34AA4542-56A7-453E-A00E-868EE352C96C@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2014 05:18:45 -0000 The first send of this was big enough for the moderator to be involved. = So I canceled and am sending with less history included. [I'll note that I seem to have trouble typing 0xdbb290 vs. 0xbdd290. The = actual value is 0xdbb290. The references to the incorrect typing should = say 0xbdd290, which is the wrong value. But I've had both types of = references listing the wrong text... in various notes.] =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 26, 2014, at 10:11 PM, Mark Millard wrote: The openfirmware peer crash (i.e., the before Copyright notice crash) = happens during/just-after the MMU setup and the peer pfwcall is the = first ofwcall where pmap_bootstrapped is non-zero at the time. In other = words: the very first ofwcall in the new context fails. And this failure involves some of the same code area that I got a = backtrace for and reported as a separate crash (with the trace listed). = As a reminder for that backtrace that has a difference failure point: .pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 [or ld r0,88(r4) in = an alternate notation] .pvo_tree_RB_FIND+0x38 .moea64_dev_direct_mapped_0x90 .pmap_dev_direct_mapped+0x84 ("_dev" was missing in earlier note) .bs_remap_earlyboot_0x6c .moea64_late_bootstrap+0x178 .moea64_bootstrap_native+0x120 .pmap_bootstrap+0xac .powerpc_init+0x514 btext+0xa8 As for the sequence of ofwcall's that I reported: starting at the last = OF_finddevice before the OF_instance_to_package that I reported in the = sequence of ofwcall's from quiesce until the crash... moea64_late_bootstrap does chosen =3D OF_finddevice("/chosen"); if (chosen !=3D -1 && OF_getprop(chosen, "mmu", &mmui, 4) !=3D = -1) { mmu =3D OF_instance_to_package(mmui); if (mmu =3D=3D -1 || (sz =3D OF_getproplen(mmu, = "translations")) =3D=3D -1) sz =3D 0; if (sz > 6144 /* tmpstksz - 2 KB headroom */) panic("moea64_bootstrap: too many ofw translations"); =20 if (sz > 0) moea64_add_ofw_mappings(mmup, mmu, sz); } with moea64_add_ofw_mappings called. Then... moea64_add_ofw_mappings does... bzero(translations, sz); OF_getprop(OF_finddevice("/"), "#address-cells", &acells, sizeof(acells)); if (OF_getprop(mmu, "translations", trans_cells, sz) =3D=3D -1) panic("moea64_bootstrap: can't get ofw translations"); And it is the next ofwcall after that last OF_getprop that fails. (It = happens to be a peer request.) Adding a dump of the pmap_bootstrapped = value with the ofwcall name in my hack for reporting things about the = crash confirmed that peer ofwcall as the first with pmap_bootstrapped = non-zero. I will note here that it is somewhat later than the above code that = pvo_vaddr_compare ends up executing via bs_remap_earlyboot. That earlier = moea64_late_bootstrap code continues after the } from the first if above = with: /* * Calculate the last available physical address. */ for (i =3D 0; phys_avail[i + 2] !=3D 0; i +=3D 2) ; Maxmem =3D powerpc_btop(phys_avail[i + 1]); /* * Initialize MMU and remap early physical mappings */ MMU_CPU_BOOTSTRAP(mmup,0); mtmsr(mfmsr() | PSL_DR | PSL_IR); pmap_bootstrapped++; bs_remap_earlyboot(); (and more). I've not found the peer call yet but it may well be after = the pvo_vaddr_compare shown above as far as execution order goes. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:41 PM, Mark Millard = wrote: The first boot after make -8 kernel without quiesce also died during = peer, I'd guess the same one. Looks like quiesce does not matter for the issue. (But it is handy for = identifying which peer fails.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:08 PM, Nathan Whitehorn wrote: Can you comment out the call to quiesce? It may not be necessary on your = system. -Nathan On 09/25/14 13:17, Mark Millard wrote: > The "before copyright" hang/exception is during the first openfirmware = "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) = completes fine, the ofwcall(args) is made but it does not return = normally. >=20 > Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's = is: >=20 > quiesce > finddevice > parent > getprop > getprop > getprop > finddevice > getprop > instance-to-package > getproplen > finddevice > getprop > getprop > peer >=20 > And when the boot fails before the copyright that ofwcall for peer = ends up resulting in the register dump with no register pointing to the = kernel's normal stack area. >=20 > I still have no clue what is happening during peer. = ofw_restore_trap_vec(save_trap_init) is being called and is returning = before ofwcall is used. For all I know some uses of peer could require = not being quiesce'd in order for peer to be reliable. >=20 > In the form of my display indicating what executed the text reported = ends in: >=20 > ^ >=20 > where the ^ indicates the stage that last completed in the call = sequence inside openfirmware_core. This information is displayed by the >=20 > x/s ofw_name_history >=20 > in the automatically created default script for DDB. I read the = sequence backwards from the end marker (here ^), following the = wraparound if there is that much text and if I care to go back that far. >=20 > FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep = 25 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc >=20 > My current hacks to get this information are: >=20 > Index: /usr/src/sys/ddb/db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/ddb/db_script.c (revision 271944) > +++ /usr/src/sys/ddb/db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt; x/s ofw_name_history"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- > Index: /usr/src/sys/powerpc/conf/GENERIC64 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) > +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) > @@ -76,6 +76,8 @@ > # Debugging support. Always need this: > options KDB # Enable kernel debugger support. > options KDB_TRACE # Print a stack trace for a panic. > +options DDB > +options GDB > =20 > # Make an SMP-capable kernel by default > options SMP # Symmetric MultiProcessor Kernel > Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944) > +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy) > @@ -324,6 +324,12 @@ > openfirmware(&args); > } > =20 > +/* Part of HACK to have record of ofw call names */ > +#define ofw_name_history_record_size 256 > +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* = Initially: automatically '\0' filled */ > +char * ofw_name_history_pos =3D ofw_name_history; > +/* End Part of HACK */ > + > static int > openfirmware_core(void *args) > { > @@ -330,6 +336,42 @@ > int result; > register_t oldmsr; > =20 > + { /* HACK to have record of ofw call names */ > + struct argtype_prefix { > + cell_t name; > + }; > + > + char *name =3D (char*) (uintptr_t) (((struct = argtype_prefix*)args)->name); > +=20 > + int i; > + > + *ofw_name_history_pos =3D '<'; > + > + for(i=3D0; (*name) && i!=3D20; i++) { > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D *name; > + > + name++; > + } > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '>'; > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '@'; > + > + ofw_name_history[ofw_name_history_record_size] =3D '\0'; /* Paranoia = */ > + } /* HACK end */ > + > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > @@ -337,14 +379,22 @@ > */ > oldmsr =3D intr_disable(); > =20 > + *ofw_name_history_pos =3D '#'; /* HACK */ > + > ofw_sprg_prepare(); > =20 > + *ofw_name_history_pos =3D '$'; /* HACK */ > + > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '%'; /* HACK */ > + > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > + *ofw_name_history_pos =3D '^'; /* HACK */ > + > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > @@ -357,13 +407,21 @@ > =20 > result =3D ofwcall(args); > =20 > + *ofw_name_history_pos =3D '&'; /* HACK */ > + > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '*'; /* HACK */ > + > ofw_sprg_restore(); > =20 > + *ofw_name_history_pos =3D '~'; /* HACK */ > + > intr_restore(oldmsr); > =20 > + *ofw_name_history_pos =3D '!'; /* HACK */ > + > return (result); > } >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 25, 2014, at 3:46 AM, Mark Millard = wrote: >=20 > One source code oddity that I notice is the following mixed use of = ofw_real_mode: always tested vs. never tested (#if 0 ... #endif) ... >=20 >> /* >> * Saved SPRG0-3 from OpenFirmware. Will be restored prior to the = callback. >> */ >> register_t ofw_sprg0_save; >>=20 >> static __inline void >> ofw_sprg_prepare(void) >> { >> if (ofw_real_mode) >> return; >>=20 >> /* >> * Assume that interrupt are disabled at this point, or >> * SPRG1-3 could be trashed >> */ >> __asm __volatile("mfsprg0 %0\n\t" >> "mtsprg0 %1\n\t" >> "mtsprg1 %2\n\t" >> "mtsprg2 %3\n\t" >> "mtsprg3 %4\n\t" >> : "=3D&r"(ofw_sprg0_save) >> : "r"(ofmsr[1]), >> "r"(ofmsr[2]), >> "r"(ofmsr[3]), >> "r"(ofmsr[4])); >> } >> =20 >> static __inline void >> ofw_sprg_restore(void) >> { >> #if 0 >> if (ofw_real_mode) >> return; >> #endif >>=20 >> /* >> * Note that SPRG1-3 contents are irrelevant. They are = scratch >> * registers used in the early portion of trap handling when >> * interrupts are disabled. >> * >> * PCPU data cannot be used until this routine is called ! >> */ >> __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save)); >> } >=20 > It would seem that for ofw_real_mode !=3D 0 that ofw_sprg_prepare = would never set up ofw_sprg0_save (via mfsprg0) for the later = ofw_sprg_restore's always-executed mtsprg0 that is based on = ofw_sprg0_save. >=20 > register_t seems to trace back to __int64_t --and that would leave = ofw_sprg0_save initialized to zero as a global and that would have to be = okay as the SPRG0 value to restore in such a case. (I have not tracked = down what any of the per-processor values for SPRG0 are/should-be.) >=20 >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 25, 2014, at 2:12 AM, Mark Millard = wrote: >=20 > The register dump that has no kernel stack addresses in any registers = does have register contents suggesting a ofwcall use, matching up = reasonably with the code I looked at that is related to ofwcall. ofwcall = is only reached via openfirmware_core from what I can tell. (If there = are other paths into openfirmware than via ofwcall then the register = dump suggests that they are not in use around the crash.) >=20 > And openfirmware_core has logic for exception vector swapping, going = both directions: >=20 >> static int >> openfirmware_core(void *args) >> { >> int result; >> register_t oldmsr; >> =20 >> /* >> * Turn off exceptions - we really don't want to end up >> * anywhere unexpected with PCPU set to something strange >> * or the stack pointer wrong. >> */ >> oldmsr =3D intr_disable(); >> =20 >> ofw_sprg_prepare(); >> =20 >> /* Save trap vectors */ >> ofw_save_trap_vec(save_trap_of); >> =20 >> /* Restore initially saved trap vectors */ >> ofw_restore_trap_vec(save_trap_init); >> =20 >> #if defined(AIM) && !defined(__powerpc64__) >> /* >> * Clear battable[] translations >> */ >> if (!(cpu_features & PPC_FEATURE_64)) >> __asm __volatile("mtdbatu 2, %0\n" >> "mtdbatu 3, %0" : : "r" (0)); >> isync(); >> #endif >>=20 >> result =3D ofwcall(args); >>=20 >> /* Restore trap vecotrs */ >> ofw_restore_trap_vec(save_trap_of); >>=20 >> ofw_sprg_restore(); >>=20 >> intr_restore(oldmsr); >>=20 >> return (result); >> } >=20 > In turn openfirmware_core is used only by ofw_rendezvous_dispatch and = in turn that is used only by openfirmware: only PCPU_GET(cpuid) =3D=3D 0 = does the above. save_trap_init is initialized by powerpc_init using = ofw_save_trap_vec. >=20 > [Note that ofw_restore_trap_vec uses __syncicache which does not use = dcbf after the bcopy but instead uses dcbst: That is part of what lead = my investigation into the distinction --and so to my more overall dcbst = vs. dcbf use questions after proving dcbf would not be sufficient for a = fix to the specific boot issue.] >=20 > Unless the initialization of save_trap_init ends up with the wrong = contents for openfirmware it would appear that the exception vectors are = kept tracking by the above code. But the above does assume that the = openfirmware vectors are unchanged after save_trap_init is initialized: = there is no attempt at tracking of any potential updates to the = openfirmware exception vectors. >=20 > I would infer then that after ofw_restore_trap_vec(save_trap_of) is = executed is when the exception that DDB reports happened: That is when = FreeBSD's exception vectors are again in place. But a stack pointer into = the kernel stack is not then in place in any register (based on DDB's = register dump): stack handling is messed up already by the point of the = reported exception. And that may actually be why an illegal instruction = at address zero was reached: an incorrect stack context used to get an = address to execute at. >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 > On Sep 24, 2014, at 8:36 AM, Nathan Whitehorn wrote: >=20 > There shouldn't be any exceptions at that point, nested or otherwise. = What I suspect is happening is that Open Firmware has turned them on for = some bizarre reason, taken one, and ended up in the kernel's handlers = but with the Open Firmware environment. Saving and restoring the OF = interrupt vectors would be a possible solution; flattening the device = tree in loader so that the kernel doesn't call Open Firmware at all = would be another. I think Justin may have tried the first at some point. > -Nathan >=20 > On 09/24/14 02:04, Mark Millard wrote: >> Now that I've had a kernel/boot crash with a successful DDB bt and = show registers (a different submittal) it makes for a good = comparison/contrast with what DDB reports for this "before copyright" = crash. >>=20 >> Something unique to the "before copyright" context is... >>=20 >> No registers are reported to have values that point into the range = between tmpstk and esym. >>=20 >> In other words: There is no valid stack pointer reported as far as I = can tell. r1 has the value 0 instead of being a handling a valid stack = address. tmpstk=3D0xbd7000 and esym=3D0xbdb000 (example for one of my = WITH_DEBUG_FILES=3D and options DDB and GDB builds of 10.1-BETA2). That = at least gives a ball park on the range to expect for pointing into the = stack even with some build variation. >>=20 >> It leaves me wondering if the DDB report is for a nested exception = handling. That could explain why lr points to u_trap+0x10 and srr0 = points to k_trap+0x28 when normally srr0 would point to the the failing = instruction (or the instruction after) and lr to where that routine = would normally return to. >>=20 >> The register values that are reported for my 10.1-BETA2 builds that = crash before the copyright notice are: >>=20 >> r0: 0 >> r1: 0 >> r2: 0xc81538 vop_unlock_desc >> r3: 0xd18868 >> r4: 0x894b58 >> r5: 0 >> r6: 0xc1dee0 M_AUDITBSM >> r7: 0xe3f818 ofw_real_mode >> r8: 0x1 >> r9: 0xe0f580 __pcpu >> r10: 0x1c35ec0 >> r11: 0 >> r12: 0x10000000 >> r13: 0xdbb290 thread0 (Note: another submittal has this mistyped as = 0xdbb290.) >> r14-r19: all 0 >> r20: 0x10c1000 >> r21: 0x4 >> r22: 0x180abd4 >> r23: 0x1803a28 >> r24: 0xc000000000008760 >> r25: 0xcc89b8 smp_no... >> r26: 0xcea108 ofw_rend... >> r27: 0x894b58 ofwcall+0xa8 >> r28: 0x894b58 ofwcall+0xa8 >> r29: 2400022 >> r30: 9000000000001032 >> r31: 0xbb7d38 >>=20 >> srr0: 0x102720 k_trap+0x28 >> srr1: 9000000000001032 >> lr: 0x1026f0 u_trap+0x10 >> ctr: 0xff846d78 >> cr: 2000deb0 >> xer: 0 >> dar: f...d50 (lots of f's) >> dsisr: 42000000 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Sat Sep 27 06:57:10 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 40F62336 for ; Sat, 27 Sep 2014 06:57:10 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8337777F for ; Sat, 27 Sep 2014 06:57:08 +0000 (UTC) Received: (qmail 26742 invoked from network); 27 Sep 2014 06:57:07 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 27 Sep 2014 06:57:07 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Sat, 27 Sep 2014 02:57:07 -0400 (EDT) Received: (qmail 17465 invoked from network); 27 Sep 2014 06:56:00 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 27 Sep 2014 06:56:00 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id C922A1C402B; Fri, 26 Sep 2014 23:55:59 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net> Date: Fri, 26 Sep 2014 23:55:59 -0700 Message-Id: <2E98A886-36FF-4B68-B729-F2143339E1DE@dsl-only.net> References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> <54248467.4050900@freebsd.org> <34AA4542-56A7-453E-A00E-868EE352C96C@dsl-only.net> <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2014 06:57:10 -0000 According to my adjusted dumping: At the "before = Copyright"/ofwcall-for-peer crash ofw_real_mode=3D=3D0. And that does turn off exception vector save/restore: __inline void ofw_save_trap_vec(char *save_trap_vec) { if (!ofw_real_mode) return; bcopy((void *)EXC_RST, save_trap_vec, EXC_LAST - EXC_RST); } static __inline void ofw_restore_trap_vec(char *restore_trap_vec) { if (!ofw_real_mode) return; bcopy(restore_trap_vec, (void *)EXC_RST, EXC_LAST - EXC_RST); __syncicache(EXC_RSVD, EXC_LAST - EXC_RSVD); } So now it is clear to me how FreeBSD's exception vectors could be = involved in a context that does not have FreeBSD's environment in place. = (Finally!) For powerpc64/GENERIC64 it should also then establish OFW_STD_32BIT: boolean_t OF_bootstrap() { boolean_t status =3D FALSE; =20 if (openfirmware_entry !=3D NULL) { if (ofw_real_mode) { status =3D OF_install(OFW_STD_REAL, 0); } else { #ifdef __powerpc64__ status =3D OF_install(OFW_STD_32BIT, 0); #else status =3D OF_install(OFW_STD_DIRECT, 0); #endif } This seems to be like OFW_STD_REAL in what it sets up: ofw_real_methods. static ofw_def_t ofw_real =3D { OFW_STD_REAL, ofw_real_methods, 0 }; OFW_DEF(ofw_real); static ofw_def_t ofw_32bit =3D { OFW_STD_32BIT, ofw_real_methods, 0 }; OFW_DEF(ofw_32bit); ofw_real_mode is used to figure out the context when it matters from = what I can tell so far. Just to experiment to be sure I temporarily hacked in ignoring = ofw_real_mode in ofw_save_trap_vec and ofw_restore_trap_vec so they = would be effective at exception vector swapping. As I guessed it still hangs before the copyright notice. (Without = getting to DDB so no dump information is displayed.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 26, 2014, at 10:18 PM, Mark Millard = wrote: The first send of this was big enough for the moderator to be involved. = So I canceled and am sending with less history included. [I'll note that I seem to have trouble typing 0xdbb290 vs. 0xbdd290. The = actual value is 0xdbb290. The references to the incorrect typing should = say 0xbdd290, which is the wrong value. But I've had both types of = references listing the wrong text... in various notes.] =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 26, 2014, at 10:11 PM, Mark Millard wrote: The openfirmware peer crash (i.e., the before Copyright notice crash) = happens during/just-after the MMU setup and the peer pfwcall is the = first ofwcall where pmap_bootstrapped is non-zero at the time. In other = words: the very first ofwcall in the new context fails. And this failure involves some of the same code area that I got a = backtrace for and reported as a separate crash (with the trace listed). = As a reminder for that backtrace that has a difference failure point: .pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 [or ld r0,88(r4) in = an alternate notation] .pvo_tree_RB_FIND+0x38 .moea64_dev_direct_mapped_0x90 .pmap_dev_direct_mapped+0x84 ("_dev" was missing in earlier note) .bs_remap_earlyboot_0x6c .moea64_late_bootstrap+0x178 .moea64_bootstrap_native+0x120 .pmap_bootstrap+0xac .powerpc_init+0x514 btext+0xa8 As for the sequence of ofwcall's that I reported: starting at the last = OF_finddevice before the OF_instance_to_package that I reported in the = sequence of ofwcall's from quiesce until the crash... moea64_late_bootstrap does chosen =3D OF_finddevice("/chosen"); if (chosen !=3D -1 && OF_getprop(chosen, "mmu", &mmui, 4) !=3D = -1) { mmu =3D OF_instance_to_package(mmui); if (mmu =3D=3D -1 || (sz =3D OF_getproplen(mmu, = "translations")) =3D=3D -1) sz =3D 0; if (sz > 6144 /* tmpstksz - 2 KB headroom */) panic("moea64_bootstrap: too many ofw translations"); =20 if (sz > 0) moea64_add_ofw_mappings(mmup, mmu, sz); } with moea64_add_ofw_mappings called. Then... moea64_add_ofw_mappings does... bzero(translations, sz); OF_getprop(OF_finddevice("/"), "#address-cells", &acells, sizeof(acells)); if (OF_getprop(mmu, "translations", trans_cells, sz) =3D=3D -1) panic("moea64_bootstrap: can't get ofw translations"); And it is the next ofwcall after that last OF_getprop that fails. (It = happens to be a peer request.) Adding a dump of the pmap_bootstrapped = value with the ofwcall name in my hack for reporting things about the = crash confirmed that peer ofwcall as the first with pmap_bootstrapped = non-zero. I will note here that it is somewhat later than the above code that = pvo_vaddr_compare ends up executing via bs_remap_earlyboot. That earlier = moea64_late_bootstrap code continues after the } from the first if above = with: /* * Calculate the last available physical address. */ for (i =3D 0; phys_avail[i + 2] !=3D 0; i +=3D 2) ; Maxmem =3D powerpc_btop(phys_avail[i + 1]); /* * Initialize MMU and remap early physical mappings */ MMU_CPU_BOOTSTRAP(mmup,0); mtmsr(mfmsr() | PSL_DR | PSL_IR); pmap_bootstrapped++; bs_remap_earlyboot(); (and more). I've not found the peer call yet but it may well be after = the pvo_vaddr_compare shown above as far as execution order goes. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:41 PM, Mark Millard = wrote: The first boot after make -8 kernel without quiesce also died during = peer, I'd guess the same one. Looks like quiesce does not matter for the issue. (But it is handy for = identifying which peer fails.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:08 PM, Nathan Whitehorn wrote: Can you comment out the call to quiesce? It may not be necessary on your = system. -Nathan On 09/25/14 13:17, Mark Millard wrote: > The "before copyright" hang/exception is during the first openfirmware = "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) = completes fine, the ofwcall(args) is made but it does not return = normally. >=20 > Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's = is: >=20 > quiesce > finddevice > parent > getprop > getprop > getprop > finddevice > getprop > instance-to-package > getproplen > finddevice > getprop > getprop > peer >=20 > And when the boot fails before the copyright that ofwcall for peer = ends up resulting in the register dump with no register pointing to the = kernel's normal stack area. >=20 > I still have no clue what is happening during peer. = ofw_restore_trap_vec(save_trap_init) is being called and is returning = before ofwcall is used. For all I know some uses of peer could require = not being quiesce'd in order for peer to be reliable. >=20 > In the form of my display indicating what executed the text reported = ends in: >=20 > ^ >=20 > where the ^ indicates the stage that last completed in the call = sequence inside openfirmware_core. This information is displayed by the >=20 > x/s ofw_name_history >=20 > in the automatically created default script for DDB. I read the = sequence backwards from the end marker (here ^), following the = wraparound if there is that much text and if I care to go back that far. >=20 > FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep = 25 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc >=20 > My current hacks to get this information are: >=20 > Index: /usr/src/sys/ddb/db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/ddb/db_script.c (revision 271944) > +++ /usr/src/sys/ddb/db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt; x/s ofw_name_history"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- > Index: /usr/src/sys/powerpc/conf/GENERIC64 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) > +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) > @@ -76,6 +76,8 @@ > # Debugging support. Always need this: > options KDB # Enable kernel debugger support. > options KDB_TRACE # Print a stack trace for a panic. > +options DDB > +options GDB > =20 > # Make an SMP-capable kernel by default > options SMP # Symmetric MultiProcessor Kernel > Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944) > +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy) > @@ -324,6 +324,12 @@ > openfirmware(&args); > } > =20 > +/* Part of HACK to have record of ofw call names */ > +#define ofw_name_history_record_size 256 > +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* = Initially: automatically '\0' filled */ > +char * ofw_name_history_pos =3D ofw_name_history; > +/* End Part of HACK */ > + > static int > openfirmware_core(void *args) > { > @@ -330,6 +336,42 @@ > int result; > register_t oldmsr; > =20 > + { /* HACK to have record of ofw call names */ > + struct argtype_prefix { > + cell_t name; > + }; > + > + char *name =3D (char*) (uintptr_t) (((struct = argtype_prefix*)args)->name); > +=20 > + int i; > + > + *ofw_name_history_pos =3D '<'; > + > + for(i=3D0; (*name) && i!=3D20; i++) { > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D *name; > + > + name++; > + } > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '>'; > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '@'; > + > + ofw_name_history[ofw_name_history_record_size] =3D '\0'; /* Paranoia = */ > + } /* HACK end */ > + > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > @@ -337,14 +379,22 @@ > */ > oldmsr =3D intr_disable(); > =20 > + *ofw_name_history_pos =3D '#'; /* HACK */ > + > ofw_sprg_prepare(); > =20 > + *ofw_name_history_pos =3D '$'; /* HACK */ > + > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '%'; /* HACK */ > + > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > + *ofw_name_history_pos =3D '^'; /* HACK */ > + > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > @@ -357,13 +407,21 @@ > =20 > result =3D ofwcall(args); > =20 > + *ofw_name_history_pos =3D '&'; /* HACK */ > + > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '*'; /* HACK */ > + > ofw_sprg_restore(); > =20 > + *ofw_name_history_pos =3D '~'; /* HACK */ > + > intr_restore(oldmsr); > =20 > + *ofw_name_history_pos =3D '!'; /* HACK */ > + > return (result); > } >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20 From owner-freebsd-ppc@FreeBSD.ORG Sat Sep 27 07:47:06 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CE7DDB41 for ; Sat, 27 Sep 2014 07:47:06 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 670F4B68 for ; Sat, 27 Sep 2014 07:47:05 +0000 (UTC) Received: (qmail 30526 invoked from network); 27 Sep 2014 07:47:04 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 27 Sep 2014 07:47:04 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Sat, 27 Sep 2014 03:47:04 -0400 (EDT) Received: (qmail 18419 invoked from network); 27 Sep 2014 07:47:04 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 27 Sep 2014 07:47:04 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id 2AED31C402B for ; Sat, 27 Sep 2014 00:47:03 -0700 (PDT) From: Mark Millard Subject: backtrace information from the 2nd(?) most common boot crash place on PowerMac G5's: just after real memory = ... (... MB) Message-Id: Date: Sat, 27 Sep 2014 00:47:02 -0700 To: FreeBSD PowerPC ML Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2014 07:47:07 -0000 The following includes backtrace information from the 2nd most common = boot crash place in the boot message sequence on PowerMac G5's: just = after it reports real memory =3D ... (... MB). Classically it reports data storage interrupt here and it did again. But = more is dumped in my current configuration than before. FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #16 r271944M: Fri Sep 26 = 23:01:54 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc but with options DDB and DGB in GENERIC64, WITH_DEBUG_FILES=3D, = WITHOUT_CLANG=3D, WIHT_DEBUG=3D in /etc/make.conf. Also: DDB hacked to = dump various things automatically so it happens during early boot = crashes/hangs. The information reported was... fatal kernel trap exception =3D 0x300 (data storage interrupt) virtual address =3D 0x75e0000 dsisr =3D 0x42000000 curthread =3D 0xdbc290 pid =3D 0, comm =3D srr0: 0x885608 .moea64_zero_page+1ac (a dcbz r0,r10) lr: 0x8ba31c .pmap_zero_page+0x7c ctr: 0x88545c .moea64_zero_page 0x8ba318: .pmap_zero_page+0x78 0x84167c: .kmem_back+0x2d0 0x8417fc: .kmem_malloc+0x7c 0x840dc4: .vm_ksubmap_init+0x8c 0x882130: .cpu_startup+0x10c 0x4d9c10: .mi_startup+0x10c btext+0xbc (???) r0: 0x1 r1: 0xc000000000008740 r2: 0xd19468 r3: 0xe4d3a8 mmu_kernel_obj r4: 0xc000000002bfc290 r5: 0xc7dfa0 mmu_zero_page_desc r6: 0xc000000000063af8 r7: 0x2 r8: 0xe0c310 vm_phys_free_queues r9: 0x80 dbsize+0xc r10: 0x7f5e0000 r11: 0x80 dbsize_0xc r12: 0x24042042 r13: 0xdbc290 thread0 r14-r19: all 0 r20: 0x10c2000 r21: 0x4 r22: 0x163f000 r23: 0xc0000000d03fd000 r24: 0x3800 r25: 0x262 r26: 0x400000000000000 r27: 0xe4d3a8 mmu_kernel_obj r28: 0xc000000002bfc290 r29: 0xc000000002bfc290 (yes: again) r30: 0x75e0000 r31: 0xc000000000008740 cr: 0x44042044 xer: 0 (I did not write down srr1. Drat.) =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-ppc@FreeBSD.ORG Sat Sep 27 10:51:41 2014 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EC32428F for ; Sat, 27 Sep 2014 10:51:41 +0000 (UTC) Received: from asp.reflexion.net (outbound-242.asp.reflexion.net [69.84.129.242]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 28412D66 for ; Sat, 27 Sep 2014 10:51:39 +0000 (UTC) Received: (qmail 24398 invoked from network); 27 Sep 2014 10:51:38 -0000 Received: from unknown (HELO mail-cs-03.app.dca.reflexion.local) (10.81.19.3) by 0 (rfx-qmail) with SMTP; 27 Sep 2014 10:51:38 -0000 Received: by mail-cs-03.app.dca.reflexion.local (Reflexion email security v7.30.7) with SMTP; Sat, 27 Sep 2014 06:51:38 -0400 (EDT) Received: (qmail 7517 invoked from network); 27 Sep 2014 10:51:34 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (DHE-RSA-AES256-SHA encrypted) SMTP; 27 Sep 2014 10:51:34 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-98-246-178-138.hsd1.or.comcast.net [98.246.178.138]) by iron2.pdx.net (Postfix) with ESMTPSA id C34391C402B; Sat, 27 Sep 2014 03:51:31 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: lr=u_trap+0x10 and srr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang on PowerMac G5's From: Mark Millard In-Reply-To: <2E98A886-36FF-4B68-B729-F2143339E1DE@dsl-only.net> Date: Sat, 27 Sep 2014 03:51:32 -0700 Message-Id: References: <1118046C-0FF7-49FC-82DA-DB9A7A310991@dsl-only.net> <2ED3DB50-B985-4382-8FF2-3B44E7E65453@dsl-only.net> <6D729F43-662A-429E-9503-0148EC3250B1@dsl-only.net> <72535F89-3942-45A6-B351-7F746209ED9F@dsl-only.net> <0703EF26-6E33-4446-9273-BBFD0CB72893@dsl-only.net> <37575F94-763C-43BF-8DD9-F648F4A7C09F@dsl-only.net> <5422E513.6010806@freebsd.org> <1C02D0D4-14B8-465F-B493-4D3A64E4C35C@dsl-only.net> <0DF8A9EC-C81C-4E15-9420-6831BA7D5F8E@dsl-only.net> <54248467.4050900@freebsd.org> <34AA4542-56A7-453E-A00E-868EE352C96C@dsl-only.net> <7008CDAA-2DA2-419F-9BEC-AD823ECBFCCC@dsl-only.net> <2E98A886-36FF-4B68-B729-F2143339E1DE@dsl-only.net> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1878.6) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Justin Hibbits , FreeBSD PowerPC ML X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2014 10:51:42 -0000 I found the backtrace for the OF_peer call that leads to the "before = copyright"/ofwcall-for-peer hang/crash in ofwcall. This happens to be = the first ofwcall with pmap_bootstrapped!=3D0, which may be the biggest = issue involved (for what it implies). .OF_peer+0x8c .powermac_smp_first_cpu+0x3c (OF_peer(0) below) .platform_smp_first_cpu+0x78 .cpu_mp_setmaxid+0x2c (via .mpt_fc_els_reply_handler+0x2e68 that is not = explicitly listed) .mp_setmaxid+0x14 .mi_startup0x10c btext+0xbc The source code involved is: static int powermac_smp_first_cpu(platform_t plat, struct cpuref *cpuref) { char buf[8]; phandle_t cpu, dev, root; int res; root =3D OF_peer(0); dev =3D OF_child(root); while (dev !=3D 0) { res =3D OF_getprop(dev, "name", buf, sizeof(buf)); if (res > 0 && strcmp(buf, "cpus") =3D=3D 0) break; dev =3D OF_peer(dev); } if (dev =3D=3D 0) { /* * psim doesn't have a name property on the /cpus node, * but it can be found directly */ dev =3D OF_finddevice("/cpus"); if (dev =3D=3D -1) return (ENOENT); } cpu =3D OF_child(dev); while (cpu !=3D 0) { res =3D OF_getprop(cpu, "device_type", buf, = sizeof(buf)); if (res > 0 && strcmp(buf, "cpu") =3D=3D 0) break; cpu =3D OF_peer(cpu); } if (cpu =3D=3D 0) return (ENOENT); return (powermac_smp_fill_cpuref(cpuref, cpu)); } To check if the peer use is special I temporarily made OF_peer cache the = node 0 result so only the first such call uses ofwcall. (The above is = not the first such call.) The expectation is that the OF_child should = then fail. And it does. So peer is not special: it is just whichever = ofwcall argument type happens to be the first after pmap_bootstrapped!=3D0= that get the problem. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 26, 2014, at 11:55 PM, Mark Millard = wrote: According to my adjusted dumping: At the "before = Copyright"/ofwcall-for-peer crash ofw_real_mode=3D=3D0. And that does turn off exception vector save/restore: __inline void ofw_save_trap_vec(char *save_trap_vec) { if (!ofw_real_mode) return; bcopy((void *)EXC_RST, save_trap_vec, EXC_LAST - EXC_RST); } static __inline void ofw_restore_trap_vec(char *restore_trap_vec) { if (!ofw_real_mode) return; bcopy(restore_trap_vec, (void *)EXC_RST, EXC_LAST - EXC_RST); __syncicache(EXC_RSVD, EXC_LAST - EXC_RSVD); } So now it is clear to me how FreeBSD's exception vectors could be = involved in a context that does not have FreeBSD's environment in place. = (Finally!) For powerpc64/GENERIC64 it should also then establish OFW_STD_32BIT: boolean_t OF_bootstrap() { boolean_t status =3D FALSE; =20 if (openfirmware_entry !=3D NULL) { if (ofw_real_mode) { status =3D OF_install(OFW_STD_REAL, 0); } else { #ifdef __powerpc64__ status =3D OF_install(OFW_STD_32BIT, 0); #else status =3D OF_install(OFW_STD_DIRECT, 0); #endif } This seems to be like OFW_STD_REAL in what it sets up: ofw_real_methods. static ofw_def_t ofw_real =3D { OFW_STD_REAL, ofw_real_methods, 0 }; OFW_DEF(ofw_real); static ofw_def_t ofw_32bit =3D { OFW_STD_32BIT, ofw_real_methods, 0 }; OFW_DEF(ofw_32bit); ofw_real_mode is used to figure out the context when it matters from = what I can tell so far. Just to experiment to be sure I temporarily hacked in ignoring = ofw_real_mode in ofw_save_trap_vec and ofw_restore_trap_vec so they = would be effective at exception vector swapping. As I guessed it still hangs before the copyright notice. (Without = getting to DDB so no dump information is displayed.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 26, 2014, at 10:18 PM, Mark Millard = wrote: The first send of this was big enough for the moderator to be involved. = So I canceled and am sending with less history included. [I'll note that I seem to have trouble typing 0xdbb290 vs. 0xbdd290. The = actual value is 0xdbb290. The references to the incorrect typing should = say 0xbdd290, which is the wrong value. But I've had both types of = references listing the wrong text... in various notes.] =3D=3D=3D Mark Millard markmi@dsl-only.net On Sep 26, 2014, at 10:11 PM, Mark Millard wrote: The openfirmware peer crash (i.e., the before Copyright notice crash) = happens during/just-after the MMU setup and the peer pfwcall is the = first ofwcall where pmap_bootstrapped is non-zero at the time. In other = words: the very first ofwcall in the new context fails. And this failure involves some of the same code area that I got a = backtrace for and reported as a separate crash (with the trace listed). = As a reminder for that backtrace that has a difference failure point: .pvo_vaddr_compare+0x14, instruction ld r0, r4, 0x58 [or ld r0,88(r4) in = an alternate notation] .pvo_tree_RB_FIND+0x38 .moea64_dev_direct_mapped_0x90 .pmap_dev_direct_mapped+0x84 ("_dev" was missing in earlier note) .bs_remap_earlyboot_0x6c .moea64_late_bootstrap+0x178 .moea64_bootstrap_native+0x120 .pmap_bootstrap+0xac .powerpc_init+0x514 btext+0xa8 As for the sequence of ofwcall's that I reported: starting at the last = OF_finddevice before the OF_instance_to_package that I reported in the = sequence of ofwcall's from quiesce until the crash... moea64_late_bootstrap does chosen =3D OF_finddevice("/chosen"); if (chosen !=3D -1 && OF_getprop(chosen, "mmu", &mmui, 4) !=3D = -1) { mmu =3D OF_instance_to_package(mmui); if (mmu =3D=3D -1 || (sz =3D OF_getproplen(mmu, = "translations")) =3D=3D -1) sz =3D 0; if (sz > 6144 /* tmpstksz - 2 KB headroom */) panic("moea64_bootstrap: too many ofw translations"); =20 if (sz > 0) moea64_add_ofw_mappings(mmup, mmu, sz); } with moea64_add_ofw_mappings called. Then... moea64_add_ofw_mappings does... bzero(translations, sz); OF_getprop(OF_finddevice("/"), "#address-cells", &acells, sizeof(acells)); if (OF_getprop(mmu, "translations", trans_cells, sz) =3D=3D -1) panic("moea64_bootstrap: can't get ofw translations"); And it is the next ofwcall after that last OF_getprop that fails. (It = happens to be a peer request.) Adding a dump of the pmap_bootstrapped = value with the ofwcall name in my hack for reporting things about the = crash confirmed that peer ofwcall as the first with pmap_bootstrapped = non-zero. I will note here that it is somewhat later than the above code that = pvo_vaddr_compare ends up executing via bs_remap_earlyboot. That earlier = moea64_late_bootstrap code continues after the } from the first if above = with: /* * Calculate the last available physical address. */ for (i =3D 0; phys_avail[i + 2] !=3D 0; i +=3D 2) ; Maxmem =3D powerpc_btop(phys_avail[i + 1]); /* * Initialize MMU and remap early physical mappings */ MMU_CPU_BOOTSTRAP(mmup,0); mtmsr(mfmsr() | PSL_DR | PSL_IR); pmap_bootstrapped++; bs_remap_earlyboot(); (and more). I've not found the peer call yet but it may well be after = the pvo_vaddr_compare shown above as far as execution order goes. =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:41 PM, Mark Millard = wrote: The first boot after make -8 kernel without quiesce also died during = peer, I'd guess the same one. Looks like quiesce does not matter for the issue. (But it is handy for = identifying which peer fails.) =3D=3D=3D Mark Millard markmi at dsl-only.net On Sep 25, 2014, at 2:08 PM, Nathan Whitehorn wrote: Can you comment out the call to quiesce? It may not be necessary on your = system. -Nathan On 09/25/14 13:17, Mark Millard wrote: > The "before copyright" hang/exception is during the first openfirmware = "peer" after "quiesce". The ofw_restore_trap_vec(save_trap_init) = completes fine, the ofwcall(args) is made but it does not return = normally. >=20 > Ignoring the ofwcall's from before quiesce, the sequence of ofwcall's = is: >=20 > quiesce > finddevice > parent > getprop > getprop > getprop > finddevice > getprop > instance-to-package > getproplen > finddevice > getprop > getprop > peer >=20 > And when the boot fails before the copyright that ofwcall for peer = ends up resulting in the register dump with no register pointing to the = kernel's normal stack area. >=20 > I still have no clue what is happening during peer. = ofw_restore_trap_vec(save_trap_init) is being called and is returning = before ofwcall is used. For all I know some uses of peer could require = not being quiesce'd in order for peer to be reliable. >=20 > In the form of my display indicating what executed the text reported = ends in: >=20 > ^ >=20 > where the ^ indicates the stage that last completed in the call = sequence inside openfirmware_core. This information is displayed by the >=20 > x/s ofw_name_history >=20 > in the automatically created default script for DDB. I read the = sequence backwards from the end marker (here ^), following the = wraparound if there is that much text and if I care to go back that far. >=20 > FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #11 r271944M: Thu Sep = 25 12:14:05 PDT 2014 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64 = powerpc >=20 > My current hacks to get this information are: >=20 > Index: /usr/src/sys/ddb/db_script.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/ddb/db_script.c (revision 271944) > +++ /usr/src/sys/ddb/db_script.c (working copy) > @@ -319,10 +319,25 @@ > { > char scriptname[DB_MAXSCRIPTNAME]; > =20 > + /* HACK!!! : Additional lines to force a basic default script to = exist. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + struct ddb_script *dsp =3D = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT); > + if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; = bt; x/s ofw_name_history"); > + > snprintf(scriptname, sizeof(scriptname), "%s.%s", > DB_SCRIPT_KDBENTER_PREFIX, eventname); > if (db_script_exec(scriptname, 0) =3D=3D ENOENT) > (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > + > + /* HACK!!! : Additional lines to always use the default script, > + * even if scriptname existed and was executed. > + * Will dump information even if ddb input is not available for = early crash. > + * Used to get more information about PowerMac G5 "before Copyright" = hangs. > + */ > + else > + (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0); > } > =20 > /*- > Index: /usr/src/sys/powerpc/conf/GENERIC64 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/conf/GENERIC64 (revision 271944) > +++ /usr/src/sys/powerpc/conf/GENERIC64 (working copy) > @@ -76,6 +76,8 @@ > # Debugging support. Always need this: > options KDB # Enable kernel debugger support. > options KDB_TRACE # Print a stack trace for a panic. > +options DDB > +options GDB > =20 > # Make an SMP-capable kernel by default > options SMP # Symmetric MultiProcessor Kernel > Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/ofw/ofw_machdep.c (revision 271944) > +++ /usr/src/sys/powerpc/ofw/ofw_machdep.c (working copy) > @@ -324,6 +324,12 @@ > openfirmware(&args); > } > =20 > +/* Part of HACK to have record of ofw call names */ > +#define ofw_name_history_record_size 256 > +char ofw_name_history[ofw_name_history_record_size+1] =3D {}; /* = Initially: automatically '\0' filled */ > +char * ofw_name_history_pos =3D ofw_name_history; > +/* End Part of HACK */ > + > static int > openfirmware_core(void *args) > { > @@ -330,6 +336,42 @@ > int result; > register_t oldmsr; > =20 > + { /* HACK to have record of ofw call names */ > + struct argtype_prefix { > + cell_t name; > + }; > + > + char *name =3D (char*) (uintptr_t) (((struct = argtype_prefix*)args)->name); > +=20 > + int i; > + > + *ofw_name_history_pos =3D '<'; > + > + for(i=3D0; (*name) && i!=3D20; i++) { > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D *name; > + > + name++; > + } > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '>'; > + > + ofw_name_history_pos++; > + if (ofw_name_history_pos =3D=3D = &ofw_name_history[ofw_name_history_record_size]) { > + ofw_name_history_pos =3D ofw_name_history; > + } > + *ofw_name_history_pos =3D '@'; > + > + ofw_name_history[ofw_name_history_record_size] =3D '\0'; /* Paranoia = */ > + } /* HACK end */ > + > /* > * Turn off exceptions - we really don't want to end up > * anywhere unexpected with PCPU set to something strange > @@ -337,14 +379,22 @@ > */ > oldmsr =3D intr_disable(); > =20 > + *ofw_name_history_pos =3D '#'; /* HACK */ > + > ofw_sprg_prepare(); > =20 > + *ofw_name_history_pos =3D '$'; /* HACK */ > + > /* Save trap vectors */ > ofw_save_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '%'; /* HACK */ > + > /* Restore initially saved trap vectors */ > ofw_restore_trap_vec(save_trap_init); > =20 > + *ofw_name_history_pos =3D '^'; /* HACK */ > + > #if defined(AIM) && !defined(__powerpc64__) > /* > * Clear battable[] translations > @@ -357,13 +407,21 @@ > =20 > result =3D ofwcall(args); > =20 > + *ofw_name_history_pos =3D '&'; /* HACK */ > + > /* Restore trap vecotrs */ > ofw_restore_trap_vec(save_trap_of); > =20 > + *ofw_name_history_pos =3D '*'; /* HACK */ > + > ofw_sprg_restore(); > =20 > + *ofw_name_history_pos =3D '~'; /* HACK */ > + > intr_restore(oldmsr); > =20 > + *ofw_name_history_pos =3D '!'; /* HACK */ > + > return (result); > } >=20 >=20 >=20 >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net >=20