From owner-freebsd-ppc@FreeBSD.ORG Wed Feb 18 15:35:10 2015 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E64BEA6D for ; Wed, 18 Feb 2015 15:35:10 +0000 (UTC) Received: from d.mail.sonic.net (d.mail.sonic.net [64.142.111.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CAAE78F1 for ; Wed, 18 Feb 2015 15:35:10 +0000 (UTC) Received: from zeppelin.tachypleus.net (173-161-16-229-Illinois.hfc.comcastbusiness.net [173.161.16.229]) (authenticated bits=0) by d.mail.sonic.net (8.15.1/8.14.9) with ESMTPSA id t1IFZ1TK022539 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 18 Feb 2015 07:35:01 -0800 Message-ID: <54E4B124.9040006@freebsd.org> Date: Wed, 18 Feb 2015 07:35:00 -0800 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: freebsd-ppc@freebsd.org Subject: Re: Fixing powerpc64 /boot/loader's kernel page handing: suggestions? References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVZdVFFuGe8Wb3dcarbuK/U3+lKfGptv0cdfLmxC6QKxSAAKxN9AQtJXLpFNbEwiS0Ud4p6epSgprKrqRgg9vzPu5AMPwSKIcls= X-Sonic-ID: C;sLTAs4O35BGis9UUxQPdhw== M;ohIqtIO35BGis9UUxQPdhw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Feb 2015 15:35:11 -0000 Thanks for diagnosing this! The syncicache spans the whole kernel out of laziness. As you note, it isn't appropriate. If there are more instances of this kind of thing, then it might make sense to try to make ld emit only one PT_LOAD program section as a long-term solution. I'll look into that soon. -Nathan On 02/10/15 17:16, Mark Millard wrote: > Context: > > Unfortunately this takes me a bit to describe... > > powerpc 64 FreeBSD 10.1-??? variants on a PowerMac G5 Quad-Core, built on the same machine. I expect the issue applies to some plain powerpc contexts as well as some other powerpc64 contexts. As example context where my issue occurs is: > >> 10.1-RELEASE-p5 >> 10.1-RELEASE-p5 >> FreeBSD FBSDG5M1 10.1-RELEASE-p5 FreeBSD 10.1-RELEASE-p5 #0 r277808M: Fri Jan 30 00:58:33 PST 2015 root@FBSDG5M1:/usr/obj/usr/home/markmi/src_10_1_releng/sys/GENERIC64vtsc powerpc > But I also get is for various vintages of 10.1-STABLE (and 11.0-CURRENT). I use 10.1-RELEASE-p5 here because I happen to have a build that avoids the problem and I know what to set for that build to regenerated --and I know at least one thing to to turn on for builds to create the problem. > >> root@FBSDG5M1:/usr/home/markmi/src_10_1_releng # more sys/powerpc/conf/GENERIC64vtsc >> include GENERIC64 >> ident GENERIC64vtsc >> >> nooptions PS3 #Sony Playstation 3 HACK!!! to allow sc >> >> options DDB # HACK!!! to dump early crash info (but 11.0-CURRENT already has it) >> options GDB # HACK!!! ... >> options VERBOSE_SYSINIT # VERBOSE_SYSINT blocks direct booting for my 10.1-RELEASE-p5 variants: Crashes when the loader is in __syncicache doing dcbst's. >> options BOOTVERBOSE=1 >> options BOOTHOWTO=RB_VERBOSE >> #options KTR >> #options KTR_MASK=KTR_TRAP >> #options KTR_CPUMASK=0xF >> #options KTR_VERBOSE >> >> # HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt historically mishandled during booting >> device sc >> #device kbdmux # HACK: already listed by vt >> options SC_OFWFB # OFW frame buffer >> options SC_DFLT_FONT # compile font in >> makeoptions SC_DFLT_FONT=cp437 >> >> >> # Disable extra checking typically used for FreeBSD 11.0-CURRENT: >> nooptions DEADLKRES #Enable the deadlock resolver >> nooptions INVARIANTS #Enable calls of extra sanity checking >> nooptions INVARIANT_SUPPORT #Extra sanity checks of internal structures, required by INVARIANTS >> nooptions WITNESS #Enable checks to detect deadlocks and cycles >> nooptions WITNESS_SKIPSPIN #Don't run witness on spinlocks for speed >> nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones > > For my temporarily extended ELF_VERBOSE code [and other printf's] that also reports on non-PT_LOADs (which are otherwise skipped) what it reports for booting various 10.1-??? kernel builds is the sequence: > > PT_PHDR > PT_INTERP > PT_LOAD (for .text) > (using archsw.arch_copyin then kern_pread) > Address range example: 0x100000-0xbe017b > > PT_LOAD (for .data) > (using kern_pread) > Address range for the same example: 0xbf0000-0xea4b7f > PT_DYNAMIC > PT_GNU_STACK > symtab > strtab > Final address for the same example: 0x1114baf > > The issue happens when there are such unreferenced pages where I indicated. It turns out for what I started this investigation with that if I commented out VERBOSE_SYSINIT in GENERIC64vtsc (listed earlier) then no unreferenced pages appear but with VERBOSE_SYSINT there are such pages (holding the rest of the context constant). But this is not the only way to get such unreferenced pages. For example my 10.1-STABLE build has unreferenced pages but does not have VERBOSE_SYSINIT (yet). > > When there are unreferenced pages between the two PT_LOADs those pages do not get archsw_arch_copyin or kern_pread handling. (kern_pread in turn uses archsw.arch_readin.) > > For my PowerMac G5 Quad-Core context those archsw.arch_ routines end up being ofw_copyin and ofw_readin. Those routines in turn call ofw_memmap which includes doing: > >> if (OF_call_method("claim", memory, 3, 1, destp, dlen, 0, &addr) >> == -1) { >> printf("ofw_mapmem: physical claim failed\n"); >> return (ENOMEM); >> } >> >> /* >> * We only do virtual memory management when real_mode is false. >> */ >> if (real_mode == 0) { >> if (OF_call_method("claim", mmu, 3, 1, destp, dlen, 0, &addr) >> == -1) { >> printf("ofw_mapmem: virtual claim failed\n"); >> return (ENOMEM); >> } >> >> if (OF_call_method("map", mmu, 4, 0, destp, destp, dlen, 0) >> == -1) { >> printf("ofw_mapmem: map failed\n"); >> return (ENOMEM); >> } >> } > and during load-time this is what programs the PowerPC to have the PTEG entries (and whatever else) that instructions like dcbst require (since MSR[DR]=1). > > The crashes are at the first dcbst in __syncicache execution that reference the missing pages. (It seems unlikely that there is any other usage of those pages.) The crash reports missing PTEG entries (DSISR for IV 0x300). (Apple's openfirmware word .registers shows the recorded register status from the crash. After the crash the PowerMac is in Apple's context, not FreeBSD's.) > > The __syncicache use results from the following > >> int >> ppc64_ofw_elf_loadfile(char *filename, u_int64_t dest, >> struct preloaded_file **result) >> { >> int r; >> >> r = __elfN(loadfile)(filename, dest, result); >> if (r != 0) >> return (r); >> >> /* >> * No need to sync the icache for modules: this will >> * be done by the kernel after relocation. >> */ >> if (!strcmp((*result)->f_type, "elf kernel")) >> __syncicache((void *) (*result)->f_addr, (*result)->f_size); >> return (0); >> } > (powerpc has a similar sequence with __syncicache as I remember.) For some reason the __syncicache usage is set up to span into or beyond the .data segment, not just the .text one. I do not know why. > > __elfN(loadfile)'s interface is not designed to return multiple address ranges and is returning one range that spans into both the PT_LOAD ranges (.text and .data) and any unreferenced pages that are between them. (In fact it spans even more afterwards as I remember.) > > > Questions: > > Anyone have a clue about why the __syncicache use is set up to span into .data (and more) and not just span .text --and willing to explain a little? > > > As far as solution directions go: this looks like a subject area appropriate to general FreeBSD use base on the available evidence. A local personal hack does not seem appropriate. So... > > > A) Should the link of the kernel be producing a kernel with unreferenced pages between the two PT_LOADs (between .text and .data)? Is the proper fix to prevent those pages from existing in linked kernels? > > vs. > > B) Is it okay for those unreferenced pages to be there between the two PT_LOADs? If yes... > > B1) Should something like the ofw_memmap activity be forced on those otherwise unreferenced pages so that the later __syncicache use can stay as it is? > > vs. > > B2) Should the unreferenced pages be skipped by making separate __synicache calls for each PT_LOAD (.text segment and then .data segment and beyond(?))? > > vs. > > B3) Should only the .text segment be spanned by the __syncicache use? Some other more specific range that avoids those unreferenced pages? > > > It would appear that all but (A) involve changing the interface provided by __elfN(loadfile) and/or the interfaces it uses: the fix does not appear well localized. (A) may have its own such issues but in other code or files that I've not looked at. > > > === > Mark Millard > markmi at dsl-only.net > > _______________________________________________ > freebsd-ppc@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-ppc > To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org" >