From owner-freebsd-ppc@freebsd.org Sun Apr 21 05:08:07 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9FD1F158262F for ; Sun, 21 Apr 2019 05:08:07 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic302-20.consmr.mail.gq1.yahoo.com (sonic302-20.consmr.mail.gq1.yahoo.com [98.137.68.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6D9C28BBBA for ; Sun, 21 Apr 2019 05:08:06 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: l8S9YhkVM1mPFUdUgrfc0CxGPWCcsABTURO_wJcNwGFX824jWounLd6UW2IQrNt RT9PegmsK3EYrACDa4Zlu6_h4btYgD3RAC2G90XfZ67bEqNx5WpokA6yaiwc0atdWvpfKmJ62IaH MUpHnZ5iJ1ABzZUnXRGcgPsrNcur7.ctXFJlnu9tpWdfgMKC3JXHtiTrs89wZczOUqL1HxbLEy92 88YHieTbYTGcQElx0FLfVayJlbK0oS9wYWZfZRbQbrhIhrW7KBfYnukDXL1uCgbdyVfuvdIu.Lou kvO3UcKB0VJ8PxC7oUE5mPRQgl659I1ltWlheERcidaLXtvtqAFgEdh0e9ciQLhXozyNK7u1xtLF WsXprOI7lEB2nCIWGj3IILmwg8flSZFF5CEfiPZgpbWoHceZh0wzTlZu3AO.gg6TT.hvXlMwNE_r lPJtoNfU5NT.2LDnEOu5Fa7Wcn6qdSx9GsRTN6rWfpPdpmDY9Q19II8z3QENwOThgRBf_9bNkfFu EkAOuh8N4fHWraLzoFKL0S_1pctbqkTebcm9sq7iLpbiB3bX9YWjtsbMzd33AC0Mjo2rYRriyftI OgBG3bakBmBBdBjZP6bOT09ON71MjHiAo6gHObNxqbn.Kkk.M0SYcNBgq2PF20qjBJ2Hl3Pku7t3 4OT2Hhup7d2QBqVdKZ3lybY_Zs3gYJ6nPZU0pw.UC2bxNC3fXy0bIzDoRkiSejTpCHdhXf7dlXAP UTH91jAekxn4vfrmCnScGwNBx82rC3nUtzmnMr5x3aUpVd6ybl8EzWhp.R0O8UC0brCtpvY92Yqz 9uB_rQRiynkJhU_y3JBwi0O4glSA.sqmuC72LhR0UGwyyYefkh4zILrVhnHEKsPyQ5ztfnRyogkY WPugMP08RCQtYoVP0NwlXF9XRg9W20_O50OUOGbWzrsMCipd4Ewgph2KXNUoyFUIW0KPs_DZpfPH YmqE8ss4bUnWzVy1CaRJvL5b_HHF_fw1hXlkWZ_gbVwU7p3kvTmW.JUuimnARHYeB.UX17rqqkLY PM2vFAVmZuPdNJ9cHnFNngqIMGsnfUImgvMnwVptzRB3eBfbmjTnbcC55bujlcQRfE1PiESyCBGk d9Ed4FG_XvIBXeYD1jsmb414L38c- Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.gq1.yahoo.com with HTTP; Sun, 21 Apr 2019 05:07:59 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.103]) ([76.115.7.162]) by smtp409.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID d1b27d599f889d9946884148b1f22cf6; Sun, 21 Apr 2019 05:07:55 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: Under usefdt=1 type of context for G5 "4 core" (system total): unload then boot /boot/kernel/kernel fails very early with: panic: Translation map has incorrect cell count (260/256) From: Mark Millard In-Reply-To: Date: Sat, 20 Apr 2019 22:07:54 -0700 Cc: FreeBSD PowerPC ML , Justin Hibbits , Nathan Whitehorn Content-Transfer-Encoding: quoted-printable Message-Id: <56F7B1A9-3095-486B-9BE4-7602484984DF@yahoo.com> References: <9A6549AE-46FB-41B9-8F15-9E1AE3E16E2C@yahoo.com> To: "Mark L. Mil." X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 6D9C28BBBA X-Spamd-Bar: - X-Spamd-Result: default: False [-1.07 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.87)[-0.870,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.68)[0.678,0]; NEURAL_HAM_LONG(-0.91)[-0.906,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.54)[ip: (1.10), ipnet: 98.137.64.0/21(0.92), asn: 36647(0.73), country: US(-0.06)]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[146.68.137.98.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Apr 2019 05:08:07 -0000 [Why there were 256*4 bytes instead of 260*4 bytes is now known: the conversion to fdt truncated the translations property to 1024 bytes, no longer a multiple of 5.] On 2019-Jan-29, at 16:30, Mark Millard wrote: > On 2019-Jan-28, at 23:42, Mark Millard wrote: >=20 >> [The loader here has been modified to always do the usefdt=3D1 code. >> The modern VM_MAX_KERNEL_ADDRESS is in the kernel.] >>=20 >> In trying to boot for other experiments I tried at the loader prompt: >>=20 >> unload >> load /boot/kernel/kernel >> boot >>=20 >> and: >>=20 >> unload >> boot -v /boot/kernel/kernel >>=20 >> Such combinations produced (typed from a picture of the screen, >> the missing text was really missing): >>=20 >> GDB: no debug ports present >> KDB: debugger backends: ddb >> KDB: current backend: ddb >> panic: Translation map has incorrect cell count (260/256) >=20 > Note that 260 is a multiple of 5 and 4 but 256 is not a multiple > of 5. Being a multiple of 5 is important for "acells=3D=3D2" = contexts. >=20 > Going through some of the code, first noting some context: >=20 > struct ofw_map { > cell_t om_va; > cell_t om_len; > uint64_t om_pa; > cell_t om_mode; > }; >=20 > This has 4 fields but one appears to be bigger than cell_t > if I understand right. This gets into if: >=20 > OF_getencprop(OF_finddevice("/"), "#address-cells", &acells, > sizeof(acells)); >=20 > indicates 2 cells for om_pa (via acells=3D=3D2) or just 1 cell. So > when acells=3D=3D2 there are 5 cells per, otherwise there are 4. >=20 > Now to the routine in question: >=20 > static void > moea64_add_ofw_mappings(mmu_t mmup, phandle_t mmu, size_t sz) > { > struct ofw_map translations[sz/(4*sizeof(cell_t))]; /*>=3D 4 = cells per */ >=20 > For acells=3D=3D2 the above potentially over allocates translations = some. > Later it turns out that if it does that may prevent out of bounds > writing into translations . >=20 >=20 > pcell_t acells, trans_cells[sz/sizeof(cell_t)]; >=20 > It turns out that sz/sizeof(cell_t) at this point ended up as a > multiple of 4 but not of 5, despite acells=3D=3D2 being the case. > Later this leads to out of bounds accesses for trans_cells . >=20 > struct pvo_entry *pvo; > register_t msr; > vm_offset_t off; > vm_paddr_t pa_base; > int i, j; >=20 > bzero(translations, sz); > OF_getencprop(OF_finddevice("/"), "#address-cells", &acells, > sizeof(acells)); >=20 > This ended up with acells=3D=3D2 --so 5 cells per translation entry. >=20 > if (OF_getencprop(mmu, "translations", trans_cells, sz) =3D=3D = -1) > panic("moea64_bootstrap: can't get ofw translations"); >=20 > For acells=3D=3D2: sz/sizeof(cell_t) not being a multiple of 5 means > either some unused trans_cell entries or access outside the > trans_cells array in the loop below, depending on the loop test > used. >=20 > CTR0(KTR_PMAP, "moea64_add_ofw_mappings: translations"); > sz /=3D sizeof(cell_t); >=20 > sz here ended up as 256, not a multiple of 5. >=20 > for (i =3D 0, j =3D 0; i < sz; j++) { >=20 > The i < sz test is testing the first of the 4 or 5 being > below sz instead of testing the last of the 4 or 5. This > leads to out of bounds accesses into trans_cells below for > the 256 with acells=3D=3D2 case. It also means that the last > translations[j] ends up with some garbage content. >=20 > translations[j].om_va =3D trans_cells[i++]; > translations[j].om_len =3D trans_cells[i++]; > translations[j].om_pa =3D trans_cells[i++]; > if (acells =3D=3D 2) { > translations[j].om_pa <<=3D 32; > translations[j].om_pa |=3D trans_cells[i++]; > } > translations[j].om_mode =3D trans_cells[i++]; > } >=20 > The loop test for the above loop just looks wrong to me for avoid > bad accesses. >=20 > KASSERT(i =3D=3D sz, ("Translations map has incorrect cell = count (%d/%zd)", > i, sz)); >=20 > Having sz [ the original sz/sizeof(cell_t) ] not be a multiple of 5 > for acells=3D=3D2 would fail this test even if the loop stopped = without > going out of bounds on trans_cells (so i=3D=3D255 instead of i=3D=3D260)= . >=20 > For the original sz figure having sz/sizeof(cell_t) =3D=3D 256, the = original > sz figure was not a multiple of 5 [presuming sizeof(cell_t) is not a > multiple of 5]. >=20 > My guess is that the original sz value was wrong after the unload and > boot (reloading) and the loop structure should avoid going out of = bounds > on trans_cells anyway. >=20 >> cpuid =3D 0 >> time =3D 1 >> KDB: stack backtrace: >> 0xc000000000f9e kdb_backtrace+0x60 >> 0xc000000000f9e vpanic+0x258 >> 0xc000000000f9e panic+0x3c >> 0xc000000000f9e moea64_late_bootstrap+0x2c0 >> 0xc000000000f9e moea64_bootstrap_native+0x20c >> 0xc000000000f9e pmap_bootstrap_0xc8 >> 0xc000000000f9e powerpc_init+0x440 >> 0xc000000000f9efc0: at=20 >>=20 >> (And that is all.) >=20 >=20 >=20 > The below tried to fix the loop to avoid out of bounds > trans_cells accesses and to allow my context to not panic > for the "multiple of 4 but not of 5" issue for the unload > then reload/boot sequence. The bias was for the code > changes to be easy to follow, given the earlier commentary. >=20 > # svnlite diff /usr/src/sys/powerpc/aim/mmu_oea64.c | more = Index: = /usr/src/sys/powerpc/aim/mmu_oea64.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/sys/powerpc/aim/mmu_oea64.c (revision 341836) > +++ /usr/src/sys/powerpc/aim/mmu_oea64.c (working copy) > @@ -502,7 +502,7 @@ > register_t msr; > vm_offset_t off; > vm_paddr_t pa_base; > - int i, j; > + int i, istep, j; >=20 > bzero(translations, sz); > OF_getencprop(OF_finddevice("/"), "#address-cells", &acells, > @@ -512,7 +512,8 @@ >=20 > CTR0(KTR_PMAP, "moea64_add_ofw_mappings: translations"); > sz /=3D sizeof(cell_t); > - for (i =3D 0, j =3D 0; i < sz; j++) { > + istep =3D (acells =3D=3D 2) ? 5 : 4; > + for (i =3D 0, j =3D 0; i+istep-1 < sz; j++) { > translations[j].om_va =3D trans_cells[i++]; > translations[j].om_len =3D trans_cells[i++]; > translations[j].om_pa =3D trans_cells[i++]; > @@ -522,7 +523,7 @@ > } > translations[j].om_mode =3D trans_cells[i++]; > } > - KASSERT(i =3D=3D sz, ("Translations map has incorrect cell = count (%d/%zd)", > + KASSERT(i+sz%istep =3D=3D sz, ("Translations map has incorrect = cell count (%d/%zd)", > i, sz)); >=20 > sz =3D j; >=20 > I do not claim the KASSERT should officially change but I expect that = the > loop should. The original rejection by a debug build was tied to the conversion to fdt truncating the translation property: if (proplen > 1024) { proplen =3D 1024; } in add_node_to_fdt in stand/powerpc/ofw/ofwfdt.c . This changed a 1040=3D=3D208*5 total to a 1024=3D=3D256*4 total. (1024 is not a multiple of 5.) So the problem goes away when the truncation logic is removed. Still, the truncation did expose some coding problems in the translation map extraction, such as out of bounds access for such a truncated case. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)