From owner-freebsd-ppc@freebsd.org Fri May 10 22:55:50 2019 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C05F9158B3E3 for ; Fri, 10 May 2019 22:55:50 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-14.consmr.mail.bf2.yahoo.com (sonic314-14.consmr.mail.bf2.yahoo.com [74.6.132.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6958E8EE70 for ; Fri, 10 May 2019 22:55:49 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: DOdp.g0VM1mYKOK1Ha4Og.hXqMFbwAy9ObNZam_tjIisIX4UdLeg2FSodQfIvh0 L7Bce.1yBZ2Y3C_1dwFdCDqRJq_RrFCxup3mD3wuOvPKOtP4FtaRNzcgl_zsrD68AX8JVgNQcyCV .72jt0dNcfqpgsb2MbvT3h2DzrfXSuwvv5jnUIhQnEh7jA4FU_hcEUV1t38J55fYWyHyVGXYwRbu T._RWx80LD3gdBaXgHLsalf0Fa5iw8zz9yY49HdJ4lb5Z0N3SsLOCx3IIlfaouTWiEzRrDzfb6wf kv7tccO6ynHFxXnLlK3b722FCon1snvtVvCkp2od1YvwXm5.Gomg8Px_xlsBuwZn2oIqvDNquOpE AFYeR0FCGPtVXADs4LgEnhHlrGG1Hv61tBUvg8fFbWuv4hkEvrVYMYNClAipvWr6nZ.4NOmu.mGV FgOb.L_HjEYxhBzQBOx3VElkbux9ZoRlWHJxNzfjqlUEjashzNlxXgvefDYE57eL_I25AOOj4oVK IC0.ylCpdJHTQL6RAopcpST3FMGP6gDkRl47E.RoZCZzLN1V65snMDmDIzNhEKCB2dYfs8GONht_ paonY7DQtfBw6PF18jutj26RYKGYzil3KHIzV6_IrMXxeC4qcnxR31naZak1puMM1CZNvSaSSnG. CBGaBv0er4Q.N1gmYDHjgY_0z8vLeIOmdGwHkPK_qK98gBZBZg.9s25IV.Q5Oz8yH9tLTXpKwHQh SKh5apQ210f0FXU0g.VX6I.NbmpQ3lQ0FwtN_xTcPESpMMv7quPajiOBHY625NrcyrwYFdT4r8VT wi6pvYnbA3Ca_Kp7UkwZfI7JzleD7fvXiaKECq4d7_MYXFSkytj4NEIuyx8EE4VWG90mfjrWS77I Myjr9J.u8ye5FCLC6moDuGb1sdD9ZvRx0gVVj7G6Y6sYwP39tdvclawDxhy6FVcYE8Blp_v6ENo. 2UO6BBLikXKNSlGIiamOhnCubcxtcwU4W8.WprXqKmMaOUVebNeSCBbsXyeAQPnLhDBtBmDXHD9O XUPc1Ecj4iyery17K95eRfZ7GHc52LFtpTNv3oS4uRSstpaxipDqvuYx.3QzhGb3Av8UWRFzK7uY 8B4S7zT9DbttkZmPvU6tmTnVrsZFlU5uZRO4V.6qwqXVPlQ7V5jDWHNIm6nj0D.jGSIBacouu9VW 7wYE4H3w- Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.bf2.yahoo.com with HTTP; Fri, 10 May 2019 22:55:43 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.103]) ([76.115.7.162]) by smtp424.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 222bb12fa2f4c17b2d933b3a582ab29c; Fri, 10 May 2019 22:45:32 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: 970/PowerMac G5 cpudep_ap_bootstrap slb-related hangup *solved* . . . From: Mark Millard In-Reply-To: <241A2C8D-2E9D-4E2E-8A78-3E4A17F0C46A@yahoo.com> Date: Fri, 10 May 2019 15:45:29 -0700 Cc: FreeBSD PowerPC ML , Dennis Clarke , Nathan Whitehorn Content-Transfer-Encoding: quoted-printable Message-Id: <541CEE9E-9DF5-4287-BE92-460A2CEA9597@yahoo.com> References: <2E7A0894-E5B0-4776-95F2-76B7EE0EE93C@yahoo.com> <241A2C8D-2E9D-4E2E-8A78-3E4A17F0C46A@yahoo.com> To: Justin Hibbits X-Mailer: Apple Mail (2.3445.104.8) X-Rspamd-Queue-Id: 6958E8EE70 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.15 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FREEMAIL_TO(0.00)[gmail.com]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.88)[0.875,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.42)[ip: (4.43), ipnet: 74.6.128.0/21(1.53), asn: 26101(1.22), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.57)[0.573,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.79)[0.792,0]; RCVD_IN_DNSWL_NONE(0.00)[124.132.6.74.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 May 2019 22:55:51 -0000 [world and kernel built, installed. Then boot-tested repeatedly.] On 2019-May-10, at 14:02, Mark Millard wrote: > [For head -r347463 I'll still have to have = lib/libc/powerpc64/string/strcmp.S > patched to avoid cmpb instructions. No other patches.] >=20 > On 2019-May-10, at 13:11, Mark Millard wrote: >=20 >> On 2019-May-10, at 12:38, Justin Hibbits = wrote: >>=20 >>> Hi Mark, >>>=20 >>> On Fri, May 10, 2019 at 6:23 AM Mark Millard = wrote: >>>>=20 >>>> [Having removed all my prior investigatory material, I include >>>> a svnlite diff that I've booted based on, a comparatively >>>> minimal diff from the head -r347003 that I started from.] >>>>=20 >>>> On 2019-May-10, at 02:15, Mark Millard = wrote: >>>>=20 >>>>> [This continues a prior message, but I choose a new subject >>>>> text for the testing that showed the kind of material working.] >>>>>=20 >>>>> I have the slbtrap/handle_kernel_slb_spill working instead >>>>> of hanging up when it has an slb-miss (and well as when there >>>>> is no miss). >>>>>=20 >>>>> In /usr/src/sys/powerpc/aim/mp_cpudep.c I moved the >>>>> 970 code for HID0 and HID1 from cpudep_ap_setup, code >>>>> that looks like, >>>>>=20 >>>>> /* Set HIOR to 0 */ >>>>> __asm __volatile("mtspr 311,%0" :: "r"(0)); >>>>> powerpc_sync(); >>>>>=20 >>>>> /* >>>>> * The 970 has strange rules about how to update HID = registers. >>>>> * See Table 2-3, 970MP manual >>>>> * >>>>> * Note: HID4 and HID5 restored already in >>>>> * cpudep_ap_early_bootstrap() >>>>> */ >>>>>=20 >>>>> __asm __volatile("mtasr %0; sync" :: "r"(0)); >>>>> #ifdef __powerpc64__ >>>>> __asm __volatile(" \ >>>>> sync; isync; = \ >>>>> mtspr %1, %0; = \ >>>>> mfspr %0, %1; mfspr %0, %1; mfspr %0, = %1; \ >>>>> mfspr %0, %1; mfspr %0, %1; mfspr %0, = %1; \ >>>>> sync; isync" >>>>> :: "r"(bsp_state[0]), "K"(SPR_HID0)); >>>>> __asm __volatile("sync; isync; \ >>>>> mtspr %1, %0; mtspr %1, %0; sync; isync" >>>>> :: "r"(bsp_state[1]), "K"(SPR_HID1)); >>>>> #else >>>>> __asm __volatile(" \ >>>>> ld %0,0(%2); = \ >>>>> sync; isync; = \ >>>>> mtspr %1, %0; = \ >>>>> mfspr %0, %1; mfspr %0, %1; mfspr %0, = %1; \ >>>>> mfspr %0, %1; mfspr %0, %1; mfspr %0, = %1; \ >>>>> sync; isync" >>>>> : "=3Dr"(reg) : "K"(SPR_HID0), "b"(bsp_state)); >>>>> __asm __volatile("ld %0, 8(%2); sync; isync; \ >>>>> mtspr %1, %0; mtspr %1, %0; sync; isync" >>>>> : "=3Dr"(reg) : "K"(SPR_HID1), "b"(bsp_state)); >>>>> #endif >>>>>=20 >>>>> powerpc_sync(); >>>>>=20 >>>>> Here to? moved it to cpudep_ap_early_bootstrap, just before the >>>>> code for HID4 and HID5, and I commented out 2 #if/endif lines: >>>>>=20 >>>>> void >>>>> cpudep_ap_early_bootstrap(void) >>>>> { >>>>> //#ifndef __powerpc64__ >>>>> register_t reg; >>>>> //#endif >>>>>=20 >>>>> switch (mfpvr() >> 16) { >>>>> case IBM970: >>>>> case IBM970FX: >>>>> case IBM970MP: >>>>>> .>.> INSERT CODE HERE <.<.<. >>>>>=20 >>>>> /* Restore HID4 and HID5, which are necessary for the = MMU */ >>>>>=20 >>>>> #ifdef __powerpc64__ >>>>> mtspr(SPR_HID4, bsp_state[2]); powerpc_sync(); = isync(); >>>>> mtspr(SPR_HID5, bsp_state[3]); powerpc_sync(); = isync(); >>>>> #else >>>>> __asm __volatile("ld %0, 16(%2); sync; isync; \ >>>>> mtspr %1, %0; sync; isync;" >>>>> : "=3Dr"(reg) : "K"(SPR_HID4), "b"(bsp_state)); >>>>> __asm __volatile("ld %0, 24(%2); sync; isync; \ >>>>> mtspr %1, %0; sync; isync;" >>>>> : "=3Dr"(reg) : "K"(SPR_HID5), "b"(bsp_state)); >>>>> #endif >>>>> powerpc_sync(); >>>>> break; >>>>> . . . >>>>>=20 >>>>> This does the initialization before cpudep_ap_bootstrap, >>>>> instead of after. >>>>>=20 >>>>> With things then sufficiently initialized for PSL_IR|PSL_DR >>>>> code to doing things like pcpup->pc_curthread->td_pcb-> >>>>> that sometimes have slb misses, it boots fine, >>>>> loading into the slb as needed. No more checkstop status >>>>> (or whatever it was). >>>>>=20 >>>>> I do not know if non-970 contexts should have similar >>>>> changes in the ordering of initializations or not. >>>>> But, clearly, the 970 family members do need such. >>>>>=20 >>>>> I'm not claiming that other material from other notes >>>>> that I sent out should be ignored, only that the above >>>>> changes the observed failing behavior, and so is a big >>>>> gain all by itself. And it is simple to do without >>>>> other investigations that might be involved in the >>>>> more overall context. >>>>=20 >>>> Of course, whitespace details, may not be well preserved >>>> below. (The commenting out of the two #if/#endif lines >>>> was unnecessary and is not done in the below.) >>>>=20 >>> >>>=20 >>> Good sleuthing. >>>=20 >>> I think the whole diff could be reduced to just moving the HIOR. = Can >>> you give r347463 a shot? It's the reduced diff of just moving HIOR. >>> If that's not sufficient, then I can move the HID0/HID1 >>> initializations, but they didn't look relevant for early boot >>> stability when I reviewed. >>=20 >> I can try later today. >>=20 >> I'll note that the bsp does not use the relative ordering >> the ap's use for HID0 and HID1 vs. code analogous to >> cpudep_ap_bootstrap as far as I could tell: it does HID0 >> earlier and makes no HID1 assigments at all (depending >> on openfirmware or the loader to have given appropriate >> assignments). >>=20 >> (OpenFirmware does not seem to do much for configuring the >> ap's, just the bsp. Depending on defaults is more of an >> issue for the ap's.) >>=20 >> Also, some HID0 and HID1 points to consider: >>=20 >> HID0 controls the TBR behavaior, and mftb() is in use in the >> slb replacement code: >>=20 >> bit 18 is: tb_ctrl Enable time-base counting when the processor is = stopped. >>=20 >> bit 19 is: ext_tb_en External time-base enable. With: >> =E2=80=A2 0 Use TBEN input as enable. TB is clocked at 1/8 of = the full processor frequency. >> =E2=80=A2 1 Use TBEN input to clock time base (external clock). >>=20 >> (I've seen other material claiming 1/16th instead of 1/8th.) >>=20 >> There is also: >>=20 >> bit 32 is: en_mck Enable external machine check interrupts (preferred = state equals =E2=80=981=E2=80=99). >>=20 >> HID1 has (note the "must be 1 for proper functioning" example): >>=20 >> bit 5 is: en_ic Enable instruction cache (must be =E2=80=981=E2=80=99 = for proper functioning). >>=20 >> bit 10 is: en_if_cach Enable instruction fetch cacheability control. = With: >> =E2=80=A2 0 All instruction fetch accesses are treated as cache = inhibited regardless of >> the state of the I bit in the page table. >> =E2=80=A2 1 Instruction fetch cacheability is controlled by the = state of the I bit in the >> page table (preferred state). >>=20 >> (I'll not list other cache/link-stack//tablewalks related material. = There >> are some with "preferred state equals '1'".) >>=20 >>=20 >> I do not see why either of HID0 or HID1 has a reason to be later = than >> where I put them (relative to other activities). Why do you want them >> to be later? >=20 >=20 > My test will still have changes to allow world to operate > on the 970MP (by avoiding cmpb instructions): >=20 > # svnlite diff /mnt/usr/src/ > Index: /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S (revision = 347463) > +++ /mnt/usr/src/lib/libc/powerpc64/string/strcmp.S (working copy) > @@ -88,9 +88,16 @@ > .Lstrcmp_compare_by_word: > ld %r5,0(%r3) /* Load double words. */ > ld %r6,0(%r4) > - xor %r8,%r8,%r8 /* %r8 <- Zero. */ > + lis %r8,32639 /* 0x7f7f */ > + ori %r8,%r8,32639 /* 0x7f7f7f7f */ > + rldimi %r8,%r8,32,0 /* 0x7f7f7f7f'7f7f7f7f */ > xor %r0,%r5,%r6 /* Check if double words are different. = */ > - cmpb %r7,%r5,%r8 /* Check if double words contain zero. = */ > + /* Check for zero vs. not bytes: */ > + and %r9,%r5,%r8 /* 0x00->0x00, 0x80->0x00, = other->ms-bit-in-byte=3D=3D0 */ > + add %r9,%r9,%r8 /* ->0x7f, ->0x7f, = ->ms-bit-in-byte=3D=3D1 */ > + nor %r7,%r9,%r5 /* ->0x80, ->0x00, = ->ms-bit-in-byte=3D=3D0 */ > + andc %r7,%r7,%r8 /* ->0x80, ->0x00, ->0x00 = */ > + /* sort of like cmpb %r7,%r5,%r8 for %r8 = being zero */ >=20 > /* > * If double words are different or contain zero, > @@ -104,7 +111,12 @@ > ldu %r5,8(%r3) /* Load double words. */ > ldu %r6,8(%r4) > xor %r0,%r5,%r6 /* Check if double words are different. = */ > - cmpb %r7,%r5,%r8 /* Check if double words contain zero. = */ > + /* Check for zero vs. not bytes: */ > + and %r9,%r5,%r8 /* 0x00->0x00, 0x80->0x00, = other->ms-bit-in-byte=3D=3D0 */ > + add %r9,%r9,%r8 /* ->0x7f, ->0x7f, = ->ms-bit-in-byte=3D=3D1 */ > + nor %r7,%r9,%r5 /* ->0x80, ->0x00, = ->ms-bit-in-byte=3D=3D0 */ > + andc %r7,%r7,%r8 /* ->0x80, ->0x00, ->0x00 = */ > + /* sort of like cmpb %r7,%r5,%r8 for %r8 = being zero */ >=20 > /* > * If double words are different or contain zero, I built world and kernel (gcc 4.2.1 toolchain based), installed, and booted a bunch of times. No obvious problems. I'll note that the kernel build is based on: # more /mnt/usr/src/sys/powerpc/conf/GENERIC64-dcons include GENERIC64 ident GENERIC64-dcons options GDB device dcons device dcons_crom In order to have firewire-dcons based access to boot output, not that it ever hung up. Screen display output can stop before messages actually do. In such cases, firewire-dcons use seems to show all the output. I did my own builds, in part because any official build artifacts would be based on using cmpb instructions in world (that the 970 family does not have). The other part was having firewire-dcons in place in case of problems. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)