Date: Sun, 07 Feb 2016 03:06:21 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 206990] powerpc (32-bit), projects/clang380-import vs. 11.0-CURRENT's sendsig: need to avoid signal delivery trashing the stack and so causing SIGSEGV Message-ID: <bug-206990-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D206990 Bug ID: 206990 Summary: powerpc (32-bit), projects/clang380-import vs. 11.0-CURRENT's sendsig: need to avoid signal delivery trashing the stack and so causing SIGSEGV Product: Base System Version: 11.0-CURRENT Hardware: ppc OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: markmi@dsl-only.net The observed problem: For a TARGET_ARCH=3Dpowerpc clang 3.8.0 based buildworld installation: atte= mpting "make -j 6 buildworld" (run on 4 powerpc cores) eventually gets a segmenta= tion fault. (More details later.) "make buildworld" does not fault. (The example hardware currently in use is a Quad Core PowerMac G5 but not with a 64-bit buildworld.) (This is with the content of sys/powerpc/powerpc/sigcode32.S -r295186 in pl= ace so that that part of the signal delivery maintains the modulo 16 byte stack/frame alignment for the handler. clang 3.8.0 sometimes generates code that depends on the alignment in ways gcc 4.2.1's code does not.) I used ktrace/kdump commands of the structure: ktrace -di -f /usr/obj/make.out -t cs -p ??? kdump -E -f /usr/obj/make.out -p ??? > /var/tmp/make_ktrace_sigsegv_??.txt to investigate the context of the SIGSEGV's. Example results (showing the l= ines that are always the same at the end for the failing process --but for addre= sses and timestamp variations anyway): 65158 make 0.205791 PSIG SIGCHLD caught handler=3D0x180aae0 mask=3D0x0 code=3DCLD_EXITED 65158 make 0.205822 CALL write(0x3,0x189e914,0x1) 65158 make 0.205847 RET write 1 65158 make 0.205869 CALL sigreturn(0xffffbb50) 65158 make 0.205923 RET sigreturn JUSTRETURN 65158 make 0.205962 PSIG SIGSEGV SIG_DFL code=3DSEGV_MAPERR 599 make 5.552305 PSIG SIGCHLD caught handler=3D0x180aae0 mask=3D0x0 code=3DCLD_EXITED 599 make 5.552323 CALL write(0x3,0x189e914,0x1) 599 make 5.552337 RET write 1 599 make 5.552347 CALL sigreturn(0xffffbb30) 599 make 5.552358 RET sigreturn JUSTRETURN 599 make 5.552381 PSIG SIGSEGV SIG_DFL code=3DSEGV_MAPERR 75728 make 4.141097 PSIG SIGCHLD caught handler=3D0x180aae0 mask=3D0x0 code=3DCLD_EXITED 75728 make 4.141116 CALL write(0x3,0x189e914,0x1) 75728 make 4.141154 RET write 1 75728 make 4.141349 CALL sigreturn(0xffffbaa0) 75728 make 4.141366 RET sigreturn JUSTRETURN 75728 make 4.141404 PSIG SIGSEGV SIG_DFL code=3DSEGV_MAPERR 12195 make 27.213277 PSIG SIGCHLD caught handler=3D0x180aae0 mask=3D0= x0 code=3DCLD_EXITED 12195 make 27.213322 CALL write(0x3,0x189e914,0x1) 12195 make 27.213346 RET write 1 12195 make 27.213361 CALL sigreturn(0xffffb1e0) 12195 make 27.213383 RET sigreturn JUSTRETURN 12195 make 27.213418 PSIG SIGSEGV SIG_DFL code=3DSEGV_MAPERR 50545 make 80.255162 PSIG SIGCHLD caught handler=3D0x180aae0 mask=3D0= x0 code=3DCLD_EXITED 50545 make 80.255192 CALL write(0x3,0x189e914,0x1) 50545 make 80.255219 RET write 1 50545 make 80.255241 CALL sigreturn(0xffffafa0) 50545 make 80.255265 RET sigreturn JUSTRETURN 50545 make 80.255317 PSIG SIGSEGV SIG_DFL code=3DSEGV_MAPERR Every example SIGSEGV from "make -j 6 buildworld" attempts were like that. Which instance of make varied and where in make varied. The "-E" elapsed ti= me give a solid clue to there being variability in when the fault happens: It = is not some local property of specific code. I'll use some script log file sizes as another indication of variability. I= 've sorted them: 2942664 3304207 3342660 3474585 3941983 So spanning from 2.9 MBytes to 3.9 MBytes. I've since gotten a few with less and some with more. The cause: Comparing clang 3.8.0 generated code for TARGET_ARCH=3Dpowerpc to gcc 4.2.1 generated code. . . clang 3.8.0 based Str_Match preamble (from make): 0x181a4a8 <Str_Match>: mflr =C2=A0=C2=A0=C2=A0r0 0x181a4ac <Str_Match+4>: stw =C2=A0=C2=A0=C2=A0=C2=A0r31,-4(r1) # Cl= ang's frame pointer (r31)=20 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= # saved before stack pointer changed. 0x181a4b0 <Str_Match+8>: stw =C2=A0=C2=A0=C2=A0=C2=A0r0,4(r1) =C2=A0= =C2=A0# lr saved before stack pointer changed. 0x181a4b4 <Str_Match+12>: stwu =C2=A0=C2=A0=C2=A0r1,-32(r1) # Stack p= ointer finally saved and =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= # changed. 0x181a4b8 <Str_Match+16>: mr =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0r31,r1 =C2= =A0=C2=A0=C2=A0=C2=A0# r31 is the frame pointer under clang. 0x181a4bc <Str_Match+20>: stw =C2=A0=C2=A0=C2=A0=C2=A0r30,24(r31) gcc 4.2.1 based Str_Match preamble: 0x1819cb8 <Str_Match>: mflr =C2=A0=C2=A0=C2=A0r0 0x1819cbc <Str_Match+4>: stwu =C2=A0=C2=A0=C2=A0r1,-32(r1) # Stack p= ointer saved and changed first. 0x1819cc0 <Str_Match+8>: stw =C2=A0=C2=A0=C2=A0=C2=A0r31,28(r1) # r3= 1 saved after stack pointer changed. 0x1819cc4 <Str_Match+12>: mr =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0r31,r3 =C2= =A0=C2=A0=C2=A0=C2=A0# gcc 4.2.1 does not reserve =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= # r31 for use as a frame pointer. 0x1819cc8 <Str_Match+16>: stw =C2=A0=C2=A0=C2=A0=C2=A0r30,24(r1) 0x1819ccc <Str_Match+20>: stw =C2=A0=C2=A0=C2=A0=C2=A0r0,36(r1) =C2= =A0# lr saved after stack pointer changed. Picking a different example for postamble code, showing just clang 3.8.0's code: 0x1801b8c <Buf_AddBytes+104>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r30,24(r31) 0x1801b90 <Buf_AddBytes+108>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r29,20(r31) 0x1801b94 <Buf_AddBytes+112>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r28,16(r31) 0x1801b98 <Buf_AddBytes+116>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r27,12(r31) 0x1801b9c <Buf_AddBytes+120>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r26,8(r31) 0x1801ba0 <Buf_AddBytes+124>: addi =C2=A0=C2=A0=C2=A0r1,r1,32 =C2=A0=C2= =A0# Stack pointer adjusted first 0x1801ba4 <Buf_AddBytes+128>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r0,4(r1) 0x1801ba8 <Buf_AddBytes+132>: lwz =C2=A0=C2=A0=C2=A0=C2=A0r31,-4(r1) # Th= en Frame Pointer load happens =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= # "outside" the new stack range. 0x1801bac <Buf_AddBytes+136>: mtlr =C2=A0=C2=A0=C2=A0r0 0x1801bb0 <Buf_AddBytes+140>: blr In other words: clang 3.8.0's generated 32-bit powerpc code is based on the= re being a safe scratch area below the stack ("below" by memory address). So similar to the 224 byte "red zone" area that 32-bit AIX powerpc and 32-bit Darwin powerpc use. But sendsig( sig_t, ksiginfo_t*, sigset_t*) in sys/powerpc/powerpc/exec_machdep.c only maintains such a scratch area for 64-bit code contexts, where it uses the "288 byte scratch region below the stack" that 64-bit Darwin and the like use. So on 32-bit powerpc (and lib32?) sendsig sometimes causes replacement of t= he stored frame pointer value before the matching "lwz r31,-4(r1)" happens. And that leads to later segmentation faults after the "lwz r31,-4(r1)". Note: Other than "wasting" some bytes temporarily, having a "red zone" like scratch area is compatible with gcc 4.2.1 style code as well. A fix?. . . I'm testing "make -j 6 buildworld" on the G5 now based on the following proof-of-concept patch. It is still running and has gotten much farther than all prior attempts. But it will be some time before the G5 and a G4 test are complete. Index: /usr/src/sys/powerpc/powerpc/exec_machdep.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/powerpc/powerpc/exec_machdep.c (revision 295351) +++ /usr/src/sys/powerpc/powerpc/exec_machdep.c (working copy) @@ -155,6 +155,31 @@ ksi->ksi_info.si_addr =3D (void *)((tf->exc =3D=3D EXC_DSI) ?=20 tf->dar : tf->srr0); +/* + * clang 3.8.0+ for TARGET_ARCH=3Dpowerpc (32bit) generates the likes of + * "stw r31, -4(r1)", placing its frame pointer (r31) where the stack + * pointer does not yet reach. It may well at times put even more out + * there before adjusting the stack pointer. + * + * clang also generates "lwz r31, -4(r1)" after incrementing r1 during + * the return sequence: again there is a time during which the frame + * pointer storage is outside where the stack pointer reaches. + * + * Without a "scratch region below the stack" that is respected for + * signal delivery the frame pointer value is sometimes trashed and + * that leads to later segmentation faults. ("Below" by memory + * address viewpoint.) + * + * Using the AIX/Darwin 224 Byte "red-zone" rule for TARGET_ARCH=3Dpowerpc + * here is compatible with gcc 4.2.1's code generation that moves the stack + * pointer first. (But it does then waste some bytes temporarily), So + * have TARGET_ARCH=3Dpowerpc be similar to TARGET_ARCH=3Dpowerpc64 in its + * use of a "scratch region below the stack". + * + * 224 avoids changing the 16-byte alignment property. + */ +#define PPC32_SSCRATCH 224 + #ifdef COMPAT_FREEBSD32 if (SV_PROC_FLAG(p, SV_ILP32)) { siginfo_to_siginfo32(&ksi->ksi_info, &siginfo32); @@ -162,7 +187,7 @@ code =3D siginfo32.si_code; sfp =3D (caddr_t)&sf32; sfpsize =3D sizeof(sf32); - rndfsize =3D ((sizeof(sf32) + 15) / 16) * 16; + rndfsize =3D PPC32_SSCRATCH + ((sizeof(sf32) + 15) / 16) * = 16; /* * Save user context @@ -191,9 +216,11 @@ */ rndfsize =3D 288 + ((sizeof(sf) + 47) / 48) * 48; #else - rndfsize =3D ((sizeof(sf) + 15) / 16) * 16; + rndfsize =3D PPC32_SSCRATCH + ((sizeof(sf) + 15) / 16) * 16; #endif +#undef PPC32_SSCRATCH + /* * Save user context */ --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-206990-8>