From owner-freebsd-toolchain@freebsd.org Thu Feb 18 22:29:38 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 343E4AADD91 for ; Thu, 18 Feb 2016 22:29:38 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-153.reflexion.net [208.70.211.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EAE4CA76 for ; Thu, 18 Feb 2016 22:29:37 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 5885 invoked from network); 18 Feb 2016 22:29:31 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 18 Feb 2016 22:29:31 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Thu, 18 Feb 2016 17:29:25 -0500 (EST) Received: (qmail 7085 invoked from network); 18 Feb 2016 22:29:24 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 18 Feb 2016 22:29:24 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 34AF6B1E002; Thu, 18 Feb 2016 14:29:26 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc [fpr use also tested] From: Mark Millard In-Reply-To: <8EB46124-3335-4643-8C64-16DA56D481F5@dsl-only.net> Date: Thu, 18 Feb 2016 14:29:29 -0800 Cc: Nathan Whitehorn , FreeBSD Toolchain , FreeBSD PowerPC ML Content-Transfer-Encoding: quoted-printable Message-Id: <8C40A5D7-C0B8-4142-89D4-228017C446CE@dsl-only.net> References: <20160214192903.GA96697@vlakno.cz> <70B405C4-E1AC-4F35-9786-051FDA2F8BE7@dsl-only.net> <20160215191100.GA17387@vlakno.cz> <3A260EC5-E69A-4980-8F74-C04395F4E5F4@dsl-only.net> <20160215201800.GA20796@vlakno.cz> <74577A87-3006-43A9-9EAB-F51D946B6245@dsl-only.net> <28FF474D-2109-4605-8B2B-C5374CBCCF42@dsl-only.net> <8EB46124-3335-4643-8C64-16DA56D481F5@dsl-only.net> To: Roman Divacky X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Feb 2016 22:29:38 -0000 On 2016-Feb-17, at 9:23 PM, Mark Millard wrote: >=20 > My fpr related notes/worries about the fix were wrong. >=20 > I finally got some time to look at this again and I see that I somehow = missed the following code when I looked before: >=20 > // The calling convention either uses 1-2 GPRs or 1 FPR. > Address NumRegsAddr =3D Address::invalid(); > if (isInt || IsSoftFloatABI) { > NumRegsAddr =3D Builder.CreateStructGEP(VAList, 0, = CharUnits::Zero(), "gpr"); > } else { > NumRegsAddr =3D Builder.CreateStructGEP(VAList, 1, = CharUnits::One(), "fpr"); > } >=20 > So the >=20 > Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >=20 > in Case 2 is tracking gpr vs. fpr usage contexts as it should. Also: >=20 > llvm::Value *NumRegs =3D Builder.CreateLoad(NumRegsAddr, = "numUsedRegs"); >=20 > // "Align" the register count when TY is i64. > if (isI64 || (isF64 && IsSoftFloatABI)) { > NumRegs =3D Builder.CreateAdd(NumRegs, Builder.getInt8(1)); > NumRegs =3D Builder.CreateAnd(NumRegs, Builder.getInt8((uint8_t) = ~1U)); > } >=20 > llvm::Value *CC =3D > Builder.CreateICmpULT(NumRegs, Builder.getInt8(8), "cond"); >=20 > is using the same bounds check figure (8) for gpr and fpr. >=20 > Apparently that common bound is one reason that the clang numbering is = not the same as the ABI document's numbering: clang's numbering allows = using the same figure for both contexts. (Given the prior alignment for = isI64 (or isF64 with IsSoftFloatABI).) >=20 > Sorry for the prior noise about fpr. >=20 > It is still true that DOUBLE_OR_FLOAT is untested so far. >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net I finally got some time to apply to some basic testing involving double = as well (for involving fpr use). . . No problems with exceptions. Looking at the memory contents at various = stages in gdb looks good. va_list's gpr, fpr, overflow_arg_area changes = as its va_args use progresses look good. Values extracted by va_args use = looks good. Both default and -O2. The added > Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); passes my checks. I've not observed any problems from buildworld = materials, unlike when that line is missing. [Note: I run with the signal delivery modified to have a "red zone" to = deal with other aspects of clang 3.8.0 code generation that are not ABI = compliant for when the stack pointer is moved. Having a "red zone" is = still operationally correct for an ABI compliant code generation, it = just temporarily wastes more bytes. Also: the kernel was built with gcc = 4.2.1 but world was built with clang 3.8.0.] =3D=3D=3D Mark Millard markmi at dsl-only.net . . . [bad fpr related material omitted] . . . On 2016-Feb-16, at 2:45 AM, Mark Millard wrote: I used: > Index: /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp = (revision 295601) > +++ /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp = (working copy) > @@ -3569,6 +3569,8 @@ > { > CGF.EmitBlock(UsingOverflow); >=20 > + Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); > + > // Everything in the overflow area is rounded up to a size of at = least 4. > CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4); as my test change. (Get evidence of operation before potential cleanup = of the duplicated 8's.) After a full buildworld/installworld based on the updated compiler. . . My simple example of the problem no longer fails. "ls -l -n /" now works. "svnlite update -r295601 /usr/src" now works. So whatever you want to do for the details of any submitted code, the = basics of the change do avoid the SEGVs and allow these programs to = work. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-15, at 4:27 PM, Mark Millard wrote: On 2016-Feb-15, at 1:20 PM, Mark Millard wrote: >=20 > On 2016-Feb-15, at 12:18 PM, Roman Divacky = wrote: >>=20 >> On Mon, Feb 15, 2016 at 12:17:50PM -0800, Mark Millard wrote: >>> On 2016-Feb-15, at 11:11 AM, Roman Divacky = wrote: >>>>=20 >>>> Mark, I believe you're right. What do you think about this patch? >>>>=20 >>>> Index: tools/clang/lib/CodeGen/TargetInfo.cpp >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> --- tools/clang/lib/CodeGen/TargetInfo.cpp (revision 260852) >>>> +++ tools/clang/lib/CodeGen/TargetInfo.cpp (working copy) >>>> @@ -3599,6 +3599,8 @@ >>>> { >>>> CGF.EmitBlock(UsingOverflow); >>>>=20 >>>> + Builder.CreateStore(Builder.getInt8(11), NumRegsAddr); >>>> + >>>> // Everything in the overflow area is rounded up to a size of at = least 4. >>>> CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4); >>>>=20 >>>>=20 >>>> Can you test it? >>>=20 >>> It may be later today before I can start the the test process. >>>=20 >>> While your change is not wrong as presented, it does seem to be = based on the ABI document's numbering with the range 3 <=3D gr <12, = where 3 <=3D gr < 11 cover r3-r10 use and gr=3D11 implies overflow stack = area use. (gr being the ABI documents name.) >>>=20 >>> The clang code generation that I saw while analyzing the problem and = the clang source that you had me look at did not use that numbering. = Instead it seems to be based on 0 <=3D gpr < 9, where 0 <=3D gpr < 8 = cover r3-r10 use and gpr=3D8 implies overflow stack area use. (gpr being = what gdb showed me as I remember.) In other words: clang counts the = number of "parameter registers" already in use as it goes along instead = of tracking register numbers that have been used. >>>=20 >>> So assigning any value that appears to be positive and >=3D 8 should = work, such as: >>>=20 >>> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr); >>=20 >> Can you check what number gcc uses? We want to be interoperable with = gcc. >>=20 >> Anyway, thanks for testing! >>=20 >> Roman >=20 > I'll do that check of gcc 4.2.1 code generation before starting the = test later today. >=20 > But if the clang numbering is different in gcc 4.2.1 then far more = than just adding a >=20 >> Builder.CreateStore(Builder.getInt8(?), NumRegsAddr) >=20 >=20 > for some "?" would need to be involved in the changes in order to = reach compatibility. >=20 >=20 > I'll note that for clang 3.8.0 the actual comparison instruction = generated is of the form >=20 >> cmplwi r?,7 >=20 >=20 > for some r?, such as r5 or r4, and the conditional branch generated is = a bgt instruction. >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net gcc 4.2.1 generates comparison instructions for va_arg of the form: cmplwi cr7,r0,8 and the conditional branch generated is a "bge cr7, . . ." instruction. So the same number range is in use by both compilers: They are = compatible for the bounds checks for reg vs. overflow for how they = count, equality inclusion/exclusion matching up with the specific number = (8 vs. 7) used to make things the same overall. Other aspects of the code generation distinctions would take me time to = analyze. It will be a while before I will be looking at other points. =3D=3D=3D Mark Millard markmi at dsl-only.net