Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Feb 2016 05:51:51 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Roman Divacky <rdivacky@vlakno.cz>
Cc:        Nathan Whitehorn <nwhitehorn@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc
Message-ID:  <28FF474D-2109-4605-8B2B-C5374CBCCF42@dsl-only.net>
In-Reply-To: <74577A87-3006-43A9-9EAB-F51D946B6245@dsl-only.net>
References:  <F6846682-10F7-4D0D-A691-ED8D4366805C@dsl-only.net> <20160214192903.GA96697@vlakno.cz> <70B405C4-E1AC-4F35-9786-051FDA2F8BE7@dsl-only.net> <D7D536A4-68B6-4506-BDFB-8C2C41E1C958@dsl-only.net> <20160215191100.GA17387@vlakno.cz> <3A260EC5-E69A-4980-8F74-C04395F4E5F4@dsl-only.net> <20160215201800.GA20796@vlakno.cz> <D702187C-6B8C-42AC-855D-45A570FDB0FA@dsl-only.net> <D3351840-02B5-4696-8163-D90A6E039E4C@dsl-only.net> <74577A87-3006-43A9-9EAB-F51D946B6245@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
By the way: Nothing tested or seen so far checks DOUBLE_OR_FLOAT =
handling.

That involves fr (fpr in va_list in clang terms) instead of gr/gpr. =
fr/fpr has its own independent count and bound for using floating point =
registers vs. using the overflow area. There is also condition register =
bit 6 that indicates if floating point is involved overall or not.

Ultimately which of gpr vs. fpr and which bound (if the numbers are =
distinct in value) depends on the type specified in va_arg =
(SIMPLE_ARG/LONG_LONG vs. DOUBLE_OR_FLOAT status).

This may mean that the fix is an improvement for some types of usage but =
not a complete update: It is wrong for DOUBLE_OR_FLOAT instances of =
var_arg as stands. fpr would need to be involved instead. For world I =
expect it is fairly generally an improvement.

Also if the condition register indicates floating point is involved =
overall then there is likely management/handling of floating point state =
(for context switching management). (If it indicates no floating point =
involvement then there might be optimizations possible.)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-16, at 2:45 AM, Mark Millard <markmi@dsl-only.net> wrote:

I used:

> Index: /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp	=
(revision 295601)
> +++ /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp	=
(working copy)
> @@ -3569,6 +3569,8 @@
>   {
>     CGF.EmitBlock(UsingOverflow);
>=20
> +    Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);
> +
>     // Everything in the overflow area is rounded up to a size of at =
least 4.
>     CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4);

as my test change. (Get evidence of operation before potential cleanup =
of the duplicated 8's.)

After a full buildworld/installworld based on the updated compiler. . .

My simple example of the problem no longer fails.

"ls -l -n /" now works.

"svnlite update -r295601 /usr/src" now works.

So whatever you want to do for the details of any submitted code, the =
basics of the change do avoid the SEGVs and allow these programs to =
work.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-15, at 4:27 PM, Mark Millard <markmi@dsl-only.net> wrote:

On 2016-Feb-15, at 1:20 PM, Mark Millard <markmi at dsl-only.net> wrote:
>=20
> On 2016-Feb-15, at 12:18 PM, Roman Divacky <rdivacky at vlakno.cz> =
wrote:
>>=20
>> On Mon, Feb 15, 2016 at 12:17:50PM -0800, Mark Millard wrote:
>>> On 2016-Feb-15, at 11:11 AM, Roman Divacky <rdivacky at vlakno.cz> =
wrote:
>>>>=20
>>>> Mark, I believe you're right. What do you think about this patch?
>>>>=20
>>>> Index: tools/clang/lib/CodeGen/TargetInfo.cpp
>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>>> --- tools/clang/lib/CodeGen/TargetInfo.cpp	(revision 260852)
>>>> +++ tools/clang/lib/CodeGen/TargetInfo.cpp	(working copy)
>>>> @@ -3599,6 +3599,8 @@
>>>> {
>>>> CGF.EmitBlock(UsingOverflow);
>>>>=20
>>>> +    Builder.CreateStore(Builder.getInt8(11), NumRegsAddr);
>>>> +
>>>> // Everything in the overflow area is rounded up to a size of at =
least 4.
>>>> CharUnits OverflowAreaAlign =3D CharUnits::fromQuantity(4);
>>>>=20
>>>>=20
>>>> Can you test it?
>>>=20
>>> It may be later today before I can start the the test process.
>>>=20
>>> While your change is not wrong as presented, it does seem to be =
based on the ABI document's numbering with the range 3 <=3D gr <12, =
where 3 <=3D gr < 11 cover r3-r10 use and gr=3D11 implies overflow stack =
area use. (gr being the ABI documents name.)
>>>=20
>>> The clang code generation that I saw while analyzing the problem and =
the clang source that you had me look at did not use that numbering. =
Instead it seems to be based on 0 <=3D gpr < 9, where 0 <=3D gpr < 8 =
cover r3-r10 use and gpr=3D8 implies overflow stack area use. (gpr being =
what gdb showed me as I remember.) In other words: clang counts the =
number of "parameter registers" already in use as it goes along instead =
of tracking register numbers that have been used.
>>>=20
>>> So assigning any value that appears to be positive and >=3D 8 should =
work, such as:
>>>=20
>>> Builder.CreateStore(Builder.getInt8(8), NumRegsAddr);
>>=20
>> Can you check what number gcc uses? We want to be interoperable with =
gcc.
>>=20
>> Anyway, thanks for testing!
>>=20
>> Roman
>=20
> I'll do that check of gcc 4.2.1 code generation before starting the =
test later today.
>=20
> But if the clang numbering is different in gcc 4.2.1 then far more =
than just adding a
>=20
>> Builder.CreateStore(Builder.getInt8(?), NumRegsAddr)
>=20
>=20
> for some "?" would need to be involved in the changes in order to =
reach compatibility.
>=20
>=20
> I'll note that for clang 3.8.0 the actual comparison instruction =
generated is of the form
>=20
>> cmplwi  r?,7
>=20
>=20
> for some r?, such as r5 or r4, and the conditional branch generated is =
a bgt instruction.
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net

gcc 4.2.1 generates comparison instructions for va_arg of the form:

cmplwi  cr7,r0,8

and the conditional branch generated is a "bge cr7, . . ." instruction.

So the same number range is in use by both compilers: They are =
compatible for the bounds checks for reg vs. overflow for how they =
count, equality inclusion/exclusion matching up with the specific number =
(8 vs. 7) used to make things the same overall.

Other aspects of the code generation distinctions would take me time to =
analyze. It will be a while before I will be looking at other points.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?28FF474D-2109-4605-8B2B-C5374CBCCF42>