Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 2 Nov 2013 01:07:47 +0100
From:      Dimitry Andric <dim@FreeBSD.org>
To:        =?iso-8859-1?Q?=22C=2E_Bergstr=F6m=22?= <cbergstrom@pathscale.com>
Cc:        Alexey Dokuchaev <danfe@nsu.ru>, hackers@freebsd.org, cfe-commits@cs.uiuc.edu
Subject:   Re: SSE2 intrinsics: gcc46 vs. clang contradiction
Message-ID:  <A7CD3ADC-872C-40D7-B48A-C0C1A8FA885A@FreeBSD.org>
In-Reply-To: <52742115.9010404@pathscale.com>
References:  <20131101124645.GA73456@regency.nsu.ru> <20131101154320.GA11359@regency.nsu.ru> <52742115.9010404@pathscale.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_DF9A7AA5-2FDA-458D-9DC4-3ABF13DF6A07
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

On 01 Nov 2013, at 22:45, C. Bergstr=F6m <cbergstrom@pathscale.com> =
wrote:
> On 11/ 1/13 10:43 PM, Alexey Dokuchaev wrote:
>> On Fri, Nov 01, 2013 at 07:46:45PM +0700, Alexey Dokuchaev wrote:
>>> What adds to confusion, in their manual [1] Intel spells them =
differently
>>> themselves: first, in the table, it says:
>>>=20
>>>   _mm_movpi64_epi64		Move		MOVDQ2Q
>>>               ^^^^^
>>>=20
>>> Then later, when they describe what it does, it says:
>>>=20
>>>   __m128i _mm_movpi64_pi64(__m64 a)
>>>                       ^^^^
>>>   Moves the 64 bits of a to the lower 64 bits of the result, zeroing =
the
>>>   upper bits.
>> Microsoft =
(http://msdn.microsoft.com/en-us/library/has3d153(v=3Dvs.90).aspx)
>> defines these two:
>>=20
>>   _mm_movepi64_pi64		MOVDQ2Q			Move
>>   _mm_movpi64_epi64		MOVQ2DQ			Move
>>=20
>> That is:
>>=20
>>   __m64 _mm_movepi64_pi64 (__m128i a);
>>   MOVDQ2Q
>>   r0 :=3D a0 ;
>>=20
>>   __m128i _mm_movpi64_epi64 (__m64 a);
>>   MOVDQ2Q
>>   r0 :=3D a0 ; r1 :=3D 0X0 ;
>>=20
>> Cf. Intel's:
>>=20
>>   _mm_movepi64_pi64		Move			MOVDQ2Q
>>   _mm_movpi64_epi64		Move			MOVDQ2Q
>>=20
>>   __m64 _mm_movepi64_pi64(__m128i a)
>>   Returns the lower 64 bits of a as an __m64 type:	R0 :=3D a0
>>=20
>>   __m128i _mm_movpi64_pi64(__m64 a)
>>   Moves the 64 bits of a to the lower 64 bits
>>   of the result, zeroing the upper bits:		R0 :=3D a0, R1 =3D=
 0X0
>>=20
>> Assuming that both documents correctly assign instructions to =
function
>> names (bonus clue: it also makes them symmetrical), then =
_mm_movpi64_pi64
>> is indeed a typo and Clang's header is wrong, while GCC's is correct: =
it
>> should read _mm_movpi64_epi64(), not _mm_movpi64_pi64().
> Why isn't this being asked on the clang or llvm mailing list? Wouldn't =
this impact upstream as well?

Indeed, so redirecting to the cfe-commits list.  It looks like this =
incorrect function name has been in emmintrin.h since clang r61443 (by =
andersca).  Basically, we need the typo fixed as follows:

Index: tools/clang/lib/Headers/emmintrin.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- tools/clang/lib/Headers/emmintrin.h (revision 193039)
+++ tools/clang/lib/Headers/emmintrin.h (working copy)
@@ -1366,7 +1366,7 @@ _mm_movepi64_pi64(__m128i __a)
 }

 static __inline__ __m128i __attribute__((__always_inline__, =
__nodebug__))
-_mm_movpi64_pi64(__m64 __a)
+_mm_movpi64_epi64(__m64 __a)
 {
   return (__m128i){ (long long)__a, 0 };
 }

Is this OK?

-Dimitry


--Apple-Mail=_DF9A7AA5-2FDA-458D-9DC4-3ABF13DF6A07
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iEYEARECAAYFAlJ0QlwACgkQsF6jCi4glqNxKgCggpYbVbwFv7WfDirtup04XUw6
0YwAnRfBHUAF3BP5+MNVb6DquYtH4MKM
=RRCu
-----END PGP SIGNATURE-----

--Apple-Mail=_DF9A7AA5-2FDA-458D-9DC4-3ABF13DF6A07--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A7CD3ADC-872C-40D7-B48A-C0C1A8FA885A>