Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Nov 2013 22:43:20 +0700
From:      Alexey Dokuchaev <danfe@nsu.ru>
To:        hackers@freebsd.org
Subject:   Re: SSE2 intrinsics: gcc46 vs. clang contradiction
Message-ID:  <20131101154320.GA11359@regency.nsu.ru>
In-Reply-To: <20131101124645.GA73456@regency.nsu.ru>
References:  <20131101124645.GA73456@regency.nsu.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 01, 2013 at 07:46:45PM +0700, Alexey Dokuchaev wrote:
> What adds to confusion, in their manual [1] Intel spells them differently
> themselves: first, in the table, it says:
> 
>   _mm_movpi64_epi64		Move		MOVDQ2Q
>               ^^^^^
> 
> Then later, when they describe what it does, it says:
> 
>   __m128i _mm_movpi64_pi64(__m64 a)
>                       ^^^^
>   Moves the 64 bits of a to the lower 64 bits of the result, zeroing the
>   upper bits.

Microsoft (http://msdn.microsoft.com/en-us/library/has3d153(v=vs.90).aspx)
defines these two:

  _mm_movepi64_pi64		MOVDQ2Q			Move
  _mm_movpi64_epi64		MOVQ2DQ			Move

That is:

  __m64 _mm_movepi64_pi64 (__m128i a);
  MOVDQ2Q
  r0 := a0 ;

  __m128i _mm_movpi64_epi64 (__m64 a);
  MOVDQ2Q
  r0 := a0 ; r1 := 0X0 ;

Cf. Intel's:

  _mm_movepi64_pi64		Move			MOVDQ2Q
  _mm_movpi64_epi64		Move			MOVDQ2Q

  __m64 _mm_movepi64_pi64(__m128i a)
  Returns the lower 64 bits of a as an __m64 type:	R0 := a0

  __m128i _mm_movpi64_pi64(__m64 a)
  Moves the 64 bits of a to the lower 64 bits
  of the result, zeroing the upper bits:		R0 := a0, R1 = 0X0

Assuming that both documents correctly assign instructions to function
names (bonus clue: it also makes them symmetrical), then _mm_movpi64_pi64
is indeed a typo and Clang's header is wrong, while GCC's is correct: it
should read _mm_movpi64_epi64(), not _mm_movpi64_pi64().

./danfe



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131101154320.GA11359>