From owner-freebsd-hackers@FreeBSD.ORG Fri Nov 1 15:43:52 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EBB9E9A2 for ; Fri, 1 Nov 2013 15:43:52 +0000 (UTC) (envelope-from danfe@regency.nsu.ru) Received: from mx.nsu.ru (mx.nsu.ru [84.237.50.39]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 923C82562 for ; Fri, 1 Nov 2013 15:43:52 +0000 (UTC) Received: from regency.nsu.ru ([193.124.210.26]) by mx.nsu.ru with esmtp (Exim 4.69) (envelope-from ) id 1VcGsv-00077d-ID for hackers@freebsd.org; Fri, 01 Nov 2013 22:43:46 +0700 Received: from regency.nsu.ru (localhost [127.0.0.1]) by regency.nsu.ru (8.14.2/8.14.2) with ESMTP id rA1FhQMs014968 for ; Fri, 1 Nov 2013 22:43:36 +0700 (NOVT) (envelope-from danfe@regency.nsu.ru) Received: (from danfe@localhost) by regency.nsu.ru (8.14.2/8.14.2/Submit) id rA1FhLOt014929 for hackers@freebsd.org; Fri, 1 Nov 2013 22:43:21 +0700 (NOVT) (envelope-from danfe) Date: Fri, 1 Nov 2013 22:43:20 +0700 From: Alexey Dokuchaev To: hackers@freebsd.org Subject: Re: SSE2 intrinsics: gcc46 vs. clang contradiction Message-ID: <20131101154320.GA11359@regency.nsu.ru> References: <20131101124645.GA73456@regency.nsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131101124645.GA73456@regency.nsu.ru> User-Agent: Mutt/1.4.2.1i X-Mailman-Approved-At: Fri, 01 Nov 2013 16:09:31 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Nov 2013 15:43:53 -0000 On Fri, Nov 01, 2013 at 07:46:45PM +0700, Alexey Dokuchaev wrote: > What adds to confusion, in their manual [1] Intel spells them differently > themselves: first, in the table, it says: > > _mm_movpi64_epi64 Move MOVDQ2Q > ^^^^^ > > Then later, when they describe what it does, it says: > > __m128i _mm_movpi64_pi64(__m64 a) > ^^^^ > Moves the 64 bits of a to the lower 64 bits of the result, zeroing the > upper bits. Microsoft (http://msdn.microsoft.com/en-us/library/has3d153(v=vs.90).aspx) defines these two: _mm_movepi64_pi64 MOVDQ2Q Move _mm_movpi64_epi64 MOVQ2DQ Move That is: __m64 _mm_movepi64_pi64 (__m128i a); MOVDQ2Q r0 := a0 ; __m128i _mm_movpi64_epi64 (__m64 a); MOVDQ2Q r0 := a0 ; r1 := 0X0 ; Cf. Intel's: _mm_movepi64_pi64 Move MOVDQ2Q _mm_movpi64_epi64 Move MOVDQ2Q __m64 _mm_movepi64_pi64(__m128i a) Returns the lower 64 bits of a as an __m64 type: R0 := a0 __m128i _mm_movpi64_pi64(__m64 a) Moves the 64 bits of a to the lower 64 bits of the result, zeroing the upper bits: R0 := a0, R1 = 0X0 Assuming that both documents correctly assign instructions to function names (bonus clue: it also makes them symmetrical), then _mm_movpi64_pi64 is indeed a typo and Clang's header is wrong, while GCC's is correct: it should read _mm_movpi64_epi64(), not _mm_movpi64_pi64(). ./danfe