Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Jul 2014 09:20:55 -0700
From:      Justin Hibbits <jrh29@alumni.cwru.edu>
To:        Alexey Dokuchaev <danfe@nsu.ru>
Cc:        powerpc@freebsd.org
Subject:   Re: How to convert SSEish _mm_set1_ps() into AltiVec correctly?
Message-ID:  <CAHSQbTAG8rJbfyYG-FaQjuVm0ZYWAOLN6UcY-ycM%2Byuw0OvESw@mail.gmail.com>
In-Reply-To: <20140714154224.GA28612@regency.nsu.ru>
References:  <20140714154224.GA28612@regency.nsu.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 14, 2014 at 8:42 AM, Alexey Dokuchaev <danfe@nsu.ru> wrote:
> Hi there,
>
> I'm a bit confused about how to convert _mm_set1_ps() [1] SSE function into
> its AltiVec equivalent.  To start with, I need to set all four floats of a
> vector to the same value.  So far, I've come up with two versions that work
> with GCC or Clang, but I want to have a code that works with any compiler,
> and is technically correct (works not just by accident).
>
> On PowerPC, there are two altivec.h files provided by GCC 4.2 and Clang:
>
>     /usr/include/clang/3.4.1/altivec.h
>     /usr/include/gcc/4.2/altivec.h
>
> The problem is that they are substantially different (read: offer different
> APIs).  For Clang, I can simply write something like this:
>
>     union {
>         float f1, f2, f3, f4;
>         vector float f;
>     } foo;
>
>     foo.f = vec_splats(42.f);
>     // all f1, f2, f3, f4 are 42.f now
>
> But this does not work with GCC: it simply does not offer vec_splats(float);
> however, I can do this:
>
>     float init = 42.f;
>     foo.f = vec_ld(0, &init);
>
> And it will set all four components to 42.f.  Yet this is technically wrong,
> as apparently I'm supposed to pass an entire array of floats, e.g. if built
> with Clang all floats are "nan".  Lets change the code to this:
>
>     float init[4] = { 42.f };
>     foo.f = vec_ld(0, init);
>
> This works with both compilers, but I'm not sure if it is correct.  Can any
> of our AltiVec experts give me some hint here?  Thanks.
>
> ./danfe
>
> [1] http://msdn.microsoft.com/en-us/library/vstudio/2x1se8ha%28v=vs.100%29.aspx

I just tried the following:

vector float a = (vector float){42.0f};
vector float b = vec_splat(a, 0);

Haven't done anything more than compile test it, but it builds with
both gcc and clang.  GCC uses vspltw, while clang uses vperm.

- Justin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHSQbTAG8rJbfyYG-FaQjuVm0ZYWAOLN6UcY-ycM%2Byuw0OvESw>