Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Oct 2012 11:47:43 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Peter Wemm <peter@wemm.org>, freebsd-arch@freebsd.org
Subject:   Re: using SSE2 in kernel C code (improving AES-NI module)
Message-ID:  <20121023084743.GQ35915@deviant.kiev.zoral.com.ua>
In-Reply-To: <20121023070417.GD1563@funkthat.com>
References:  <20121019233833.GS1967@funkthat.com> <20121020054847.GB35915@deviant.kiev.zoral.com.ua> <20121020171124.GU1967@funkthat.com> <CAGE5yCoM92rU7Ca7C7_x=3vXW%2BqO9Zc0uQhPURuMbstPDvq9yg@mail.gmail.com> <20121021024726.GA1563@funkthat.com> <20121021061011.GG35915@deviant.kiev.zoral.com.ua> <20121023070417.GD1563@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--1D9gOFySlrGasJlk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 23, 2012 at 12:04:17AM -0700, John-Mark Gurney wrote:
> Konstantin Belousov wrote this message on Sun, Oct 21, 2012 at 09:10 +030=
0:
> > On Sat, Oct 20, 2012 at 07:47:26PM -0700, John-Mark Gurney wrote:
> > > Peter Wemm wrote this message on Sat, Oct 20, 2012 at 11:10 -0700:
> > > > Or, another option.. do something like genassym or the many other
> > > > kernel build tools.  aicasm builds and runs a userland tool to
> > > > generate something to build into the kernel.  With sufficient
> > > > cross-contamination safeguards I wonder if something similar might =
be
> > > > able to be done here.
> > >=20
> > > Well, looks like I may this working...  Turns out I can't name the fi=
le
> > > .s otherwise config puts it in SFILES which causes all sorts of probl=
ems..
> > > So, I went w/ .nos, does any one else have any suggestions?
> > >=20
> > > how does this look to people:
> > > aesni_wrap2.nos                 optional aesni                       =
      \
> > >         dependency      "$S/crypto/aesni/aesni_wrap2.c"              =
      \
> > >         compile-with    "${CC} -O3 -fPIC -S -o aesni_wrap2.nos $S/cry=
pto/aesni/aesni_wrap2.c" \  =20
> > >         no-obj no-implicit-rule before-depend                        =
      \
> > >         clean           "aesni_wrap2.nos"
> > > aesni_wrap2.o                   optional aesni                       =
      \
> > >         dependency      "aesni_wrap2.nos"                            =
      \
> > >         compile-with    "${NORMAL_S} aesni_wrap2.nos"                =
      \
> > >         no-implicit-rule                                             =
      \
> > >         clean           "aesni_wrap2.o"
> > >=20
> > > We'll have to do something similar in the module Makefile, but that is
> > > easier...
> > >=20
> > > Also, I thought we had a better way to note that some devices depend
> > > upon others than just throwing a depend error...  If you include aesni
> > > w/o crypto, you get error about missing cryptodev_if.h...
> > >=20
> > Hm, if such thing is possible, why do you need to compile through the
> > .S at all ? All you need is to specify the special compiling flags,
> > including -msse and -msse2.
>=20
> Thanks, I managed to get it down to one...
>=20
> > Note, you shall not need -fPIC, at least for amd64. I would suggest to =
use
> > -O2, as well as to try to honour the -g settings.
>=20
> If I don't do -fpic I get:
> aesni_wrap2.o:(.eh_frame+0x20): relocation truncated to fit: R_X86_64_32 =
against `.text'
>=20
> when linking the kernel...  If you can explain to me how to get rid of
> this error, I'll do it..
Yes, because you need -mcmodel=3Dkernel on amd64, but -fPIC on i386.
This is why I suggested to use CFLAGS, which takes care of it in single
place.

It would be huge PITA to duplicate the kernel compilation flag for
arch in some obscure place. The best would be to edit the CFLAGS in place,
if possible (I do not know make to judge). Second possible way is to
add some var like CFLAGS_SSE to centralized place.

>=20
> > Most likely, you can put the ${CFLAGS} on the command line, followed
> > by -msse -msse2.
>=20
> I can't use CFLAGS because it removes access to the xmmintrin.h header
> file...  It looks like an option is to use:
> -fpic ${OPTFLAGS:C/^-O2$/-O3/} ${DEBUG}
>=20
> In my testing, -O2 is significantly slower, hence the bump to -O3:
> x O2.txt
> + O3.txt
>     N           Min           Max        Median           Avg        Stdd=
ev
> x  20     1741.3491      1754.987     1752.9267     1751.5602     3.56169=
47
> +  20      2223.217     2244.4501     2242.7028     2240.3183     5.70206=
91
> Difference at 95.0% confidence
>         488.758 +/- 3.04271
>         27.9042% +/- 0.173715%
>         (Student's t, pooled s =3D 4.75391)
>=20
> Those are MB/sec...
I think that -O3 compile output have to be validated manually, due to
high-risk optimizations. Anyway, if it works there, great.

>=20
> Index: files.amd64
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- files.amd64	(revision 241041)
> +++ files.amd64	(working copy)
> @@ -137,6 +137,11 @@
>  crypto/aesni/aeskeys_amd64.S	optional aesni
>  crypto/aesni/aesni.c		optional aesni
>  crypto/aesni/aesni_wrap.c	optional aesni
> +aesni_wrap2.o			optional aesni				   \
> +	dependency	"$S/crypto/aesni/aesni_wrap2.c"			   \
> +	compile-with    "${CC} -c -fpic ${COPTFLAGS:C/^-O2$/-O3/} ${DEBUG} -o a=
esni_wrap2.o $S/crypto/aesni/aesni_wrap2.c" \
> +	no-implicit-rule						   \
> +	clean           "aesni_wrap2.o"
>  crypto/blowfish/bf_enc.c	optional	crypto | ipsec=20
>  crypto/des/des_enc.c		optional	crypto | ipsec | netsmb
>  crypto/via/padlock.c		optional	padlock
>=20
>=20
> I still need to fix up i386, and will let people review a full patch
> to address both arches before committing...
>=20
> --=20
>   John-Mark Gurney				Voice: +1 415 225 5579
>=20
>      "All that I will do, has been done, All that I have, has not."

--1D9gOFySlrGasJlk
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAlCGWa8ACgkQC3+MBN1Mb4jdlQCgp+rejsSxhHgcrXYHOtXtYXEs
FzIAoO3Tar4TX1dsw4xfZaYhsVwlTuqZ
=e4Jo
-----END PGP SIGNATURE-----

--1D9gOFySlrGasJlk--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121023084743.GQ35915>