Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Mar 2016 01:02:20 +0100
From:      Dimitry Andric <dim@FreeBSD.org>
To:        Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc:        freebsd-toolchain@freebsd.org
Subject:   Re: clang gets numerical underflow wrong, please fix.
Message-ID:  <A70D119A-514A-4949-9BCB-CA344650BDB5@FreeBSD.org>
In-Reply-To: <20160313201004.GA26343@troutmask.apl.washington.edu>
References:  <20160313182521.GA25361@troutmask.apl.washington.edu> <74970883-FE44-47C0-BDA0-92DB0723398A@FreeBSD.org> <20160313201004.GA26343@troutmask.apl.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_40B5429D-BCD2-4684-8E3A-55F296B73BBE
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

On 13 Mar 2016, at 21:10, Steve Kargl <sgk@troutmask.apl.washington.edu> =
wrote:
> On Sun, Mar 13, 2016 at 09:03:57PM +0100, Dimitry Andric wrote:
...
>> So it's storing the intermediate result in a double, for some reason.
>> The fnstsw will then result in zero, since there was no underflow at
>> that point.
>>=20
>> I will submit a bug for this upstream, thanks for the report.

Submitted upstream as: https://llvm.org/bugs/show_bug.cgi?id=3D26931


> Thanks for the quick reply.  But, it must be using an 80-bit
> extended double instead of a double for storage.  This variation
>=20
> #include <fenv.h>
> #include <stdio.h>
>=20
> int
> main(void)
> {
>   int i;
> //   float x =3D 1.f;
>   double x =3D 1.;
>   i =3D 0;
>   feclearexcept(FE_ALL_EXCEPT);
>   do {
>      x /=3D 2;
>      i++;
>   } while(!fetestexcept(FE_UNDERFLOW));
>   if (fetestexcept(FE_UNDERFLOW)) printf("FE_UNDERFLOW: ");
>   printf("x =3D %e after %d iterations\n", x, i);
>=20
>   return 0;
> }
>=20
> yields
>=20
> % cc -O -o z b.c -lm && ./z
> FE_UNDERFLOW: x =3D 0.000000e+00 after 16435 iterations
>=20
> It should be 1075 iterations.
>=20
> Note, there is a similar issue with OVERFLOW.  The upshot is
> that clang on current is probably miscompiling libm.

With this example, I also get different results from gcc (4.8.5),
depending on the optimization level:

$ gcc -O underflow-iter.c -o underflow-iter-gcc -lm
$ ./underflow-iter-gcc
FE_UNDERFLOW: x =3D 0.000000e+00 after 1075 iterations
$ gcc -O2 underflow-iter.c -o underflow-iter-gcc -lm
$ ./underflow-iter-gcc
FE_UNDERFLOW: x =3D 0.000000e+00 after 16435 iterations

Similar for the overflow case:

$ gcc -O overflow-iter.c -o overflow-iter-gcc -lm
$ ./overflow-iter-gcc
FE_OVERFLOW: x =3D inf after 1024 iterations
$ gcc -O2 overflow-iter.c -o overflow-iter-gcc -lm
$ ./overflow-iter-gcc
FE_OVERFLOW: x =3D inf after 16384 iterations

Are we depending on some sort of subtle undefined behavior here?  With
-O, the 'main loop' becomes:

.L3:
	fld1
	fstpl	24(%esp)
	movl	$0, %ebx
.L8:
	fldl	24(%esp)
	fld	%st(0)
	faddp	%st, %st(1)
	fstpl	24(%esp)
	addl	$1, %ebx
	fnstsw %ax
	movl	%eax, %esi
	movl	__has_sse, %eax
	testl	%eax, %eax
	je	.L4
	cmpl	$2, %eax
	jne	.L5
	call	__test_sse
	testl	%eax, %eax
	je	.L5
.L4:
	stmxcsr 44(%esp)
	jmp	.L6
.L5:
	movl	$0, 44(%esp)
.L6:
	orl	44(%esp), %esi
	testl	$8, %esi
	je	.L8

With -O2, it becomes:

.L3:
	fld1
	xorl	%ebx, %ebx
.L12:
	fadd	%st(0), %st
	addl	$1, %ebx
	fnstsw %ax
	testl	%edx, %edx
	movl	%eax, %esi
	je	.L10
	cmpl	$2, %edx
	je	.L27
.L9:
	xorl	%eax, %eax
.L8:
	orl	%eax, %esi
	andl	$8, %esi
	je	.L12

So it switches from using faddp and fstpl to direct fadd of %st(0) and
%st.  I assume that uses the internal 80 bit precision?  Gcc also
manages to move the __has_sse stuff out to further down in the function,
but it does not really affect the result.

-Dimitry


--Apple-Mail=_40B5429D-BCD2-4684-8E3A-55F296B73BBE
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.29

iEYEARECAAYFAlbl/5MACgkQsF6jCi4glqO95wCfaSScY8fm/V7XtAcMJ7Xz7Ctw
/OUAoISYUy/1dgZFhXFbT7wPyDRgSWZF
=prQV
-----END PGP SIGNATURE-----

--Apple-Mail=_40B5429D-BCD2-4684-8E3A-55F296B73BBE--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A70D119A-514A-4949-9BCB-CA344650BDB5>