Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Mar 2019 12:45:29 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        bde@freebsd.org, freebsd-hackers Hackers <freebsd-hackers@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]
Message-ID:  <679402FF-907C-43AF-B18C-8C9CC857D7A6@yahoo.com>
In-Reply-To: <20190301112717.GW2420@kib.kiev.ua>
References:  <D3D7E9F4-9A5E-4320-B3C8-EC5CEF4A2764@yahoo.com> <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2019-Mar-1, at 03:27, Konstantin Belousov <kib@freebsd.org> wrote:

> On Thu, Feb 28, 2019 at 11:39:06PM -0800, Mark Millard wrote:
>> [The new, trial code also has truncation occurring.]
> In fact no, I do not think it is.
>=20
>> An example observation of diff_scaled having an overflowed
>> value was:
>>=20
>> scale_factor            =3D=3D 0x80da2067ac
>> scale_factor*freq overflows unsigned, 64 bit representation.
>> tim_offset              =3D=3D   0x3da0eaeb
>> tim_cnt                 =3D=3D   0x42dea3c4
>> tim_diff                =3D=3D    0x53db8d9
>> For reference:                0x1fc9d43 =3D=3D =
0xffffffffffffffffull/scale_factor
>> scaled_diff       =3D=3D 0xA353A5BF3FF780CC (truncated to 64 bits)
>>=20
>> But for the new, trail code:
>>=20
>> 0x80da2067ac is 40 bits
>>   0x53db8d9 is 27 bits
>> So 67 bits, more than 63. Then:
>>=20
>>   x
>> =3D=3D (0x80da2067ac>>32) * 0x53db8d9
>> =3D=3D 0x80 * 0x53db8d9
>> =3D=3D 0x29EDC6C80
>>=20
>>   x>>32
>> =3D=3D 0x2
>>=20
>>   x<<32
>> =3D=3D 0x9EDC6C8000000000 (limited to 64 bits)
>> Note the truncation of: 0x29EDC6C8000000000.
> Right, this is how the patch is supposed to work.  Note that the =
overflow
> bits 'lost' due to overflow of the left shift are the same bits that =
as
> used to increment bt->sec:
> 	bt->sec +=3D x >> 32;
> So the 2 seconds are accounted for.

Good to know.

>>=20
>> Thus the "bintime_addx(bt, x << 32)" is still
>> based on a truncated value.
>>=20
>=20
> I must admit that 2 seconds of interval where the timehands where
> not updated is too much.  This might be the real cause of all ppc
> troubles.  I tried to see if the overflow case is possible on amd64,
> and did not get a single case of the '> 63' branch executed during the
> /usr/tests/lib/libc run.
>=20
> Actually, the same overflow-prone code exists in libc, so below is the
> updated patch:
> - I added __predict_false()
> - libc multiplication is also done separately for high-order bits.
> (fftclock counterpart is still pending).
>=20
> diff --git a/lib/libc/sys/__vdso_gettimeofday.c =
b/lib/libc/sys/__vdso_gettimeofday.c
> index 3749e0473af..a14576988ff 100644
> --- a/lib/libc/sys/__vdso_gettimeofday.c
> +++ b/lib/libc/sys/__vdso_gettimeofday.c
> @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$");
> #include <sys/time.h>
> #include <sys/vdso.h>
> #include <errno.h>
> +#include <limits.h>
> +#include <strings.h>
> #include <time.h>
> #include <machine/atomic.h>
> #include "libc_private.h"
> @@ -62,6 +64,7 @@ binuptime(struct bintime *bt, struct vdso_timekeep =
*tk, int abs)
> {
> 	struct vdso_timehands *th;
> 	uint32_t curr, gen;
> +	uint64_t scale, x;
> 	u_int delta;
> 	int error;
>=20
> @@ -78,7 +81,14 @@ binuptime(struct bintime *bt, struct vdso_timekeep =
*tk, int abs)
> 			continue;
> 		if (error !=3D 0)
> 			return (error);
> -		bintime_addx(bt, th->th_scale * delta);
> +		scale =3D th->th_scale;
> +		if (__predict_false(fls(scale) + fls(delta) > 63)) {
> +			x =3D (scale >> 32) * delta;
> +			scale &=3D UINT_MAX;
> +			bt->sec +=3D x >> 32;
> +			bintime_addx(bt, x << 32);
> +		}
> +		bintime_addx(bt, scale * delta);
> 		if (abs)
> 			bintime_add(bt, &th->th_boottime);
>=20
> diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c
> index 2656fb4d22f..be75781e000 100644
> --- a/sys/kern/kern_tc.c
> +++ b/sys/kern/kern_tc.c
> @@ -355,13 +355,22 @@ void
> binuptime(struct bintime *bt)
> {
> 	struct timehands *th;
> -	u_int gen;
> +	uint64_t scale, x;
> +	u_int delta, gen;
>=20
> 	do {
> 		th =3D timehands;
> 		gen =3D atomic_load_acq_int(&th->th_generation);
> 		*bt =3D th->th_offset;
> -		bintime_addx(bt, th->th_scale * tc_delta(th));
> +		scale =3D th->th_scale;
> +		delta =3D tc_delta(th);
> +		if (__predict_false(fls(scale) + fls(delta) > 63)) {
> +			x =3D (scale >> 32) * delta;
> +			scale &=3D UINT_MAX;
> +			bt->sec +=3D x >> 32;
> +			bintime_addx(bt, x << 32);
> +		}
> +		bintime_addx(bt, scale * delta);
> 		atomic_thread_fence_acq();
> 	} while (gen =3D=3D 0 || gen !=3D th->th_generation);
> }
> @@ -388,13 +397,22 @@ void
> bintime(struct bintime *bt)
> {
> 	struct timehands *th;
> -	u_int gen;
> +	uint64_t scale, x;
> +	u_int delta, gen;
>=20
> 	do {
> 		th =3D timehands;
> 		gen =3D atomic_load_acq_int(&th->th_generation);
> 		*bt =3D th->th_bintime;
> -		bintime_addx(bt, th->th_scale * tc_delta(th));
> +		scale =3D th->th_scale;
> +		delta =3D tc_delta(th);
> +		if (__predict_false(fls(scale) + fls(delta) > 63)) {
> +			x =3D (scale >> 32) * delta;
> +			scale &=3D UINT_MAX;
> +			bt->sec +=3D x >> 32;
> +			bintime_addx(bt, x << 32);
> +		}
> +		bintime_addx(bt, scale * delta);
> 		atomic_thread_fence_acq();
> 	} while (gen =3D=3D 0 || gen !=3D th->th_generation);
> }

Thanks.

I'll note that powerpc64 seems to use:

int
fls(int mask)
{
        int bit;

        if (mask =3D=3D 0)
                return (0);
        for (bit =3D 1; mask !=3D 1; bit++)
                mask =3D (unsigned int)mask >> 1;
        return (bit);
}

from /usr/src/sys/libkern/fls.c (unless I missed
finding an alternate machine-dependent substitution
somewhere). So lots of looping.

I've no clue if this contributes to why, with your
prior patch, the powerpcl64 spent most of its time
being non-responsive and I was unable to do anything
effective but escape into and use ddb. (I had to
power off to shutdown. I've had to boot from other
media to repair things after such.)

I expect I need to figure out how to investigate
the operation of substructure that contributes to
binuptime in order to figure out what contributes
to the large deltas.

One point from my general understanding is that
keeping the tbr's reasonably tracking across the
2 sockets (2 cores per socket) in my context is
problematical/racy. (Not that I know the details
at this point.)

FYI:

# sysctl kern.timecounter
kern.timecounter.tc.timebase.quality: 0
kern.timecounter.tc.timebase.frequency: 33333333
kern.timecounter.tc.timebase.counter: 1831468476
kern.timecounter.tc.timebase.mask: 4294967295
kern.timecounter.stepwarnings: 0
kern.timecounter.alloweddeviation: 5
kern.timecounter.hardware: timebase
kern.timecounter.choice: timebase(0) dummy(-1000000)
kern.timecounter.tick: 1
kern.timecounter.fast_gettime: 1

At around 33.3 MHz 2^32 counts would be about 128
seconds to wrap.

I do know that the tbr is 64 bit on powerpc64 --and
is truncated to 32 bits for its use:

static unsigned
decr_get_timecount(struct timecounter *tc)
{
        return (mftb());
}

in /usr/src/sys/powerpc/powerpc/clock.c .

I'll keep trying to get evidence.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?679402FF-907C-43AF-B18C-8C9CC857D7A6>