From owner-freebsd-hackers@freebsd.org Sat Mar 2 14:25:32 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D63B15258BB; Sat, 2 Mar 2019 14:25:32 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56F636A093; Sat, 2 Mar 2019 14:25:31 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x22EPMdu096463 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Mar 2019 16:25:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x22EPMdu096463 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x22EPLVr096434; Sat, 2 Mar 2019 16:25:21 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 2 Mar 2019 16:25:21 +0200 From: Konstantin Belousov To: Bruce Evans Cc: Mark Millard , freebsd-hackers Hackers , FreeBSD PowerPC ML Subject: Re: powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed] Message-ID: <20190302142521.GE68879@kib.kiev.ua> References: <20190228145542.GT2420@kib.kiev.ua> <20190228150811.GU2420@kib.kiev.ua> <962D78C3-65BE-40C1-BB50-A0088223C17B@yahoo.com> <28C2BB0A-3DAA-4D18-A317-49A8DD52778F@yahoo.com> <20190301112717.GW2420@kib.kiev.ua> <20190302043936.A4444@besplex.bde.org> <20190301194217.GB68879@kib.kiev.ua> <20190302071425.G5025@besplex.bde.org> <20190302105140.GC68879@kib.kiev.ua> <20190302225513.W3408@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190302225513.W3408@besplex.bde.org> User-Agent: Mutt/1.11.3 (2019-02-01) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Mar 2019 14:25:32 -0000 On Sun, Mar 03, 2019 at 12:03:18AM +1100, Bruce Evans wrote: > On Sat, 2 Mar 2019, Konstantin Belousov wrote: > > > On Sat, Mar 02, 2019 at 07:38:20AM +1100, Bruce Evans wrote: > >> On Fri, 1 Mar 2019, Konstantin Belousov wrote: > >>> + scale = th->th_scale; > >>> + delta = tc_delta(th); > >>> + if (__predict_false(th->th_scale_bits + fls(delta) > 63)) { > >> > >> Better, but shouldn't be changed (and the bug that causes the large intervals > >> remains unlocated), and if it is changed then it should use: > >> > >> if (delta >= th->th_large_delta) > >> > >>> @@ -1464,6 +1483,11 @@ tc_windup(struct bintime *new_boottimebin) > >>> scale += (th->th_adjustment / 1024) * 2199; > >>> scale /= th->th_counter->tc_frequency; > >>> th->th_scale = scale * 2; > >>> +#ifdef _LP64 > >>> + th->th_scale_bits = ffsl(th->th_scale); > >>> +#else > >>> + th->th_scale_bits = ffsll(th->th_scale); > >>> +#endif > >> > >> th->th_large_delta = ((uint64_t)1 << 63) / scale; > > > > So I am able to reproduce it with some surprising ease on HPET running > > on Haswell. > > So what is the cause of it? Maybe the tickless code doesn't generate > fake clock ticks right. Or it is just a library bug. The kernel has > to be slightly real-time to satisfy the requirement of 1 update per. > Applications are further from being real-time. But isn't it enough > for the kernel to ensure that the timehands cycle more than once per > second? No, I entered ddb as you suggested. > > > I looked at the generated code for libc which still uses ffsll() on 32bit, > > due to the ABI issues. At least clang generates two BSF instructions for > > this code, so I think that forking vdso_timehands ABI for this is not > > reasonable right now. > > > > diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c > > index 3749e0473af..3c3c71207bd 100644 > > --- a/lib/libc/sys/__vdso_gettimeofday.c > > +++ b/lib/libc/sys/__vdso_gettimeofday.c > > @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); > > #include > > #include > > #include > > +#include > > +#include > > #include > > #include > > #include "libc_private.h" > > @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > { > > struct vdso_timehands *th; > > uint32_t curr, gen; > > - u_int delta; > > + uint64_t scale, x; > > + u_int delta, scale_bits; > > int error; > > > > do { > > @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) > > continue; > > if (error != 0) > > return (error); > > - bintime_addx(bt, th->th_scale * delta); > > + scale = th->th_scale; > > +#ifdef _LP64 > > + scale_bits = ffsl(scale); > > +#else > > + scale_bits = ffsll(scale); > > +#endif > > I see that there is an ABI problem adding th_large_delta. > > > + if (__predict_false(scale_bits + fls(delta) > 63)) { > > + x = (scale >> 32) * delta; > > + scale &= UINT_MAX; > > Should be UINT32_MAX or better 0xffffffff. Ok. > > > + bt->sec += x >> 32; > > + bintime_addx(bt, x << 32); > > + } > > + bintime_addx(bt, scale * delta); > > if (abs) > > bintime_add(bt, &th->th_boottime); > > I don't changing this at all this. binuptime() was carefully written > to not need so much 64-bit arithmetic. > > If this pessimization is allowed, then it can also handle a 64-bit > deltas. Using the better kernel method: > > if (__predict_false(delta >= th->th_large_delta)) { > bt->sec += (scale >> 32) * (delta >> 32); > x = (scale >> 32) * (delta & 0xffffffff); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > x = (scale & 0xffffffff) * (delta >> 32); > bt->sec += x >> 32; > bintime_addx(bt, x << 32); > bintime_addx(bt, (scale & 0xffffffff) * > (delta & 0xffffffff)); > } else > bintime_addx(bt, scale * (delta & 0xffffffff)); This only makes sense if delta is extended to uint64_t, which requires the pass over timecounters. > > I just noticed that there is a 64 x 32 -> 64 bit multiplication in the > current method. This can be changed to do expicit 32 x 32 -> 64 bit > multiplications and fix the overflow problem at small extra cost on > 32-bit arches: > > /* 32-bit arches did the next multiplication implicitly. */ > x = (scale >> 32) * delta; > /* > * And they did the following shifts and most of the adds > * implicitly too. Except shifting x left by 32 lost the > * seconds part that the next line handles. The next line > * is the only extra cost for them. > */ > bt->sec += x >> 32; > bintime_addx(bt, (x << 32) + (scale & 0xffffffff) * delta); Ok, what about the following. diff --git a/lib/libc/sys/__vdso_gettimeofday.c b/lib/libc/sys/__vdso_gettimeofday.c index 3749e0473af..cfe3d96d001 100644 --- a/lib/libc/sys/__vdso_gettimeofday.c +++ b/lib/libc/sys/__vdso_gettimeofday.c @@ -32,6 +32,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include "libc_private.h" @@ -62,7 +64,8 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) { struct vdso_timehands *th; uint32_t curr, gen; - u_int delta; + uint64_t scale, x; + u_int delta, scale_bits; int error; do { @@ -78,7 +81,19 @@ binuptime(struct bintime *bt, struct vdso_timekeep *tk, int abs) continue; if (error != 0) return (error); - bintime_addx(bt, th->th_scale * delta); + scale = th->th_scale; +#ifdef _LP64 + scale_bits = ffsl(scale); +#else + scale_bits = ffsll(scale); +#endif + if (__predict_false(scale_bits + fls(delta) > 63)) { + x = (scale >> 32) * delta; + scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); + } + bintime_addx(bt, scale * delta); if (abs) bintime_add(bt, &th->th_boottime); diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c index 2656fb4d22f..2e28f872229 100644 --- a/sys/kern/kern_tc.c +++ b/sys/kern/kern_tc.c @@ -72,6 +72,7 @@ struct timehands { struct timecounter *th_counter; int64_t th_adjustment; uint64_t th_scale; + uint64_t th_large_delta; u_int th_offset_count; struct bintime th_offset; struct bintime th_bintime; @@ -351,17 +352,44 @@ fbclock_getmicrotime(struct timeval *tvp) } while (gen == 0 || gen != th->th_generation); } #else /* !FFCLOCK */ + +static void +bintime_helper(struct bintime *bt, uint64_t *scale, u_int delta) +{ + uint64_t x; + + x = (*scale >> 32) * delta; + *scale &= 0xffffffff; + bt->sec += x >> 32; + bintime_addx(bt, x << 32); +} + void binuptime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_offset; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -388,13 +416,28 @@ void bintime(struct bintime *bt) { struct timehands *th; - u_int gen; + uint64_t scale; + u_int delta, gen; do { th = timehands; gen = atomic_load_acq_int(&th->th_generation); *bt = th->th_bintime; - bintime_addx(bt, th->th_scale * tc_delta(th)); + scale = th->th_scale; + delta = tc_delta(th); +#ifdef _LP64 + /* Avoid overflow for scale * delta. */ + if (__predict_false(th->th_large_delta <= delta)) + bintime_helper(bt, &scale, delta); + bintime_addx(bt, scale * delta); +#else + /* + * Also avoid (uint64_t, uint32_t) -> uint64_t + * multiplication on 32bit arches. + */ + bintime_helper(bt, &scale, delta); + bintime_addx(bt, (u_int)scale * delta); +#endif atomic_thread_fence_acq(); } while (gen == 0 || gen != th->th_generation); } @@ -1464,6 +1507,7 @@ tc_windup(struct bintime *new_boottimebin) scale += (th->th_adjustment / 1024) * 2199; scale /= th->th_counter->tc_frequency; th->th_scale = scale * 2; + th->th_large_delta = ((uint64_t)1 << 63) / scale; /* * Now that the struct timehands is again consistent, set the new