Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Nov 2018 13:57:08 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Andriy Gapon <avg@freebsd.org>, "Rodney W. Grimes" <rgrimes@freebsd.org>,  Allan Jude <allanjude@freebsd.org>, Warner Losh <imp@freebsd.org>,  src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r340450 - head/sys/sys
Message-ID:  <CANCZdfp-e2UQ=xPt_kfCcqzWsYsUaaP%2BDYyQf4uy574-VuQ4kA@mail.gmail.com>
In-Reply-To: <20181120023823.M1428@besplex.bde.org>
References:  <CANCZdfqr=R-MCpDEGWDqGYJbcQ46Hqw7PMrVinAsYTRPjLjJPA@mail.gmail.com> <201811190104.wAJ14CaE059062@pdx.rh.CN85.dnsmgr.net> <CANCZdfoGKKcL70ESKow=hfNABpO=5%2BQtUYcNmpt4gReRkeUvrA@mail.gmail.com> <5e227743-6463-d395-f2ba-da8d4ba248ca@FreeBSD.org> <CANCZdfo4=TknaHR0%2B3fTCDzKO_EHF0Vfmkooi92ZaGEnVfq2bA@mail.gmail.com> <20181120005944.B1050@besplex.bde.org> <20181120014951.J1250@besplex.bde.org> <20181120023823.M1428@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Nov 19, 2018 at 9:05 AM Bruce Evans <brde@optusnet.com.au> wrote:

> On Tue, 20 Nov 2018, Bruce Evans wrote:
>
> > On Tue, 20 Nov 2018, Bruce Evans wrote:
> >
> >> On Mon, 19 Nov 2018, Warner Losh wrote:
> >>
> >>> On Mon, Nov 19, 2018 at 12:31 AM Andriy Gapon <avg@freebsd.org> wrote:
> > ...
> > I found my test program.
> >
> >>> But I think I understand the problem now.
> >>>
> >>> mstosbt(1000) is overflowing with my change, but not without because
> we're
> >>> adding 2^32 to a number that's ~900 away from overflowing changing a
> very
> >>
> >> This value is just invalid.  Negative values are obviously invalid.
> >> Positive
> >> values of more than 1 second are invalid.  In the old version, values of
> >> precisely 1 second don't overflow since the scale factor is rounded
> down;
> >> the result is just 1 unit lower than 1 second.  Overflow (actually
> >> truncation)
> >> occurs at 1 unit above 1 second.  E.g., sbttoms(mstosbt(1000)) was 999
> and
> >> sbttoms(mstosbt(1001)) was 0.  Now in my fixed version,
> >> sbttoms(mstosbt(1000))
> >> is truncated to 0 and sbttoms(mstosbt(1001)) is truncated to 1.
> >>
> >> The test program also showed that in the old version, all valid args
> except
> >> 0 are reduced by precisely 1 by conversion to sbt and back.
> >>
> >>> large sleep of 1 second to a tiny sleep of 0 (which is 1ms at the
> default
> >>> Hz). I think we get this in the mlx code because there's a
> msleep(1000) in
> >>> at least one place. msleep(1001) would have failed before my change.
> Now I
> >>> think msleep(999) works, but msleep(1000) fails. Since the code is
> waiting
> >>> a second for hardware to initialize, a 1ms instead is likely to catch
> the
> >>> hardware before it's finished. I think this will cause problems no
> matter
> >>> cold or not, since it's all pause_sbt() under the covers and a delay
> of 0
> >>> is still 0 either way.
> >>
> >> Bug in the mlx code.  The seconds part must be converted separately, as
> in
> >> tstosbt().
> >
> > mlx doesn't seem to use sbt directly, or even msleep(), so the bug does
> > seem to be central.
> >
> > mlx actually uses mtx_sleep() with timo = hz().   mtx_sleep() is actually
> > an obfuscated macro wrapping _sleep().  The conversion to sbt is done by
> > the macro and is sloppy.  It multiplies timo by tick_sbt.
> >
> > tick_sbt is SBT_1S / hz, so the sbt is SBT_1S / hz * hz which is SBT_1S
> > reduced a little.  This is not affected by the new code, and it still
> isn't
> > converted back to 1 second in ms, us or ns.  Even if it is converted back
> > and then forth to sbt using the new code, it remains less than 1 second
> as
> > an sbt so shouldn't cause any new problems.
>
> Here are all uses of these functions in kern outside of sys/time.h:
>
> XX arm/arm/mpcore_timer.c:      sc->et.et_min_period = nstosbt(20);
>
> OK since 20 ns is less than 1 second.
>
> XX cddl/compat/opensolaris/sys/kcondvar.h:      return
> (cv_timedwait_sbt(cvp, mp, nstosbt(tim), nstosbt(res), 0));
> XX cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c:
> pause_sbt("dmu_tx_delay", nstosbt(wakeup),
> XX cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c:
>  nstosbt(zfs_delay_resolution_ns), C_ABSOLUTE);
> XX cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c:    sbintime_t sleep =
> nstosbt((zilog->zl_last_lwb_latency * pct) / 100);
> XX compat/linuxkpi/common/include/linux/delay.h:
> pause_sbt("lnxsleep", mstosbt(ms), 0, C_HARDCLOCK);
> XX compat/linuxkpi/common/src/linux_hrtimer.c:
> nstosbt(hrtimer->expires), nstosbt(hrtimer->precision), 0);
> XX compat/linuxkpi/common/src/linux_hrtimer.c:
> callout_reset_sbt(&hrtimer->callout, nstosbt(ktime_to_ns(time)),
> XX compat/linuxkpi/common/src/linux_hrtimer.c:      nstosbt(nsec),
> hrtimer_call_handler, hrtimer, 0);
> XX compat/linuxkpi/common/src/linux_hrtimer.c:
> callout_reset_sbt(&hrtimer->callout, nstosbt(ktime_to_ns(interval)),
> XX compat/linuxkpi/common/src/linux_hrtimer.c:
> nstosbt(hrtimer->precision), hrtimer_call_handler, hrtimer, 0);
> XX compat/linuxkpi/common/src/linux_schedule.c: ret =
> -pause_sbt("lnxsleep", mstosbt(ms), 0, C_HARDCLOCK | C_CATCH);
>
> All of the above might are broken unless their timeout arg is restricted to
> less than 1 second.
>
> Also, at least the sleep and pause calls in the above probably have a
> style bug in knowing about sbt's at all.  More important functions
> like msleep() hide the use of sbt's and use fuzzier scale factors like
> tick_sbt.
>

Yes. It's for these users I'm fixing the >= 1s cases.


> XX dev/iicbus/nxprtc.c:                 pause_sbt("nxpotp", mstosbt(100),
> mstosbt(10), 0);
> XX dev/iicbus/nxprtc.c:         pause_sbt("nxpbat", mstosbt(100), 0, 0);
>
> OK since 100 ms is less than 1 second.
>
> XX kern/kern_sysctl.c:  sb = ustosbt(tt);
> XX kern/kern_sysctl.c:  sb = mstosbt(tt);
>
> Recently added bugs.  The args is supplied by applications so can be far
> above
> 1 second.
>
> Also, these functions have a bogus API that takes uint64_t args.  sbt's
> can't represent that high, and the API doesn't support failure, so callers
> have the burden of passing valid values.  It is easiest to only support
> args
> up to 1 second and require the caller to handle seconds.
>

It's just as easy to cope.


> Lots of itimer and select() code handle the corresponding problem for
> timevals and timespecs almost correctly.  Timevals and timespecs already
> have the seconds part separate and itimer and select() syscalls do validity
> checking on the fractional seconds, so there is no problem converting the
> fractional sections to an sbt.  Howver, the seconds part has type time_t
> and this can be int64_t, which sbt's cannot represent.  Also, POSIX doesn't
> permit failure for seconds values that are representable by time_t.  The
> almost correct handling is to split up large seconds values in some cases
> and break POSIX conformance by silently reducing the seconds value in
> other cases.  The reduction is usually to INT32_MAX / 2.  This allows
> adding 2 limited values but it is not obvious that this is enough since
> rounding up twice or just adding 2 more would give overflow.
>

Right, that's beyond the scope of what I'm fixing.


> XX kern/subr_rtc.c:                     sbt = nstosbt(waitns);
>
> This is correct, since waitns is the nanoseconds part of a timespec.
>

Yup.

https://reviews.freebsd.org/D18051

Contains the changes. They address all the actual problems you've raised,
but I'm going to disagree that 'ull' is bogus. For FreeBSD, it's fine and
it's a lot more readable than the alternatives.

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp-e2UQ=xPt_kfCcqzWsYsUaaP%2BDYyQf4uy574-VuQ4kA>