Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2011 09:48:26 -0700
From:      Artem Belevich <art@freebsd.org>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS: arc_reclaim_thread running 100%, 8.1-RELEASE, LBOLT related
Message-ID:  <BANLkTinQkS36SQKpZhsavx3C_ad838DG=g@mail.gmail.com>
In-Reply-To: <4DE50811.5060606@FreeBSD.org>
References:  <0EFD28CD-F2E9-4AE2-B927-1D327EC99DB9@bitgravity.com> <BANLkTikVq0-En7=4Dy_dTf=tM55Cqou_mw@mail.gmail.com> <4DE50811.5060606@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 31, 2011 at 8:24 AM, Andriy Gapon <avg@freebsd.org> wrote:
>>> =A0 =A0 =A0 =A0 65 =A0 =A0 =A0 =A0 nsec =3D (hrtime_t)ts.tv_sec * NANOS=
EC + ts.tv_nsec;
>>
>> Yup. This would indeed overflow in ~106.75 days.
>
> Have you referred to the LBOLT above?
> gethrtime() should have several hundred years before overflowing.

hrtime_t is 64-bit. NANOSEC=3D1000000000.

When it's time to use LBOLT, we further multiply number of seconds by HZ:
>>         41 #define LBOLT   ((gethrtime() * hz) / NANOSEC)

In the end we want 64-bit scalar to hold number of seconds times 10e12.
0x7fffffffffffffff/1000000000000 =3D 9223372  # number of seconds before
signed overflow
9223372/(24*60*60) --> 106   # .. or about 106 days

>> The side effect is that it limits bolt resolution to hz units. With
>> HZ=3D100, that will be 10ms. Whether it's good enough or too coarse I
>
> Nope, I think you did your your math wrong here.
> As shown above it limits the resolution to hz ticks, i.e. hz * 1/hz secon=
ds :)

Point taken.

>
>> have no idea. Perhaps we can compromise and update lbolt in
>> microseconds. That should give us few hundred years until the
>> overflow.
>
> Well, we can either use the ticks variable, since we are not switching to=
 tickless
> mode yet. =A0But we would need to make it 64-bit to avoid early overflows=
.
> Or, perhaps, to be somewhat future-friendly we could do approximately wha=
t
> OpenSolaris [upstream :-)] code does:
>
> gethrtime() / (NANOSEC / hz)
>
> Or, given that NANOSEC is constant and hz is invariant, we could apply ex=
tended
> invariant division by multiplication approach to get precise and fast res=
ult
> without overflowing. =A0But likely that's an overkill here. =A0Though we =
definitely
> should pre-calculate, store and re-use (NANOSEC / hz) value just like Ope=
nSolaris
> does it.

This should work.

>
>>> clock_t will still need the typedef'ed to 64bit to still address the l2=
arc usage of LBOLT.
>
> Hm, I see that OpenSolaris defines clock_t to long, while we use int32_t.
> So, I think that it means two things:
> - i386 OpenSolaris (if it exists) should be affected by the problem as we=
ll
> - we should not use our native clock_t definition for ported OpenSolaris =
code
>
> Maybe we should fix our clock_t to be something wider at least for 64-bit
> platforms. =A0But I am not prepared to discuss this.

>
> To summarize:
> 1. We must fix ddi_get_lbolt*() to avoid unnecessary overflow in multipli=
cation

Agreed.

> 2. We could fix 31-bit clock_t overflow by using Solaris-compatible defin=
ition of
> it (long), but that still won't help on i386. =A0Maybe we should bring up=
 this issue
> to the attention of upstream ZFS developers. =A0Universally using ddi_get=
_lbolt64()
> seems like a safer bet.

FYI, we've already changed clock_t for opensolaris code to int64_t in
r218169 regardless of $MACHINE.

--Artem



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTinQkS36SQKpZhsavx3C_ad838DG=g>