Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2013 10:12:21 +0000
From:      Bob Bishop <rb@gid.co.uk>
To:        Marin Atanasov Nikolov <dnaeon@gmail.com>
Cc:        ml-freebsd-stable <freebsd-stable@freebsd.org>, John <john@theusgroup.com>
Subject:   Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0
Message-ID:  <D04EDC9B-27DD-422C-97B8-103B30DAC97D@gid.co.uk>
In-Reply-To: <CAJ-UWtT8pFn86OMpPG47ryKN%2B%2B=1KfaQX3JtbCLuu_kByvtMzA@mail.gmail.com>
References:  <CAJ-UWtSANRMsOqwW9rJ6Eebta6=AiHeNO6fhPO0mhYhZiMmn4A@mail.gmail.com> <op.wq3zxn038527sy@ronaldradial.versatec.local> <alpine.BSF.2.00.1301180758460.96418@wonkity.com> <1358527685.32417.237.camel@revolution.hippie.lan> <20130118173602.GA76438@neutralgood.org> <alpine.BSF.2.00.1301181313560.1604@wonkity.com> <CAJ-UWtRRfCKg9GBR_ppvtjvJGadiOXMXBFBpX7tAvLEXDoZHQg@mail.gmail.com> <20130119201914.84B761CB@server.theusgroup.com> <CAJ-UWtR%2Bymv_%2BxpLcw01r9r=ym6gMh%2BHt4KfTabWQXXcAv5Ydw@mail.gmail.com> <CAJ-UWtT8pFn86OMpPG47ryKN%2B%2B=1KfaQX3JtbCLuu_kByvtMzA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

On 25 Jan 2013, at 09:29, Marin Atanasov Nikolov wrote:

> Hello again :)
>=20
> Here's my update on these spontaneous reboots after less than a week =
since
> I've updated to stable/9.
>=20
> First two days the system was running fine with no reboots happening, =
so I
> though that this update actually fixed it, but I was wrong.
>=20
> The reboots are still happening and still no clear evidence of the =
root
> cause. What I did so far:
>=20
> * Ran disks tests -- looking good
> * Ran memtest -- looking good
> * Replaced power cables
> * Ran UPS tests -- looking good
> * Checked for any bad capacitors -- none found
> * Removed all ZFS snapshots
>=20
> There is also one more machine connected to the same UPS, so if it was =
a
> UPS issue I'd expect that the other one reboots too, but that's not =
the
> case.
>=20
> Now that I've excluded the hardware part of this problem

Have you done anything to rule out the machine's power supply?

> I started looking
> again into the software side, and this time in particular -- ZFS.
>=20
> I'm running FreeBSD 9.1-STABLE #1 r245686 on a Intel i5 with 8Gb of =
memory.
>=20
> A quick look at top(1) showed lots of memory usage by ARC and my =
available
> free memory dropping fast. I've made a screenshot, which you can see =
on the
> link below:
>=20
> * http://users.unix-heaven.org/~dnaeon/top-zfs-arc.jpg
>=20
> So I went to the FreeBSD Wiki and started reading the ZFS Tuning Guide =
[1],
> but honestly at the end I was not sure which parameters I need to
> increase/decrease and to what values.
>=20
> Here's some info about my current parameters.
>=20
>    % sysctl vm.kmem_size_max
>    vm.kmem_size_max: 329853485875
>=20
>    % sysctl vm.kmem_size
>    vm.kmem_size: 8279539712
>=20
>    % sysctl vfs.zfs.arc_max
>    vfs.zfs.arc_max: 7205797888
>=20
>    % sysctl kern.maxvnodes
>    kern.maxvnodes: 206227
>=20
> There's one script at the ZFSTuningGuide which calculates kernel =
memory
> utilization, and for me these values are listed below:
>=20
>    TEXT=3D22402749, 21.3649 MB
>    DATA=3D4896264192, 4669.44 MB
>    TOTAL=3D4918666941, 4690.81 MB
>=20
> While looking for ZFS tuning I've also stumbled upon this thread in =
the
> FreeBSD Forums [2], where the OP describes a similar behaviour to what =
I am
> already experiencing, so I'm quite worried now that the reason for =
these
> crashes is ZFS.
>=20
> Before jumping into any change to the kernel parameters (vm.kmem_size,
> vm.kmem_max_size, kern.maxvnodes, vfs.zfs.arc_max) I'd like to hear =
any
> feedback from people that have already done such optimizations on =
their ZFS
> systems.
>=20
> Could you please share what are the optimal values for these =
parameters on
> a system with 8Gb of memory? Is there a way to calculate these values =
or is
> it just a "test-and-see-which-fits-better" way of doing this?
>=20
> Thanks and regards,
> Marin
>=20
> [1]: https://wiki.freebsd.org/ZFSTuningGuide
> [2]: http://forums.freebsd.org/showthread.php?t=3D9143
>=20
>=20
> On Sun, Jan 20, 2013 at 3:44 PM, Marin Atanasov Nikolov =
<dnaeon@gmail.com>wrote:
>=20
>>=20
>>=20
>>=20
>> On Sat, Jan 19, 2013 at 10:19 PM, John <john@theusgroup.com> wrote:
>>=20
>>>> At 03:00am I can see that periodic(8) runs, but I don't see what =
could
>>> have
>>>> taken so much of the free memory. I'm also running this system on =
ZFS and
>>>> have daily rotating ZFS snapshots created - currently the number of =
ZFS
>>>> snapshots are > 1000, and not sure if that could be causing this. =
Here's
>>> a
>>>> list of the periodic(8) daily scripts that run at 03:00am time.
>>>>=20
>>>> % ls -1 /etc/periodic/daily
>>>> 800.scrub-zfs
>>>>=20
>>>> % ls -1 /usr/local/etc/periodic/daily
>>>> 402.zfSnap
>>>> 403.zfSnap_delete
>>>=20
>>> On a couple of my zfs machines, I've found running a scrub along =
with
>>> other
>>> high file system users to be a problem.  I therefore run scrub from =
cron
>>> and
>>> schedule it so it doesn't overlap with periodic.
>>>=20
>>> I also found on a machine with an i3 and 4G ram that overlapping =
scrubs
>>> and
>>> snapshot destroy would cause the machine to grind to the point of =
being
>>> non-responsive. This was not a problem when the machine was new, but
>>> became one
>>> as the pool got larger (dedup is off and the pool is at 45% =
capacity).
>>>=20
>>> I use my own zfs management script and it prevents snapshot destroys =
from
>>> overlapping scrubs, and with a lockfile it prevents a new destroy =
from
>>> being
>>> initiated when an old one is still running.
>>>=20
>>> zfSnap has its -S switch to prevent actions during a scrub which you
>>> should
>>> use if you haven't already.
>>>=20
>>>=20
>> Hi John,
>>=20
>> Thanks for the hints. It was a long time since I've setup zfSnap and =
I've
>> just checked the configuration and I am using the "-s -S" flags, so =
there
>> should be no overlapping.
>>=20
>> Meanwhile I've updated to 9.1-RELEASE, but then I hit an issue when =
trying
>> to reboot the system (which appears to be discussed a lot in a =
separate
>> thread).
>>=20
>> Then I've updated to stable/9, so at the least the reboot issue is =
now
>> solved. Since I've to stable/9 I'm monitoring the system's memory =
usage and
>> so far it's been pretty stable, so I'll keep an eye of an update to
>> stable/9 has actually fixed this strange issue.
>>=20
>> Thanks again,
>> Marin
>>=20
>>=20
>>> Since making these changes, a machine that would have to be rebooted
>>> several
>>> times a week has now been up 61 days.
>>>=20
>>> John Theus
>>> TheUs Group
>>>=20
>>=20
>>=20
>>=20
>> --
>> Marin Atanasov Nikolov
>>=20
>> dnaeon AT gmail DOT com
>> http://www.unix-heaven.org/
>>=20
>=20
>=20
>=20
> --=20
> Marin Atanasov Nikolov
>=20
> dnaeon AT gmail DOT com
> http://www.unix-heaven.org/
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to =
"freebsd-stable-unsubscribe@freebsd.org"
>=20


--
Bob Bishop          +44 (0)118 940 1243
rb@gid.co.uk    fax +44 (0)118 940 1295
             mobile +44 (0)783 626 4518








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D04EDC9B-27DD-422C-97B8-103B30DAC97D>