Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2013 12:26:47 +0200
From:      Marin Atanasov Nikolov <dnaeon@gmail.com>
To:        Bob Bishop <rb@gid.co.uk>
Cc:        ml-freebsd-stable <freebsd-stable@freebsd.org>, John <john@theusgroup.com>
Subject:   Re: Spontaneous reboots on Intel i5 and FreeBSD 9.0
Message-ID:  <CAJ-UWtTNBLku6bLJiVw%2BjoU2Q02%2BiTYy_HfhpD_A4_g8YDf9uw@mail.gmail.com>
In-Reply-To: <D04EDC9B-27DD-422C-97B8-103B30DAC97D@gid.co.uk>
References:  <CAJ-UWtSANRMsOqwW9rJ6Eebta6=AiHeNO6fhPO0mhYhZiMmn4A@mail.gmail.com> <op.wq3zxn038527sy@ronaldradial.versatec.local> <alpine.BSF.2.00.1301180758460.96418@wonkity.com> <1358527685.32417.237.camel@revolution.hippie.lan> <20130118173602.GA76438@neutralgood.org> <alpine.BSF.2.00.1301181313560.1604@wonkity.com> <CAJ-UWtRRfCKg9GBR_ppvtjvJGadiOXMXBFBpX7tAvLEXDoZHQg@mail.gmail.com> <20130119201914.84B761CB@server.theusgroup.com> <CAJ-UWtR%2Bymv_%2BxpLcw01r9r=ym6gMh%2BHt4KfTabWQXXcAv5Ydw@mail.gmail.com> <CAJ-UWtT8pFn86OMpPG47ryKN%2B%2B=1KfaQX3JtbCLuu_kByvtMzA@mail.gmail.com> <D04EDC9B-27DD-422C-97B8-103B30DAC97D@gid.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 25, 2013 at 12:12 PM, Bob Bishop <rb@gid.co.uk> wrote:

> Hi,
>
> On 25 Jan 2013, at 09:29, Marin Atanasov Nikolov wrote:
>
> > Hello again :)
> >
> > Here's my update on these spontaneous reboots after less than a week
> since
> > I've updated to stable/9.
> >
> > First two days the system was running fine with no reboots happening, so
> I
> > though that this update actually fixed it, but I was wrong.
> >
> > The reboots are still happening and still no clear evidence of the root
> > cause. What I did so far:
> >
> > * Ran disks tests -- looking good
> > * Ran memtest -- looking good
> > * Replaced power cables
> > * Ran UPS tests -- looking good
> > * Checked for any bad capacitors -- none found
> > * Removed all ZFS snapshots
> >
> > There is also one more machine connected to the same UPS, so if it was a
> > UPS issue I'd expect that the other one reboots too, but that's not the
> > case.
> >
> > Now that I've excluded the hardware part of this problem
>
> Have you done anything to rule out the machine's power supply?
>
>

Hi,

Yes, it's a brand new one.

Regards,
Marin



> > I started looking
> > again into the software side, and this time in particular -- ZFS.
> >
> > I'm running FreeBSD 9.1-STABLE #1 r245686 on a Intel i5 with 8Gb of
> memory.
> >
> > A quick look at top(1) showed lots of memory usage by ARC and my
> available
> > free memory dropping fast. I've made a screenshot, which you can see on
> the
> > link below:
> >
> > * http://users.unix-heaven.org/~dnaeon/top-zfs-arc.jpg
> >
> > So I went to the FreeBSD Wiki and started reading the ZFS Tuning Guide
> [1],
> > but honestly at the end I was not sure which parameters I need to
> > increase/decrease and to what values.
> >
> > Here's some info about my current parameters.
> >
> >    % sysctl vm.kmem_size_max
> >    vm.kmem_size_max: 329853485875
> >
> >    % sysctl vm.kmem_size
> >    vm.kmem_size: 8279539712
> >
> >    % sysctl vfs.zfs.arc_max
> >    vfs.zfs.arc_max: 7205797888
> >
> >    % sysctl kern.maxvnodes
> >    kern.maxvnodes: 206227
> >
> > There's one script at the ZFSTuningGuide which calculates kernel memory
> > utilization, and for me these values are listed below:
> >
> >    TEXT=22402749, 21.3649 MB
> >    DATA=4896264192, 4669.44 MB
> >    TOTAL=4918666941, 4690.81 MB
> >
> > While looking for ZFS tuning I've also stumbled upon this thread in the
> > FreeBSD Forums [2], where the OP describes a similar behaviour to what I
> am
> > already experiencing, so I'm quite worried now that the reason for these
> > crashes is ZFS.
> >
> > Before jumping into any change to the kernel parameters (vm.kmem_size,
> > vm.kmem_max_size, kern.maxvnodes, vfs.zfs.arc_max) I'd like to hear any
> > feedback from people that have already done such optimizations on their
> ZFS
> > systems.
> >
> > Could you please share what are the optimal values for these parameters
> on
> > a system with 8Gb of memory? Is there a way to calculate these values or
> is
> > it just a "test-and-see-which-fits-better" way of doing this?
> >
> > Thanks and regards,
> > Marin
> >
> > [1]: https://wiki.freebsd.org/ZFSTuningGuide
> > [2]: http://forums.freebsd.org/showthread.php?t=9143
> >
> >
> > On Sun, Jan 20, 2013 at 3:44 PM, Marin Atanasov Nikolov <
> dnaeon@gmail.com>wrote:
> >
> >>
> >>
> >>
> >> On Sat, Jan 19, 2013 at 10:19 PM, John <john@theusgroup.com> wrote:
> >>
> >>>> At 03:00am I can see that periodic(8) runs, but I don't see what could
> >>> have
> >>>> taken so much of the free memory. I'm also running this system on ZFS
> and
> >>>> have daily rotating ZFS snapshots created - currently the number of
> ZFS
> >>>> snapshots are > 1000, and not sure if that could be causing this.
> Here's
> >>> a
> >>>> list of the periodic(8) daily scripts that run at 03:00am time.
> >>>>
> >>>> % ls -1 /etc/periodic/daily
> >>>> 800.scrub-zfs
> >>>>
> >>>> % ls -1 /usr/local/etc/periodic/daily
> >>>> 402.zfSnap
> >>>> 403.zfSnap_delete
> >>>
> >>> On a couple of my zfs machines, I've found running a scrub along with
> >>> other
> >>> high file system users to be a problem.  I therefore run scrub from
> cron
> >>> and
> >>> schedule it so it doesn't overlap with periodic.
> >>>
> >>> I also found on a machine with an i3 and 4G ram that overlapping scrubs
> >>> and
> >>> snapshot destroy would cause the machine to grind to the point of being
> >>> non-responsive. This was not a problem when the machine was new, but
> >>> became one
> >>> as the pool got larger (dedup is off and the pool is at 45% capacity).
> >>>
> >>> I use my own zfs management script and it prevents snapshot destroys
> from
> >>> overlapping scrubs, and with a lockfile it prevents a new destroy from
> >>> being
> >>> initiated when an old one is still running.
> >>>
> >>> zfSnap has its -S switch to prevent actions during a scrub which you
> >>> should
> >>> use if you haven't already.
> >>>
> >>>
> >> Hi John,
> >>
> >> Thanks for the hints. It was a long time since I've setup zfSnap and
> I've
> >> just checked the configuration and I am using the "-s -S" flags, so
> there
> >> should be no overlapping.
> >>
> >> Meanwhile I've updated to 9.1-RELEASE, but then I hit an issue when
> trying
> >> to reboot the system (which appears to be discussed a lot in a separate
> >> thread).
> >>
> >> Then I've updated to stable/9, so at the least the reboot issue is now
> >> solved. Since I've to stable/9 I'm monitoring the system's memory usage
> and
> >> so far it's been pretty stable, so I'll keep an eye of an update to
> >> stable/9 has actually fixed this strange issue.
> >>
> >> Thanks again,
> >> Marin
> >>
> >>
> >>> Since making these changes, a machine that would have to be rebooted
> >>> several
> >>> times a week has now been up 61 days.
> >>>
> >>> John Theus
> >>> TheUs Group
> >>>
> >>
> >>
> >>
> >> --
> >> Marin Atanasov Nikolov
> >>
> >> dnaeon AT gmail DOT com
> >> http://www.unix-heaven.org/
> >>
> >
> >
> >
> > --
> > Marin Atanasov Nikolov
> >
> > dnaeon AT gmail DOT com
> > http://www.unix-heaven.org/
> > _______________________________________________
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org
> "
> >
>
>
> --
> Bob Bishop          +44 (0)118 940 1243
> rb@gid.co.uk    fax +44 (0)118 940 1295
>              mobile +44 (0)783 626 4518
>
>
>
>
>
>


-- 
Marin Atanasov Nikolov

dnaeon AT gmail DOT com
http://www.unix-heaven.org/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-UWtTNBLku6bLJiVw%2BjoU2Q02%2BiTYy_HfhpD_A4_g8YDf9uw>