Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Jul 2015 10:29:40 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Christopher Forgeron <csforgeron@gmail.com>
Cc:        FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>,  FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: FreeBSD 10.1 Memory Exhaustion
Message-ID:  <CAJ-Vmom58SjgOG7HYPE4MVaB=XPaEkx_OTYgvOTHxwqGnTxtug@mail.gmail.com>
In-Reply-To: <CAB2_NwCngPqFH4q-YZk00RO_aVF9JraeSsVX3xS0z5EV3YGa1Q@mail.gmail.com>
References:  <CAB2_NwCngPqFH4q-YZk00RO_aVF9JraeSsVX3xS0z5EV3YGa1Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
hi,

With that much storage and that many snapshots, I do think you need
more than 96GB of RAM in the box. I'm hoping someone doing active ZFS
work can comment..

I don't think the ZFS code is completely "memory usage" safe. The
"old" Sun suggestions when I started using ZFS was "if your server
panics due to out of memory with ZFS, buy more memory."

That said, there doesn't look like there's a leak anywhere - those
dumps show you're using at least 32gig on each just in zfs data
buffers.

Try tuning the ARC down a little?



-adrian


On 13 July 2015 at 04:48, Christopher Forgeron <csforgeron@gmail.com> wrote=
:
>
>
> TL;DR Summary: I can run FreeBSD out of memory quite consistently, and it=
=E2=80=99s
> not a TOS/mbuf exhaustion issue. It=E2=80=99s quite possible that ZFS is =
the
> culprit, but shouldn=E2=80=99t the pager be able to handle aggressive mem=
ory
> requests in a low memory situation gracefully, without needing custom tun=
ing
> of ZFS / VM?
>
>
> Hello,
>
> I=E2=80=99ve been dealing with some instability in my 10.1-RELEASE and
> STABLEr282701M machines for the last few months.
>
> These machines are NFS/iSCSI storage machines, running on Dell M610x or
> similar hardware, 96 Gig Memory, 10Gig Network Cards, dual Xeon Processor=
s =E2=80=93
> Fairly beefy stuff.
>
> Initially I thought it was more issues with TOS / jumbo mbufs, as I had t=
his
> problem last year. I had thought that this was properly resolved, but
> setting my MTU to 1500, and turning off TOS did give me a bit more
> stability. Currently all my machines are set this way.
>
> Crashes were usually represented by loss of network connectivity, and the
> ctld daemon scrolling messages across the screen at full speed about lost
> connections.
>
> All of this did seem like more network stack problems, but with each cras=
h
> I=E2=80=99d be able to learn a bit more.
>
> Usually there was nothing of any use in the logfile, but every now and th=
en
> I=E2=80=99d get this:
>
> Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> Jun  3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate 80
> bytes
> Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> Jun  3 13:02:04 san0 kernel: WARNING: icl_pdu_new: failed to allocate 80
> bytes
> Jun  3 13:02:04 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> ---------
> Jun  4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate 80
> bytes
> Jun  4 03:03:09 san0 kernel: WARNING: icl_pdu_new: failed to allocate 80
> bytes
> Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): failed to allocate memory
> Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
> connection
> Jun  4 03:03:09 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): connection error; dropping
> connection
> Jun  4 03:03:10 san0 kernel: WARNING: 172.16.0.97
> (iqn.1998-01.com.vmware:esx5a-3387a188): waiting for CTL to terminate tas=
ks,
> 1 remaining
> Jun  4 06:04:27 san0 syslogd: kernel boot file is /boot/kernel/kernel
>
> So knowing that it seemed to be running out of memory, I started leaving
> leaving =E2=80=98vmstat 5=E2=80=99 running on a console, to see what it w=
as displaying
> during the crash.
>
> It was always the same thing:
>
>  0 0 0   1520M  4408M    15   0   0   0    25  19   0   0 21962 1667 9139=
0
> 0 33 67
>  0 0 0   1520M  4310M     9   0   0   0     2  15   3   0 21527 1385 9516=
5
> 0 31 69
>  0 0 0   1520M  4254M     7   0   0   0    14  19   0   0 17664 1739 7287=
3
> 0 18 82
>  0 0 0   1520M  4145M     2   0   0   0     0  19   0   0 23557 1447 9694=
1
> 0 36 64
>  0 0 0   1520M  4013M     4   0   0   0    14  19   0   0 4288  490 34685=
  0
> 72 28
>  0 0 0   1520M  3885M     2   0   0   0     0  19   0   0 11141 1038 6924=
2
> 0 52 48
>  0 0 0   1520M  3803M    10   0   0   0    14  19   0   0 24102 1834 9105=
0
> 0 33 67
>  0 0 0   1520M  8192B     2   0   0   0     2  15   1   0 19037 1131 7747=
0
> 0 45 55
>  0 0 0   1520M  8192B     0  22   0   0     2   0   6   0  146   82  578 =
 0
> 0 100
>  0 0 0   1520M  8192B     1   0   0   0     0   0   0   0  130   40  510 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  143   40  501 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  201   62  660 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0  101   28  404 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   97   27  398 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   93   28  377 =
 0
> 0 100
>  0 0 0   1520M  8192B     0   0   0   0     0   0   0   0   92   27  373 =
 0
> 0 100
>
>
>  I=E2=80=99d go from a decent amount of free memory to suddenly having no=
ne. Vmstat
> would stop outputting, console commands would hang, etc. The whole system
> would be useless.
>
> Looking into this, I came across a similar issue;
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199189
>
> I started increasing v.v_free_min, and it helped =E2=80=93 My crashes wen=
t from
> being ~every 6 hours to every few days.
>
> Currently I=E2=80=99m running with vm.v_free_min=3D1254507 =E2=80=93 That=
=E2=80=99s (1254507 * 4KiB) ,
> or 4.78GiB of Reserve.  The vmstat above is of a machine with that settin=
g
> still running to 8B of memory.
>
> I have two issues here:
>
> 1) I don=E2=80=99t think I should ever be able to run the system into the=
 ground on
> memory. Deny me new memory until the pager can free more.
> 2) Setting =E2=80=98min=E2=80=99 doesn=E2=80=99t really mean =E2=80=98min=
=E2=80=99 as it can obviously go below that
> threshold.
>
>
> I have plenty of local UFS swap (non-ZFS drives)
>
>  Adrian requested that I output a few more diagnostic items, and this is
> what I=E2=80=99m running on a console now, in a loop:
>
>         vmstat
>         netstat -m
>         vmstat -z
>         sleep 1
>
> The output of four crashes are attached here, as they can be a bit long. =
Let
> me know if that=E2=80=99s not a good way to report them. They will each s=
tart
> mid-way through a vmstat =E2=80=93z output, as that=E2=80=99s as far back=
 as my terminal
> buffer allows.
>
>
>
> Now, I have a good idea of the conditions that are causing this: ZFS
> Snapshots, run by cron, during times of high ZFS writes.
>
> The crashes are all nearly on the hour, as that=E2=80=99s when crontab tr=
iggers my
> python scripts to make new snapshots, and delete old ones.
>
> My average FreeBSD machine has ~ 30 zfs datasets, with each pool having ~=
20
> TiB used. These all need to snapshot on the hour.
>
> By staggering the snapshots by a few minutes, I have been able to reduce
> crashing from every other day to perhaps once a week if I=E2=80=99m lucky=
 =E2=80=93 But if I
> start moving a lot of data around, I can cause daily crashes again.
>
> It=E2=80=99s looking to be the memory demand of snapshotting lots of ZFS =
datasets at
> the same time while accepting a lot of write traffic.
>
> Now perhaps the answer is =E2=80=98don=E2=80=99t do that=E2=80=99 but I f=
eel that FreeBSD should be
> robust enough to handle this. I don=E2=80=99t mind tuning for now to
> reduce/eliminate this, but others shouldn=E2=80=99t run into this pain ju=
st because
> they heavily load their machines =E2=80=93 There must be a way of avoidin=
g this
> condition.
>
> Here are the contents of my /boot/loader.conf and sysctl.conf, so show my
> minimal tuning to make this problem a little more bearable:
>
> /boot/loader.conf
> vfs.zfs.arc_meta_limit=3D49656727553
> vfs.zfs.arc_max =3D 91489280512
>
> /etc/sysctl.conf
> vm.v_free_min=3D1254507
>
>
> Any suggestions/help is appreciated.
>
> Thank you.
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmom58SjgOG7HYPE4MVaB=XPaEkx_OTYgvOTHxwqGnTxtug>