From owner-freebsd-fs@FreeBSD.ORG Mon Oct 12 07:56:12 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 859B3106566B; Mon, 12 Oct 2009 07:56:12 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 9191F8FC15; Mon, 12 Oct 2009 07:56:11 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 7FE48183654; Mon, 12 Oct 2009 09:56:09 +0200 (CEST) X-CRM114-Version: 20090423-BlameSteveJobs ( TRE 0.7.6 (BSD) ) MF-ACE0E1EA [pR: 21.9356] X-CRM114-CacheID: sfid-20091012_09560_374ADE94 X-CRM114-Status: Good ( pR: 21.9356 ) Message-ID: <4AD2E118.2050202@fsn.hu> Date: Mon, 12 Oct 2009 09:56:08 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4AC1E540.9070001@fsn.hu> <4AC5B2C7.2000200@fsn.hu> <20091002184526.GA1660@garage.freebsd.pl> <4ACDA5EA.2010600@fsn.hu> <4ACDDED0.2070707@fsn.hu> <20091008160718.GB2134@garage.freebsd.pl> In-Reply-To: <20091008160718.GB2134@garage.freebsd.pl> X-Stationery: 0.4.10 X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.3 (people.fsn.hu); Mon, 12 Oct 2009 09:56:08 +0200 (CEST) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@FreeBSD.org Subject: Re: ARC size constantly shrinks, then ZFS slows down extremely X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Oct 2009 07:56:12 -0000 Pawel Jakub Dawidek wrote: > On Thu, Oct 08, 2009 at 02:45:04PM +0200, Attila Nagy wrote: > >> Attila Nagy wrote: >> >>> Hello, >>> >>> Pawel Jakub Dawidek wrote: >>> >>>> On Fri, Oct 02, 2009 at 09:59:03AM +0200, Attila Nagy wrote: >>>> >>>> >>>>> Backing out this change from the 8-STABLE kernel: >>>>> http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902 >>>>> >>>>> >>>>> makes it survive about half and hour of IMAP searching. Of course >>>>> only time will tell whether this helps in the long run, but so far >>>>> 10/10 tries succeeded to kill the machine with this method... >>>>> >>>>> >>>> Could you try this patch: >>>> >>>> http://people.freebsd.org/~pjd/patches/arc.c.4.patch >>>> >>>> >>> It seems (after running for two days) that this fixes my problem. And >>> I see that Kip has came out with a similar version (which I couldn't >>> yet test, but hope that will also do). >>> >> It seems that I was a little bit quick regarding this. >> The machine just stopped with this: >> last pid: 32358; load averages: 0.01, 0.04, 0.12 up 2+06:33:56 >> 14:36:25 >> 114 processes: 1 running, 112 sleeping, 1 zombie >> CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle >> Mem: 536M Active, 63M Inact, 393M Wired, 8K Cache, 111M Buf >> Swap: 4096M Total, 15M Used, 4081M Free >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 24025 root 1 44 0 3932K 992K vmwait 0 6:06 0.00% zpool >> 84190 root 1 44 0 4700K 1592K CPU1 1 4:17 0.00% top >> 99029 root 1 44 0 4132K 1212K nanslp 1 3:53 0.00% gstat >> 26317 root 1 44 0 1528K 352K piperd 1 3:38 0.00% >> readproctitl >> 49143 125 4 45 0 12248K 3788K sigwai 0 2:50 0.00% >> milter-greyl >> 39969 root 1 44 0 1536K 516K vmwait 0 2:50 0.00% supervise >> 40241 root 1 44 0 1536K 516K vmwait 0 2:47 0.00% supervise >> 44633 root 1 44 0 1536K 512K vmwait 0 2:43 0.00% supervise >> 43434 root 1 44 0 1536K 516K vmwait 0 2:43 0.00% supervise >> 50575 root 1 44 0 1536K 516K vmwait 0 2:42 0.00% supervise >> 45510 root 1 44 0 1536K 512K vmwait 0 2:42 0.00% supervise >> 58146 60 1 44 0 264M 8828K pfault 0 2:32 0.00% imapd >> 47526 389 6 44 0 92688K 2296K ucond 1 1:29 0.00% slapd >> 5417 root 1 44 0 9396K 1680K pfault 1 1:26 0.00% sshd >> 13147 root 1 44 0 3340K 860K vmwait 1 0:45 0.00% syslogd >> 92597 root 1 44 0 9396K 1676K pfault 1 0:39 0.00% sshd >> 26437 125 1 44 0 6924K 1700K vmwait 0 0:33 0.00% qmgr >> >> The above top was refreshing, but every other stuff on different ssh >> consoles (like a running zpool iostat and gstat) was frozen. >> Even top stopped when I have resized the window. >> > > Please try Kip's patch that was committed, it changes priorities a bit, > which should help. > My i386 machine is still alive after two days of uptime (with your patch, it lived for about two days, so I can't say -at least now- that it's OK). The amd64 machine started to loose ARC memory again. See these: http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/zfs_mem-week.png http://people.fsn.hu/~bra/freebsd/20091012-zfs-arcsize/memory-week.png Your patch was active between 7 and 9. You can see that the ARC size was somewhat constant. On october 9, I installed Kip's modification, and ARC size started to decrease. BTW, previously (before october 7) I set the arc min size to 10-15GB (can't remember the exact value), but now it runs with the defaults (only the max size is set): vfs.zfs.arc_min: 3623878656 vfs.zfs.arc_max: 28991029248 As you can see, there are plenty of memory. This machine uses UFS as well (and writes it heavily), maybe that's what affects ZFS size, by caching a lot of stuff?