From owner-freebsd-fs@FreeBSD.ORG  Mon Feb 25 17:00:11 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
 by hub.freebsd.org (Postfix) with ESMTP id DF07B204;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
 (envelope-from toasty@dragondata.com)
Received: from mail.your.org (mail.your.org [IPv6:2001:4978:1:2::cc09:3717])
 by mx1.freebsd.org (Postfix) with ESMTP id BE357934;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Received: from mail.your.org (chi02.mail.your.org [204.9.55.23])
 by mail.your.org (Postfix) with ESMTP id 5D6A6F06C72;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Received: from vpn132.rw1.your.org (vpn132.rw1.your.org [204.9.51.132])
 (using TLSv1 with cipher AES128-SHA (128/128 bits))
 (No client certificate requested)
 by mail.your.org (Postfix) with ESMTPSA id 23EFFF06C6E;
 Mon, 25 Feb 2013 17:00:11 +0000 (UTC)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Improving ZFS performance for large directories
From: Kevin Day <toasty@dragondata.com>
In-Reply-To: <5124AC69.6010709@FreeBSD.org>
Date: Mon, 25 Feb 2013 11:00:10 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <237DCD81-5CAB-466B-8BF4-543D195FA545@dragondata.com>
References: <19DB8F4A-6788-44F6-9A2C-E01DEA01BED9@dragondata.com>
 <CAJjvXiE+8OMu_yvdRAsWugH7W=fhFW7bicOLLyjEn8YrgvCwiw@mail.gmail.com>
 <F4420A8C-FB92-4771-B261-6C47A736CF7F@dragondata.com>
 <20130201192416.GA76461@server.rulingia.com>
 <19E0C908-79F1-43F8-899C-6B60F998D4A5@dragondata.com>
 <5124AC69.6010709@FreeBSD.org>
To: Andriy Gapon <avg@FreeBSD.org>
X-Mailer: Apple Mail (2.1499)
Cc: FreeBSD Filesystems <freebsd-fs@FreeBSD.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2013 17:00:11 -0000


On Feb 20, 2013, at 4:58 AM, Andriy Gapon <avg@FreeBSD.org> wrote:

> on 19/02/2013 22:10 Kevin Day said the following:
>> Timing doing an "ls" in large directories 20 times, the first is the =
slowest,
> then all subsequent listings are roughly the same. There doesn't =
appear to be any
> gain after 20 repetitions
>=20
> I think that the above could be related to the below
>=20
>> 	vfs.zfs.arc_meta_limit                  16398159872
>> 	vfs.zfs.arc_meta_used                   16398120264
>=20


Doing some more testing=85

After a fresh reboot, without the SSD cache, an ls(1) in a large =
directory is pretty fast. After we've been running for an hour or so, =
the speed gets progressively worse. I can kill all other activity on the =
system, and it's still bad. I reboot, and it's back to normal.=20

On an idle system, I watched gstat(8), during the ls(1) the drives are =
basically at 100% busy while it's running, reading far more data than =
I'd think necessary to read a directory. top(1) is showing that the =
"zfskern" kernel process is burning a lot of CPU during that time too. =
Is there a possibility there's a bug/sub-optimal access pattern we're =
hitting when the arc_meta_limit is hit? Something akin to if something =
that was just read doesn't get put into the arc_meta cache, it's having =
to re-read the same data many times just to iterate through the =
directory?

I've been hesitating to increase the arc size because we've only got =
64GB of memory here and I can't add any further. The processes running =
on the system themselves need a fair chunk of ram, so I'm trying to =
figure out how we can either upgrade this motherboard to something newer =
or reduce our memory size. I've got a feeling I'm going to need to do =
this, but since this is a non-commercial project it's kinda hard to =
spend that much money on it. :)

-- Kevin