From owner-freebsd-hackers@FreeBSD.ORG  Sat Oct 25 07:48:17 2008
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 610C6106566C
	for <freebsd-hackers@freebsd.org>; Sat, 25 Oct 2008 07:48:17 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10])
	by mx1.freebsd.org (Postfix) with ESMTP id 16ED18FC1F
	for <freebsd-hackers@freebsd.org>; Sat, 25 Oct 2008 07:48:17 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from sunfire.cs.huji.ac.il ([132.65.16.80])
	by cs1.cs.huji.ac.il with esmtp
	id 1Ktdsd-000B7U-5M; Sat, 25 Oct 2008 09:48:15 +0200
X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2
To: Dan Nelson <dnelson@allantgroup.com>
In-reply-to: <20081024150916.GB41283@dan.emsphone.com> 
References: <E1KtIbt-000PhA-HW@cs1.cs.huji.ac.il>
	<20081024150916.GB41283@dan.emsphone.com>
Comments: In-reply-to Dan Nelson <dnelson@allantgroup.com>
	message dated "Fri, 24 Oct 2008 10:09:16 -0500."
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 25 Oct 2008 09:48:15 +0200
From: Danny Braniss <danny@cs.huji.ac.il>
Message-ID: <E1Ktdsd-000B7U-5M@cs1.cs.huji.ac.il>
Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject: Re: zfs & waiting on zio->io_cv 
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Oct 2008 07:48:17 -0000

> In the last episode (Oct 24), Danny Braniss said:
> > there is a big delay (probably more than 1 sec.) when doing simple tasks
> > on this zfs, like ls(1), or 'zfs list', long enough to hit ^T
> > and get the same [zio->io_cv)], any hints?
> > 
> > store-01# zfs list
> > (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1672k
> > (hitting ^T)load: 0.00  cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1684k
> > NAME              USED  AVAIL  REFER  MOUNTPOINT
> > h                 472G  11.2T    23K  /h
> > h/home            466G  11.2T   466G  /h/home
> > h/home@23-10-08    54K      -   466G  -
> > h/root             18K  11.2T    18K  /h/root
> > h/src              18K  11.2T    18K  /h/src
> > h/system         5.64G  11.2T  5.64G  /h/system
> 
> That's sort of the equivalent to waiting in "biord" on a UFS
> filesystem, I think.  ZFS is just waiting for the disk to return a
> block.  If you happen to do something during the window where ZFS is
> commiting its transaction group, it has to wait until the sync
> finishes.  If some other process is doing a lot of writes, or you only
> have one disk in your zpool, or your pool is close to full, it may take
> a couple seconds to sync.
> 
> There's a couple of things you can try to improve interactive
> performance.  Raising zfs's arc_max is the easiest to do, and will let
> ZFS cache more stuff, increasing the likelyhood that an "ls" will be
> able to read from cache instead of having to go to disk.  Setting it at
> 1/4 your physical RAM is probably as high as you can go without causing
> panics.
> 
> Raising txg_time ( in /sys/cddl/.../zfs/txg.c ) from 5 to
> say 30 will tell zfs to sync less often, which can be a win if you
> don't actually do that much writing.  With a single spindle, it may
> take a substantial fraction of a second just to sync a tiny txg due to
> the number of copies of metadata ZFS writes for redundancy.
> 
> If you do a lot of writing, lowering zfs_vdev_max_pending ( in
> /sys/cddl/.../zfs/vdev_queue.c ) from 35 down to 16 or less will reduce
> the number of simultaneous I/Os ZFS will try to send to each disk,
> which will let your reads compete a little better with other I/O.  On
> ATA or SATA disks, you might want to set it to 2.
> 
ok, forgot to mention a small detail, the machine is a cuad core, with 8gb
of main memory, the disks are 14x1tb connected via a perc/raid5
tests show that disk access is quiet fast, over 200Mg/s.

the 'delays' are seen when the machine is totaly idle. (it's not production 
yet)
and been up for some time. btw, I can't reproduce the 'delay', so I think
it has to do with caching.

I guess this beast needs some tunning, are there any tools out there
to monitor/tune ZFS? 

thanks,
	danny


> -- 
> 	Dan Nelson
> 	dnelson@allantgroup.com