From owner-freebsd-fs@FreeBSD.ORG Tue Jun 28 01:08:31 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C6C91065694 for ; Tue, 28 Jun 2011 01:08:31 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.westchester.pa.mail.comcast.net (qmta10.westchester.pa.mail.comcast.net [76.96.62.17]) by mx1.freebsd.org (Postfix) with ESMTP id A40768FC14 for ; Tue, 28 Jun 2011 01:08:30 +0000 (UTC) Received: from omta23.westchester.pa.mail.comcast.net ([76.96.62.74]) by qmta10.westchester.pa.mail.comcast.net with comcast id 1Cva1h0051c6gX85AD8WUq; Tue, 28 Jun 2011 01:08:30 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta23.westchester.pa.mail.comcast.net with comcast id 1D8Q1h00t1t3BNj3jD8T3k; Tue, 28 Jun 2011 01:08:29 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9E729102C19; Mon, 27 Jun 2011 18:08:22 -0700 (PDT) Date: Mon, 27 Jun 2011 18:08:22 -0700 From: Jeremy Chadwick To: George Sanders Message-ID: <20110628010822.GA41399@icarus.home.lan> References: <1309217450.43651.YahooMailRC@web120014.mail.ne1.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1309217450.43651.YahooMailRC@web120014.mail.ne1.yahoo.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Improving old-fashioned UFS2 performance with lots of inodes... X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jun 2011 01:08:31 -0000 On Mon, Jun 27, 2011 at 04:30:50PM -0700, George Sanders wrote: > I have a very old-fashioned file server running a 12-disk raid6 array on a 3ware > 9650SE. 2TB disks, so the size comes out to 18TB. > > I newfs the raw device with: > > newfs -i 65535 /dev/xxx > > and I would consider jumping to 131072 ... that way my fsck should not take any > longer than it would with a smaller disk, since there are not any more total > inodes. > > BUT ... > > with over 100 million inodes on the filesystem, things go slow. Overall > throughput is fine, and I have no complaints there, but doing any kind of > operations with the files is quite slow. Building a file list with rsync, or > doing a cp, or a ln -s of a big dir tree, etc. > > Let's assume that the architecture is not changing ... it's going to be FreeBSD > 8.x, using UFS2, and raid6 on actual spinning (7200rpm) disks. > > What can I do to speed things up ? > > Right now I have these in my loader.conf: > > kern.maxdsiz="4096000000"# for fsck > vm.kmem_size="1610612736"# for big rsyncs > vm.kmem_size_max="1610612736"# for big rsyncs On what exact OS version? Please don't say "8.2", need to know 8.2-RELEASE, -STABLE, or what. You said "8.x" above, which is too vague. If 8.2-STABLE you should not be tuning vm.kmem_size_max at all, and you probably don't need to tune vm.kmem_size either. I also do not understand how vm.kmem_size would affect rsync, since rsync is a userland application. I imagine you'd want to adjust kern.maxdsiz and kern.dfldsiz (default dsiz). > and I also set: > > vfs.ufs.dirhash_maxmem=64000000 This tunable uses memory for a single directorie that has a huge amount of files in it; AFAIK it does not apply to "large directory structures" (as in directories within directories within directories). It's obvious you're just tinkering with random sysctls hoping to gain performance without really understanding what the sysctls do. :-) To see if you even need to increase that, try "sysctl -a | grep vfs.ufs.dirhash" and look at dirhash_mem vs. dirhash_maxmem, as well as dirhash_lowmemcount. > but that's it. > > What bugs me is, the drives have 64M cache, and the 3ware controller has 224 MB > (or so) but the system itself has 64 GB of RAM ... is there no way to use the > RAM to increase performance ? I don't see a way to actually throw hardware > resources at UFS2, other than faster disks which are uneconomical for this > application ... > > Yes, 3ware write cache is turned on, and storsave is set to "balanced". > > Is there anything that can be done ? The only thing I can think of on short notice is to have multiple filesystems (volumes) instead of one large 12TB one. This is pretty common in the commercial filer world. Regarding system RAM and UFS2: I have no idea, Kirk might have to comment on that. You could "make use" of system RAM for cache (ZFS ARC) if you were using ZFS instead of native UFS2. However, if the system has 64GB of RAM, you need to ask yourself why the system has that amount of RAM in the first place. For example, if the machine runs mysqld and is tuned to use a large amount of memory, you really don't ""have"" 64GB of RAM to play with, and thus wouldn't want mysqld and some filesystem caching model fighting over memory (e.g. paging/swapping). Overall my opinion is that you're making absolutely humongous filesystems and expecting the performance, fsck, etc. to be just like it would be for a 16MB filesystem. That isn't the case at all. ZFS may be more what you're looking for, especially since you're wanting to use system memory as a large filesystem/content cache. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |