From owner-freebsd-stable@FreeBSD.ORG Tue Mar 5 05:32:52 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2CE8482F for ; Tue, 5 Mar 2013 05:32:52 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:16]) by mx1.freebsd.org (Postfix) with ESMTP id D910D109 for ; Tue, 5 Mar 2013 05:32:51 +0000 (UTC) Received: from omta05.emeryville.ca.mail.comcast.net ([76.96.30.43]) by qmta01.emeryville.ca.mail.comcast.net with comcast id 7RqP1l0060vp7WLA1hYqG6; Tue, 05 Mar 2013 05:32:50 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta05.emeryville.ca.mail.comcast.net with comcast id 7hYp1l00F1t3BNj8RhYpep; Tue, 05 Mar 2013 05:32:49 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 4572573A31; Mon, 4 Mar 2013 21:32:49 -0800 (PST) Date: Mon, 4 Mar 2013 21:32:49 -0800 From: Jeremy Chadwick To: Ben Morrow Subject: Re: ZFS "stalls" -- and maybe we should be talking about defaults? Message-ID: <20130305053249.GA38107@icarus.home.lan> References: <513524B2.6020600@denninger.net> <89680320E0FA4C0A99D522EA2037CE6E@multiplay.co.uk> <20130305050539.GA52821@anubis.morrow.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130305050539.GA52821@anubis.morrow.me.uk> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1362461570; bh=nFa860JsFsflesZ/PCtuoEUZ3aENOQ8OxwIyC96Ncbg=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=By9xfZg+4LonIoHhi/a/XCBjGkuu83fqPy36pow11791eB8xZbI82NPo+SsbCbLXn EeT5Zu/VustLEfJmp6kTrPEEdZE9mCvR19yYaHxja2xBGKIyr6ECYgXyW/l2nre5AF lKx97ya0eXBqm5Lud7nRj1/HO3UXBxr5oEwC/YZ7biOaKufl7d77FQcewdKWEbcqac 2LCEBGvn+dPTh8AZNaIsFEgwWXaLfeyQ4qvv9qHKBnAi6vWUMs6giz1VcIu9ANnVK8 e+AmeC6gKRXmAPXvRMXuCiPy5QtGuGHcRX6jZPCncd0A8v/XfkV/UDD0S2vhyvJxB4 /3+ZOrqAwMiBw== Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 05:32:52 -0000 On Tue, Mar 05, 2013 at 05:05:47AM +0000, Ben Morrow wrote: > Quoth Karl Denninger : > > > > Note that the machine is not booting from ZFS -- it is booting from and > > has its swap on a UFS 2-drive mirror (handled by the disk adapter; looks > > like a single "da0" drive to the OS) and that drive stalls as well when > > it freezes. It's definitely a kernel thing when it happens as the OS > > would otherwise not have locked (just I/O to the user partitions) -- but > > it does. > > Is it still the case that mixing UFS and ZFS can cause problems, or were > they all fixed? I remember a while ago (before the arc usage monitoring > code was added) there were a number of reports of serious probles > running an rsync from UFS to ZFS. This problem still exists on stable/9. The behaviour manifests itself as fairly bad performance (I cannot remember if stalling or if just throughput rates were awful). I can only speculate as to what the root cause is, but my guess is that it has something to do with the two caching systems (UFS vs. ZFS ARC) fighting over large sums of memory. The advice I've given people in the past is: if you do a LOT of I/O between UFS and ZFS on the same box, it's time to move to 100% ZFS. That said, I still do not recommend ZFS for a root filesystem (this biting people still happens even today), and swap-on-ZFS is a huge no-no. I will note that I myself use pure UFS+SU (not SUJ) for my main OS installation (that means /, swap, /var, /tmp, and /usr) on a dedicated SSD, while everything else is ZFS raidz1 (no dedup, no compression; won't ever enable these until that thread priority problem is fixed on FreeBSD). However, when I was migrating from gmirror+UFS+SU to ZFS, I witnessed what I described in my 1st and 2nd paragraphs. What userland utilities were used (rsync vs. cp) made no difference; the problem is in the kernel. Footnote about this thread: This thread contains all sorts of random pieces of information about systems, with very little actual detail in them (barring the symptoms, which are always useful to know!). For example, just because your machine has 8 cores and 12GB of RAM doesn't mean jack squat if some software in the kernel is designed "oddly". Reworded: throwing more hardware at a problem solves nothing. The most useful thing (for me) that I found was deep within the thread, a few words along the lines of "De-dup isn't used". What about compression, and if it's *ever* been enabled on the filesystem (even if not presently enabled)? It matters. All this matters. I see lots of end-users talking about these problems, but (barring Steven) literally no "kernel people" who are "in the know" about ZFS mentioning how said users can get them (devs) info that can help track this down. Those devs live on freebsd-fs@ and freebsd-hackers@, and not too many read freebsd-stable@. Step back for a moment and look at this anti-KISS configuration: - Hardware RAID controller involved (Areca 1680ix) - Hardware RAID controller has its own battery-backed cache (2GB) - Therefore arcmsr(4) is involved -- revision of driver/OS build matters here, ditto with firmware version - 4 disks are involved, models unknown - Disks are GPT and are *partitioned, and ZFS refers to the partitions not the raw disk -- this matters (honest, it really does; the ZFS code handles things differently with raw disks) - Providers are GELI-encrypted Now ask yourself if any dev is really going to tackle this one given the above mess. My advice would be to get rid of the hardware RAID (go with Intel ICHxx or ESBx on-board with AHCI), use raw disks for ZFS (if 4096-byte sector disks use the gnop(8) method, which is a one-time thing), and get rid of GELI. If you can reproduce the problem there 100% of the time, awesome, it's a clean/clear setup for someone to help investigate. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |