From owner-freebsd-questions@FreeBSD.ORG Thu Mar 21 04:45:58 2013 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AA5B1339 for ; Thu, 21 Mar 2013 04:45:58 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:32]) by mx1.freebsd.org (Postfix) with ESMTP id 905F1820 for ; Thu, 21 Mar 2013 04:45:58 +0000 (UTC) Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73]) by qmta03.emeryville.ca.mail.comcast.net with comcast id E00u1l02R1afHeLA34lygS; Thu, 21 Mar 2013 04:45:58 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta17.emeryville.ca.mail.comcast.net with comcast id E4lx1l00K1t3BNj8d4lxAy; Thu, 21 Mar 2013 04:45:57 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 4919F73A1C; Wed, 20 Mar 2013 21:45:57 -0700 (PDT) Date: Wed, 20 Mar 2013 21:45:57 -0700 From: Jeremy Chadwick To: quartz@sneakertech.com Subject: Re: ZFS question Message-ID: <20130321044557.GA15977@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1363841158; bh=/yrewiA3W2CgdiNdwiJPMVuiLQTXZ0HOKXZdEUzqBLA=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=LbPAgQ0QSJMRK1VHqS96ci7sBnDMwZSSxiOkCSEMMO22+E/SCarViAOHkemGnbHvC unykuaSMCwxsdttHLeTfcKj94x/Q8FJzd8+9dC6WInoscKw+2mkY2XvKEhenWKbYxy mKTx0jR5jMHpgkW2gWboU4W3bCsZ3cit3VZnO9kJiqOWjC3bWgctUXhBuytx2kTt58 qpbVKALCzunv4eatY/Ho8U7KQeb21hHnlAVOwxdYQFVsKMJbrzmroO+FZNyL5udf1Z vzz/B76ebbx08qerYGt7EpS7nEgLL39MEvudx5+1JU7AHMgTAqHdx6EDCDM+y1eHRU ZYlr24c+Xid8g== Cc: freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Mar 2013 04:45:58 -0000 (Please keep me CC'd as I'm not subscribed to -questions) Lots to say about this. 1. freebsd-fs is the proper list for filesystem-oriented questions of this sort, especially for ZFS. 2. The issue you've described is experienced by some, and **not** experienced by even more/just as many, so please keep that in mind. Each/every person's situation/environment/issue has to be treated separately/as unique. 3. You haven't provided any useful details, even in your follow-up post here: http://lists.freebsd.org/pipermail/freebsd-questions/2013-March/249958.html All you've provided is a "general overview" with no technical details, no actual data. You need to provide that data verbatim. You need to provide: - Contents of /boot/loader.conf - Contents of /etc/sysctl.conf - Output from "zpool status" - Output from "zpool get all" - Output from "zfs get all" - Output from "dmesg" (probably the most important) - Output from "sysctl vfs.zfs kstat.zfs" I particularly tend to assist with disk-level problems, so if this turns out to be a disk-level issue (and NOT a controller or controller driver issue), I can help quite a bit with that. 4. I would **not** suggest rolling back to 9.0. This recommendation is solves nothing -- if there is truly a bug/livelock issue, then that needs to be tracked down. By rolling back, if there is an issue, you're effectively ensuring it'll never get investigated or fixed, which means you can probably expect to see this in 9.2, 9.3, or even 10.x onward. If you can't deal with the instability, or don't have the time/cycles/interest to help track it down, that's perfectly okay too: my recommendation is to go back to UFS (there's no shame in that). Else, as always, I strongly recommend running stable/9 (keep reading). 5. stable/9 (a.k.a. FreeBSD 9.1-STABLE) just recently (~5 days ago) MFC'd an Illumos ZFS feature solely to help debug/troubleshoot this exact type of situation: introduction of the ZFS deadmean thread. Reference materials for what that is: http://svnweb.freebsd.org/base?view=revision&revision=248369 http://svnweb.freebsd.org/base?view=revision&revision=247265 https://www.illumos.org/issues/3246 The purpose of this feature (enabled by default) is to induce a kernel panic when ZFS I/O stalls/hangs for unexpectedly long periods of time (configurable via vfs.zfs.deadman_synctime). Once the panic happens (assuming your system is configured with a slice dedicated to swap (ZFS-backed swap = bad bad bad) and use of dumpdev="auto" in rc.conf), upon reboot the system should extract the crash dump from swap and save it into /var/crash. At that point kernel developers on the -fs list can help tell you *exactly* what to do with kgdb(1) that can shed some light on what happened/where the issue may lie. All that's assuming that the issue truly is ZFS waiting for I/O and not something else (like ZFS internally spinning hard in its own code). Good luck, and let us know how you want to proceed. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |