From owner-freebsd-fs@FreeBSD.ORG Mon Jun 7 12:19:56 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D5D31065680 for ; Mon, 7 Jun 2010 12:19:56 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id 32E888FC08 for ; Mon, 7 Jun 2010 12:19:55 +0000 (UTC) Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90]) by qmta03.emeryville.ca.mail.comcast.net with comcast id T00o1e0041wfjNsA30KvFQ; Mon, 07 Jun 2010 12:19:55 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta23.emeryville.ca.mail.comcast.net with comcast id T0Ku1e00A3S48mS8j0Kuz3; Mon, 07 Jun 2010 12:19:55 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 948929B418; Mon, 7 Jun 2010 05:19:54 -0700 (PDT) Date: Mon, 7 Jun 2010 05:19:54 -0700 From: Jeremy Chadwick To: Martin Simmons Message-ID: <20100607121954.GA52932@icarus.home.lan> References: <4C0CAABA.2010506@icyb.net.ua> <20100607083428.GA48419@icarus.home.lan> <4C0CB3FC.8070001@icyb.net.ua> <20100607090850.GA49166@icarus.home.lan> <201006071112.o57BCGMf027496@higson.cam.lispworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201006071112.o57BCGMf027496@higson.cam.lispworks.com> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: zfs i/o error, no driver error X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jun 2010 12:19:56 -0000 On Mon, Jun 07, 2010 at 12:12:16PM +0100, Martin Simmons wrote: > >>>>> On Mon, 7 Jun 2010 02:08:50 -0700, Jeremy Chadwick said: > > > > I'm still trying to figure out why people do this. > > Maybe because the ZFS Best Practices Guide suggests it? ("Run zpool scrub on > a regular basis to identify data integrity problems...") > > It makes sense to detect errors when there is still a healthy mirror, rather > than waiting until two drives are failing :-) The official quote from the ZFS Best Practices Guide[1] is: "Run zpool scrub on a regular basis to identify data integrity problems. If you have consumer-quality drives, consider a weekly scrubbing schedule. If you have datacenter-quality drives, consider a monthly scrubbing schedule." The first line of the paragraph seems reasonable; the concept being, do this process often so that you catch potential data-threatening errors before your entire pool explodes. Cool, I can accept that, but it gets us into a discussion about how often this is necessary (keep reading for more on that). However, the second part of the paragraph -- total rubbish. "Datacenter-quality drives?" Oh, I think they mean "enterprise-grade drives", which really don't offer much more than high-end consumer-grade drives at this point in time[2]. One of the key points of ZFS's creation was to provide a reliable filesystem using cheap disks[3][4]. The only thing I can find in the ZFS Administration Guide[5] is this: "The simplest way to check your data integrity is to initiate an explicit scrubbing of all data within the pool. This operation traverses all the data in the pool once and verifies that all blocks can be read. Scrubbing proceeds as fast as the devices allow, though the priority of any I/O remains below that of normal operations. This operation might negatively impact performance, though the file system should remain usable and nearly as responsive while the scrubbing occurs." "Performing routine scrubbing also guarantees continuous I/O to all disks on the system. Routine scrubbing has the side effect of preventing power management from placing idle disks in low-power mode. If the system is generally performing I/O all the time, or if power consumption is not a concern, then this issue can safely be ignored." What's confusing about this is the phrase that pool verification is done by "verifying all the blocks can be read". Doesn't that happen when a standard read operation comes down the pipe for a file? What I'm getting at is that there's no explanation (that I can find) which states why scrubbing regularly "ensures" anything, other than allowing a person to see an error sooner than later. Which brings us to the topic of scrub interval... This exact question was asked on the ZFS OpenSolaris list[6] in late 2008, and nobody there provided any concrete evidence either. The closest thing to evidence is this: "...in normal operation, ZFS only checks data as it's read back from the disks. If you don't periodically scrub, errors that happen over time won't be caught until I next read that actual data, which might be inconvenient if it's a long since the initial data was written". The topic of scrub intervals was also brought up a month later[7]. Someone said: "We did a study on re-write scrubs which showed that once per year was a good interval for modern, enterprise-class disks. However, ZFS does a read-only scrub, so you might want to scrub more often". The first part conflicts with what the guide recommends (I'd also like to see the results of the study!), while the last half of the paragraph makes no sense ("because it reads, do it more often!"). So if you take the first sentence and apply it to what the ZFS Best Practices Guide says, you come out with... "scrub consumer-grade disks every 6 months". In the same thread, we have this quote from a different person: "Even that is probably more frequent than necessary. I'm sure somebody has done the MTTDL math. IIRC, the big win is doing any scrubbing at all. The difference between scrubbing every 2 weeks and every 2 months may be negligible. (IANAMathematician tho)" So the justification seems, well, unjustified. It's almost as if because the filesystem is new, that there's an underlying sense of paranoia, so everyone scrubs often. I understand the "pre-emptive" argument, just not the technical argument. So how often do *I* scrub our pools? Rarely. I tend to look at SMART stats much more aggressively; "uh oh, uncorrected sector, better scrub..." Or if while using the system it feels sluggish on I/O, or cronjob tasks taking way longer than need be. > > It's important to remember that scrubs are *highly* intensive on both > > the system itself as well as on all pool members. Disk I/O activity is > > very heavy during a scrub; it's not considered "normal use". > > Is it worse that a full backup? I guess scrub does read all drives, but OTOH > backup will typically read all data non-linearly, which adds a different kind > of stress. I'd guess it'd depend greatly on the type of backup. I'd imagine that a ZFS snapshot (non-incremental) + zfs send would be less intensive than a scrub, and the same (but even more so) with an incremental snapshot. I'd imagine rsync/tar/cp/etc. would be somewhere in-between. I don't use ZFS snapshots because I don't know if they've stabilised on FreeBSD. [1]: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools [2]: http://lists.freebsd.org/pipermail/freebsd-fs/2010-May/008508.html [3]: http://blogs.sun.com/bonwick/entry/zfs_end_to_end_data [4]: http://Fwww.sun.com/software/solaris/zfs_lc_preso.pdf [5]: http://docs.sun.com/app/docs/doc/819-5461/gbbwa?l=en&a=view [6]: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg20995.html [7]: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg21728.html [8]: http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSPeriodicScrubbing -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |