From owner-freebsd-questions@FreeBSD.ORG Thu Aug 21 14:25:55 2014 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6CDDEA00 for ; Thu, 21 Aug 2014 14:25:55 +0000 (UTC) Received: from mail-qc0-f171.google.com (mail-qc0-f171.google.com [209.85.216.171]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2881C3C69 for ; Thu, 21 Aug 2014 14:25:54 +0000 (UTC) Received: by mail-qc0-f171.google.com with SMTP id r5so9239223qcx.16 for ; Thu, 21 Aug 2014 07:25:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=wBCJ/IlE+z9aDPBs4kSefp6IRfpJIb2J1Vk4qSg8Ick=; b=hUB79ifZpwdzOxIuTbhrX7yuEJjnipMtf9rUeuBGWke7duywSyzaHRrdG5tgwPOc66 piLMbO7wQLaHf0t6OYGxF3iiugQeXOeopX/AiTjG283g27N65dIcWMo+/W0hwyhleC4M rtoKJDv8/jBRT7ZKZCFQ2uKIM0vB5Oc+CYzArnNSjuwi+GzR6YT4ABlvHcfyD4kegRc3 4M6nxAfsdeYZCBCYsYh/uYwD7XDaE4iw6gz5YrtCQBR1Vkp1o+iu45jhdpLUg9AlPweT iUzo/VTuqGQXATi+StQvbdGx4Zz3Y6KfJLYwAVFBLLHc6p2TRu9DjJ9CUZX4JJ1zqNGW tVXg== X-Gm-Message-State: ALoCoQkmNYrROR0FWckBH9qoAMNZiZGrDlB8qUhhi/ehYYx5C49xsyo9JlWOIBsZkd0chUBdksDG X-Received: by 10.224.62.8 with SMTP id v8mr86933865qah.9.1408630658428; Thu, 21 Aug 2014 07:17:38 -0700 (PDT) Received: from [192.168.2.65] ([96.236.21.80]) by mx.google.com with ESMTPSA id y70sm29345707qgd.3.2014.08.21.07.17.37 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 21 Aug 2014 07:17:37 -0700 (PDT) Subject: Re: some ZFS questions Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: text/plain; charset=windows-1252 From: Paul Kraus In-Reply-To: <201408211007.s7LA7YGd002430@sdf.org> Date: Thu, 21 Aug 2014 10:17:36 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201408070816.s778G9ug015988@sdf.org> <40AF5B49-80AF-4FE2-BA14-BFF86164EAA8@kraus-haus.org> <201408211007.s7LA7YGd002430@sdf.org> To: Scott Bennett X-Mailer: Apple Mail (2.1878.6) Cc: freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Aug 2014 14:25:55 -0000 On Aug 21, 2014, at 6:07, Scott Bennett wrote: > Paul Kraus wrote: >> If I had two (or more) vdevs on each device (and I *have* done that = when I needed to), I would have issued the first zpool replace command, = waited for it to complete and then issued the other. If I had more than = one drive fail, I would have handled the replacement of BOTH drives on = one zpool first and then moved on to the second. This is NOT because I = want to be nice and easy on my drives :-), it is simply because I expect = that running the two operations in parallel will be slower than running = them in series. For the major reason that large seeks are slower than = short seeks. >=20 > My concern was over a slightly different possible case, namely, a = hard > failure of a component drive (e.g., makes ugly noises, doesn't spin, = and/or > doesn't show up as a device recognized as such by the OS). In that = case, > either one has to physically connect a replacement device or a spare = is > already on-line. A spare would automatically be grabbed by a pool for > reconstruction, so I wanted to know whether situation under discussion = would > result in automatically initiated rebuilds of both pools at once. If the hot spare devices are listed in each zpool, then yes, they would = come online as the zpools have a device fail. But =85 it is possible to = have one zpool with a failed vdev and the other not. It depends on the = device=92s failure mode. Bad blocks can cause ZFS to mark a vdev bad and = need replacement while a vdev also on that physical device may not have = bad blocks. In the case of a complete failure both zpools would start = resilvering *if* the hot spares were listed in both. The way around this = would be to have the spare device ready but NOT list it in the lower = priority zpool. After the first resilver completes manually do the zpool = replace on the second. >> A zpool replace is not a simple copy from the failing device to the = new one, it is a rebuild of the data on the new device, so if the device = fails completely it just keeps rebuilding. The example in my blog was of = a drive that just went offline with no warning. I put the new drive in = the same physical slot (I did not have any open slots) and issued the = resilver command. >=20 > Okay. However, now you bring up another possible pitfall. Are = ZFS's > drives address- or name-dependent? All of the drives I expect to use = will be > external drives. At least four of the six will be connected via USB = 3.0. The > other two may be connected via USB 3.0, Firewire 400, or eSATA. In = any case, > their device names in /dev will most likely change from one boot to = another. ZFS uses the header written to the device to identify it. Note that this = was not always the case and *if* you have a zfs cache file you *may* run = into device renaming issues. I have not seen any, but I am also = particularly paranoid about not moving devices around before exporting = them. I have seen too many stories of lost zpools due to this many years = ago on the ZFS list. >> Tune vfs.zfs.arc_max in /boot/loader.conf >=20 > That looks like a huge help. While initially loading a file = system > or zvol, would there be any advantage to setting primarycache to = "metadata", > as opposed to leaving it set to the default value of "all=94? I do not know, but I=92m sure the folks on the ZFS list who know much = more than I do will have opinions :-) >> If I had less than 4GB of RAM I would limit the ARC to 1/2 RAM, = unless this were solely a fileserver, then I would watch how much memory = I needed outside ZFS and set the ARC to slightly less than that. Take a = look at the recommendations here = https://wiki.freebsd.org/ZFSTuningGuidefor low RAM situations. >=20 > Will do. Hmm...I see again the recommendation to increase = KVA_PAGES > from 260 to 512. I worry about that because the i386 kernel says at = boot > that it ignores all real memory above ~2.9 GB. A bit farther along, = during > the early messages preserved and available via dmesg(1), it says, >=20 > real memory =3D 4294967296 (4096 MB) > avail memory =3D 3132100608 (2987 MB) On the FreeBSD VM I am running with only 1 GB memory I did not do any = tuning and ZFS seems to be working fine. It is a mail store, but not a = very large one (only about 130 GB of email). Performance is not a = consideration on this VM, it is archival. -- Paul Kraus paul@kraus-haus.org