Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Aug 2014 10:17:36 -0400
From:      Paul Kraus <paul@kraus-haus.org>
To:        Scott Bennett <bennett@sdf.org>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: some ZFS questions
Message-ID:  <E32981BF-613C-49D9-B7F8-18B9A99C0141@kraus-haus.org>
In-Reply-To: <201408211007.s7LA7YGd002430@sdf.org>
References:  <201408070816.s778G9ug015988@sdf.org> <40AF5B49-80AF-4FE2-BA14-BFF86164EAA8@kraus-haus.org> <201408211007.s7LA7YGd002430@sdf.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 21, 2014, at 6:07, Scott Bennett <bennett@sdf.org> wrote:

> Paul Kraus <paul@kraus-haus.org> wrote:

>> If I had two (or more) vdevs on each device (and I *have* done that =
when I needed to), I would have issued the first zpool replace command, =
waited for it to complete and then issued the other. If I had more than =
one drive fail, I would have handled the replacement of BOTH drives on =
one zpool first and then moved on to the second. This is NOT because I =
want to be nice and easy on my drives :-), it is simply because I expect =
that running the two operations in parallel will be slower than running =
them in series. For the major reason that large seeks are slower than =
short seeks.
>=20
>     My concern was over a slightly different possible case, namely, a =
hard
> failure of a component drive (e.g., makes ugly noises, doesn't spin, =
and/or
> doesn't show up as a device recognized as such by the OS).  In that =
case,
> either one has to physically connect a replacement device or a spare =
is
> already on-line.  A spare would automatically be grabbed by a pool for
> reconstruction, so I wanted to know whether situation under discussion =
would
> result in automatically initiated rebuilds of both pools at once.

If the hot spare devices are listed in each zpool, then yes, they would =
come online as the zpools have a device fail. But =85 it is possible to =
have one zpool with a failed vdev and the other not. It depends on the =
device=92s failure mode. Bad blocks can cause ZFS to mark a vdev bad and =
need replacement while a vdev also on that physical device may not have =
bad blocks. In the case of a complete failure both zpools would start =
resilvering *if* the hot spares were listed in both. The way around this =
would be to have the spare device ready but NOT list it in the lower =
priority zpool. After the first resilver completes manually do the zpool =
replace on the second.

>> A zpool replace is not a simple copy from the failing device to the =
new one, it is a rebuild of the data on the new device, so if the device =
fails completely it just keeps rebuilding. The example in my blog was of =
a drive that just went offline with no warning. I put the new drive in =
the same physical slot (I did not have any open slots) and issued the =
resilver command.
>=20
>     Okay.  However, now you bring up another possible pitfall.  Are =
ZFS's
> drives address- or name-dependent?  All of the drives I expect to use =
will be
> external drives.  At least four of the six will be connected via USB =
3.0.  The
> other two may be connected via USB 3.0, Firewire 400, or eSATA.  In =
any case,
> their device names in /dev will most likely change from one boot to =
another.

ZFS uses the header written to the device to identify it. Note that this =
was not always the case and *if* you have a zfs cache file you *may* run =
into device renaming issues. I have not seen any, but I am also =
particularly paranoid about not moving devices around before exporting =
them. I have seen too many stories of lost zpools due to this many years =
ago on the ZFS list.

>> Tune vfs.zfs.arc_max in /boot/loader.conf
>=20
>     That looks like a huge help.  While initially loading a file =
system
> or zvol, would there be any advantage to setting primarycache to =
"metadata",
> as opposed to leaving it set to the default value of "all=94?

I do not know, but I=92m sure the folks on the ZFS list who know much =
more than I do will have opinions :-)

>> If I had less than 4GB of RAM I would limit the ARC to 1/2 RAM, =
unless this were solely a fileserver, then I would watch how much memory =
I needed outside ZFS and set the ARC to slightly less than that. Take a =
look at the recommendations here =
https://wiki.freebsd.org/ZFSTuningGuidefor low RAM situations.
>=20
>     Will do.  Hmm...I see again the recommendation to increase =
KVA_PAGES
> from 260 to 512.  I worry about that because the i386 kernel says at =
boot
> that it ignores all real memory above ~2.9 GB.  A bit farther along, =
during
> the early messages preserved and available via dmesg(1), it says,
>=20
> real memory  =3D 4294967296 (4096 MB)
> avail memory =3D 3132100608 (2987 MB)

On the FreeBSD VM I am running with only 1 GB memory I did not do any =
tuning and ZFS seems to be working fine. It is a mail store, but not a =
very large one (only about 130 GB of email). Performance is not a =
consideration on this VM, it is archival.

--
Paul Kraus
paul@kraus-haus.org




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E32981BF-613C-49D9-B7F8-18B9A99C0141>