Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Oct 2017 14:05:12 +0100
From:      Kate Dawson <k4t@3msg.es>
To:        freebsd-questions@freebsd.org
Subject:   FreeBSD ZFS file server with SSD HDD
Message-ID:  <20171011130512.GE24374@apple.rat.burntout.org>

next in thread | raw e-mail | index | archive | help

--mR8QP4gmHujQHb1c
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,=20

Currently running a FreeBSD NFS server with a zpool comprising=20

12 x 1TB hard disk drives are arranged as pairs of mirrors in a strip set (=
 RAID 10 )

An additional 2x 960GB SSD added. These two SSD are partitioned with a
small partition begin used for a ZIL log, and larger partion arranged for
L2ARC cache.=20

Additionally the host has 64GB RAM and 16 CPU cores (AMD Opteron 2Ghz)

A dataset from the pool is exported via NFS to a number of Debian
Gnu/Linux hosts running a xen hypervisor. These run several disk image
based virtual machines

In general use, the FreeBSD NFS host sees very little read IO, which is to =
expected
as the RAM cache  and L2ARC are designed to minimise the amount of read load
on the disks.

However we're starting to see high load ( mostly IO WAIT ) on the Linux
virtualisation hosts, and virtual machines - with kernel timeouts
occurring resulting in crashes and instability.

I believe this may be due to the limited number of random write IOPS availa=
ble
on the zpool NFS export.=20

I can get sequential writes and reads to and from the NFS server at
speeds that approach the maximum the network provides ( currently 1Gb/s
+ Jumbo Frames, and I could increase this by bonding multiple interfaces to=
gether. )

However day to day usage does not show network utilisation anywhere near
this maximum.

If I look at the output of `zpool iostat -v tank 1 ` I see that every
five seconds or so, the numner of write operation go to > 2k

I think this shows that the I'm hitting the limit that the spinning disk
can provide in this workload.

As a cost effective way to improve this ( rather than replacing the
whole chassis ) I was considering replacing the 1TB HDD with 1TB SSD,
for the improved IOPS.=20

I wonder if there were any opinions within the community here, on=20

1. What metrics can I gather to confirm the disk write IO as bottleneck?

2. If the proposed solution will have the required effect?  That is an
decrease in the IOWAIT on the GNU/Linux virtualization hosts.

I hope this is a well formed question.

Regards,=20

Kate Dawson
--=20
"The introduction of a coordinate system to geometry is an act of violence"

--mR8QP4gmHujQHb1c
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCAAGBQJZ3hcCAAoJEAGiBQHoGku6jeIQAJpEA+BgNn+aRCFPGYbDGZqf
ptOk9NWFr1nIjZE3P9SdnOF19abfodGf+i28WAmSqhnHU/qx9lKHWW4y/Tu7/bIe
uXlBELEZ//tdlcgDYYsBuoaDDy8bj8XE71vPCcwky956h56D97gesZtf070TSrWu
AI0pCQdl9SGE98a4bbWUTkMEYVT3K014gTOosxuSwY4hbVmFFDjDQCy+yAURJDsa
QrDBjRDKUTcLvBnunpVvKNjFVbi8zHrEfSlMzxWcZVVS5IKFlb1ew2dvPhO3x6B4
ahkjwELze12fhFUwMgDUZ502h5UgVjOUe/AbkT+K2vwyGB791MoetwocBZ422rnL
ucfG5C3TH0WtfOgAC07/9m98CGy5ZU7O+OeG14CnbIcps0gSoI1ZMvF1/9itSOnn
RpO3Y/6DZksplHOmpm+YfjSVlcM0tS9yKCMhFGWT5uX6KfboDO5q2YVyw+Wuy31t
r1AerPGtPBmR44n1lofZaPeJB3j3P57tXzzkEadbYPrOQ1aSyQ78fnbp9+kGmWPq
MhNzZCGajixhilDSxz/cbH6G3+I3jM5/FLA4/u8EhHCxuiOyQju2dwo5m6IPFfof
iCjxWlUXezb6vHos95wutG06wIegGiXXIrR/nF6a1sik8OCkZxfWcJYLRDMItlh4
j8Un9Tsd8xJzQm2G2Qxg
=2kGf
-----END PGP SIGNATURE-----

--mR8QP4gmHujQHb1c--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171011130512.GE24374>