Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jun 2014 11:28:16 -0400
From:      Rich <rincebrain@gmail.com>
To:        Graham Allan <allan@physics.umn.edu>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: Large ZFS arrays?
Message-ID:  <CAOeNLuo-m-_hu5TdnG_njsArYHhQOTsVsjknbsumbO7_p8LvPQ@mail.gmail.com>
In-Reply-To: <53A44A23.6050604@physics.umn.edu>
References:  <1402846139.4722.352.camel@btw.pki2.com> <53A44A23.6050604@physics.umn.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Just FYI, a lot of people who do this use sas[23]ircu for scripting
this, rather than sg3utils, though the latter is more powerful if you
have enough of the SAS spec to play with...

- Rich

On Fri, Jun 20, 2014 at 10:50 AM, Graham Allan <allan@physics.umn.edu> wrote:
> On 6/15/2014 10:28 AM, Dennis Glatting wrote:
>>
>> Anyone built a large ZFS infrastructures (PB size) and care to share
>> words of wisdom?
>
>
> This is a bit of a late response but I wanted to put in our "me too" before
> I forget...
>
> We have about 500TB of storage on ZFS at present, and plan to add 600TB more
> later this summer, mostly in similar arrangements to what I've seen
> discussed already - using Supermicro 847 JBOD chassis and a mixture of Dell
> R710/R720 head nodes, with LSI 9200-8e HBAs. One R720 has four 847 chassis
> attached, a couple R710s just have a single chassis. We originally installed
> one HBA in the R720 for each chassis but had some deadlock problems at one
> point, which was resolved by daisy-chaining the chassis from a single HBA. I
> had a feeling it was maybe related to kern/177536 but not really sure.
>
> We've been running FreeBSD 9.1 on all the production nodes, though I've long
> wanted to (and am now beginning to) set up a reasonable long-term testing
> box where we could check out some of the kernel patches or tuning
> suggestions which come up - also beginning to test the 9.3 release for the
> next set of servers.
>
> We built all these conservatively with each chassis as a separate pool, each
> having four 10-drive raidz2 vdevs, a couple of spares, a cheapish L2ARC SSD
> and a mirrored pair of ZIL SSD (maybe unnecessary to mirror this these
> days?). I was using the Intel 24GB SLC drive for the ZIL, will need to
> choose something new for future pools.
>
> Would be interesting to hear a little about experiences with the drives
> used... For our first "experimental" chassis we used 3TB Seagate desktop
> drives - cheap but not the best choice, 18 months later they are dropping
> like flies (luckily we can risk some cheapness here as most of our data can
> be re-transferred from other sites if needed). Another chassis has 2TB WD
> RE4 enterprise drives (no problems), and four others have 3TB and 4TB WD
> "Red" NAS drives... which are another "slightly risky" selection but so far
> have been very solid (also in some casual discussion with a WD field
> engineer he seemed to feel these would be fine for both ZFS and hadoop use).
>
> Tracking drives for failures and replacements was a big issue for us. One of
> my co-workers wrote a nice perl script which periodically harvests all the
> data from the chassis (via sg3utils) and stores the mappings of chassis
> slots, da devices, drive labels, etc into a database. It also understands
> the layout of the 847 chassis and labels the drives for us according to some
> rules we made up - we do some prefix for the pool name, then "f" or "b" for
> front/back of chassis, then the slot number, and finally (?) has some
> controls to turn the chassis drive identify lights on or off. There might be
> other ways to do all this but we didn't find any, so it's been incredibly
> useful for us.
>
> As far as performance goes we've been pretty happy. Some of these get
> relatively hammered by NFS i/o from cluster compute jobs (maybe ~1200
> processes on 100 nodes) and they have held up much better than our RHEL NFS
> servers using fiber channel RAID storage. We've also performed a few bulk
> transfers between hadoop and ZFS (using distcp with an NFS destination) and
> saw sustained 5Gbps write speeds (which really surprised me).
>
> I think that's all I've got for now.
>
> Graham
> --
> -------------------------------------------------------------------------
> Graham Allan
> School of Physics and Astronomy - University of Minnesota
> -------------------------------------------------------------------------
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOeNLuo-m-_hu5TdnG_njsArYHhQOTsVsjknbsumbO7_p8LvPQ>