Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Sep 2015 08:20:24 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Outback Dingo <outbackdingo@gmail.com>
Cc:        Mark Saad <nonesuch@longcount.org>, freebsd-fs@freebsd.org,  Rakshith Venkatesh <vrock28@gmail.com>,  Jordan Hubbard <jordanhubbard@icloud.com>
Subject:   Re: CEPH + FreeBSD
Message-ID:  <488408636.4345946.1441801224232.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <CAKYr3zyPxM0oKhgsNoFmkFxn6AFkekS2=ptV7Z6CicTzZvXCdw@mail.gmail.com>
References:  <CANw0z%2BVhYCPNWrjByXLf8yO9wA0sc05_8eVJsM-McjcGNU9KQg@mail.gmail.com> <CANw0z%2BXrwK=6y%2BLpoiewc_eLDBYB5UZ5XpU6-YP0-K2FKwSa5w@mail.gmail.com> <A19FDEB5-1DEF-4EBF-8E9E-A1AD4688F1AA@icloud.com> <100306673.40344407.1441279047901.JavaMail.zimbra@uoguelph.ca> <1564D4FA-9BE1-4E37-8E91-F14A009D6B62@icloud.com> <838814506.1858817.1441577912291.JavaMail.zimbra@uoguelph.ca> <F5F89A87-AD10-4CD3-BF56-854EEBB4C121@longcount.org> <CAKYr3zyPxM0oKhgsNoFmkFxn6AFkekS2=ptV7Z6CicTzZvXCdw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Outback Dingo wrote:
> On Wed, Sep 9, 2015 at 11:31 AM, Mark Saad <nonesuch@longcount.org> wrote=
:
>=20
> > All
> >  What about leofs. It's in ports has  and s3 obj store and NFS support =
out
> > of the box
> >
> >
> > http://www.freshports.org/databases/leofs/
> > http://leo-project.net
>=20
>=20
> LeoFS supperts NFSv3 and does not have a lock manager....
>=20
I doubt lack of a lock manager is an issue for what I want to do, since the=
 NFSv4.1
metadata server (just a regular NFSv4.1 server that can give out layouts fo=
r reading/writing
the data directly on the data servers) handles the locking. It is actually =
much easier to keep
track of the locking in the NFSv4.1 server and not have to worry about lock=
ing on the
underlying cluster FS. All I intend to do with a NFSv3 server on the data s=
erver(s) is do
Read/Write RPCs. Everything else is handled via the NFSv4.1 metadata server=
.
(The original RFC required use of NFSv4.1 read/write ops on the data server=
s,
 but a new layout type called flex files supports NFSv3 Read/Write for the =
data servers.)

The key issue for me is whether or not it has a VFS interface to a POSIX li=
ke
file system (via FUSE or ???). At a quick glance at the web page, I don't s=
ee
any mention of this?
Why? Well, simply the fact that I am looking at extending the current kerne=
l based
     NFSv4.1 server to support pNFS. Obviously, there are other ways a NFSv=
4.1/pNFS
     server can be built (userland NFS-Ganehsa that is on Linux, for exampl=
e), but
     that isn't what I'm interested in doing.

Btw, I took a quick look at MooseFS and it does seem to have this and could=
 be an
     alternative to glusterFS. It isn't an object store and only appears to=
 have a
     single metadata server, which might be a limitation for the long term?
     It sounds like MooseFS uses custom prototcol for the chunk/data
     servers and I don't feel like trying to define yet another layout type=
, so I
     think I would need to add a partial NFSv3 server to the chunk/data ser=
vers.

     I will be looking more closely at both glusterFS and MooseFS soon.

If there are yet more of these cluster object stores that you think might b=
e worth
considering, feel free to mention them. (I thought I had looked at most of =
them, but
hadn't noticed MooseFS, so...)

Thanks for all the comments, rick

>=20
>=20
> >
> >
> >
> > ---
> > Mark Saad | nonesuch@longcount.org
> >
> > > On Sep 6, 2015, at 6:18 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote=
:
> > >
> > > Jordan Hubbard wrote:
> > >>
> > >>> On Sep 3, 2015, at 4:17 AM, Rick Macklem <rmacklem@uoguelph.ca> wro=
te:
> > >>>
> > >>> Slightly off topic but, btw, there is a port of GLusterFS and those
> > folks
> > >>> do seem
> > >>> interested in seeing it brought "up to speed". I am not sure how
> > mature it
> > >>> is at
> > >>> this point, but it has been known to build on amd64. (I don't have =
an
> > amd64
> > >>> machine,
> > >>> so I haven't gotten around to building/testing it, but I do plan to
> > try and
> > >>> use
> > >>> it as a basis for a pNFS server, if I can figure out how to get the=
 FH
> > info
> > >>> out of it.
> > >>> I'm working on that;-)
> > >>
> > >> There are at least two distributed (multi-node) object stores for
> > FreeBSD
> > >> that I know of.
> > >>
> > >> One is glusterfs, for which I=E2=80=99m not even really clear on the=
 status of
> > the
> > >> ports for.  I don=E2=80=99t see any glusterfs port in the master bra=
nch of
> > >> https://github.com/freebsd/freebsd-ports (or
> > >> https://github.com/freebsd/freebsd-ports/tree/branches/2015Q3 for th=
at
> > >> matter).
> > >>
> > >> Our FreeNAS ports tree (https://github.com/freenas/ports), in which =
we
> > have a
> > >> bit more latitude to add and curate our own ports, has both a
> > net/glusterfs
> > >> and sysutils/glusterfs, from separate sources (looks like we need to
> > clean
> > >> things up) - net/glusterfs lists craig001@lerwick.hopto.org as the
> > >> MAINTAINER and is at version 3.6.2.  The sysutils/glusterfs port lis=
ts
> > >> bapt@FreeBSD.org as the MAINTAINER and is at version 20140811.
> > >>
> > >> I=E2=80=99m not really sure about the provenance since we were simpl=
y evaluating
> > >> glusterfs for awhile and may have pulled in interim versions from th=
ose
> > >> sources, but obviously it would be best to have an official maintain=
er
> > and
> > >> someone in the FreeBSD project actually curating a glusterfs port so
> > that
> > >> all users of FreeBSD can use it.  It would also be fairly key to you=
r
> > own
> > >> efforts, assuming you decide to pursue glusterfs as a foundation
> > technology
> > >> for pNFS.
> > >>
> > >> The other object store, which is pretty mature and is currently lead=
ing
> > the
> > >> pack (of two :) ) for inclusion into FreeNAS is RiakCS from Basho.
> > There is
> > >> a port currently in databases/riak but it=E2=80=99s pretty out of da=
te at
> > version
> > >> 1.4.12 (the current version is 2.0.1, with 2.0 being a major upgrade=
 of
> > >> RiakCS).
> > >>
> > >> We are very interested in investigating various ways of shimming Ria=
kCS
> > to
> > >> NFS, using RiakCS a back-end store.   Is that something you=E2=80=99=
d be
> > amenable to
> > >> discussing?   I=E2=80=99d be happy to send you an amd64 architecture=
 machine to
> > >> develop on. :)
> > > Hmm. From a quick look at their web page (I looked once before as wel=
l),
> > I don't
> > > think RiakCS has what I need to do pNFS in a reasonable (for me) amou=
nt
> > of effort.
> > > Two things that glusterFS has that I am hoping to use (and I don't th=
ink
> > RiakCS has
> > > either of these) are:
> > > - A Fuse file system interface which allows the kernel nfsd to access
> > the store as
> > >  a file system, so that it can provide the metadata services (NFS
> > without the reads/writes).
> > > - A userland NFSv3 server in each node which will allow the node to a=
ct
> > as a data server.
> > >
> > > If I am wrong and RiakCS does support a VFS file system interface (vi=
a
> > Fuse or ???), then
> > > please correct me. With that, it might be a reasonable alternative.
> > > I'll admit I've spent a little time looking at the glusterFS sources =
and
> > haven't yet
> > > solved the problem of how to generate the file handles I need, but th=
at
> > sounds trivial
> > > compared with an entire Fuse and/or VFS file system interface, I thin=
k?
> > >
> > > In general, using a cloud object store to implement a pNFS server is =
a
> > *mis*use of
> > > the technology, imho. I think it may be possible with glusterFS, sinc=
e
> > that technology
> > > seems to be based on a cluster file system, which is what a pNFS serv=
er
> > can also use.
> > >
> > > I think there would be a lot of work involved in mapping a POSIX file
> > system onto the
> > > Riak database and then exporting that via NFS, etc. It might also be
> > more practical to
> > > do this via a userland NFS service than the kernel based one currentl=
y
> > in FreeBSD.
> > > (glusterFS is starting to use the NFS-ganesha server, but I believe i=
t
> > is pretty Linux specific,
> > > so I doubt it would be useful for Riak running on FreeBSD?)
> > >
> > > rick
> > >
> > >> - Jordan
> > > _______________________________________________
> > > freebsd-fs@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> >
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?488408636.4345946.1441801224232.JavaMail.zimbra>