Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Dec 2014 20:31:17 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
Cc:        freebsd-net@freebsd.org, Gerrit =?utf-8?B?S8O8aG4=?= <gerrit.kuehn@aei.mpg.de>
Subject:   Re: compiling on nfs directories
Message-ID:  <1877801167.13336057.1418693477745.JavaMail.root@uoguelph.ca>
In-Reply-To: <CAOgwaMtky7a62tn3Q%2BvsWZObM9NDVE-tR4iqvxqaLSvxTKrWkQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Mehmet Erol Sanliturk wrote:
> On Mon, Dec 15, 2014 at 3:42 PM, Rick Macklem <rmacklem@uoguelph.ca>
> wrote:
> >
> > Mehmet Erol Sanilturk wrote:
> > > On Mon, Dec 15, 2014 at 12:59 PM, Rick Macklem
> > > <rmacklem@uoguelph.ca>
> > > wrote:
> > > >
> > > > Mehmet Erol Sanliturk wrote:
> > > > > On Mon, Dec 15, 2014 at 1:24 AM, Gerrit K=C3=BChn
> > > > > <gerrit.kuehn@aei.mpg.de>
> > > > > wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I ran into some weird issue here last week:
> > > > > > I have an NFS-Server for storage and diskless booting (pxe
> > > > > > /
> > > > > > nfs
> > > > > > root)
> > > > > > running under FreeBSD. The clients are running Gentoo
> > > > > > Linux.
> > > > > > Some
> > > > > > time
> > > > > > ago, I replaced the server, going from a HDD-based storage
> > > > > > array
> > > > > > (ZFS)
> > > > > > under FreeBSD 8.3 to an SSD-based array under FreeBSD
> > > > > > 10-stable
> > > > > > (as
> > > > > > of
> > > > > > February this year - I know this needs updates).
> > > > > >
> > > > > > Only now I recognized that this somehow appears to have
> > > > > > broken
> > > > > > some
> > > > > > of my
> > > > > > Gentoo ebuilds that do not install cleanly anymore. They
> > > > > > complain
> > > > > > about
> > > > > > "soiled libtool library files found" and "insecure
> > > > > > RUNPATHs" in
> > > > > > the
> > > > > > installation stage of shared libs.
> > > > > >
> > > > > > I was not able to find any useful solution for this in the
> > > > > > Net
> > > > > > so
> > > > > > far.
> > > > > > However, I was able to verify that this is somehow an issue
> > > > > > with
> > > > > > the nfs
> > > > > > server by plugging in a USB-drive into the diskless clients
> > > > > > and
> > > > > > mounting
> > > > > > this as /var/tmp/portage (the directory structure where
> > > > > > Gentoo's
> > > > > > ebuilds
> > > > > > are compiled). This makes the error messages go away, and
> > > > > > everything works
> > > > > > again (like it did before the server update).
> > > > > >
> > > > > > Are there any suggestions what might be causing this and
> > > > > > how to
> > > > > > fix
> > > > > > it?
> > > > > >
> > > > > >
> > > > > > cu
> > > > > >   Gerrit
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > With respect to information given in your message , may pure
> > > > > guess is
> > > > > the
> > > > > following :
> > > > >
> > > > >
> > > > > When a client generates a file in NFS server , it assumes
> > > > > that
> > > > > everything
> > > > > is written into the file .
> > > > > The next step ( reading the generated file ) starts , BUT the
> > > > > file is
> > > > > NOT
> > > > > completely written into disk yet ,
> > > > > therefore it reads an incomplete file which causes errors in
> > > > > the
> > > > > client .
> > > > >
> > > > Well, not exactly. The NFS client chooses whether or not the
> > > > written
> > > > data must be committed to stable storage (disk) right away via
> > > > a
> > > > flag
> > > > argument on the write. It is up to the client to keep track of
> > > > what
> > > > has
> > > > been written and if the FILE_STABLE flag wasn't set, must do a
> > > > separate
> > > > Commit RPC to force the data to stable storage on the server.
> > > > It is also up to the NFS client to keep track of the file's
> > > > size
> > > > while
> > > > it is being grown, since the NFS server's size may be smaller
> > > > until
> > > > the data gets written to the server.
> > > > Also, note that he didn't see the problem with FreeBSD8.3,
> > > > which
> > > > would
> > > > have been following the same rules on the server as 10.1.
> > > >
> > > > What I suspect might cause this is one of two things:
> > > > 1 - The modify time of the file is now changing at a time the
> > > > Linux
> > > >     client doesn't expect, due to changes in ZFS or maybe TOD
> > > >     clock
> > > >     resolution. (At one time, the TOD clock was only at a
> > > >     resolution
> > > >     of 1sec, so the client wouldn't see the modify time change
> > > >     often.
> > > >     I think it is now at a much higher resolution, but would
> > > >     have
> > > >     to
> > > >     look at the code/test to be sure.)
> > > > 2 - I think you mention this one later in your message, in that
> > > > the
> > > >     build might be depending on file locking. If this is the
> > > >     case,
> > > >     trying NFSv4, which does better file locking, might fix the
> > > >     problem.
> > > >
> > > > Gerrit, I would suggest that you do "nfsstat -m" on the Linux
> > > > client,
> > > > to see what the mount options are. The Linux client might be
> > > > using
> > > > NFSv4
> > > > already.
> > > > Also, avoid "soft, intr" especially if you are using NFSv4,
> > > > since
> > > > these
> > > > can cause slow server response to result in a failure of a
> > > > read/write
> > > > when it shouldn't fail, due to timeout or interruption by a
> > > > signal.
> > > >
> > > > If you could find out more about what causes the specific build
> > > > failure
> > > > on the Linux side, that might help.
> > > > If you can reproduce a build failure quickly/easily, you can
> > > > capture
> > > > packets via "tcpdump -s 0 -w <file> host <client-hostname>" on
> > > > the
> > > > server and then look at it in wireshark to see what the server
> > > > is
> > > > replying
> > > > when the build failure occurs. (I don't mind looking at a
> > > > packet
> > > > trace if
> > > > it is relatively small, if you email it to me as an
> > > > attachment.)
> > > >
> > > > Good luck with it, rick
> > > > ps: I am not familiar with the Linux mount options, but if it
> > > > has
> > > >     stuff like "nocto", you could try those.
> > > >
> > > > > In FreeBSD NFS server , there is NOT ( or I could NOT be able
> > > > > to
> > > > > find
> > > > > ) a
> > > > > facility to store written data immediately into disk .
> > > > >
> > > > > NFS server is collecting data up to a point ( number of bytes
> > > > > )
> > > > > and
> > > > > then
> > > > > writing it to disk , during this phase ( whether the NFS
> > > > > server
> > > > > is
> > > > > busy or
> > > > > not ) is not important ) . With this structure ,
> > > > > the tasks which a program writes a small number of bytes to
> > > > > be
> > > > > read
> > > > > by
> > > > > another program can not be
> > > > > processed by a NFS server only .
> > > > >
> > > > > I did not try "locking in NFS server" : If this route is
> > > > > taken ,
> > > > > then
> > > > > it is
> > > > > necessary to adjust the clients for such periods to wait that
> > > > > NFS
> > > > > server
> > > > > has removed the lock which themselves can continue ( Each
> > > > > such
> > > > > read
> > > > > requires a waiting loop without generating an error message
> > > > > about
> > > > > unavailable data and termination . ) .
> > > > >
> > > > > In Linux NFS server , there is an option to immediately write
> > > > > the
> > > > > received
> > > > > data into disk . This is improving the above situation
> > > > > considerable
> > > > > but not
> > > > > completely solving the problem ( because during reads of data
> > > > > ,
> > > > > data
> > > > > in
> > > > > cache is NOT concatenated to the data in disk ) .
> > > > >
> > > > >
> > > > > Another MAJOR problem is that , the NFS server is NOT
> > > > > concatenating
> > > > > data in
> > > > > cache to data in disk during reads : This defect is making
> > > > > NFS
> > > > > server
> > > > > useless for , let's say "real time" , applications used
> > > > > concurrently
> > > > > or as
> > > > > a single one by the clients without using another "Server"
> > > > > within
> > > > > NFS
> > > > > server .
> > > > >
> > > > >
> > > > >
> > > > > In your case , during software builds , a step is using the
> > > > > previously
> > > > > generated files : In local disk , writing and reading are
> > > > > sequential
> > > > > , in
> > > > > the sense that written data is found during reading . In NFS
> > > > > server
> > > > > this is
> > > > > not the case .
> > > > >
> > > > >
> > > > > With respect to my knowledge obtained from messages in
> > > > > FreeBSD
> > > > > mailing
> > > > > lists about making a possibility to read data immediately
> > > > > after
> > > > > it is
> > > > > written into NFS server is NOT available .
> > > > >
> > > > >
> > > > >
> > > > > Thank you very much .
> > > > >
> > > > > Mehmet Erol Sanliturk
> > > > > _______________________________________________
> > > > >
> > >
> > >
> > >
> > >
> > >
> > > When a C program is written to be used in an NFS environment ,
> > > some
> > > possibilities may be used to synchronize write and reads from the
> > > programs
> > > with the unsolved "cached data" problem .
> > >
> > > When ready programs are used , such as "make" , "ld" , there is
> > > no
> > > choice .
> > >
> > > I am using Pascal programs , then there is no such facilities .
> > >
> > > The solution may be to improve the NFS Server and Client modules
> > > to
> > > use
> > > cached data  during reads :
> > >
> > > If end of file is reached : Before sending EOF signal , check
> > > whether
> > > there
> > > is data in cache or not .
> > >          If there is data in cache : continue reading from the
> > >          cache
> > >          up to
> > > end ,
> > >             else send an EOF signal .
> > >
> > > ( For random access files , also there is a need to look at the
> > > cached
> > > values . )
> > >
> > Well, the FreeBSD NFS client (and most others) do extensive data
> > caching
> > and will read data from the client cache whenever possible. NFS
> > performance
> > without client caching is pretty terrible.
> >
> > The problem (which has existed since NFS was first developed in
> > about 1985)
> > is that NFS does not provide a cache coherency protocol, so when
> > multiple
> > clients write data to a file concurrently, there is no guarantee
> > that the
> > client
> > read will get the most up-to-date data. There has been something
> > called
> > close-to-open (cto) consistency adopted, which says that a client
> > will
> > read data written
> > by another client after the writing client has closed the file.
> > (Most NFS
> > clients only implement this "approximately", since they depend on
> > seeing
> > the
> > modify time change to determine this. This may not happen when
> > multiple
> > modifications
> > occur in the same time of day clock tick or when clients cache the
> > file's
> > attributes and use a stale cached modify time. Turning off client
> > attribute
> > caching improves this, but also results in a performance hit, due
> > to the
> > extra Getattr RPCs done.)
> >
> > The current consensus within the NFS community (driven by the Linux
> > client
> > implementation) is to only provide data consistency among multiple
> > clients
> > when byte range locking is used on the file.
> >
> > I'm not sure if this was what you were referring to. (It is true
> > that NFS
> > is not and cannot be a POSIX compliant file system, due to it's
> > design.)
> >
> > "make" can often be confused when the modify time isn't updated
> > when
> > expected.
> >
> > If an application running on FreeBSD wants to ensure that data is
> > written
> > to
> > stable storage on the server, the application can call fsync(2).
> >
> > > Since the above modification requires knowledge of internal
> > > structure
> > > of
> > > NFS Server , and perhaps NFS Client ,I am not able to supply any
> > > patch .
> > > Also I am not able to understand its implementation difficulty .
> > >
> > > My opinion is that the above modification would be a wonderful
> > > improvement
> > > for NFS system  in FreeBSD , because it will behave just like a
> > > local
> > > data
> > > store usable as "real time" data processing tasks . In the
> > > present
> > > structure , this is NOT possible with NFS Client and Server only
> > > .
> > >
> > Many years ago, I implemented a cache coherency protocol for NFS
> > called
> > NQNFS. No one used it (at least not much) and it never caught on.
> > Most care about NFS performance and data coherency has never been a
> > priority with most users, from what I've seen.
> >
> > rick
> >
>=20
>=20
> With respect to given information in
>=20
> The Design and Implementation of the FreeBSD Operating System
>  By Marshall Kirk McKusick, George V. Neville-Neil, Robert N.M.
>  Watson
>=20
> Second Edition : p. 559
>=20
>=20
> NQNFS has been removed from FreeBSD Version 5 on .
>=20
> It was available in Version 4.11 :
>=20
> http://svnweb.freebsd.org/base/release/4.11.0/sys/nfs/nqnfs.h?view=3Dmark=
up
>=20
>=20
> ( 1 ) Is there a newer version of  NQNFS other than the above which
> is
> available ?
>        A link would be very good , if it is available .
>=20
>=20
> (2) Are there other systems which is using NQNFS in their current
> distributions ?
>=20
Not that I know of. It's long gone dead and buried...

rick

>=20
>=20
>=20
> >
> > >
> > > Thank you very much .
> > >
> > >
> > > Mehmet Erol Sanliturk
> > >
> >
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1877801167.13336057.1418693477745.JavaMail.root>