Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Jul 2015 15:31:59 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Graham Allan <allan@physics.umn.edu>
Cc:        Ahmed Kamal via freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: Linux NFSv4 clients are getting (bad sequence-id error!)
Message-ID:  <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <20150716235022.GF32479@physics.umn.edu>
References:  <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <CANzjMX5eN1FsnHMf6KGZe_b3vwxxF=dy3fJUHxeGO4BXuNzfPA@mail.gmail.com> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <CANzjMX427XNQJ1o6Wh2CVy1LF1ivspGcfNeRCmv%2BOyApK2UhJg@mail.gmail.com> <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com> <20150716235022.GF32479@physics.umn.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Graham Allan wrote:
> I'm curious how things are going for you with this?
> 
> Reading your thread did pique my interest since we have a lot of
> Scientific Linux (RHEL clone) boxes with FreeBSD NFSv4 servers. I meant
> to glance through our logs for signs of the same issue, but today I
> started investigating a machine which appeared to have hung processes,
> high rpciod load, and high traffic to the NFS server. Of course it is
> exactly this issue.
> 
> The affected machine is running SL5 though most of our server nodes are
> now SL6. I can see errors from most of them but the SL6 systems appear
> less affected - I see a stream of the sequence-id errors in their logs but
> things in general keep working. The one SL5 machine I'm looking at
> has a single sequence-id error in today's logs, but then goes into a
> stream of "state recovery failed" then "Lock reclaim failed". It's
> probably partly related to the particular workload on this machine.
> 
> I would try switching our SL6 machines to NFS 4.1 to see if the
> behaviour changes, but 4.1 isn't supported by our 9.3 servers (is it in
> 10.1?).
> 
Btw, I've done some testing against a fairly recent Fedora and haven't seen
the problem. If either of you guys could load a recent Fedora on a test client
box, it would be interesting to see if it suffers from this. (My experience is
that the Fedora distros have more up to date Linux NFS clients.)

rick

> At the NFS servers, most of the sysctl settings are already tuned
> from defaults. eg tcp.highwater=100000, vfs.nfsd.tcpcachetimeo=300,
> 128-256 nfs kernel threads.
> 
> Graham
> 
> On Fri, Jul 03, 2015 at 01:21:00AM +0200, Ahmed Kamal via freebsd-fs wrote:
> > PS: Today (after adjusting tcp.highwater) I didn't get any screaming
> > reports from users about hung vnc sessions. So maybe just maybe, linux
> > clients are able to somehow recover from this bad sequence messages. I
> > could still see the bad sequence error message in logs though
> > 
> > Why isn't the highwater tunable set to something better by default ? I mean
> > this server is certainly not under a high or unusual load (it's only 40 PCs
> > mounting from it)
> > 
> > On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal
> > <email.ahmedkamal@googlemail.com
> > > wrote:
> > 
> > > Thanks all .. I understand now we're doing the "right thing" .. Although
> > > if mounting keeps wedging, I will have to solve it somehow! Either using
> > > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1.
> > >
> > > Regarding Xin's patch, is it possible to build the patched nfsd code, as
> > > a
> > > kernel module ? I'm looking to minimize my delta to upstream.
> > >
> > > Also would adopting Xin's patch and hiding it behind a
> > > kern.nfs.allow_linux_broken_client be an option (I'm probably not the
> > > last
> > > person on earth to hit this) ?
> > >
> > > Thanks a lot for all the help!
> > >
> > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > > wrote:
> > >
> > >> Ahmed Kamal wrote:
> > >> > Appreciating the fruitful discussion! Can someone please explain to
> > >> > me,
> > >> > what would happen in the current situation (linux client doing this
> > >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of
> > >> > that?
> > >> Well, as you've seen, the Linux client doesn't function correctly
> > >> against
> > >> the FreeBSD server (and probably others that don't support this
> > >> "skip-by-1"
> > >> case).
> > >>
> > >> > What do users see? Any chances of data loss?
> > >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the
> > >> Linux
> > >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy
> > >> observing
> > >> it.
> > >>
> > >> >
> > >> > Also, I find it strange that netapp have acknowledged this is a bug on
> > >> > their side, which has been fixed since then!
> > >> Yea, I think Netapp screwed up. For some reason their server allowed
> > >> this,
> > >> then was fixed to not allow it and then someone decided that was broken
> > >> and
> > >> reversed it.
> > >>
> > >> > I also find it strange that I'm the first to hit this :) Is no one
> > >> running
> > >> > nfs4 yet!
> > >> >
> > >> Well, it seems to be slowly catching on. I suspect that the Linux client
> > >> mounting a Netapp is the most common use of it. Since it appears that
> > >> they
> > >> flip flopped w.r.t. who's bug this is, it has probably persisted.
> > >>
> > >> It may turn out that the Linux client has been fixed or it may turn out
> > >> that most servers allowed this "skip-by-1" even though David Noveck (one
> > >> of the main authors of the protocol) seems to agree with me that it
> > >> should
> > >> not be allowed.
> > >>
> > >> It is possible that others have bumped into this, but it wasn't isolated
> > >> (I wouldn't have guessed it, so it was good you pointed to the RedHat
> > >> discussion)
> > >> and they worked around it by reverting to NFSv3 or similar.
> > >> The protocol is rather complex in this area and changed completely for
> > >> NFSv4.1,
> > >> so many have also probably moved onto NFSv4.1 where this won't be an
> > >> issue.
> > >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't
> > >> use
> > >>  these seqid fields.)
> > >>
> > >> This is all just mho, rick
> > >>
> > >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > >> wrote:
> > >> >
> > >> > > Julian Elischer wrote:
> > >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote:
> > >> > > > > I am going to post to nfsv4@ietf.org to see what they say.
> > >> > > > > Please
> > >> > > > > let me know if Xin Li's patch resolves your problem, even though
> > >> > > > > I
> > >> > > > > don't believe it is correct except for the UINT32_MAX case. Good
> > >> > > > > luck with it, rick
> > >> > > > and please keep us all in the loop as to what they say!
> > >> > > >
> > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in
> > >> > > > a
> > >> > > > number field that has a
> > >> > > > bit of slack at wrap time (probably due to some ambiguity in the
> > >> > > > original spec).
> > >> > > >
> > >> > > Actually, since N is the lock op already done, N + 1 is the next
> > >> > > lock
> > >> > > operation in order. Since lock ops need to be strictly ordered,
> > >> allowing
> > >> > > N + 2 (which means N + 2 would be done before N + 1) makes no sense.
> > >> > >
> > >> > > I think the author of the RFC meant that N + 2 or greater fails, but
> > >> it
> > >> > > was poorly worded.
> > >> > >
> > >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is an
> > >> archive
> > >> > > of it somewhere, but I can't remember where.;-)
> > >> > >
> > >> > > rick
> > >> > > _______________________________________________
> > >> > > freebsd-fs@freebsd.org mailing list
> > >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > >> > > To unsubscribe, send any mail to
> > >> > > "freebsd-fs-unsubscribe@freebsd.org"
> > >> > >
> > >> >
> > >>
> > >
> > >
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 
> --
> -------------------------------------------------------------------------
> Graham Allan - allan@physics.umn.edu - gta@umn.edu - (612) 624-5040
> School of Physics and Astronomy - University of Minnesota
> -------------------------------------------------------------------------
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?184170291.10949389.1437161519387.JavaMail.zimbra>