FreeBSD Mail Archives

Date:      Thu, 16 Jul 2015 18:50:22 -0500
From:      Graham Allan <allan@physics.umn.edu>
To:        Ahmed Kamal via freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: Linux NFSv4 clients are getting (bad sequence-id error!)
Message-ID:  <20150716235022.GF32479@physics.umn.edu>
In-Reply-To: <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com>
References:  <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <CANzjMX7xKBvnzJhQhB_ZrUnyE2m_FJXXy4fm_RFnuZfBDyDm2A@mail.gmail.com> <55947C6E.5060409@delphij.net> <1491630362.2785531.1435799383802.JavaMail.zimbra@uoguelph.ca> <5594B008.10202@freebsd.org> <1022558302.2863702.1435838360534.JavaMail.zimbra@uoguelph.ca> <CANzjMX5eN1FsnHMf6KGZe_b3vwxxF=dy3fJUHxeGO4BXuNzfPA@mail.gmail.com> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <CANzjMX427XNQJ1o6Wh2CVy1LF1ivspGcfNeRCmv%2BOyApK2UhJg@mail.gmail.com> <CANzjMX5xyUz6OkMKS4O-MrV2w58YT9ricOPLJWVtAR5Ci-LMew@mail.gmail.com>

I'm curious how things are going for you with this?

Reading your thread did pique my interest since we have a lot of
Scientific Linux (RHEL clone) boxes with FreeBSD NFSv4 servers. I meant
to glance through our logs for signs of the same issue, but today I
started investigating a machine which appeared to have hung processes,
high rpciod load, and high traffic to the NFS server. Of course it is
exactly this issue.

The affected machine is running SL5 though most of our server nodes are
now SL6. I can see errors from most of them but the SL6 systems appear
less affected - I see a stream of the sequence-id errors in their logs but
things in general keep working. The one SL5 machine I'm looking at
has a single sequence-id error in today's logs, but then goes into a
stream of "state recovery failed" then "Lock reclaim failed". It's
probably partly related to the particular workload on this machine.

I would try switching our SL6 machines to NFS 4.1 to see if the
behaviour changes, but 4.1 isn't supported by our 9.3 servers (is it in
10.1?).

At the NFS servers, most of the sysctl settings are already tuned
from defaults. eg tcp.highwater=100000, vfs.nfsd.tcpcachetimeo=300,
128-256 nfs kernel threads.

Graham

On Fri, Jul 03, 2015 at 01:21:00AM +0200, Ahmed Kamal via freebsd-fs wrote:
> PS: Today (after adjusting tcp.highwater) I didn't get any screaming
> reports from users about hung vnc sessions. So maybe just maybe, linux
> clients are able to somehow recover from this bad sequence messages. I
> could still see the bad sequence error message in logs though
> 
> Why isn't the highwater tunable set to something better by default ? I mean
> this server is certainly not under a high or unusual load (it's only 40 PCs
> mounting from it)
> 
> On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal <email.ahmedkamal@googlemail.com
> > wrote:
> 
> > Thanks all .. I understand now we're doing the "right thing" .. Although
> > if mounting keeps wedging, I will have to solve it somehow! Either using
> > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1.
> >
> > Regarding Xin's patch, is it possible to build the patched nfsd code, as a
> > kernel module ? I'm looking to minimize my delta to upstream.
> >
> > Also would adopting Xin's patch and hiding it behind a
> > kern.nfs.allow_linux_broken_client be an option (I'm probably not the last
> > person on earth to hit this) ?
> >
> > Thanks a lot for all the help!
> >
> > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > wrote:
> >
> >> Ahmed Kamal wrote:
> >> > Appreciating the fruitful discussion! Can someone please explain to me,
> >> > what would happen in the current situation (linux client doing this
> >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of that?
> >> Well, as you've seen, the Linux client doesn't function correctly against
> >> the FreeBSD server (and probably others that don't support this
> >> "skip-by-1"
> >> case).
> >>
> >> > What do users see? Any chances of data loss?
> >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the Linux
> >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy
> >> observing
> >> it.
> >>
> >> >
> >> > Also, I find it strange that netapp have acknowledged this is a bug on
> >> > their side, which has been fixed since then!
> >> Yea, I think Netapp screwed up. For some reason their server allowed this,
> >> then was fixed to not allow it and then someone decided that was broken
> >> and
> >> reversed it.
> >>
> >> > I also find it strange that I'm the first to hit this :) Is no one
> >> running
> >> > nfs4 yet!
> >> >
> >> Well, it seems to be slowly catching on. I suspect that the Linux client
> >> mounting a Netapp is the most common use of it. Since it appears that they
> >> flip flopped w.r.t. who's bug this is, it has probably persisted.
> >>
> >> It may turn out that the Linux client has been fixed or it may turn out
> >> that most servers allowed this "skip-by-1" even though David Noveck (one
> >> of the main authors of the protocol) seems to agree with me that it should
> >> not be allowed.
> >>
> >> It is possible that others have bumped into this, but it wasn't isolated
> >> (I wouldn't have guessed it, so it was good you pointed to the RedHat
> >> discussion)
> >> and they worked around it by reverting to NFSv3 or similar.
> >> The protocol is rather complex in this area and changed completely for
> >> NFSv4.1,
> >> so many have also probably moved onto NFSv4.1 where this won't be an
> >> issue.
> >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and doesn't
> >> use
> >>  these seqid fields.)
> >>
> >> This is all just mho, rick
> >>
> >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem <rmacklem@uoguelph.ca>
> >> wrote:
> >> >
> >> > > Julian Elischer wrote:
> >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote:
> >> > > > > I am going to post to nfsv4@ietf.org to see what they say. Please
> >> > > > > let me know if Xin Li's patch resolves your problem, even though I
> >> > > > > don't believe it is correct except for the UINT32_MAX case. Good
> >> > > > > luck with it, rick
> >> > > > and please keep us all in the loop as to what they say!
> >> > > >
> >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1 in a
> >> > > > number field that has a
> >> > > > bit of slack at wrap time (probably due to some ambiguity in the
> >> > > > original spec).
> >> > > >
> >> > > Actually, since N is the lock op already done, N + 1 is the next lock
> >> > > operation in order. Since lock ops need to be strictly ordered,
> >> allowing
> >> > > N + 2 (which means N + 2 would be done before N + 1) makes no sense.
> >> > >
> >> > > I think the author of the RFC meant that N + 2 or greater fails, but
> >> it
> >> > > was poorly worded.
> >> > >
> >> > > I will pass along whatever I get from nfsv4@ietf.org. (There is an
> >> archive
> >> > > of it somewhere, but I can't remember where.;-)
> >> > >
> >> > > rick
> >> > > _______________________________________________
> >> > > freebsd-fs@freebsd.org mailing list
> >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> >> > >
> >> >
> >>
> >
> >
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

-- 
-------------------------------------------------------------------------
Graham Allan - allan@physics.umn.edu - gta@umn.edu - (612) 624-5040
School of Physics and Astronomy - University of Minnesota
-------------------------------------------------------------------------

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150716235022.GF32479>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation