Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Mar 2004 19:29:39 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        freebsd-hackers@freebsd.org
Subject:   NFS client side bug fixes that probably apply to FBsd-4 and 5.
Message-ID:  <200403130329.i2D3TdCv044838@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
    I believe that #1 and #2 applies to FreeBSD-4.x and might apply to 5.x,
    and #3 probably applies to 5.x and might apply to 4.x.  #1 and #2 may
    or may not apply to 5.x depending on how you handle software interrupts,
    but I expect they might due to thread switching in the mutex code and
    preemption.

    Generally the symptoms of these bugs are a locked up NFS mount but an
    otherwise working system.

    The window of opportunity is fairly small for these in 4.x and they seem
    to have been around for a long time, but I expect it should be possible
    to trip over them occassionally even in 4.x.   I can trip them in DFly
    within an hour due to the larger window of opportunity in DFly (due in
    part to additional thread switches in the pru_send code).  I don't know
    about 5.x.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

dillon      2004/03/12 19:13:53 PST

DragonFly src repository

  Modified files:
    sys/vfs/nfs          nfs.h nfs_socket.c 
  Log:
  Fix a bunch of NFS races.  These races existed in FreeBSD 4.x but are more
  likely to occur now due to the additional thread switching that DragonFly
  performs when doing things like sending UDP packets.  Three bugs are
  being fixed:
  
  * nfs_request() adds the request to the nfs_timer queue before doing initial
    processing (e.g. transmission) of the request.  The initial transmission of
    the request will race between nfs_request and nfs_timer, potentially causing
    the congestion window calculation (nm_sent) to be bumped twice instead of
    once.  This eventually closes the congestion window permanently and
    causes the NFS mount to freeze.  (Additionally the request could be
    transmitted twice unnecessarily, also fixed).
  
  * Updates to rep->r_flags and nmp->nm_sent were not being properly protected
    against nfs_timer due to splsoftclock() being released too early.  All
    such accesses are now protected.
  
  * nfs_reply() depends on nfs_rcvlock to do an interlock check to see if the
    request has already been replied, but nfs_rcvlock() only does this if it
    cannot immediately get the receiver lock.  The problem is that the NFS
    code in between request transmission and nfs_reply() can block, potentially
    allowing a reply to be returned to another nfsiod.  The NFS receiver winds
    up getting stuck waiting for a reply that has already been returned.
    nfs_rcvlock() now unconditionally checks to see if the reply has already
    occured before entering the loop.
  
  Revision  Changes    Path
  1.6       +9 -8      src/sys/vfs/nfs/nfs.h
  1.14      +37 -14    src/sys/vfs/nfs/nfs_socket.c


http://www.dragonflybsd.org/cvsweb/src/sys/vfs/nfs/nfs.h.diff?r1=1.5&r2=1.6&f=h
http://www.dragonflybsd.org/cvsweb/src/sys/vfs/nfs/nfs_socket.c.diff?r1=1.13&r2=1.14&f=h




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403130329.i2D3TdCv044838>