From owner-freebsd-fs@FreeBSD.ORG Wed Sep 19 06:17:05 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 63988106566C for ; Wed, 19 Sep 2012 06:17:05 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id D676F8FC15 for ; Wed, 19 Sep 2012 06:17:04 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q8J6HCUA003288; Wed, 19 Sep 2012 09:17:12 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q8J6H0sA042206; Wed, 19 Sep 2012 09:17:00 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q8J6Gxrf042205; Wed, 19 Sep 2012 09:16:59 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 19 Sep 2012 09:16:59 +0300 From: Konstantin Belousov To: Rick Macklem Message-ID: <20120919061659.GS37286@deviant.kiev.zoral.com.ua> References: <20120918085941.GZ37286@deviant.kiev.zoral.com.ua> <21418398.765673.1347975294365.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2CmXnuBWhlSJqbcw" Content-Disposition: inline In-Reply-To: <21418398.765673.1347975294365.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FS List Subject: Re: testing/review of atomic export update patch X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Sep 2012 06:17:05 -0000 --2CmXnuBWhlSJqbcw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Sep 18, 2012 at 09:34:54AM -0400, Rick Macklem wrote: > Konstantin Belousov wrote: > > On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote: > > > Konstantin Belousov wrote: > > > > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote: > > > > > Hi, > > > > > > > > > > There is a simple patch at: > > > > > http://people.freebsd.org/~rmacklem/atomic-export.patch > > > > > that can be applied to a kernel + mountd, so that the new > > > > > nfsd can be suspended by mountd while the exports are being > > > > > reloaded. It adds a new "-S" flag to mountd to enable this. > > > > > (This avoids the long standing bug where clients receive ESTALE > > > > > replies to RPCs while mountd is reloading exports.) > > > > > > > > This looks simple, but also somewhat worrisome. What would happen > > > > if the mountd crashes after nfsd suspension is requested, but > > > > before > > > > resume was performed ? > > > > > > > > Might be, mountd should check for suspended nfsd on start and > > > > unsuspend > > > > it, if some flag is specified ? > > > Well, I think that happens with the patch as it stands. > > > > > > suspend is done if the "-S" option is specified, but that is a no op > > > if it is already suspended. The resume is done no matter what flags > > > are provided, so mountd will always try and do a "resume". > > > --> get_exportlist() is always called when mountd is started up and > > > it does the resume unconditionally when it completes. > > > If mountd repeatedly crashes before completing get_exportlist() > > > when it is started up, the exports will be all messed up, so > > > having the nfsd threads suspended doesn't seem so bad for this > > > case (which hopefully never happens;-). > > > > > > Both suspend and resume are just no ops for unpatched kernels. > > > > > > Maybe the comment in front of "resume" should explicitly explain > > > this, instead of saying resume is harmless to do under all > > > conditions? > > > > > > Thanks for looking at it, rick > > I see. > >=20 > > My another note is that there is no any protection against parallel > > instances of suspend/resume happen. For instance, one thread could set > > suspend_nfsd =3D 1 and be descheduled, while another executes resume > > code sequence meantime. Then it would see suspend_nfsd !=3D 0, while > > nfsv4rootfs_lock not held, and tries to unlock it. It seems that > > nfsv4_unlock would silently exit. The suspending thread resumes, > > and obtains the lock. You end up with suspend_nfsd =3D=3D 0 but lock he= ld. > Yes. I had assumed that mountd would be the only thing using these syscal= ls > and it is single threaded. (The syscalls can only be done by root for the > obvious reasons.;-) >=20 > Maybe the following untested version of the syscalls would be better, sin= ce > they would allow multiple concurrent calls to either suspend or resume. > (There would still be an indeterminate case if one thread called resume > concurrently with another few calling suspend, but that is unavoidable, > I think?) >=20 > Again, thanks for the comments, rick > --- untested version of syscalls --- > } else if ((uap->flag & NFSSVC_SUSPENDNFSD) !=3D 0) { > NFSLOCKV4ROOTMUTEX(); > if (suspend_nfsd =3D=3D 0) { > /* Lock out all nfsd threads */ > igotlock =3D 0; > while (igotlock =3D=3D 0 && suspend_nfsd =3D=3D 0) { > igotlock =3D nfsv4_lock(&nfsv4rootfs_lock, 1, > NULL, NFSV4ROOTLOCKMUTEXPTR, NULL); > } > suspend_nfsd =3D 1; > } > NFSUNLOCKV4ROOTMUTEX(); > error =3D 0; > } else if ((uap->flag & NFSSVC_RESUMENFSD) !=3D 0) { > NFSLOCKV4ROOTMUTEX(); > if (suspend_nfsd !=3D 0) { > nfsv4_unlock(&nfsv4rootfs_lock, 0); > suspend_nfsd =3D 0; > } > NFSUNLOCKV4ROOTMUTEX(); > error =3D 0; > } =46rom the cursory look, this variant is an improvement, mostly by taking the interlock before testing suspend_nfsd, and using the while loop. Is it possible to also make the sleep for the lock interruptible ? So that blocked mountd could be killed by a signal ? --2CmXnuBWhlSJqbcw Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlBZY1sACgkQC3+MBN1Mb4jzjgCfVE5TsFuaN7NItix9xLNCMjam eKkAn2IsumdW+ckxb4xAGXZorptD5njG =J1Hs -----END PGP SIGNATURE----- --2CmXnuBWhlSJqbcw--