From owner-freebsd-current@FreeBSD.ORG Thu May 15 06:37:22 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7E79837B401; Thu, 15 May 2003 06:37:22 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7C65A43F85; Thu, 15 May 2003 06:37:21 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h4FDb7On020466; Thu, 15 May 2003 09:37:07 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h4FDb6Af020463; Thu, 15 May 2003 09:37:06 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Thu, 15 May 2003 09:37:06 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Andrew P. Lentvorski, Jr." In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Don Lewis cc: alfred@FreeBSD.org cc: current@FreeBSD.org Subject: Re: rpc.lockd spinning; much breakage X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 May 2003 13:37:22 -0000 On Thu, 15 May 2003, Andrew P. Lentvorski, Jr. wrote: > > It looks like rpc.statd on the client needs to remember that it requested > > the lock, > > That's not the purpose of rpc.statd. rpc.statd is only recording locks > for server/client crash recovery. It should not get involved in cancel > message problems. NFS is supposed to be stateless. Any state required > for locking is solely the responsibility of rpc.lockd. Putting the > state rpc.statd just pushes the problem around without getting rid of > it. Er, yes. That's what I meant to write. rpc.lockd. :-) > In fact, as currently written, I'm pretty sure that rpc.statd does not > work correctly anyway. I'm still making my way through the XNFS spec, and will take a look at rpc.statd when I'm done. rpc.lockd is becoming a lot more clear to me with more deep reading of the spec :-). > > ... It's not clear to me how that should be accomplished: perhaps when > > it tries to wake up the process and discovers it is missing, it should > > do it, or if the lock attempt is aborted early due to a signal, a > > further message should be sent from the kernel to the userland rpc.lockd > > to notify it that the lock instance is no longer of interest. Note that > > if we're only using the pid to identify a process, not a pid and some > > sort of generation number, there's the potential for pid reuse and a > > resulting race. > > One solution would be for the client kernel to maintain all locks (UFS, > NFS, SMB, whatever) in one area/data structure and then delegate the > appropriate specific actions as the signals come on. > > Another alternative is that rpc.lockd must register a kevent for every > process which requests a lock in NFS so that it gets notified if the > process gets terminated. > > I have no idea which would be the better/easier solution. freebsd-fs > has been notably silent on this issue. I suspect the "easier" solution is to continue to work with rpc.lockd in its current structure and adapt it to be aware of the additional events of interest. I believe, incidentally, that the open() can be interrupted by a signal when grabbing the exclusive lock and keep running, in which case the locking attempt is aborted, but the process hasn't died. So even a kevent indicating the death of the process isn't sufficient. I need to test this assumption, but it strikes me as pretty likely. One issue that concerns me is races -- registering for a kevent after getting a request assumes that you can get it registered before the process dies. In SMP systems, this introduces a natural race. I think it sounds like there are two parts to a solution: (1) The kernel notifies rpc.lockd when a process aborts a lock attempt, which permits rpc.lockd to handle one of two cases: (a) The abort arrived before the lock was granted by the server, in which case we need to abort or release the distributed lock attempt -- I don't know, but am guessing, that these are the same operation due to the potential for "crossing in flight" with a response. (b) The abort arrived after the lock was granted by the server but before the kernel was notified of the grant; in which case we release the lock. I think it would also be useful to handle: (2) When no process is available to accept a lock response, that lock should be immediately released. I'm still getting a grasp on the details of rpc.lockd, so I'm not to clear on how much state is carried around internally. In terms of the kernel side, it would not be hard to add an additional case to the error handling for tsleep() to pick up EINTR, in which case we stuff another message into the fifo reporting an abort by the process. In fact, we might be able to reuse the "unlock request" by simply writing the same locking message back into the fifo with F_UNLCK, and just make sure rpc.lockd knows that if it gets an unlock while the lock event is in process, it should do the right thing. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories