From owner-freebsd-bugs@FreeBSD.ORG Thu Sep 4 16:00:27 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 01A6116A4BF for ; Thu, 4 Sep 2003 16:00:27 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0064D44001 for ; Thu, 4 Sep 2003 16:00:24 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h84N0OUp054629 for ; Thu, 4 Sep 2003 16:00:24 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h84N0OkO054628; Thu, 4 Sep 2003 16:00:24 -0700 (PDT) Resent-Date: Thu, 4 Sep 2003 16:00:24 -0700 (PDT) Resent-Message-Id: <200309042300.h84N0OkO054628@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jonathan Lennox Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1686516A4BF for ; Thu, 4 Sep 2003 15:55:33 -0700 (PDT) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2FE9B43F85 for ; Thu, 4 Sep 2003 15:55:32 -0700 (PDT) (envelope-from lennox@cs.columbia.edu) Received: from cnr.cs.columbia.edu (cnr.cs.columbia.edu [128.59.19.133]) by cs.columbia.edu (8.12.9/8.12.9) with ESMTP id h84MtTaH009275 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT) for ; Thu, 4 Sep 2003 18:55:31 -0400 (EDT) Received: from cnr.cs.columbia.edu (localhost [127.0.0.1]) by cnr.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id h84MsxuT041672 for ; Thu, 4 Sep 2003 18:54:59 -0400 (EDT) (envelope-from lennox@cnr.cs.columbia.edu) Received: (from lennox@localhost) by cnr.cs.columbia.edu (8.12.9/8.12.9/Submit) id h84MsxdA041659; Thu, 4 Sep 2003 18:54:59 -0400 (EDT) Message-Id: <200309042254.h84MsxdA041659@cnr.cs.columbia.edu> Date: Thu, 4 Sep 2003 18:54:59 -0400 (EDT) From: Jonathan Lennox To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Jonathan Lennox List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Sep 2003 23:00:27 -0000 >Number: 56461 >Category: kern >Synopsis: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Thu Sep 04 16:00:24 PDT 2003 >Closed-Date: >Last-Modified: >Originator: Jonathan Lennox >Release: FreeBSD 5.1-RELEASE-p2 i386 >Organization: Columbia University Computer Science >Environment: System: FreeBSD cnr.cs.columbia.edu 5.1-RELEASE-p2 FreeBSD 5.1-RELEASE-p2 #0: Wed Aug 27 22:24:11 EDT 2003 lennox@cnr.cs.columbia.edu:/usr/obj/usr/src/sys/CNR i386 >Description: Linux's implementation of NFS NLM locks is buggy: it doesn't support lock cookies longer than 8 bytes in size. See the comment in on the definition of 'struct nlm_cookie': "NLM cookies. Technically they can be 1K, Nobody uses over 8 bytes however." Unfortunately, this is actually "nobody" except FreeBSD 5.x, which uses 16-byte cookies. As a result, any attempt by a FreeBSD client to lock an NFS-mounted file from a Linux server results in the process on the FreeBSD client hanging, unkillably. Getting this fixed in Linux will probably be difficult -- after all, it doesn't inconvenience *Linux* users. Moreover, since this hasn't been fixed as of Linux 2.6-test, any server-side fix is going to take a *long* time to be reliably deployed. As such, I'm afraid that in order to have successful interoperation with Linux NFS servers, the FreeBSD NFS lock client code needs to be modified to send only 8-byte NLM cookies. The patch I've attached below is a quick-and-dirty fix, as recommended by Dan Nelson on freebsd-hackers on 29 April 2003. However, it loses functionality, since the protection against PID recycling is disabled. A proper fix would be either to somehow compress all three pieces of information -- pid, pid_start, and msg_seq -- into eight bytes (difficult); maintain an in-kernel table mapping an eight-byte sequence number to lockd_msg_ident; or find some other, smaller way of defending against pid recycling. >How-To-Repeat: Make sure rpc.lockd and rpc.statd are running. NFS-mount a filesystem from a Linux fileserver. flock() the file. Observe the flock()ing process hanging. Notice that not even kill -9 will kill the process. >Fix: Apply the following patch, and rebuild rpc.lockd and your kernel. --- nfs_lock.h.orig Thu Sep 4 18:11:45 2003 +++ nfs_lock.h Thu Sep 4 18:12:17 2003 @@ -49,12 +49,10 @@ /* * This structure is used to uniquely identify the process which originated * a particular message to lockd. A sequence number is used to differentiate - * multiple messages from the same process. A process start time is used to - * detect the unlikely, but possible, event of the recycling of a pid. + * multiple messages from the same process. */ struct lockd_msg_ident { pid_t pid; /* The process ID. */ - struct timeval pid_start; /* Start time of process id */ int msg_seq; /* Sequence number of message */ }; --- nfs_lock.c.orig Thu Sep 4 18:11:50 2003 +++ nfs_lock.c Thu Sep 4 18:14:45 2003 @@ -117,7 +117,6 @@ p->p_nlminfo->pid_start = p->p_stats->p_start; timevaladd(&p->p_nlminfo->pid_start, &boottime); } - msg.lm_msg_ident.pid_start = p->p_nlminfo->pid_start; msg.lm_msg_ident.msg_seq = ++(p->p_nlminfo->msg_seq); msg.lm_fl = *fl; @@ -257,8 +256,8 @@ */ if (targetp->p_nlminfo == NULL || ((ansp->la_msg_ident.msg_seq != -1) && - (timevalcmp(&targetp->p_nlminfo->pid_start, - &ansp->la_msg_ident.pid_start, !=) || + (/*timevalcmp(&targetp->p_nlminfo->pid_start, + &ansp->la_msg_ident.pid_start, !=) || */ targetp->p_nlminfo->msg_seq != ansp->la_msg_ident.msg_seq))) { PROC_UNLOCK(targetp); return (EPIPE); >Release-Note: >Audit-Trail: >Unformatted: