From owner-freebsd-current@FreeBSD.ORG Tue Dec 13 02:46:36 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D77621065670; Tue, 13 Dec 2011 02:46:36 +0000 (UTC) (envelope-from jwd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id ABFB38FC0C; Tue, 13 Dec 2011 02:46:36 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pBD2kaCI002374; Tue, 13 Dec 2011 02:46:36 GMT (envelope-from jwd@freefall.freebsd.org) Received: (from jwd@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pBD2kanc002373; Tue, 13 Dec 2011 02:46:36 GMT (envelope-from jwd) Date: Tue, 13 Dec 2011 02:46:36 +0000 From: John To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Message-ID: <20111213024636.GA47103@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: multihomed nfs server - NLM lock failure on additional interfaces X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Dec 2011 02:46:36 -0000 Hi Folks, I have a 9-prerelease system where I've been testing nfs/zfs. The system has been working quite well until moving the server to a multihomed configuration. Given the following: nfsd: master (nfsd) nfsd: server (nfsd) /usr/sbin/rpcbind -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/mountd -r -l -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/rpc.statd -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/rpc.lockd -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 10.24.6.38 is the default interface on 1G. The 172 nets are 10G connected to compute systems. ifconfig_bce0=' inet 10.24.6.38 netmask 255.255.0.0 -rxcsum -txcsum' _c='physical addr which never changes' ifconfig_bce1=' inet 172.1.1.2 netmask 255.255.255.0' _c='physcial addr on crossover cable' ifconfig_cxgb2='inet 172.21.21.129 netmask 255.255.255.0' _c='physical backside 10g compute net' ifconfig_cxgb3='inet 172.21.201.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb6='inet 172.21.202.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb8='inet 172.21.203.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb4='inet 172.21.204.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb0='inet 172.21.205.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' The 10.24.6.34 and 10.24.6.33 are alias addresses for the system. Destination Gateway Flags Refs Use Netif Expire default 10.24.0.1 UGS 0 1049 bce0 The server works correctly (and quite well) for both udp & tcp mounts. Basically, all nfs traffic is great! However, locking only works for clients connected to the 10.24.6.38 interface. A tcpdump file from good & bad runs: http://www.freebsd.org/~jwd/lockgood.pcap http://www.freebsd.org/~jwd/lockbad.pcap Basically, the clients (both FreeBSD & Linux) query the servers rpcbind for the address of the nlm which is returned correctly. For the good run, the NLM is then called. For the bad call, it is not. I've started digging through code, but I do not claim to be an rpc expert. If anyone has suggestions I would appreciate any pointers. Thanks! John