From owner-freebsd-current@FreeBSD.ORG Sun Apr 18 08:02:30 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8C6B216A4CE for ; Sun, 18 Apr 2004 08:02:30 -0700 (PDT) Received: from heinz.dinsnail.net (dinsnail.net [217.160.166.159]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9661943D55 for ; Sun, 18 Apr 2004 08:02:29 -0700 (PDT) (envelope-from michael@weiser.dinsnail.net) Received: from heinz.dinsnail.net (heinz.dinsnail.net [127.0.0.1] (may be forged)) by heinz.dinsnail.net (8.12.11/8.12.11) with ESMTP id i3IF2Lc7025579; Sun, 18 Apr 2004 17:02:21 +0200 Received: from khazad-dum.weiser.dinsnail.net (uucp@localhost) i3IF2Ldh025578; Sun, 18 Apr 2004 17:02:21 +0200 Received: from khazad-dum.weiser.dinsnail.net (localhost [127.0.0.1]) i3IDBV7a000923; Sun, 18 Apr 2004 15:11:31 +0200 (CEST) (envelope-from michael@khazad-dum.weiser.dinsnail.net) Received: (from michael@localhost)i3IDBVhk000922; Sun, 18 Apr 2004 15:11:31 +0200 (CEST) (envelope-from michael) Date: Sun, 18 Apr 2004 15:11:31 +0200 From: Michael Weiser To: Sean McNeil Message-ID: <20040418131131.GA660@weiser.dinsnail.net> References: <1080882894.5980.26.camel@server.mcneil.com> <20040402163353.GC6724@dan.emsphone.com> <1080940409.3711.1.camel@server.mcneil.com> <20040402215745.GB49311@dan.emsphone.com> <1080949413.49158.27.camel@server.mcneil.com> <20040403000742.GD49311@dan.emsphone.com> <1080953041.51638.11.camel@server.mcneil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1080953041.51638.11.camel@server.mcneil.com> User-Agent: Mutt/1.4.2.1i X-MailScanner: Found to be clean X-MailScanner-From: michael@weiser.dinsnail.net cc: freebsd-current@freebsd.org Subject: Re: nfs server issues X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Apr 2004 15:02:30 -0000 On Fri, Apr 02, 2004 at 04:44:02PM -0800, Sean McNeil wrote: > Bingo! It looks like a problem with checksum offloading: > ifconfig re0 -rxcsum -txcsum > and now it no longer hangs. Good call! The NIC in question is: Yesterday I realised that I have the same problem here with a -CURRENT server and a linux-2.6.5 client. Reading works fine but on writes of big files the nfs server will lock up gradually after about 10MB being written. First the transfer blocks but ssh and other services will continue to work. Later on the machine gets severely locked up. It's still pingable but the filesystem seems to be stuck somewhere. It starts with an ls hanging when run on the server in the directory written to by the client, later the ssh session itself gets locked. > re0: port 0xa400-0xa4ff mem > 0xdf004000-0xdf0040ff irq 12 at device 11.0 on pci1 The client has a VIA Rhine II onboard NIC and the server a 3Com 3c905B being configured as follows: xl0: flags=8843 mtu 1500 options=9 inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255 inet6 fe80::246:3ed:fe38:ea5c%xl0 prefixlen 64 scopeid 0x1 inet6 fec0::1:1 prefixlen 112 ether 00:50:04:38:ea:5c media: Ethernet autoselect (100baseTX) status: active First thing I tried switching off receive checksum unloading but that didn't change anything. What is the VLAN_MTU option actually doing? Then I tried switching back and forth between nfsv3 and nfsv2 as well as tcp and udp transport. No effect either. Then I booted FreeSBIE-1.0 on the client and mounted the same filesystem off the server. With that it actually worked fine and gave realistic throughput with udp and tcp. But when I set the same rsize and wsize (8129/8192) values as on Linux the server got stuck again after 10MB. After that I tried lowering the rsize/wsize on Linux as well. With 1024/1024 and 2048/2048 there's no lockup but throughput is at ~60KB/s and the server seems to sync every single write. With 4096/4096 and 8192/8192 I get the lockups again. Can anyone give me a hint which option to tune to get this working reliably? I'm fairly new to FreeBSD and lack the necessary insight to debug this on kernel/gdb level but I'd be happy to give it a try if someone gave me a point to start. Thanks in advance for any insights into this one. -- Micha