From owner-freebsd-current@FreeBSD.ORG Sat Nov 1 12:34:40 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 273D716A4CE for ; Sat, 1 Nov 2003 12:34:40 -0800 (PST) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6005A43FA3 for ; Sat, 1 Nov 2003 12:34:39 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from user-38ldtjc.dialup.mindspring.com ([209.86.246.108] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1AG2SA-0000W5-00; Sat, 01 Nov 2003 12:34:36 -0800 Message-ID: <3FA41782.8FB1DFF8@mindspring.com> From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Michal Mertl References: <20031029183808.M99053@prg.traveller.cz> <200310300804.58296.sam@errno.com> <20031031151312.Y55560@prg.traveller.cz> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4dd6005ac5eaeeb34b00724486f120750350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: Barney Wolff cc: current@freebsd.org Subject: Re: jumbograms (& em) & nfs a no go X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sat, 01 Nov 2003 20:34:40 -0000 X-Original-Date: Sat, 01 Nov 2003 12:28:50 -0800 X-List-Received-Date: Sat, 01 Nov 2003 20:34:40 -0000 Michal Mertl wrote: > On Fri, 31 Oct 2003, Terry Lambert wrote: > > Michal Mertl wrote: > > > I then left one computer at 4.9 and upgraded the other to 5.0. When I > > > mount a partition from 5.0 machine I found out, that copying reliably > > > works only from 5.0 to 4.9. The other way around I see messages 'em0: > > > discard oversize frame (ether type 800 flags 3 len 67582 > max 6014)' on > > > 5.0 and the copying stalls. On 4.9 machine I later see 'nfs server > > > 10.0.0.2:/usr: not responding'. The interface is stuck for some time - can > > > be revived by changing mtu back to 1500 and down/up sequence. > > > > Implies the sending host is not honoring the MTU restriction when > > deciding whether or not to frag packets. > > Can you suggest what to do to find out what's really happening? I thought > nfsd network part was mostly userland thus the same as ftpd (or better > netperf) and should work. No. Traditionally (except in Linux), nfsd is a userland thing that calls a system call and never returns to user space. It exists in order to provide a process context for use by blocking calls in the kernel, specifically for use by tsleep(), wakeup(), and so on. In more modern UNIX systems, it's a kernel thread, and has no user space existance at all, or, on systems that will permit NFS to be turned off, and don't have the ability/desire to hang the kernel on/off state off the existance/nonexistance of active exports, it's a stub that tells the kernel to run the kernel thread(s). The easiest way to find out what's happening is to grep the BSD sources where the message is coming from, and then work back to understand the code paths that permitted something that's 67582 bytes in size to get there in the first place. Not looking at the absolutely newest sources, from memory, my original comment was based on "it had to come from the driver that way". This may or may not be a valid assumption, but it's at last a starting hypothesis with a high probability. BTW: It should be impossible to send jumbograms > 9K in size, since the jumbogram specification requires that to be the top end limit. However, it also requires that Tigon2 and Intel Gigabit ethernet adapters be able to negotiate >1500 MTU's, up to the jumbogra sie, between them, ad th Intel hardware never cooperated when I was trying to get it to autonegotiate, and I always ended up falling back to a 1500 MTU, unless I forced the issue with ifconfig. I think at this point, you are going to have to look at the sources; IMO, it's a problem in some code that calls the ether_output() function directly with too large a packet, and since NFS doesn't manually implement TCP, that's not it. Hmmm. Is this maybe UDP? If so, the easiest fix is "don't use UDP"; FreeBSD's UDP fragment reassembly code sucks anyway, and gives an excellent means of implementing a DOS attack on the target system's available mbufs. If it's UDP, and you insist on it working, you might want to make sure that the packet goes through the UDP fragmentation and NFS rsize/wsize limitation code. -- Terry