From owner-freebsd-stable@FreeBSD.ORG Mon Nov 2 15:30:09 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C0E6106566B; Mon, 2 Nov 2009 15:30:09 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id B78708FC1D; Mon, 2 Nov 2009 15:30:08 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAMeJ7kqDaFvH/2dsb2JhbADcZoIyggoE X-IronPort-AV: E=Sophos;i="4.44,667,1249272000"; d="scan'208";a="53668886" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 02 Nov 2009 10:30:06 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id BEDC510844B9; Mon, 2 Nov 2009 10:30:06 -0500 (EST) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VR-IjGKz5+SK; Mon, 2 Nov 2009 10:30:05 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id B5765108446D; Mon, 2 Nov 2009 10:30:05 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id nA2FbOd12537; Mon, 2 Nov 2009 10:37:24 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Mon, 2 Nov 2009 10:37:24 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Olaf Seibert In-Reply-To: <20091102100958.GY841@twoquid.cs.ru.nl> Message-ID: References: <20091027164159.GU841@twoquid.cs.ru.nl> <20091029135239.GX841@twoquid.cs.ru.nl> <20091102100958.GY841@twoquid.cs.ru.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: danny@cs.huji.ca.il, dfr@freebsd.org, freebsd-stable@freebsd.org Subject: Re: 8.0-RC1 NFS client timeout issue X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Nov 2009 15:30:09 -0000 On Mon, 2 Nov 2009, Olaf Seibert wrote: >> Although I think the patch does avoid sending the request on the >> partially closed connection, it doesn't fix the "real problem", >> so I don't know if it is worth testing? > > Well, I tested it anyway, just in case. It seems to work fine for me, so > far. > Yes, I think the patch is ok, but it doesn't completely resolve the reconnect issue. It's good to hear that it helps for your case. > I don't see your extra RSTs either. Maybe that is because in my case the > client used a different port number for the new connection. (Usually, > this is controlled by the TCP option SO_REUSEADDR from ). > For my packet trace, it is using different port#s. The problem is that, for some reason, it sends the RST from the new port# instead of the port# for the old connection just closed via soclose(). I don't know why you don't see the extra RSTs, but consider yourself lucky, since you should be ok without them. (It may simply be that your server isn't Solaris10 --> a different TCP stack in it.) Do you happen to know what your server is? >> I'm hoping that the "Help TCP Wizards..." thread I just started >> on freebsd-current comes up with something. >> >> At least I can reproduce the problem now. (For some reason, I have >> to reboot the Solaris10 server before the problem appears for me. >> I can't think why this matters, but that's networking for you:-) > > Maybe it depends on server load or something. This particular server is > a central file server at a university, it may have some more pressure to > terminate unused connections. > Or type of server (ie. not Solaris10). It definitely depends upon timing in the client. (I'm about to try introducing a 1sec delay before the soconnect() call and see if that makes the RSTs go away. Not much of a fix, but...) I now recall that I ran into a similar problem (although I didn't dig into the packet traces then) when testing my Mac OS X 10 client, which uses essentially the reconnect code from Mac OS X 10.4 Tiger. I "fixed" it by adding a 1sec delay before the reconnect. Thanks for helping with testing, rick