Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jul 2011 07:50:41 -0400
From:      Gary Palmer <gpalmer@freebsd.org>
To:        Paul Keusemann <pkeusem@visi.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Debugging dropped shell connections over a VPN
Message-ID:  <20110727115041.GE1339@in-addr.com>
In-Reply-To: <4E2F08E4.2070100@visi.com>
References:  <4E159C5A.5090702@visi.com> <13D65A4C-F874-4970-A070-AA0392416680@mac.com> <4E1C9FEA.2080608@visi.com> <20110720201502.GA37199@in-addr.com> <4E2EAAD7.6040906@visi.com> <20110726130549.GD1339@in-addr.com> <4E2F08E4.2070100@visi.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 26, 2011 at 01:35:16PM -0500, Paul Keusemann wrote:
> On 07/26/11 08:05, Gary Palmer wrote:
> >On Tue, Jul 26, 2011 at 06:53:59AM -0500, Paul Keusemann wrote:
> >>Again, sorry for the sluggish response.
> >>
> >>On 07/20/11 15:15, Gary Palmer wrote:
> >>>On Tue, Jul 12, 2011 at 02:26:34PM -0500, Paul Keusemann wrote:
> >>>>On 07/07/11 14:39, Chuck Swiger wrote:
> >>>>>On Jul 7, 2011, at 4:45 AM, Paul Keusemann wrote:
> >>>>>>My setup is something like this:
> >>>>>>- My local network is a mix of AIX, HP-UX, Linux, FreeBSD and Solaris
> >>>>>>machines running various OS versions.
> >>>>>>- My gateway / firewall  machine is running FreeBSD-8.1-RELEASE-p1 
> >>>>>>with
> >>>>>>ipfw, nat and racoon for the firewall and VPN.
> >>>>>>
> >>>>>>The problem is that rlogin, ssh and telnet connections over the VPN 
> >>>>>>get
> >>>>>>dropped after some period of inactivity.
> >>>>>You're probably getting NAT timeouts against the VPN connection if it 
> >>>>>is
> >>>>>left idle.  racoon ought to have a config setting called natt_keepalive
> >>>>>which sends periodic keepalives-- see whether that's disabled.
> >>>>>
> >>>>>Regards,
> >>>>Thanks for the suggestions Chuck, sorry it's taken so long to respond
> >>>>but I had to reconfigure and rebuild my kernel to enable IPSEC_NAT_T in
> >>>>order to try this out.
> >>>>
> >>>>One thing that I did not explicitly mention before is that I am routing
> >>>>a network over the VPN.
> >>>Hi Paul,
> >>>
> >>>Even if you are not being NAT'd on the VPN there may be a firewall (or
> >>>other active network component like a load balancer) with an
> >>>overflowing state table somewhere at the remote end.  We see this
> >>>frequently where I work with customer networks and the 
> >>>firewall/VPN/network
> >>>admin denies that its a time out issue so there is likely some device in
> >>>the network that has a state table and if the connection is idle for a
> >>>few minutes it gets dropped.
> >>Hmmm,  this seems likely.  Have you had any luck in finding the culprit
> >>and resolving the problem?
> >Unfortunately no.  We know the problem exists but as a vendor we have
> >very little success in getting the customer to identify the problematic
> >device inside their network as it only seems to affect our connections
> >to them when we are helping them with problems, so there is almost
> >always something more important going on and the timeout issue gets put
> >on the back burner and forgotten.  We've worked around it in some
> >places by using the ssh 'ServerAliveInterval' directive to make ssh
> >send packets and keep the session open even if we're idle, but that
> >doesn't always work.
> 
> OK, I found the ClientAliveInterval, and ClientAliveCountMax setting in 
> the ssh_config man page.  I assume these are what you are referring to.  
> I tried setting ClientAliveInterval to 15 seconds with 
> ClientAliveCountMax set to 3 and this seems to help.  I've only tried 
> this a couple of times but I have seen an ssh session stay alive for 
> over an hour.  The bad news is that the sessions are still getting 
> dropped, at least now I know when it happens.  Now I'm getting the 
> following message:
> 
>     Received disconnect from 10.64.20.69: 2: Timeout, your session not 
> responding.
> 
> From a quick perusal of the openssh source, it is not obvious whether 
> this message is coming from the client or the server side.   Initially, 
> because the keep alive timer is a server side setting, I assumed the 
> message was coming from the server side but if the session is not 
> responding how is the message getting to the client?  If it is a client 
> side problem, then I have much more flexibility to fix.  All I can do is 
> whine about server side problems.


Hi Paul,

ServerAliveInterval is actually a client setting.  e.g.  put this in
your ~/.ssh/config file

host *
	ServerAliveInterval 15

will set the client to ping the server every 15 seconds and try to
keep the connection alive.  You can replace '*' you want to be more
targeted in your configuration.

I've never played with the server side settings for various reasons.

Regards,

Gary



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110727115041.GE1339>