Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 May 1996 19:33:39 -0500 (CDT)
From:      "Karl Denninger, MCSNet" <karl@mcs.com>
To:        davidg@Root.COM
Cc:        hackers@FreeBSD.ORG
Subject:   Re: Grrr.. is this is a FreeBSD problem (TIME_WAIT again)
Message-ID:  <m0uNTm7-000IDOC@venus.mcs.com>
In-Reply-To: <199605252237.PAA23150@Root.COM> from "David Greenman" at May 25, 96 03:37:47 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >If the caller and callee are on DIFFERENT machines, I get no stale sockets.
> >This is reliable even if there are tens of new connections per minute.
> >
> >If the caller and callee are on the SAME machine, I get sockets in TIME_WAIT
> >for 2 minutes each (grrrr) which, if the traffic is heavy enough, eventually
> >blocks new connections for a few minutes until they clear up.  None of the 
> >sockets in TIME_WAIT has output or input pending; both counts show zero.
> >
> >This is a serious problem!
> 
>    Based on what you've said thus far, it's working as it is supposed to.
> There is a good discussion of the 2MSL wait ("TIME_WAIT") in "TCP/IP
> Illustrated Volume 1", page 242, by W. Richard Stevens. Depending on how
> your program handles it's ports/connections, you might be able to use the
> SO_REUSEADDR socket option to avoid the problem. See page 244.
> 
> -DG
> 
> David Greenman
> Core-team/Principal Architect, The FreeBSD Project

I understand the purpose behind the 2MSL wait, but my understanding was that
this was imposed for non-cleanly closed connections to prevent a collision
(and possibly delivering data to the "wrong" client).

The problem is two-fold:

1)	The TIME_WAIT sockets are no big deal in and of themselves (I have
	lots of Mbuf resources on the machine under consideration)..... 

2)	BUT, once there get to be a dozen or so of these sockets in
	TIME_WAIT, a NEW connection trying to bind to the server (on the 
	"connect()" call) gets blocked until one of the slots frees up!

For a high-volume transaction server this is murderous, as it means that the
processing limit is (outstanding TIME_WAIT sockets MAX / 2) transactions per 
minute.  

That's awfully conservative.... and a problem.

I am being very careful to insure that (1) all the data is out of the pipe
before I close it and (2) the stream is correctly closed in both server and
client before either calls exit(). I would think that the system could
(should?) immediately release the socket under these circumstances to
prevent the blocking condition, or that the blocking condition shouldn't
exist in the first place.  It also doesn't explain why I don't get the same
condition when I have the server and client on different machines.

I can't use UDP because the data going across the link is encrypted with a
stream cipher that is position dependant (thus, a lost or corrupted packet
would lead to the cipher getting hosed).  

The application in question is a custom RADIUS hack which talks to our
authentication database.  RADIUS spawns a new child (gack!) off for each
request, which also plays hell with me -- since these children are asynchronous
clients, I cannot use one connection for all due to sequencing problems (each
transaction may require multiple packets to be delivered, depending on
the particulars of the authentication request, and muxing them into a
persistent socket link could easily lead to confusion between sessions).

What I don't understand is why a *new* caller, trying to connect FROM a
different source port to the same destination (with plenty of "listen" 
slots open on the server side) would block under these conditions.

That is contrary to my expectations and understanding of how the socket 
layer works.

SO_REUSEADDR doesn't do me any good, as I'm not trying to reuse a tuple 
(again, I am allowing the system to grab a random port on the client side).  
I would not expect to run into the problem I'm seeing here unless I ran out 
of available Mbuf resources OR if the "random" port happened to duplicate 
one which is the TIME_WAIT state (which isn't happening).

How do you get around this if you need to have many (dozens or more) TCP
connections made to a given client (and dropped) in a short period of time?

It sounds to me like the goal is impossible to meet.

--
--
Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity
Modem: [+1 312 248-0900]     | T1 from $600 monthly; speeds to DS-3 available
Voice: [+1 312 803-MCS1]     | 21 Chicagoland POPs, ISDN, 28.8, much more
Fax: [+1 312 248-9865]       | Email to "info@mcs.net" WWW: http://www.mcs.net/
ISDN - Get it here TODAY!    | Home of Chicago's only FULL Clarinet feed!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0uNTm7-000IDOC>