Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Sep 2006 00:18:56 -0500 (CDT)
From:      Mike Silbersack <silby@silby.com>
To:        Gleb Smirnoff <glebius@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/netinet in_pcb.c tcp_subr.c tcp_timer.c tcp_var.h
Message-ID:  <20060907000939.J12826@odysseus.silby.com>
In-Reply-To: <20060906150129.GT40020@FreeBSD.org>
References:  <200609061356.k86DuZ0w016069@repoman.freebsd.org> <20060906091204.B6691@odysseus.silby.com> <20060906143204.GQ40020@FreeBSD.org> <20060906093553.L6691@odysseus.silby.com> <20060906150129.GT40020@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 6 Sep 2006, Gleb Smirnoff wrote:

> I think we should free the oldmost tcptw entry in a case if we can't
> find the local endpoint. We can tell definitely that we can't find one
> only in in_pcbbind_setup() in the "do {} while (in_pcblookup_local)" cycle,
> where EADDRNOTAVAIL is returned. We can't definitely tell this in
> in_pcblookup_local() since we don't know whether tried port is the
> last one.
>
> The oldmost tcptw entry can be taken simply from the ordered list, like
> tcp_timer_2msl_tw() does this.

That's something along the lines of what I was thinking.  However, I think 
it'll be slightly more complex than taking just the oldest entry from the 
list.  We could have time_wait states that are for sockets such as 
remoteip:ephemeralport <-> localip:80 and also 
localip:ephemeralport  <-> remoteip:80.  We'd have to find one of the ones 
of the second type to recycle.

I think I know why my implementation went so wrong - I was testing the 
case where I had http_load (or was it apachebench?) connecting to apache 
on another machine.  The case I was trying to solve was where the http 
benchmark tool created all the time_wait sockets on the client, thereby 
preventing new connections from being made.  In that case, the heuristic 
would (probably) recycle the first socket it came upon, and be done.  In 
your case, it would recycle the first socket it came upon, but it would be 
one of the remoteip:ephemeralport <-> localip:80 sockets, which wouldn't 
help it at all.  Does that sound like what was happening?

(I haven't reviewed the code, and I'm speaking from memory, so I apologize 
if I have the details slightly off.)

> However, I don't like the idea of "finding" the free port at all. This
> makes connect()/bind() performance depending on number of busy endpoints.
> Shouldn't we make an algorythm, where free endpoints are stored, and
> we don't need to _find_ one, we could just _take_ one?

That's an interesting question.  I guess right now the assumption is that 
you have 65535 ports, and very few of them are used, so it's cheaper to 
guess and see if one isn't used.  You, on the other hand, seem to have a 
large number in use, so things are quite different.  I guess you could 
make a port freelist.  That would also solve the problem of randomized 
ephemeral ports causing a port to be reused too quickly.  I'd be happy to 
review any such patch you could come up with in this area.

> M> With this code removed, are you not seeing the web frontends delaying new
> M> connections when they can't find a free port to use?
>
> No. We monitor the amount of entries in tcptw zone. It is the same
> as before. So the periodic cycle purges tcptw states at the same
> rate as in_pcblookup_local() was, except that it does this consuming
> less CPU time.

Ok, so you weren't actually running out of ephemeral ports like I was in 
the http benchmark tool scenario then.

Mike "Silby" Silbersack



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060907000939.J12826>