Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Mar 2008 23:37:19 -0600 (CST)
From:      Mike Silbersack <silby@silby.com>
To:        Fernando Gont <fernando@gont.com.ar>
Cc:        Rui Paulo <rpaulo@fnop.net>, freebsd-net@freebsd.org, Kevin Oberman <oberman@es.net>
Subject:   Re: Ephemeral port range (patch) 
Message-ID:  <20080303232430.Q43305@odysseus.silby.com>
In-Reply-To: <200803031454.m23EsVeZ006812@venus.xmundo.net>
References:  <Your message of "Sat, 01 Mar 2008 11:34:27 -0200." <200803011338.m21DcY9Z026418@venus.xmundo.net> <20080301224217.33F0A45047@ptavv.es.net> <200803020034.m220YJ6t018608@venus.xmundo.net> <20080303002815.U37933@odysseus.silby.com> <200803031454.m23EsVeZ006812@venus.xmundo.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 3 Mar 2008, Fernando Gont wrote:

> (Shame on me... somehow you mail got stuck in my queue, and I didn't respond 
> to it).

No sweat, I've taken far longer to reply to your e-mails!

> While I haven't look match at the scheme proposed by Amit, I think there's a 
> "flaw" with the algorithm: IP IDs need to be unique for {source IP, des IP, 
> Protocol}. And the algorithm still keeps a *global* IP ID. That means you'll 
> cycle through the whole IP ID space when you probably didn't need to.

That is true.  I think we have a time/space tradeoff here, with Amit's 
algorithm taking more memory and less time than a hash-based algorithm. 
But I haven't benchmarked one against the other, so it is possible that a 
double-hash might win in both categories.

I think Robert Watson said something about investigating the issue of IP 
IDs more in the near future.  What I'd like to see (if possible) is that 
we use Amit's algorithm until we've established a connection with a host, 
then switch to per-IP state and just use linear IP IDs.  That would seem 
to provide the least overhead for high speed connections.

> That said, at least theoretically speaking, one could argue that there 
> shouldn't be a problem with simply randomizing the IP ID number. For 
> connection-oriented protocols, you should be doing PMTUD, and thus will not 
> care about the IP ID. If your packets are doing fragmentation, then on links 
> will large bandwidth-delay products you're already in trouble. For 
> connection-less transport protocols (e.g., UDP), while they usually do not 
> implement PMTUD, they also do not implement flow-control or congestion 
> control. So you are either sending data to a local system (e.g., in a LAN), 
> or you probably shouldn't be sending data that fast (and then you shouldn't 
> have problems with trivially randomizing the IP ID).

I have attempted to make that argument before, and it did not go over well 
with most people.  :)

I think the counter-argument was primarily centered around UDP NFS, which, 
as you pointed out, is almost always a losing case.

>> The double-hash concept sounds pretty good, but there's a major problem 
>> with it.  If an application does a bind() to get a local port before doing 
>> a connect(), you don't know the remote IP or the remote port.
>
> Yes, this is described in Section 3.5 of our id 
> (http://www.ietf.org/internet-drafts/draft-ietf-tsvwg-port-randomization-01.txt). 
> Our take is that in that scenario you could simply randomize the local port. 
> (i.e., implement the double-hash scheme, and fall-back to trivial 
> randomization when you face this scenario).

Doh, I will try to read the ENTIRE paper next time before commenting.

>> There's a related "feature" in the BSD TCP stack that all local ports are 
>> considered equal; even for applications that do a connect() call and 
>> specify a remote IP/port, we do not let them use the same local port to two 
>> different remote IPs at the same time.  This puts a limit on the total 
>> number of outgoing connections that one machine can have.
>
> mmm... I see. So this could limit the number of outgoing connections to about 
> (ephemeral_ports/TIME_WAIT). Any objections against changing this? At least 
> for outgoing connections (i.e., non-listening sockets), this shouldn't be the 
> case. I'd be interested in working on this issue...

I don't think anyone is actively working on that problem, so you won't be 
stepping on anyone's toes by looking into it.  Bring on the patches!

There's a piece of low hanging fruit also in that area - we add incoming 
connections to the local port hash table, even though it seems unlikely 
that you are going to receive a connection from 1.1.1.1:50000->1.1.1.2:80 
and then connect from 1.1.1.2:80->1.1.1.1:50000.  Those unnecessary 
additions to the local port hash time would be nice to remove if you're 
investigating the related issues.

One thing you may or may not have noticed is that FreeBSD keeps TIME_WAIT 
sockets in a seperate zone which has a limit size, so you will not have to 
worry too much about them clogging up all ephemeral ports.

-Mike



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080303232430.Q43305>