Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Feb 2019 15:18:44 +0000
From:      David King <king.c.david@googlemail.com>
To:        Paul <devgs@ukr.net>
Cc:        freebsd-net@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Request for more intelligent local port allocation algorithm
Message-ID:  <CAGiBYGnZWrN2=U=VDbsn-t3ivOyUcaFL65oTT2CUjM_ExG0rUA@mail.gmail.com>
In-Reply-To: <1549461051.318520353.gg4fwwj8@frv39.fwdcdn.com>
References:  <1549461051.318520353.gg4fwwj8@frv39.fwdcdn.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Just to add to this, if anyone is doing some work on the outbound tcp
connection, could they also have a look at the bug here
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210726

Thanks!

On Wed, 6 Feb 2019 at 15:15, Paul <devgs@ukr.net> wrote:

> Hi dev team,
>
> It's not a secret that when application is trying to establish new TCP
> connection, without
> first binding a socket to specific local interface address, OS handles
> that automatically.
> Unfortunately there is a catch, that lies in a different logic of local
> port allocation:
> (1) when socket is bound before connect() vs (2) when it is not. When
> allocating the port
> in in_pcb_lport() by checking whether different ports are free, using
> in_pcblookup_local(),
> the behaviour is following:
>
> (1) Bound, ie laddr is assigned with specific address:
>     Port is considered occupied only if there is a PCBs that matches both
> laddr and lport
>
> (2) Not bound, ie laddr == INADDR_ANY:
>     Port is considered occupied if there is any PCBs that only matches
> lport. What this
>     means is that in order to allocate a port none of the all available
> local addresses
>     should have it allocated, even though this requirement is ridiculous,
> since we are
>     allocating only one PCB
>
> Looking though the code, it seems that (2) is due to the fact that
> tcp_connect() first
> allocates the port, indirectly through the call to in_pcbbind() and only
> then allocates
> the actual local address, also indirectly, though the call to
> in_pcbconnect_setup(), that
> in turn calls in_pcbladdr(). So, probably, in order to guarantee that
> in_pcbconnect_setup()
> will not fail we make sure that all range of local addresses are
> available, no matter
> which one of them is actually selected by in_pcbladdr()?
>
> In real world, this creates serious problems for servers that have a lot
> of outgoing
> connections, for example nginx proxy with a lot of open HTTP2 connections.
> In order to
> avoid this limitation we have created workarounds within the nginx config
> as well as
> within our  own software, basically by having 50 local addresses and only
> following the
> scenario (1). Alas, all of the built-in Unix utilities as well as other
> software always
> follow scenario (2). As the result given large number of connections there
> may be points
> in time, when whole range of ports is occupied by at least one local
> address. Even worse is
> the outcome of such condition: when in_pcb_lport() travels over the range
> of possible port
> numbers, making myriad of calls to in_pcblookup_local(), some  kind of
> important lock is
> being held withing the kernel. So important that it leads to a complete
> lock of the system.
> Even the direct terminal access is not available: it is not responsive.
> The more calls to
> connect through scenario (2) there are the longer it takes the system to
> unfreeze. Given
> some circumstances, the only option is hard reset.
>
> Is it possible to somehow update the code that does connect via scenario
> (2) to enable
> more intelligent port allocation, like for example allocating local
> address and port simultaneously
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGiBYGnZWrN2=U=VDbsn-t3ivOyUcaFL65oTT2CUjM_ExG0rUA>