Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 04 Nov 2012 06:40:04 -0800
From:      Manfred Antar <null@pozo.com>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: weird network problems on current since 10/28/2012
Message-ID:  <201211041440.qA4Ee9tF001680@pozo.com>
In-Reply-To: <50967453.5090503@freebsd.org>
References:  <201211031740.qA3HeqVX001622@pozo.com> <CAJ-VmomAR8N8ovhC7La3ttG=7Qu_%2BVwD30tPxFBpzC37eg9CHA@mail.gmail.com> <201211040113.qA41DfLn001577@pozo.com> <50964FBB.4010600@andric.com> <d125c29193960f0932391c161dca233e.squirrel@webmail.kim.net> <50967453.5090503@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
At 05:57 AM 11/4/2012, you wrote:
>On 04.11.2012 13:11, Kim Culhan wrote:
>>On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
>>>On 2012-11-04 02:13, Manfred Antar wrote:
>>>>At 03:29 PM 11/3/2012, Adrian Chadd wrote:
>>>>>On 3 November 2012 10:40, Manfred Antar <null@pozo.com> wrote:
>>>>>>i have problem connecting to freebsd box on local network since last sunday.
>>>>>>the last kernel that works:
>>>>>>   FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
>>>>>>anything after that, sometimes i can connect, other times just hangs.
>>>>>>any network connection hangs ===== pop httpd ssh etc etc.
>>>>>>anyone have any ideas ?
>>>>>>i can checkout different sources and see if i can locate the changes that cause
>>>>>>this.
>>>>>
>>>>>Please do!
>>>...
>>>>Here is what I found doing :
>>>>setenv CVSROOT /usr/home/ncvs
>>>>
>>>>cvs co -D"October 28, 2012 12:14:38 PDT" sys
>>>>
>>>>A kernel from that time works fine.
>>>>
>>>>doing:
>>>>
>>>>cvs up -D"October 28, 2012 13:14:38 PDT" sys                    1 hour later
>>>>the following files were changed:
>>>>sys/netinet/tcp_input.c
>>>>sys/netinet/tcp_timer.c
>>>>sys/netinet/tcp_var.h
>>>>
>>>>Building a kernel from these new files is when the problem starts.
>>>
>>>So, your problems seem to have been introduced by this commit by Andre:
>>>
>>>    http://svn.freebsd.org/changeset/base/242266
>>>
>>>    Increase the initial CWND to 10 segments as defined in IETF TCPM
>>>    draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
>>>    window improves the overall performance of many web services without
>>>    risking congestion collapse.
>>>
>>>    As long as it remains a draft it is placed under a sysctl marking it
>>>    as experimental:
>>>     net.inet.tcp.experimental.initcwnd10 = 1
>>>    When it becomes an official RFC soon the sysctl will be changed to
>>>    the RFC number and moved to net.inet.tcp.
>>>
>>>    This implementation differs from the RFC draft in that it is a bit
>>>    more conservative in the case of packet loss on SYN or SYN|ACK because
>>>    we haven't reduced the default RTO to 1 second yet.  Also the restart
>>>    window isn't yet increased as allowed.  Both will be adjusted with
>>>    upcoming changes.
>>>
>>>    Is is enabled by default.  In Linux it is enabled since kernel 3.0.
>>>
>>>After the commit, there was a small discussion thread on svn-src-head@
>>>about the possible problems with the approach.  Maybe you are
>>>experiencing those?
>>>
>>>As the commit message says, you should be able to turn the feature off
>>>using:
>>>
>>>    sysctl net.inet.tcp.experimental.initcwnd10=0
>>>
>>>Can you please try that, and see if the problems go away?
>>
>>FWIW this did not make the problem go away on 2 machines.
>
>Yes, this very much looks like the same problem as in PR/173309.
>
>Please try the attached patch.  It fixes the connection hang issue.
>There may be a second issue I debugging currently base on the feedback
>from Fabian Keil.
>
>-- 
>Andre
>
>Index: tcp_input.c
>===================================================================
>--- tcp_input.c (revision 242494)
>+++ tcp_input.c (working copy)
>@@ -2650,10 +2652,12 @@
>
>                SOCKBUF_LOCK(&so->so_snd);
>                if (acked > so->so_snd.sb_cc) {
>+                       tp->snd_wnd -= so->so_snd.sb_cc;
>                        sbdrop_locked(&so->so_snd, (int)so->so_snd.sb_cc);
>                        ourfinisacked = 1;
>                } else {
>                        sbdrop_locked(&so->so_snd, acked);
>+                       tp->snd_wnd -= acked;
>                        ourfinisacked = 0;
>                }
>                /* NB: sowwakeup_locked() does an implicit unlock. */

This patch improves the connection issue, not hanging on trying to connect (ssh pop)
It still seems that it is taking longer to connect though. But in the end the connection goes through.
I can capture a tcpdump and put it at http://pozo.com/tcpdump/tpdump.txt if that will help.
I'll let it run for about 1/2 hour.
Manfred


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201211041440.qA4Ee9tF001680>