Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Aug 2014 17:35:52 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        hiren panchasara <hiren.panchasara@gmail.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, John Baldwin <jhb@freebsd.org>, Jeremiah Lott <jlott@averesystems.com>
Subject:   Re: zero window and persist timer not set
Message-ID:  <CAJ-VmokMnH7Z6uTg7WY=Z4MLgemcmKKhfm_TUqtUmGxgwZy3eA@mail.gmail.com>
In-Reply-To: <CALCpEUEB8qAM5PEuVq03GQnnG-hUkAXwi=iFXZk9CptuGtZZug@mail.gmail.com>
References:  <CANG7ib_DS7NczR4Jonz35RUaT3SCVcickrOcZ4MF--_%2B1BYnxQ@mail.gmail.com> <201408111720.18544.jhb@freebsd.org> <CALCpEUEB8qAM5PEuVq03GQnnG-hUkAXwi=iFXZk9CptuGtZZug@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Sweet, I can trigger this at home when doing high connection rate TCP tests.

Lemme give this a go tonight/tomorrow and see if it changes the behaviour.

Thanks! And yes ,please do file a PR!


-a



On 11 August 2014 17:05, hiren panchasara <hiren.panchasara@gmail.com> wrote:
> On Mon, Aug 11, 2014 at 2:20 PM, John Baldwin <jhb@freebsd.org> wrote:
>> On Wednesday, August 06, 2014 5:25:38 pm Jeremiah Lott wrote:
>>> Hello,
>>>
>>> We've been seeing a problem where a tcp connection is stuck in a zero
>>> window condition and even though the client has opened more window space,
>>> our FreeBSD box never sends any more.  After some analysis it appears that
>>> the FreeBSD box is not sending zero window probes, because the persist
>>> timer did not get set (we can see in kgdb that the tcpcb shows 0 window,
>>> there is data in the socket buffer, but the persist timer is not active).
>>>
>>> After looking over the code for a while, I think I see the problem.  When
>>> tcp_output chooses to send a packet, it never arms the persist timer.  This
>>> causes a problem in the following scenario:
>>>
>>> 1.   A --> B: packet containing enough data to fill the window
>>> 2.   B --> A: ACK for #1 + new data (0 window advertisement)
>>> 3.   A --> B: ACK for #2, 0 len packet
>>>
>>> In this case, A will not activate the persist timer, because it chose to
>>> send a packet.  Unless tcp_output is called for some other reason (delayed
>>> ack timer, another input packet from B, socket syscall), A will not send
>>> zero window probes.  I was finally able to recreate this condition by
>>> setting an very small window and running programs that send very specific
>>> sequences of packets without calling recv (purposefully forcing a zero
>>> window condition).  Here is a packet capture that shows the sequence:
>>>
>>> A == 10.2.15.69 == FreeBSD 9.2
>>> B == 10.2.14.61 == FreeBSD 8.2
>>>
>>> 16:19:49.664790 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [S], seq
>>> 2362665163, win 4300, options [mss 1460,nop,wscale 6,sackOK,TS val 88804503
>>> ecr 0], length 0
>>> 16:19:49.664821 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [S.], seq
>>> 3306387947, ack 2362665164, win 65535, options [mss 1460,nop,wscale
>>> 6,sackOK,TS val 1605043666 ecr 88804503], length 0
>>> 16:19:49.664859 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], ack 1,
>>> win 67, options [nop,nop,TS val 88804503 ecr 1605043666], length 0
>>> 16:19:49.664921 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq
>>> 1:101, ack 1, win 67, options [nop,nop,TS val 88804503 ecr 1605043666],
>>> length 100
>>> 16:19:49.665137 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [P.], seq
>>> 1:3001, ack 101, win 2046, options [nop,nop,TS val 1605043666 ecr
>>> 88804503], length 3000
>>> 16:19:49.665208 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq
>>> 101:1321, ack 1449, win 45, options [nop,nop,TS val 88804503 ecr
>>> 1605043666], length 1220
>>> 16:19:49.666195 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq
>>> 1321:2769, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr
>>> 1605043666], length 1448
>>> 16:19:49.666205 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack
>>> 2769, win 2004, options [nop,nop,TS val 1605043667 ecr 88804503], length 0
>>> 16:19:49.666207 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq
>>> 2769:2771, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr
>>> 1605043666], length 2
>>> 16:19:49.667183 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq
>>> 2771:4219, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr
>>> 1605043667], length 1448
>>> 16:19:49.667190 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], seq
>>> 3001:4345, ack 4219, win 1982, options [nop,nop,TS val 1605043668 ecr
>>> 88804504], length 1344
>>> 16:19:49.667193 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq
>>> 4219:4221, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr
>>> 1605043667], length 2
>>> 16:19:49.766487 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq
>>> 4221:4321, ack 4345, win 0, options [nop,nop,TS val 88804605 ecr
>>> 1605043668], length 100
>>> 16:19:49.766499 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack
>>> 4321, win 1980, options [nop,nop,TS val 1605043768 ecr 88804505], length 0
>>>
>>> The important packets are the last four:
>>>
>>> 1. A --> B: length 1344, fills the remaining window
>>> 2. B --> A: length 2, does not ack additional data, delayed ack timer is set
>>> 3. B --> A: length 100, acks #1, immediate ack (delayed ack timer
>>> cancelled, tcp_output called with ACKNOW)
>>> 4. A --> B: length 0, acks #1 and #2, because a packet is sent tcp_output
>>> does not activate the persist timer.
>>>
>>> I would normally expect A to begin sending zero-window probes, but (since
>>> it didn't activate the persist timer) it does not.  Using kgdb, I can see
>>> that the persist timer is not set, only the keep timer is set.  This is
>>> kgdb on "A":
>>>
>>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_nxt
>>> $5 = 3306392292
>>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_max
>>> $6 = 3306392292
>>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_una
>>> $7 = 3306392292
>>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_wnd
>>> $8 = 0
>>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_cwnd
>>> $9 = 4380
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_rexmt->c_flags
>>> $11 = 16
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_persist->c_flags
>>> $12 = 16
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_keep->c_flags
>>> $13 = 22
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_2msl->c_flags
>>> $14 = 16
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_delack->c_flags
>>> $15 = 16
>>> (kgdb) print ((struct
>>> tcpcb*)(0xfffffe02ae289b70))->t_inpcb->inp_socket.so_snd.sb_cc
>>> $16 = 1656
>>>
>>> There is zero window, data in the socket buffer, and the persist timer is
>>> not set.
>>>
>>> My proposed fix follows.  If you send a 0-length packet, but there is data
>>> is the socket buffer, and neither the rexmt or persist timer is already
>>> set, then activate the persist timer.
>>>
>>> --- sys/netinet/tcp_output.c    (revision 269644)
>>> +++ sys/netinet/tcp_output.c    (working copy)
>>> @@ -1290,7 +1290,12 @@
>>>                                 tp->t_rxtshift = 0;
>>>                         }
>>>                         tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur);
>>> -               }
>>> +               } else if (len == 0 && so->so_snd.sb_cc &&
>>> +                          !tcp_timer_active(tp, TT_REXMT) &&
>>> +                          !tcp_timer_active(tp, TT_PERSIST)) {
>>> +                       tp->t_rxtshift = 0;
>>> +                       tcp_setpersist(tp);
>>> +               }
>>>
>>>         } else {
>>>                 /*
>>>                  * Persist case, update snd_max but since we are in
>>>
>>> Let me know any comments.  Thanks,
>>
>> I think your patch is correct, but please file this as a bug report so we can
>> hopefully wrangle another person to review this.
>
> Looks okay to me also from the looks of it.
>
> cheers,
> Hiren
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmokMnH7Z6uTg7WY=Z4MLgemcmKKhfm_TUqtUmGxgwZy3eA>