Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 02 Feb 2007 08:12:39 -0600
From:      Dave Baukus <david.baukus@us.fujitsu.com>
To:        freebsd-net@freebsd.org
Cc:        Dave Baukus <david.baukus@us.fujitsu.com>
Subject:   Re: ETIMEDOUT bug
Message-ID:  <45C346D7.4090305@us.fujitsu.com>
In-Reply-To: <45C2765C.7010708@us.fujitsu.com>
References:  <45C2765C.7010708@us.fujitsu.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I realized, late last night, that I was wrong on a few
details concerning this bug:

1.) The retransmit timer does not keep popping on without
being restarted.

2.) ip_output() must return ENOBUFS (TCP_MAXRXTSHIFT + 1) times
to the same, non-transmitting TCP.

3.) Given a TCP as described below, when tcp_output() uses ENOBUFS
to blindly start the retransmit timer then tp->t_rxtshift will be
falsely incremented and never cleared.

Thus the bug manifests itself because it appears for a TCP that
never transmits nobody ever clears clears tp->t_rxtshift;
this allows tp->t_rxtshift to slowly count up to TCP_MAXRXTSHIFT;
once TCP_MAXRXTSHIFT is exceeded tcp_timer_rexmt() will
kill the poor innocent TCP.

On 02/01/07 17:23, Dave Baukus wrote:
> There is a bug  tcp_output() for at least freeBSD6.1
> that causes a perfectly good TCP to be dropped by its
> retransmit timer; the application receives ETIMEDOUT.
> 
> Consider a TCP that never transmits (the receive end of the ttcp
> utility is an example), while the TCP is established
> snd_max == snd_una == snd_nxt == (isr + 1) and the retransmit
> timer should never be started. If the retransmit timer is started
> then it is never stopped by tcp_input/tcp_out because
> snd_max == snd_una == snd_nxt (always). Once started the
> timer continues its count up till tp->t_rxtshift == 12 and
> the connection that never transmitted gets falsely killed.
> 
> The bug is to blindly rely on the return value of ip_output().
> If ip_output() returns ENOBUFS then the retransmit timer is
> activated:
> 
>  From the end of tcp_output():
> out:
> SOCKBUF_UNLOCK_ASSERT(&so->so_snd);    /* Check gotos. */
> if (error == ENOBUFS) {
>         if (!callout_active(tp->tt_rexmt) &&
>             !callout_active(tp->tt_persist))
>                      callout_reset(tp->tt_rexmt, tp->t_rxtcur,
>                          tcp_timer_rexmt, tp);
>                      tp->snd_cwnd = tp->t_maxseg;
>                      return (0);
> }
> 
> My simple minded fix would be not to start the retransmit timer;
> if tcp_output() wanted to time this transmit it would have started
> the timer up above.
> 
> This ETIMEDOUT problem is easily recreated on any old machine
> using a single slow ethernet device and the ttcp test utility.
> First, fire up a couple ttcp receivers. Second, flood the same
> interface with enough ttcp transmitters to cause the driver's transmit
> ring and interface queue to back up. Eventually, one of the ttcp
> receives will get ENOBUFS from ip_output() and the retransmit
> timer will be wrongly activated for a pure ACK segment.
> 
> I was able to do it w/ the following on freeBSD6.1:
> 
> box1:
> ttcp -s -l 16384 -p 9444 -v -b 128000 -r
> ttcp -s -l 16384 -p 9445 -v -b 128000 -r
> ttcp -s -n 6553600 -l 4096 -p 9446 -v -b 128000 -t 192.168.222.13
> ttcp -s -n 9999999 -l 333  -p 9447 -v -b 128000 -t 192.168.222.13
> ttcp -s -n 9999999 -l 8192  -p 9448 -v -b 128000 -t 192.168.222.13
> ttcp -s -n 9999999 -l 333  -p 9449 -v -b 128000 -t 192.168.222.13
> ttcp -s -n 9999999 -l 8192  -p 9450 -v -b 128000 -t 192.168.222.13
> 
> box2:
> ttcp -s -n 6553600 -l 8192 -p 9444 -v -b 128000 -t  192.168.222.222
> ttcp -s -n 9999999 -l 128  -p 9445 -v -b 128000  -t  192.168.222.222
> ttcp -s -l 16384 -p 9446 -v -b 128000 -r
> ttcp -s -l 16384 -p 9447 -v -b 128000 -r
> ttcp -s -l 16384 -p 9448 -v -b 128000 -r
> ttcp -s -l 16384 -p 9449 -v -b 128000 -r
> ttcp -s -l 16384 -p 9450 -v -b 128000 -r
> 

-- 
Dave Baukus
    david.baukus@us.fujitsu.com
    972-479-2491

    Fujitsu Network Communications
          Richardson, Texas
                  USA



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45C346D7.4090305>