Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jun 2006 23:45:05 +0200 (CEST)
From:      Staffan Ulfberg <staffan@ulfberg.se>
To:        FreeBSD-gnats-submit@FreeBSD.org
Cc:        Staffan Ulfberg <staffan@ulfberg.se>
Subject:   kern/99188: FIN in same packet as duplicate ACK is lost
Message-ID:  <200606192145.k5JLj5Hx090785@multivac.fatburen.org>
Resent-Message-ID: <200606192150.k5JLoJcH006398@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         99188
>Category:       kern
>Synopsis:       FIN in same packet as duplicate ACK is lost
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 19 21:50:18 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Staffan Ulfberg
>Release:        FreeBSD 5.5-PRERELEASE i386
>Organization:
Harmonicode AB 
>Environment:
System: FreeBSD multivac.fatburen.org 5.5-PRERELEASE FreeBSD 5.5-PRERELEASE #5: Mon Jun 19 02:16:36 CEST 2006 staffanu@multivac.fatburen.org:/usr/obj/usr/src/sys/MULTIVAC i386


	
>Description:

When the remote end of a socket sends a FIN, it is lost when the
following critera is met:

	- The packet contains to other data.

	- The packet also contains an ACK for a sequence number that
	  has already been acked.

	- There is un-acked data on the way to the client.

(If the net.inet.tcp.rfc3042 sysctl is cleared, the FIN is not dropped
if this is the first or second duplicate ACK.)

The code that checks the conditions above are there to trigger a
resend to the remote end when a duplicate ACK is received.  It turns
out that the code that does that also drops the packet, probably since
it doesn't contain any data anyway.

It turns out that there is one more exception: if the packet contains
a window update, it is not handled as above, in order not to miss the
window update.  

I believe a FIN should be handled the same way.  That is, handling the
FIN is more important than handling the duplicate ACK.

>How-To-Repeat:

Compile and run the following two programs.  The first one is the
server and runs on FreeBSD.  Compile with g++:

#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <assert.h>
#include <stdlib.h>
#include <ctype.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <string.h>

int main()
{
  int listenFd;
  if ((listenFd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1)
  {
    fprintf(stderr, "Error creating socket: %s\n", strerror(errno));
    exit(1);
  }

  int trueValue = 1;
  setsockopt(listenFd,
             SOL_SOCKET, SO_REUSEADDR, &trueValue, sizeof(trueValue));

  sockaddr_in listenAddress;
  listenAddress.sin_family = AF_INET;
  listenAddress.sin_port = htons(5000);
  listenAddress.sin_addr.s_addr = INADDR_ANY;
  if (bind(listenFd, (struct sockaddr *) &listenAddress,
           sizeof(listenAddress)))
  {
    fprintf(stderr, "Bind failed: %s.\n", strerror(errno));
    exit(1);
  }
  if (listen(listenFd, 1) == -1)
  {
    fprintf(stderr, "Listen failed: %s.\n", strerror(errno));
    exit(1);
  }

  sockaddr_in clientSockAddress;
  unsigned int l = sizeof(struct sockaddr_in);
  int clientFd = accept(listenFd, (struct sockaddr *) &clientSockAddress, &l);
  if (clientFd == -1)
  {
    fprintf(stderr, "accept() failed: %s.\n", strerror(errno));
    exit(1);
  }

  char buf[10000];
  for (int i = 0; i < 100; i++)
  {
    int r = send(clientFd, buf, 1360, 0);
    if (r == -1)
    {
      fprintf(stderr, "send() failed: %s.\n", strerror(errno));
      exit(1);
    }
    else if (r != 1360)
    {
      fprintf(stderr, "send() returned %d", r);
    }
  }
  printf("sent everything\n");

  while (read(clientFd, buf, 10000) != 0)
    printf("read something\n");

  printf("closed.\n");

  return 0;
}

The second one is the client and runs on Windows XP SP2:

#include <stdio.h> 
#include <winsock2.h>

int main()
{
  WORD wVersionRequested;
  WSADATA wsaData;
  int err;
  
  wVersionRequested = MAKEWORD(2, 2);
  err = WSAStartup(wVersionRequested, &wsaData);
  if (err != 0) {
    fprintf(stderr, "error starting winsock\n");
    exit(1);
  }

  struct sockaddr_in addr;
  addr.sin_family = AF_INET;
  addr.sin_port = htons(5000);
  struct hostent *hostent = gethostbyname("172.22.32.206");
  addr.sin_addr.s_addr = inet_addr("172.22.32.206");
  if (addr.sin_addr.s_addr == INADDR_NONE) {
    fprintf(stderr, "bad remote address\n");
    exit(1);
  }
  
  SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
  if (s == INVALID_SOCKET)
  {
    fprintf(stderr, "could not create socket");
    exit(1);
  }
  
  int r = connect(s, (struct sockaddr *) &addr, sizeof addr);
  if (r == SOCKET_ERROR)
  {
    closesocket(s);
    fprintf(stderr, "connect failed\n");
    exit(1);
  }

  char buf[10000];
  for (int i = 0; i < 99; i++)
  {
    int r = recv(s, buf, 1360, 0);
    if (r == SOCKET_ERROR)
    {
      closesocket(s);
      fprintf(stderr, "read error\n");
      exit(1);
    }
  }
  shutdown(s, SD_SEND);

  Sleep(1000);

  closesocket(s);

  WSACleanup();
  return 0;
}

Compile on Windows using "cl close.cpp wsock32.lib", if the file name
is "close.cpp".

What happens is that the client connects to a server that has the
intent of sending 100 1360 byte messages.  After 99 messages, the
client sends a FIN, by doing shutdown() on its outgoing part of the
socket.  After this, the server will never be able to understand that
this socket is really dead.  (Compare netstat on the Windows machine
and on the FreeBSD machine.)

The test only works on a relatively slow link.  For the bug to be
triggered, it is necessary that the last ack before sending the FIN is
for the same sequence number that is the last received when sending
hte FIN.

>Fix:

I made the following patch to tcp_input.c, which is similar to how a
similar patch was made years ago to fix a problem where window updates
were discarded:

multivac# diff -u tcp_input.c.orig tcp_input.c 
--- tcp_input.c.orig    Thu Mar 30 16:03:34 2006
+++ tcp_input.c Mon Jun 19 02:06:14 2006
@@ -1952,7 +1952,8 @@
        case TCPS_TIME_WAIT:
                KASSERT(tp->t_state != TCPS_TIME_WAIT, ("timewait"));
                if (SEQ_LEQ(th->th_ack, tp->snd_una)) {
-                       if (tlen == 0 && tiwin == tp->snd_wnd) {
+                       if (tlen == 0 && tiwin == tp->snd_wnd && 
+                           !(thflags & TH_FIN)) {
                                tcpstat.tcps_rcvdupack++;
                                /*
                                 * If we have outstanding data (other than

/ Staffan
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200606192145.k5JLj5Hx090785>