Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 May 2016 15:41:37 -0700
From:      Dieter BSD <dieterbsd@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   TCP problems
Message-ID:  <CAA3ZYrBEPvz9ZrLp2p4_91ynPVhOeAj0Cb4vvu-jO0HQ6=UU8w@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I have suddenly started seeing TCP problems on a machine "G":
running FreeBSD 10.1
Gigabyte UD5 amd64
2 Ethernet controllers, re0 and ue0:

re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port
0xb000-0xb0ff mem 0xfe600000-0xfe600fff,0xd0000000-0xd0003fff irq 16
at device 0.0 on pci6
re0: Using 1 MSI-X message
re0: turning off MSI enable bit.
re0: Chip rev. 0x4c000000
re0: MAC rev. 0x00000000
rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0

ue0 is Siig USB-to-Ethernet  Chipset: AX88179

Problem 1: bind(2) fails
Problem 2: copying large files via Ethernet results in data corruption

1) Bind:

C program containing:

  bzero(&server, sizeof(struct sockaddr_in));
  server.sin_family=AF_INET;
  server.sin_port=htons((unsigned short)port_number);
  (void) memcpy((char*)&server.sin_addr, (char*)host->h_addr,
sizeof(server.sin_addr));

  return_code = socket(PF_INET, SOCK_STREAM, 0);
  if (return_code == -1) { fprintf(stderr, "%s: ERROR ", argv[0]);
perror("socket() failed"); fflush(stderr); exit(-1); }
  fd = return_code;

  return_code = bind(fd, (struct sockaddr*)&server, sizeof(server));
  if (return_code == -1) { fprintf(stderr, "%s: ERROR ", argv[0]);
perror("bind() failed"); fflush(stderr); exit(-1); }

gives: ERROR bind() failed: Can't assign requested address

The same binary has been working perfectly on another machine (running 8.2)
for years.  A UDP version of the program is working ok.  Rebooting didn't help.

2) Data corruption:

rcp large file from machine T (running 8.2) to machine G (10.1)
rcp the file back from G to T
compare the two copies of the file on machine T to verify integrity

This worked fine until yesterday.  Now suddenly most large files have data
corruption, thus cmp(1) fails.  The first difference occurs at various
places in the file.

Both machine have 2 gigabit Ethernet controllers (2 seperate networks).
Both networks have the problem.  I have also tried different sata disks
on different disk controllers.  Both machines are amd64 and have ECC memory.
Cables are factory made cat6 or cat7 25 foot or shorter.  Netgear gigabit
switches.  I tried using ftp instead of rcp.  Rebooting didn't help.

machine T:
nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8210b<RXCSUM,TXCSUM,VLAN_MTU,TSO4,WOL_MAGIC,LINKSTATE>
        media: Ethernet autoselect (1000baseT
<full-duplex,flowcontrol,rxpause,txpause>)
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
        media: Ethernet autoselect (1000baseT
<full-duplex,flowcontrol,rxpause,txpause>)

machine G:
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        media: Ethernet autoselect (1000baseT <full-duplex>)
ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
        media: Ethernet autoselect (1000baseT <full-duplex>)

I tried ifconfig -rxcsum -txcsum.  Machine T seems happy, but
networking on machine G stopped working so I had to turn them back on.
(problem #3?)

Small files (2-4 KB) and things like telnet/rsh seem to work fine.

It appears that *something* broke yesterday, probably something in
machine G.  But what? hardware? software?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA3ZYrBEPvz9ZrLp2p4_91ynPVhOeAj0Cb4vvu-jO0HQ6=UU8w>