Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 May 2016 11:49:54 -0700
From:      Dieter BSD <dieterbsd@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   Re: TCP problems
Message-ID:  <CAA3ZYrCAiqzFWX24qXWJbSPaWEbuv5mG3xH%2B4bk7ZDEXtaod2Q@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Larry suggests:
> Have you tried bumping the MTU on the interfaces to JUMBO frames?
> 9000 or whatever max is?

Easy enough to try, but 2 of the 4 max out at 1500.  I suppose I
could rewire the networks to get the 2 that allow 9000 on the
same wire.  But packet size seems unlikely to have anything to do
with bind failing.  And MTU=1500 is really supposed to work.

I set ue0 to *smaller* mtu (500, 250, 100) and still got data corruption,
along with
  rcp: lost connection
  Data connection: Operation timed out.  (ftp)

More on ue0's MTU below.

Mark suggests:
> Sounds like you may have fired the nic on the G box

Which is why I tried both networks.  Seems unlikely that both
re0 *and* ue0 would fail with the same symptoms.  Seems unlikely
that bad nics would have anything to do with bind failing?

Thank you both for your suggestions.
-------------------
re0:

New problem.  One network got some strange ping times for awhile:

64 bytes from machine_G_re0: icmp_seq=2 ttl=64 time=0.355 ms
64 bytes from machine_G_re0: icmp_seq=3 ttl=64 time=2001.209 ms
64 bytes from machine_G_re0: icmp_seq=4 ttl=64 time=2001.219 ms
64 bytes from machine_G_re0: icmp_seq=5 ttl=64 time=1000.728 ms
64 bytes from machine_G_re0: icmp_seq=6 ttl=64 time=0.229 ms
64 bytes from machine_G_re0: icmp_seq=7 ttl=64 time=2001.091 ms
64 bytes from machine_G_re0: icmp_seq=8 ttl=64 time=2001.129 ms
64 bytes from machine_G_re0: icmp_seq=9 ttl=64 time=1000.643 ms
64 bytes from machine_G_re0: icmp_seq=10 ttl=64 time=0.149 ms
64 bytes from machine_G_re0: icmp_seq=11 ttl=64 time=2001.207 ms
64 bytes from machine_G_re0: icmp_seq=12 ttl=64 time=2001.211 ms
64 bytes from machine_G_re0: icmp_seq=13 ttl=64 time=1000.726 ms

64 bytes from machine_T_bge0: icmp_seq=0 ttl=64 time=423.415 ms
64 bytes from machine_T_bge0: icmp_seq=1 ttl=64 time=14491.793 ms
64 bytes from machine_T_bge0: icmp_seq=2 ttl=64 time=13490.387 ms
64 bytes from machine_T_bge0: icmp_seq=3 ttl=64 time=12489.373 ms
64 bytes from machine_T_bge0: icmp_seq=4 ttl=64 time=11488.635 ms
64 bytes from machine_T_bge0: icmp_seq=5 ttl=64 time=10487.481 ms
64 bytes from machine_T_bge0: icmp_seq=6 ttl=64 time=9486.493 ms
64 bytes from machine_T_bge0: icmp_seq=7 ttl=64 time=8485.567 ms

Powered machine G down overnight and re0 mostly recovered.
Still have the bind problem.  Does bind have anything to do
with the Ethernet hardware or device drivers?  I'm guessing no.

No clue as to why re0 was causing data corruption, or why the
data corruption went away (that problem went away before the power down
so it isn't that).  Also no clue about what caused the long ping times,
which went away after the power down.

-------------------
ue0:

Noticed that netstat was reporting input errors for ue0.
And ue0 input was where the data corruption was happening.
Sent data from machine A with 10Mbps Ethernet.  Netstat
did not report any input errors on ue0 and there was no data
corruption.

So ue0 can handle gigabit data rate, but gets input errors if
packets arrive too frequently.

# ifconfig ue0 media 100baseTX-FDX
fixed the input error problem and the data corruption problem,
at the expense of making it even slower.

Max data rate seen (before lowering to 100Mbps) was about 35 MB/s
which is said to be the effective rate of USB2.

usbconfig says:
ugen0.3: <AX88179 ASIX Elec. Corp.> at usbus0, cfg=0 md=HOST spd=SUPER
(5.0Gbps) pwr=ON (124mA)

so I guess it really is running at USB3 speed.

The chip performs a lot better for tweaktown:
http://www.tweaktown.com/reviews/7243/vantec-cb-u300gna-usb-3-gigabit-network-adapter-review/index.html
(Vantec CB-U300GNA with the same Asix AX88179 chip)
"full duplex gigabit with 952 Mbps consistently across the chart"

Asix AX88179 chip:
http://www.asix.com.tw/products.php?op=pItemdetail&PItemID=131;71;112
"Supports Jumbo frame up to 4KB"

But ifconfig rejects any value > 1500:

# ifconfig ue0 mtu 4000
ifconfig: ioctl (set mtu): Invalid argument
# ifconfig ue0 mtu 1501
ifconfig: ioctl (set mtu): Invalid argument

A quick look at the driver code didn't find a MTU limit.  (But did in other
Ethernet drivers.)  Looks to me like axge(4) doesn't support a large MTU.

IIRC, one should set ifconfig -rxcsum -txcsum to get maximum data
integrity (at the expense of using more cpu).  If the cpu were doing
the checksums it should catch and correct the data corruption I'm
getting since the corruption appears to be happening inside the
Asix AX88179 chip.  But:

# ifconfig ue0 -rxcsum
results in no Ethernet traffic
# ifconfig ue0 -txcsum
seems to work ok.  (including no data corruption)

Why am I not getting any Ethernet traffic with -rxcsum?  I can see that
some controllers might not have the hardware to support rxcsum, but it
seems to me that -rxcsum and -txcsum should always work?

# ifconfig re0 -rxcsum -txcsum
seems to work ok.  (including no data corruption)

Is Asix AX88179 still the only USB to gigabit Ethernet chip?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA3ZYrCAiqzFWX24qXWJbSPaWEbuv5mG3xH%2B4bk7ZDEXtaod2Q>