Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 04 Jan 2014 10:38:47 -0800
From:      Peter Wemm <peter@wemm.org>
To:        freebsd-net@freebsd.org
Cc:        Gavin Atkinson <gavin@FreeBSD.org>, andre@freebsd.org, Peter Wemm <peter@freebsd.org>
Subject:   Long-haul problems - connections stuck in slow start
Message-ID:  <52C85537.7080307@wemm.org>

next in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--36bQ1sTSD7NQcDKVhvmAqAkjfWOnAOJmH
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

We're seeing some unfortunate misbehavior with tcp over an intercontinent=
al
link.

eg: fetching a 30GB http file from various package mirrors by a remote:
us-west(ISC) -> london(BME)
bd93e71c-cae4-44fd-943c-d1a88dbf6c6d.tar  0% of   29 GB  961 kBps 09h03m^=
C
us-east(NYI) -> london(BME)
bd93e71c-cae4-44fd-943c-d1a88dbf6c6d.tar  0% of   29 GB 1070 kBps 08h08m^=
C
us-west(YSV) -> london(BME)
bd93e71c-cae4-44fd-943c-d1a88dbf6c6d.tar  0% of   29 GB   14 kBps 590h22m=
^C

Spot the one we're concerned about...

Ping times for the three (in order):
round-trip min/avg/max/std-dev =3D 144.330/144.532/144.797/0.157 ms
round-trip min/avg/max/std-dev =3D 79.650/79.965/80.488/0.287 ms
round-trip min/avg/max/std-dev =3D 148.588/153.292/155.688/2.903 ms

The problem pair is worth showing some detail on:
16 bytes from ..:206a::1001:10, icmp_seq=3D4 hlim=3D55 time=3D148.588 ms
16 bytes from ..:206a::1001:10, icmp_seq=3D5 hlim=3D55 time=3D155.140 ms
16 bytes from ..:206a::1001:10, icmp_seq=3D6 hlim=3D55 time=3D149.443 ms
16 bytes from ..:206a::1001:10, icmp_seq=3D7 hlim=3D55 time=3D155.688 ms
16 bytes from ..:206a::1001:10, icmp_seq=3D8 hlim=3D55 time=3D148.630 ms
16 bytes from ..:206a::1001:10, icmp_seq=3D9 hlim=3D55 time=3D155.486 ms
It appears that there are two packet paths between the endpoints that hav=
e
either ~148ms or ~155ms.  I've done some longer samples and they're fairl=
y
consistent clusters.

All four machines talk to each other.

Here's where it gets interesting.  On the sender at us-west(YSV), I see t=
his:
net.inet.tcp.hostcache.list:
IP address    SSTRESH    RTT RTTVAR     CWND HITS
us-west(ISC)    59521    5ms    1ms    16845 15055031
eu-west(BME)     7343  150ms    2ms    13501 3433775
us-east(NYI)   530489  100ms   37ms    16681 43043786

The ssthresh is very low for the problematic ysv<->bme pair.

When I do a tcpdump, I see the sender fire off 7343 bytes of data, then s=
top
and wait for acks.  It's completely ignoring the receiver's window state.=

It appears stuck in slowstart mode.

Some other data:
Proto Recv-Q Send-Q Local Address  Foreign Address        (state)
tcp6       0 1047852 2001:19:2.443   2001:41c8:.24490 ESTABLISHED

(netstat -x, sorry about the wrap)
Proto Recv-Q Send-Q Local Address          Foreign Address        R-MBUF
S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-B=
MAX
  rexmt persist    keep    2msl  delack rcvtime
tcp6       0 1048152 2001:1900:2254:2.443   2001:41c8:112:83.24490      0=

 374      0    373  65688 1049580      1   2048      0 1420800 525504
8396640    0.43    0.00 7199.93    0.00    0.00    0.06

The "interesting" parts of -x:
rexmt persist    keep    2msl  delack rcvtime
 0.43    0.00 7199.93    0.00    0.00    0.06

-T
Proto Rexmit OOORcv 0-win  Local Address   Foreign Address
tcp6   54161      0      0 2001:192.443   2001:41:83.24490
note retransmits(!)

Some tcpcb fields that caught my eye for the connection:
  snd_wnd =3D 1048576,
  snd_cwnd =3D 5712,
  t_srtt =3D 6391,
  t_rttvar =3D 903,
  t_rxtshift =3D 0,
  t_rttmin =3D 30,
  t_rttbest =3D 4903,
  t_rttupdated =3D 220095,
  max_sndwnd =3D 1048576,
  snd_cwnd_prev =3D 4284,
  snd_ssthresh_prev =3D 2856,
  snd_recover_prev =3D 1397053524,
  t_sndzerowin =3D 0,
  t_badrxtwin =3D 584273259,
  snd_limited =3D 0 '\0',
  t_rttlow =3D 150,
I've stored some dumps of the tcpcb at
  http://people.freebsd.org/~peter/tcpcb.txt
Note that some in the tcpcb.txt file also have
  snd_limited =3D 2 '\002',

Over the last few days I've tried things like turning off sack, tso, the
various rfc knobs etc.  I believe they're all back to normal now.

There's small ~15 second tcpdump sample of the sender side and the receiv=
er
side at: http://people.freebsd.org/~peter/send.cap.gz and
http://people.freebsd.org/~peter/recv.cap.gz
Both ends were ntp synced.  The dumps have no sensitive data.

For amusement, I just tried this, with roughly 1 second in between:
peter@bme:~ %	scp pkg-ysv:k.gz /tmp
k.gz              100%   25MB   5.0MB/s   00:05
peter@bme:~ %	scp pkg-ysv:k.gz /tmp
k.gz                0%  960KB  20.3KB/s   41:29 ETA^C

There was no pre-existing hostcache state between those two endpoints for=

the first run.  At the end, this was created in the hostcache:
IP address   SSTRESH   RTT  RTTVAR BANDWIDTH     CWND
213.138..       5952 165ms    21ms         0     8688
All connections went slow after that.  Note that the ssh test was over ip=
v4
- the rest above is on ipv6.  However, we're seeing the same weird stuff
with http over ipv4 as well between the same two endpoints.

It was pointed out to me that this has come up before, eg: misc/173859
I know we've seen this at work as well.

A few days earlier we were pushing ~45MB/sec (bytes, not bits) between th=
ese
endpoints. Out of the blue it crashed to ~10KB/sec.  Why can't it get out=
 of
slow-start?  Is it even stuck in slow-start like I think?  Is the 148-155=
ms
bimodal rtt the problem?

Any insight would be greatly appreciated.  (please don't drop me from cc:=
)
--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6F=
JV
UTF-8: for when a ' just won\342\200\231t do.


--36bQ1sTSD7NQcDKVhvmAqAkjfWOnAOJmH
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLIVTsACgkQFRKuUnJ3cX9djgCfa8OuEF6AwFnlsyA1YT849YdN
rsUAnA6X1BMs3Rsdgh5yUTkVubFquIuM
=k83d
-----END PGP SIGNATURE-----

--36bQ1sTSD7NQcDKVhvmAqAkjfWOnAOJmH--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52C85537.7080307>