Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Jul 2007 20:38:40 +0200
From:      "Heiko Wundram (Beenic)" <wundram@beenic.net>
To:        freebsd-questions@freebsd.org
Subject:   Re: Some hosting weirdness...
Message-ID:  <200707112038.40928.wundram@beenic.net>
In-Reply-To: <CA922F3A-FFF5-4AC8-9479-77A514D7BEED@secure-computing.net>
References:  <AF5B51BD-997A-45AE-84C6-41B2D1798632@secure-computing.net> <200707111440.47637.wundram@beenic.net> <CA922F3A-FFF5-4AC8-9479-77A514D7BEED@secure-computing.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 11 July 2007 20:14:59 Eric F Crist wrote:
> Well, I performed a tcpdump as you suggested, and my mss is exactly
> 1460, not the 1452 you suggest.  What does this mean?

As your servers uplink is (most probably) an Ethernet cable, your MSS is=20
correct at 1460 (=3D 1500 bytes MTU for Ethernet - 40 bytes IP+TCP header).

When a TCP connection is established, a three-way handshake takes place. Th=
e=20
host opening the connection sends a SYN-packet which contains "his" Maximum=
=20
Segment Size, in this case it's the customer opening a website on your=20
server, and your host sends a confirmation SYN/ACK-packet to open your side=
=20
of the two way connection, which contains your MSS. This makes two values f=
or=20
Maximum Segment Size (the remote one and yours), and the smaller one is=20
chosen as the Maximum Segment Size of the connection, thus if the customer=
=20
sends a SYN-packet with MSS of 1452 and you send back a SYN/ACK with MSS of=
=20
1460, the MSS for the connection is negotiated at 1452 (which both hosts=20
should stick to).

The following TCP dump of a connection request to a host (sadly a Linux=20
box ;-)) should clear any confusion:

root@beenic01:/home/heiko# tcpdump -vv -i eth0 port 80 and host=20
hnvr-4db2ebb3.pool.einsundeins.de
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 by=
tes

=2D-- SYN packet from my dialup (MSS of 1452, I'm on DSL)
20:22:26.329522 IP (tos 0x0, ttl  52, id 8003, offset 0, flags [DF], proto:=
=20
TCP (6), length: 64) hnvr-4db2ebb3.pool.einsundeins.de.64905 >=20
mail.beenic.net.www: S, cksum 0xd2b5 (correct), 1315765383:1315765383(0) wi=
n=20
65535 <mss 1452,nop,wscale 0,nop,nop,timestamp 2442717 0,sackOK,eol>
=2D--

=2D-- SYN/ACK from server (MSS of 1460, is on 100Mbit Ethernet)
20:22:26.331590 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto: TC=
P=20
(6), length: 44) mail.beenic.net.www >=20
hnvr-4db2ebb3.pool.einsundeins.de.64905: S, cksum 0x421a (correct),=20
1939516734:1939516734(0) ack 1315765384 win 5840 <mss 1460>
=2D--

=2D-- Some connection setup
20:22:26.395813 IP (tos 0x0, ttl  52, id 8004, offset 0, flags [DF], proto:=
=20
TCP (6), length: 40) hnvr-4db2ebb3.pool.einsundeins.de.64905 >=20
mail.beenic.net.www: ., cksum 0x70a7 (correct), 1:1(0) ack 1 win 65535
20:22:26.402403 IP (tos 0x0, ttl  52, id 8005, offset 0, flags [DF], proto:=
=20
TCP (6), length: 421) hnvr-4db2ebb3.pool.einsundeins.de.64905 >=20
mail.beenic.net.www: P 1:382(381) ack 1 win 65535
20:22:26.402414 IP (tos 0x0, ttl  64, id 58600, offset 0, flags [DF], proto=
:=20
TCP (6), length: 40) mail.beenic.net.www >=20
hnvr-4db2ebb3.pool.einsundeins.de.64905: ., cksum 0x560a (correct), 1:1(0)=
=20
ack 382 win 6432
=2D--

=2D-- Actual data packet (IP packet size is the smaller of the two MSS+40)
20:22:26.923728 IP (tos 0x0, ttl  64, id 58602, offset 0, flags [DF], proto=
:=20
TCP (6), length: 1492) mail.beenic.net.www >=20
hnvr-4db2ebb3.pool.einsundeins.de.64905: . 1:1453(1452) ack 382 win 6432
=2D--

=2D-- Another data packet (again, smaller of the two MSS+40 bytes)
20:22:26.923739 IP (tos 0x0, ttl  64, id 58604, offset 0, flags [DF], proto=
:=20
TCP (6), length: 1492) mail.beenic.net.www >=20
hnvr-4db2ebb3.pool.einsundeins.de.64905: . 1453:2905(1452) ack 382 win 6432
=2D--

And so on and so forth... This output was grabbed while I was loading an HT=
ML=20
page from the server which is around 5kb large, which means that at least o=
ne=20
TCP packet is filled up completely.

Ping also makes it easy to spot this:

root@beenic01:/home/heiko# ping -s 1464 hnvr-4db2ebb3.pool.einsundeins.de
PING hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179) 1464(1492) bytes of=
=20
data.
1472 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D1=20
ttl=3D53 time=3D193 ms
1472 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D2=20
ttl=3D53 time=3D191 ms
1472 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D3=20
ttl=3D53 time=3D188 ms
1472 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D4=20
ttl=3D53 time=3D191 ms

=2D-- hnvr-4db2ebb3.pool.einsundeins.de ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3002ms
rtt min/avg/max/mdev =3D 188.379/191.356/193.704/1.912 ms
root@beenic01:/home/heiko#

1464 ping bytes (making a total IP+ICMP packet size of 1492) fit through th=
e=20
pipe, but:

root@beenic01:/home/heiko# ping -s 1465 hnvr-4db2ebb3.pool.einsundeins.de
PING hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179) 1465(1493) bytes of=
=20
data.
=46rom rtsl-hnvr-de05.nw.mediaways.net (213.20.127.85) icmp_seq=3D1 Frag ne=
eded=20
and DF set (mtu =3D 1492)
1473 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D2=20
ttl=3D53 time=3D180 ms
1473 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D3=20
ttl=3D53 time=3D179 ms
1473 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D4=20
ttl=3D53 time=3D202 ms
1473 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D5=20
ttl=3D53 time=3D198 ms
1473 bytes from hnvr-4db2ebb3.pool.einsundeins.de (77.178.235.179): icmp_se=
q=3D6=20
ttl=3D53 time=3D174 ms

=2D-- hnvr-4db2ebb3.pool.einsundeins.de ping statistics ---
17 packets transmitted, 5 received, +1 errors, 70% packet loss, time 16003ms
rtt min/avg/max/mdev =3D 174.848/186.982/202.520/11.114 ms
root@beenic01:/home/heiko#

1465 ping bytes don't, or only do if fragmented: see the ping command outpu=
t=20
first message, where an infrastructure router of my ISP which takes care of=
=20
routing the packet to me tells the pinging host (the server) that 1493=20
IP-bytes won't fit through the pipe to me, at least not if the packet is no=
t=20
to be fragmented (which the DF flag in the IP header signifies). ping chang=
es=20
the IP header to allow fragmentation from ICMP ping packet 2 on, and the=20
router happily starts fragmenting the packets for me, to which my host then=
=20
starts replying.

=46inally my ISP stops fragmenting the packet (probably because of some rou=
ting=20
switch) after packet 6, and the packets don't come through to me anymore,=20
because the new router also doesn't fragment (and doesn't send an ICMP erro=
r=20
either, like the router I got for the first 6 packets did).

The ping test is ideal for testing the hypothesis of an MSS problem, which =
can=20
easily arise if some router/firewall between your host and the destination=
=20
does packet mangling on the MSS on connection setup, which has bitten me mo=
re=20
than once, especially in the presence of VLAN technology, which shrinks the=
=20
MTU of the Ethernet interface to 1496 bytes, making an MSS of 1456.

Anyway, hope this helps for now.

=2D-=20
Heiko Wundram
Product & Application Development
=2D------------------------------------
Office Germany - EXPO PARK HANNOVER
=20
Beenic Networks GmbH
Mail=E4nder Stra=DFe 2
30539 Hannover
=20
=46on        +49 511 / 590 935 - 15
=46ax        +49 511 / 590 935 - 29
Mail       wundram@beenic.net


Beenic Networks GmbH
=2D------------------------------------
Sitz der Gesellschaft: Hannover
Gesch=E4ftsf=FChrer: Jorge Delgado
Registernummer: HRB 61869
Registergericht: Amtsgericht Hannover



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200707112038.40928.wundram>