Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Nov 2011 11:43:03 +0000
From:      Matthew Seaman <m.seaman@infracaninophile.co.uk>
To:        freebsd-questions@freebsd.org
Subject:   Re: Diagnosing packet loss
Message-ID:  <4ECE2DC7.2000800@infracaninophile.co.uk>
In-Reply-To: <97326E87-B3A2-460F-AE9D-259710B36EA2@gmail.com>
References:  <B0BE38BD-CE86-4D42-9215-933150BA07D9@gmail.com> <4ECC2CD0.8040902@sentex.net> <97326E87-B3A2-460F-AE9D-259710B36EA2@gmail.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig59BA8DE6AF4D33CE8709854E
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 24/11/2011 10:07, Kees Jan Koster wrote:
> This seems to be local to my machine. Here is another reason why I
> say that: I can reliably transmit data when I bind to the aliased IP
> address: If I use mtr to measure packet loss from saffron (the stricken=

> machine) to cumin (another machine in a different data center) I see th=
e
> following:
>=20
>  saffron (ip address a) -> cumin: packet loss
>  saffron (ip address b) -> cumin: no packet loss
>=20
>  cumin -> saffron (ip address a): packet loss
>  cumin -> saffron (ip address b): no packet loss
>=20
> This is consistent from running mtr for 5 minutes straight. This to
> me shows that the hardware is fine. Using the alias IP address I can
> run with no packet loss for as long as I like.
>=20
> Sooo.... Now what? I am completely at a loss. :-/

Hmm... I wouldn't dismiss hardware problems just yet. Earlier you showed
the ifconfig output for your problem machine:

> [kjkoster@saffron ~]$ ifconfig bge0
> bge0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu=
 1500
> 	options=3D8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINK=
STATE>
> 	ether 00:e0:81:32:ed:b4
> 	inet 91.196.169.165 netmask 0xfffffff8 broadcast 91.196.169.167
> 	inet 91.196.169.166 netmask 0xffffffff broadcast 91.196.169.166
> 	media: Ethernet autoselect (100baseTX <full-duplex,flowcontrol,rxpause=
,txpause>)
> 	status: active

Where there is a one-bit difference between the addresses.  Can you try
temporarily using two even-numbered addresses and then two odd-numbered
addresses and repeat your mtr tests?  If the packet loss problem
correlates with whether the address is even or odd, then I think that's
pretty good evidence for a dud network interface: a one-bit problem in a
memory register somewhere, occasionally flipping the least significant
bit in the address to 0.

Another test would be to swap the configuration order (ie. make .166 the
primary address and .165 the alias) -- if it's always the first
configured address that has problems, again that indicates memory
trouble in the hardware.

Are these NICs built-in to your motherboard?  If so, they will almost
certainly share a PHY, which is where the problem would be, and why
swapping the cables between interfaces made no difference.
Unfortunately in that case to fix the problem, you'll either have to
swap out the motherboard or add a separate NIC card to your system.
Hopefully the system is still under warranty.

	Cheers,

	Matthew

--=20
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
JID: matthew@infracaninophile.co.uk               Kent, CT11 9PW


--------------enig59BA8DE6AF4D33CE8709854E
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk7OLc4ACgkQ8Mjk52CukIzsWACcCrgTA5U8feZeudCyeVO3nqe9
2PAAn0a5YFV2aGiD+5tfSLmxQ8dWGqJd
=Q+d+
-----END PGP SIGNATURE-----

--------------enig59BA8DE6AF4D33CE8709854E--



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?4ECE2DC7.2000800>