Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Jul 2011 17:47:34 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        <freebsd-net@freebsd.org>
Subject:   igb enable_aim or flow_control causing tcp stalls?
Message-ID:  <379885BA631F4C7787C24E00A174B429@multiplay.co.uk>

next in thread | raw e-mail | index | archive | help
Been trying to identify an strange network stalling issue while using
scp or rsync between two machines, initially at remote locations.

The behaviour has proved quite difficult to track as it seems to require a
number or factors combined before the stalls occur. These seem to be:
1. This particular target machine
2. Some load, but not much on the machine, when idle we don't see stalls.
3. Remote 9ms+ latency or high through put 50MB/s transmission speeds

My current test case is copying a freebsd iso from a local machine to
the potentially problematic machine's /dev/null e.g.
scp FreeBSD-8.2-RELEASE-amd64-disc1.iso test1:/dev/null

These machines are connected via a cisco 6509 -> supermicro blade
chassis.

When the failure happens we see the following:-
scp FreeBSD-8.2-RELEASE-amd64-disc1.iso amsbld16:/dev/null
FreeBSD-8.2-RELEASE-amd64-disc1.iso   21%  147MB   2.1MB/s - stalled -

When all is well we see:-
scp FreeBSD-8.2-RELEASE-amd64-disc1.iso amsbld16:/dev/null
FreeBSD-8.2-RELEASE-amd64-disc1.iso   100%  691MB  53.1MB/s   00:13

This setup:-
1. Source machine 7.0-RELEASE-p2 using em0
em0@pci0:6:0:0: class=0x020000 card=0x109615d9 chip=0x10968086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'PRO/1000 EB Network Connection'
    class      = network
    subclass   = ethernet
2. Target (problem) machine 8.2-RELEASE using igb0
igb0@pci0:5:0:0:        class=0x020000 card=0x10e715d9 chip=0x10e78086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    class      = network
    subclass   = ethernet

I've tried switching to igb1 with no change, which also changes
switches and hence ports on the Cisco, so I don't at this point
believe there is an issue there.

Now I've just noticed that igb has at least two sysctl's which
seemed interesting, enable_aim & flow_control (which is missing
from the man page btw). On disabling both, the stalls seem to go away.

Unfortunately re-enabling them didn't re-introduce the stalls, but
this could another quirk when they don't re-enable properly?

So the questions are:-
1. Could either of these settings cause tcp stalls?
2. If the nic and switch differ in flow control, what is the likely
effect?
3. Any other thoughts?

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?379885BA631F4C7787C24E00A174B429>