Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Apr 2007 11:27:56 -0700
From:      "Jack Vogel" <jfvogel@gmail.com>
To:        "Sven Willenberger" <sven@dmv.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: CARP and em0 timeout watchdog
Message-ID:  <2a41acea0704201127x319be08cw869efe1dd02a046e@mail.gmail.com>
In-Reply-To: <1177091905.5457.17.camel@lanshark.dmv.com>
References:  <1176911436.7416.8.camel@lanshark.dmv.com> <1177084316.5457.5.camel@lanshark.dmv.com> <20070420160431.GA17356@icarus.home.lan> <2a41acea0704201017n42d4e987l77752ee8f7ca9f1f@mail.gmail.com> <1177091905.5457.17.camel@lanshark.dmv.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 4/20/07, Sven Willenberger <sven@dmv.com> wrote:
> On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
> > On 4/20/07, Jeremy Chadwick <koitsu@freebsd.org> wrote:
> > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
> > > > Having done more diagnostics I have found out it is not CARP related at
> > > > all. It turns out that the same timeouts will happen when ftp'ing to the
> > > > physical address IPs as well. There is also an odd situation here
> > > > depending on which protocol I use. The two boxes are connected to a Dell
> > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the
> > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
> > > > hiccup (I used dd to create various sized testfiles from 32M to 1G in
> > > > size and just scp testfile* to the other box). On the other hand, if I
> > > > connect to 192.168.0.19 using ftp (either active or passive) where ftp
> > > > is being run through inetd, the interface resets (watchdog) within
> > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
> > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
> > > > such behavioral differences between scp and ftp?
> > >
> > > You'll get a much higher throughput rate with FTP than you will with
> > > SSH, simply because encryption overhead is quite high (even with the
> > > Blowfish cipher).  With a very fast processor and on a gigE network
> > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
> > > That's the only difference I can think of.
> > >
> > > The watchdog resets I can't explain; Jack Vogel should be able to assist
> > > with that.  But it sounds like the resets only happen under very high
> > > throughput conditions (which is why you'd see it with FTP but not SSH).
> >
> > What kind of hardware is this interface? Watchdogs mean TX cleanup
> > isn't happening in a reasonable time, without further data its hard to
> > know what might be going on.
> >
> > Jack
>
> from pciconf:
>
> em0@pci13:0:0:  class=0x020000 card=0x108c15d9 chip=0x108c8086 rev=0x03
> hdr=0x00
>     vendor   = 'Intel Corporation'
>     device   = 'PRO/1000 PM'
>     class    = network
>     subclass = ethernet
> em1@pci14:0:0:  class=0x020000 card=0x109a15d9 chip=0x109a8086 rev=0x00
> hdr=0x00
>     vendor   = 'Intel Corporation'
>     class    = network
>     subclass = ethernet
>
> em0 is the interface in question.
>
> from dmesg:
>
> em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> 0x4000-0x401f mem 0xe0300000-0xe031ffff irq 16 at device 0.0 on pci13
>
> em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> 0x5000-0x501f mem 0xe0400000-0xe041ffff irq 17 at device 0.0 on pci14

OH, this is an 82573, and I've posted a firmware patcher a couple
different times, there is a bit in the MANC register that is incorrectly
programmed in some vendors systems. Can you search email for
that patcher, it needs to run from DOS. If you are unable to find
it let me know and I'll resent you a copy.

Jack



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a41acea0704201127x319be08cw869efe1dd02a046e>