Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Apr 2007 18:46:48 +0200
From:      "Clayton Milos" <clay@milos.co.za>
To:        "Sven Willenberger" <sven@dmv.com>
Cc:        Jeremy Chadwick <koitsu@FreeBSD.org>, freebsd-stable@FreeBSD.org
Subject:   Re: CARP and em0 timeout watchdog
Message-ID:  <03b401c7836b$7e125b20$9603a8c0@claylaptop>
References:  <1176911436.7416.8.camel@lanshark.dmv.com><1177084316.5457.5.camel@lanshark.dmv.com><20070420160431.GA17356@icarus.home.lan> <1177086339.5457.13.camel@lanshark.dmv.com>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message ----- 
From: "Sven Willenberger" <sven@dmv.com>
To: "Jeremy Chadwick" <koitsu@FreeBSD.org>
Cc: <freebsd-stable@FreeBSD.org>
Sent: Friday, April 20, 2007 6:25 PM
Subject: Re: CARP and em0 timeout watchdog


> On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote:
>> On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
>> > Having done more diagnostics I have found out it is not CARP related at
>> > all. It turns out that the same timeouts will happen when ftp'ing to 
>> > the
>> > physical address IPs as well. There is also an odd situation here
>> > depending on which protocol I use. The two boxes are connected to a 
>> > Dell
>> > Powerconnect 2616 gig switch with CAT6. If I scp files from the
>> > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without 
>> > a
>> > hiccup (I used dd to create various sized testfiles from 32M to 1G in
>> > size and just scp testfile* to the other box). On the other hand, if I
>> > connect to 192.168.0.19 using ftp (either active or passive) where ftp
>> > is being run through inetd, the interface resets (watchdog) within
>> > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
>> > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
>> > such behavioral differences between scp and ftp?
>>
>> You'll get a much higher throughput rate with FTP than you will with
>> SSH, simply because encryption overhead is quite high (even with the
>> Blowfish cipher).  With a very fast processor and on a gigE network
>> you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
>> That's the only difference I can think of.
>>
>> The watchdog resets I can't explain; Jack Vogel should be able to assist
>> with that.  But it sounds like the resets only happen under very high
>> throughput conditions (which is why you'd see it with FTP but not SSH).
>>
>
> I guess it is possible that the traffic from ftp (or smb) is overloading
> the interface; fwiw, if I increase the {recv,send}space to 131072 I can
> acheive 32MB+/s using scp (and ftp shows similar values). The real
> question is how to avoid these watchdog timeouts during heavy traffic;
> the whole point here was to replace windows-based fileshare servers with
> FreeBSD for the local network but at the moment it is proving
> ineffectual as any samba file transfers stall (much like ftp). I see no
> other error messages in the logfiles other than the watchdog timeouts
> plus interface down/up messages.
>
> Sven
>

Sorry for jumping on a thread here. I've had issues with em NIC's as well. 
Especially with heavy loads. What helped for me was turning on polling. I 
recompiled the kernel with polling and turned it on in rc.conf and my 
problems disappeared.

Are you running with polling on?

-Clay




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?03b401c7836b$7e125b20$9603a8c0>