Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Apr 2007 14:02:22 -0400
From:      Sven Willenberger <sven@dmv.com>
To:        Clayton Milos <clay@milos.co.za>
Cc:        Jeremy Chadwick <koitsu@freebsd.org>, freebsd-stable@freebsd.org
Subject:   Re: CARP and em0 timeout watchdog
Message-ID:  <1177092142.5457.20.camel@lanshark.dmv.com>
In-Reply-To: <03b401c7836b$7e125b20$9603a8c0@claylaptop>
References:  <1176911436.7416.8.camel@lanshark.dmv.com> <1177084316.5457.5.camel@lanshark.dmv.com> <20070420160431.GA17356@icarus.home.lan> <1177086339.5457.13.camel@lanshark.dmv.com> <03b401c7836b$7e125b20$9603a8c0@claylaptop>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2007-04-20 at 18:46 +0200, Clayton Milos wrote:
> ----- Original Message ----- 
> From: "Sven Willenberger" <sven@dmv.com>
> To: "Jeremy Chadwick" <koitsu@FreeBSD.org>
> Cc: <freebsd-stable@FreeBSD.org>
> Sent: Friday, April 20, 2007 6:25 PM
> Subject: Re: CARP and em0 timeout watchdog
> 
> 
> > On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote:
> >> On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
> >> > Having done more diagnostics I have found out it is not CARP related at
> >> > all. It turns out that the same timeouts will happen when ftp'ing to 
> >> > the
> >> > physical address IPs as well. There is also an odd situation here
> >> > depending on which protocol I use. The two boxes are connected to a 
> >> > Dell
> >> > Powerconnect 2616 gig switch with CAT6. If I scp files from the
> >> > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without 
> >> > a
> >> > hiccup (I used dd to create various sized testfiles from 32M to 1G in
> >> > size and just scp testfile* to the other box). On the other hand, if I
> >> > connect to 192.168.0.19 using ftp (either active or passive) where ftp
> >> > is being run through inetd, the interface resets (watchdog) within
> >> > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
> >> > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
> >> > such behavioral differences between scp and ftp?
> >>
> >> You'll get a much higher throughput rate with FTP than you will with
> >> SSH, simply because encryption overhead is quite high (even with the
> >> Blowfish cipher).  With a very fast processor and on a gigE network
> >> you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
> >> That's the only difference I can think of.
> >>
> >> The watchdog resets I can't explain; Jack Vogel should be able to assist
> >> with that.  But it sounds like the resets only happen under very high
> >> throughput conditions (which is why you'd see it with FTP but not SSH).
> >>
> >
> > I guess it is possible that the traffic from ftp (or smb) is overloading
> > the interface; fwiw, if I increase the {recv,send}space to 131072 I can
> > acheive 32MB+/s using scp (and ftp shows similar values). The real
> > question is how to avoid these watchdog timeouts during heavy traffic;
> > the whole point here was to replace windows-based fileshare servers with
> > FreeBSD for the local network but at the moment it is proving
> > ineffectual as any samba file transfers stall (much like ftp). I see no
> > other error messages in the logfiles other than the watchdog timeouts
> > plus interface down/up messages.
> >
> > Sven
> >
> 
> Sorry for jumping on a thread here. I've had issues with em NIC's as well. 
> Especially with heavy loads. What helped for me was turning on polling. I 
> recompiled the kernel with polling and turned it on in rc.conf and my 
> problems disappeared.
> 
> Are you running with polling on?
> 

At first I did not have polling compiled in, so no. Then I compiled in
polling (and used options HZ=2000) but it didn't change anything.
Whether I have polling enabled or disabled on the interface, the outcome
is the same.

Sven




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1177092142.5457.20.camel>