From owner-freebsd-stable@FreeBSD.ORG  Sat Aug 21 07:54:14 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5C1E41065670
	for <freebsd-stable@freebsd.org>; Sat, 21 Aug 2010 07:54:14 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta08.emeryville.ca.mail.comcast.net
	(qmta08.emeryville.ca.mail.comcast.net [76.96.30.80])
	by mx1.freebsd.org (Postfix) with ESMTP id 434558FC13
	for <freebsd-stable@freebsd.org>; Sat, 21 Aug 2010 07:54:13 +0000 (UTC)
Received: from omta03.emeryville.ca.mail.comcast.net ([76.96.30.27])
	by qmta08.emeryville.ca.mail.comcast.net with comcast
	id wvrb1e0010b6N64A8vuDY6; Sat, 21 Aug 2010 07:54:13 +0000
Received: from koitsu.dyndns.org ([98.248.41.155])
	by omta03.emeryville.ca.mail.comcast.net with comcast
	id wvuC1e0033LrwQ28PvuDUe; Sat, 21 Aug 2010 07:54:13 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 713589B425; Sat, 21 Aug 2010 00:54:12 -0700 (PDT)
Date: Sat, 21 Aug 2010 00:54:12 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: Mark Morley <mark@islandnet.com>
Message-ID: <20100821075412.GA37192@icarus.home.lan>
References: <20100816063550.GA35083@icarus.home.lan>
	<9c4ecm1t.1282067644@helpdesk.islandnet.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <9c4ecm1t.1282067644@helpdesk.islandnet.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: FreeBSD Stable <freebsd-stable@freebsd.org>, Jack Vogel <jfvogel@gmail.com>
Subject: Re: NFS stalling on 8.1-STABLE
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 21 Aug 2010 07:54:14 -0000

On Tue, Aug 17, 2010 at 10:54:04AM -0700, Mark Morley wrote:
> On Sun, 15 Aug 2010 23:35:50 -0700 Jeremy Chadwick <freebsd@jdc.parodius.com> wrote:
> >On Thu, Aug 12, 2010 at 10:35:49AM -0700, Mark Morley wrote:
> >> I have five front end web servers that all mount their content from
> >> the same server via NFS.  If I stress the link on any one of the
> >> machines (eg: copy a large directory with a lot of files to/from the
> >> mounted file system) the client will pause.  That is, all processes
> >> trying to access that mount will freeze.  The log files with hundreds
> >> or thousands of nfs server not responding / is alive again messages.
> >> After 60 seconds it returns to normal, unless the load is still there
> >> in which case it continues to pause.
> >>
> >> This has only started happening since I upgraded the client machines
> >> to 8.1-STABLE (previously four of them were 8.0 and one was 7.3).  The
> >> server is 7.1-RELEASE-p11.  No other changes have taken place in terms
> >> of hardware or software or mount options, etc.
> >>
> >> All nics involved are gigabit em cards, and they are on a private
> >> network (web access to the boxes is via an external interface).
> >
> >Are there any indications in dmesg that the NIC is responsible, e.g.
> >interface down/up, etc.?
> 
> No, nothing like that.
> 
> >Does switching to UDP-based NFS solve the problem for you?
> 
> Trying that now for the past 24 hours or so.  Four of the machine seem ok so far, but the fifth one has started dropping the mount entirely.  Access to it gives an "Input / output error" message.  Forcing a dismount and remounting brings it back.
> 
> >What OS version (uname -a) and NIC are used on the NFS server?
> 
> FreeBSD xxx 7.1-RELEASE-p11 FreeBSD 7.1-RELEASE-p11 #0: Wed May 26 03:20:59 PDT 2010
> root@xxx:/usr/obj/usr/src/sys/CUSTOM  i386
> 
> NICs are em
> 
> >Can you please provide the following output from one of the client
> >machines running 8.1-STABLE with gigE em(4)?  You can X-out machine
> >names, MAC addresses, and IP addresses/netblocks if need be.
> >
> >* uname -a
> 
> FreeBSD xxx 8.1-STABLE FreeBSD 8.1-STABLE #0: Tue Jul 27 16:27:44 PDT 2010
> root@xxx:/usr/obj/usr/src/sys/CUSTOM  amd64
> 
> >* ifconfig emX  (where X is the interface number which would be
> >  used for NFS)
> 
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
> ether 00:0e:0c:85:d5:0d
> inet 192.168.1.30 netmask 0xffffff00 broadcast 192.168.1.255
> media: Ethernet 1000baseT <full-duplex>
> status: active
> 
> >* netstat -idn -I emX
> 
> Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll Drop
> em0    1500 <Link#1>      00:0e:0c:85:d5:0d 39913814     2     0 39949943     0     0    0
> em0    1500 192.168.1.0/2 192.168.1.30      39944016     -     - 39949664     -     -    -
> 
> >* pciconf -lvc  (provide only the data for emX please)
> 
> em0@pci0:1:6:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05 hdr=0x00
> vendor     = 'Intel Corporation'
> device     = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)'
> class      = network
> subclass   = ethernet
> cap 01[dc] = powerspec 2  supports D0 D3  current D0
> cap 07[e4] = PCI-X supports 2048 burst read, 1 split transaction
> 
> >* vmstat -i
> 
> interrupt                          total       rate
> irq1: atkbd0                         239          0
> irq16: em0                      36746591        883
> irq18: em1                      12658607        304
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I'm ignoring em1 because em0 is the one which has the NFS traffic, and
em1 could in fact be a different model of Intel NIC (it's very common
for server vendors to include two different models of NIC on the same
board; sure, both em(4), but different models), so I'm staying focused
on em0.

The interrupt rate here looks quite high for a system that may not be
doing anything (I don't know for sure).  Can you provide output from
"netstat -I em0 -n -b 1" and let it run for about 60 seconds?  This
should be done both when NFS is UDP-only, as well as when NFS is
TCP-only.  I'm curious what kind of network throughput you're seeing (in
attempt to correlate it with high interrupt rates).  If network I/O is
very low yet the interrupt rate is very high, the problem may be a
driver bug or something with PCI configuration/initialisation.

I'm also CC'ing Jack Vogel of Intel, who may have some insight to what's
going on here.

> irq21: ohci0                           2          0
> irq22: ehci0                      528002         12
> irq23: atapci1                   2334936         56
> cpu0: timer                     83207296       2000
> cpu1: timer                     83207289       2000
> Total                          218682962       5256
> 
> >* sysctl hw.pci
> 
> hw.pci.usb_early_takeover: 1
> hw.pci.honor_msi_blacklist: 1
> hw.pci.enable_msix: 1
> hw.pci.enable_msi: 1
> hw.pci.do_power_resume: 1
> hw.pci.do_power_nodriver: 0
> hw.pci.enable_io_modes: 1
> hw.pci.default_vgapci_unit: -1
> hw.pci.host_mem_start: 2147483648
> hw.pci.mcfg: 1
> 
> >* As root, run "sysctl dev.em.X.stats=1" then do "dmesg" and
> >  provide the output for NIC statistics (will start with "emX:")
> 
> em0: Excessive collisions = 0
> em0: Sequence errors = 0
> em0: Defer count = 52
> em0: Missed Packets = 0
> em0: Receive No Buffers = 0
> em0: Receive Length Errors = 0
> em0: Receive errors = 1
> em0: Crc errors = 1
> em0: Alignment errors = 0
> em0: Collision/Carrier extension errors = 0
> em0: RX overruns = 0
> em0: watchdog timeouts = 0
> em0: RX MSIX IRQ = 0 TX MSIX IRQ = 0 LINK MSIX IRQ = 0
> em0: XON Rcvd = 54
> em0: XON Xmtd = 0
> em0: XOFF Rcvd = 54
> em0: XOFF Xmtd = 0
> em0: Good Packets Rcvd = 39915088
> em0: Good Packets Xmtd = 39951839

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |