From owner-freebsd-net@FreeBSD.ORG Thu Mar 20 13:32:06 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DC630CFA; Thu, 20 Mar 2014 13:32:06 +0000 (UTC) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8D47F62A; Thu, 20 Mar 2014 13:32:05 +0000 (UTC) Received: from th-04.cs.huji.ac.il ([132.65.80.125]) by kabab.cs.huji.ac.il with esmtp id 1WQd4k-0008gR-Vb; Thu, 20 Mar 2014 15:32:03 +0200 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: Network stack returning EFBIG? From: Daniel Braniss In-Reply-To: <21290.60558.750106.630804@hergotha.csail.mit.edu> Date: Thu, 20 Mar 2014 15:32:02 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <868FFD0A-106E-4C5E-A61C-10C3895C3281@cs.huji.ac.il> References: <21290.60558.750106.630804@hergotha.csail.mit.edu> To: Garrett Wollman X-Mailer: Apple Mail (2.1874) Cc: freebsd-net@freebsd.org, freebsd-stable@freebsd.org, jackv@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Mar 2014 13:32:06 -0000 turn off TSO the problems sound similar to the one I reported a while back. truing = off tso fixed it. danny On Mar 20, 2014, at 3:26 PM, Garrett Wollman = wrote: > I recently put a new server running 9.2 (with a local patches for NFS) > into production, and it's immediately started to fail in an odd way. > Since I pounded this server pretty heavily and never saw the error in > testing, I'm more than a little bit taken aback. We have identical > hardware in production with 9.1, and I have the same kernel running > just peachy on a machine with Chelsio T4 NICs. The problem machine = has > ixgbe(4): >=20 > ix0: = port 0x9c00-0x9c1f mem 0xdef80000-0xdeffffff,0xdef7c000-0xdef7ffff irq = 24 at device 0.0 on pci2 > ix0: Using MSIX interrupts with 7 vectors > ix0: Ethernet address: 04:7d:7b:a5:87:32 > ix0: PCI Express Bus: Speed 5.0GT/s Width x4 > ix1: = port 0x9880-0x989f mem 0xdee80000-0xdeefffff,0xdee7c000-0xdee7ffff irq = 34 at device 0.1 on pci2 > ix1: Using MSIX interrupts with 7 vectors > ix1: Ethernet address: 04:7d:7b:a5:87:33 > ix1: PCI Express Bus: Speed 5.0GT/s Width x4 >=20 > (pciconf tells me these are "82599EB 10-Gigabit SFI/SFP+ Network > Connection". It's a bug that the driver doesn't tell me that.) >=20 > These are glued together in a lagg(4) using LACP. >=20 > Since we put this server into production, random network system calls > have started failing with [EFBIG] or maybe sometimes [EIO]. I've > observed this with a simple ping, but various daemons also log the > errors: > Mar 20 09:22:04 nfs-prod-4 sshd[42487]: fatal: Write failed: File too = large [preauth] > Mar 20 09:23:44 nfs-prod-4 nrpe[42492]: Error: Could not complete SSL = handshake. 5 >=20 > The machine eventually becomes unreachable and has to be rebooted from > the console. >=20 > So, can anyone tell me how this is possible, and what changed between > 9.1 and 9.2 to cause it? >=20 > -GAWollman > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org"