From owner-freebsd-questions@FreeBSD.ORG Sun Oct 28 03:50:18 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8DA9816A418 for ; Sun, 28 Oct 2007 03:50:18 +0000 (UTC) (envelope-from scott@butlerpress.com) Received: from brooklyn.butlerpress.com (cheryljwillson.com [71.36.251.213]) by mx1.freebsd.org (Postfix) with ESMTP id 414C513C4A5 for ; Sun, 28 Oct 2007 03:50:18 +0000 (UTC) (envelope-from scott@butlerpress.com) Received: from localhost (localhost.butlerpress.com [127.0.0.1]) by brooklyn.butlerpress.com (Postfix) with ESMTP id 4FE8A5D37 for ; Sat, 27 Oct 2007 20:34:54 -0700 (PDT) X-Virus-Scanned: amavisd-new at butlerpress.com Received: from brooklyn.butlerpress.com ([127.0.0.1]) by localhost (brooklyn.butlerpress.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id neCr1CL924NU for ; Sat, 27 Oct 2007 20:34:51 -0700 (PDT) Received: from [192.168.0.3] (unknown [192.168.0.3]) by brooklyn.butlerpress.com (Postfix) with ESMTP id 4A7EA5CFB for ; Sat, 27 Oct 2007 20:34:51 -0700 (PDT) Message-Id: <04B8FF2B-ADD6-42B2-9265-F6BBAD90F343@butlerpress.com> From: Scott Willson To: freebsd-questions@freebsd.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v912) Date: Sat, 27 Oct 2007 20:34:14 -0700 References: <20070529232621.GB1575@rot13.obsecurity.org> X-Mailer: Apple Mail (2.912) Subject: Re: Panic With Large Network Copy X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Oct 2007 03:50:18 -0000 On Jun 4, 2007, at 10:49 AM, Scott Willson wrote: > > On May 29, 2007, at 4:26 PM, Kris Kennaway wrote: > >> On Tue, May 29, 2007 at 03:36:49PM -0700, Scott Willson wrote: >>> I am seeing hard (often no core dump) crashes on a new AMD64 box >>> running 6.2 RELEASE. When I try to rsync 10+ GB of backup files to >>> the new box, I can reliably crash it after about 20 minutes; often >>> quicker if I do something else intensive at the same time, like >>> compile MySQL. Here are the box specs: >>> ASUS M2NPV-VM motherboard >>> AMD A64 3800+ 2.4G CPU >>> ... > >>> Most times, I don't even get a core dump. Here's one I did get: >>> panic: double fault >>> ... > >>> #9 0xffffffff804371f0 in m_freem (mb=0x0) at uma.h:303 >>> #10 0xffffffff80634125 in nve_ospackettx (ctx=0xffffff00798aac00, >>> id=0xffffffffb19ea6d0, success=0) at /usr/src/sys/dev/nve/if_nve.c: >>> 1551 >> >> This looks like a nve driver bug to me. You may wish to try the >> nfe driver. >> >> Kris > > OK, my box is running nicely now. The nfe driver was indeed a good > idea, thanks! Here are the details if anyone else has similar > problems. > > 10baseT hub + nve = kernal panics under high load > This is the default FreeBSD 6.2 RELEASE configuration. > > 10baseT hub + nfe + e100phy patch = errors under high load (tx v2 > error 0x6204, watchdog timeout) > http://www.se.hiroshima-u.ac.jp/~shigeaki/software/freebsd-nfe.html > This is a replacement driver + recommended path for my hardware. No > panics, but many errors. > > 10baseT hub + nfe with no patches = errors under high load (tx v2 > error 0x6204, watchdog timeout) > > 10/100/1000baseT switch + nfe + e100phy patch = errors under high > load (tx v2 error 0x6204, watchdog timeout) > > 10/100/1000baseT switch + nfe = No errors! > This is a new switch and the nfe driver with no patch. In dmesg, I > see 'ukphy0' when I boot. > > So, as you may have surmised, my motherboard + an old 10baseT hub > doesn't work right with any driver. I replaced my very old hub with > a new switch, and I am now running the nfe driver with ukphy0. This > combination works great. Well, turns out after all that, the root cause was something else again. Fiddling with the driver and the switch helped matters, but I still experienced random drops and warnings. Once I installed Gnome, the system began to bomb regularly with oversized frames. Turns out that both the USB controller and the NIC were on the same IRQ. I'm not a hardware engineer (obviously) but it seems that software that probed the USB ports would cause problems for the Ethernet NIC. I don't need USB at all, so I disabled it in the BIOS, and no problems for real this time. Just wanted to post this for the equally clueless.