From owner-freebsd-stable@FreeBSD.ORG Mon Apr 21 14:54:35 2008 Return-Path: Delivered-To: stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BF95A106567A; Mon, 21 Apr 2008 14:54:35 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by mx1.freebsd.org (Postfix) with ESMTP id 51D788FC38; Mon, 21 Apr 2008 14:54:34 +0000 (UTC) (envelope-from arno@heho.snv.jussieu.fr) Received: from heho.snv.jussieu.fr (heho.snv.jussieu.fr [134.157.184.22]) by shiva.jussieu.fr (8.14.2/jtpda-5.4) with ESMTP id m3LEqvJi033518 ; Mon, 21 Apr 2008 16:53:20 +0200 (CEST) X-Ids: 164 Received: from heho.snv.jussieu.fr (localhost [127.0.0.1]) by heho.snv.jussieu.fr (8.13.3/jtpda-5.2) with ESMTP id m3LEquFm020255 ; Mon, 21 Apr 2008 16:52:56 +0200 (MEST) Received: (from arno@localhost) by heho.snv.jussieu.fr (8.13.3/8.13.1/Submit) id m3LEqtqB020252; Mon, 21 Apr 2008 16:52:55 +0200 (MEST) (envelope-from arno) To: Kris Kennaway References: <20080421094718.GY25623@hub.freebsd.org> From: "Arno J. Klaassen" Date: 21 Apr 2008 16:52:55 +0200 In-Reply-To: <20080421094718.GY25623@hub.freebsd.org> Message-ID: Lines: 43 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (shiva.jussieu.fr [134.157.0.164]); Mon, 21 Apr 2008 16:53:22 +0200 (CEST) X-Virus-Scanned: ClamAV 0.92/6863/Mon Apr 21 14:55:32 2008 on shiva.jussieu.fr X-Virus-Status: Clean X-Miltered: at jchkmail.jussieu.fr with ID 480CAA5C.006 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 480CAA5C.006/134.157.184.22/heho.snv.jussieu.fr/heho.snv.jussieu.fr/ Cc: stable@FreeBSD.ORG, Clayton Milos , net@FreeBSD.ORG Subject: Re: nfs-server silent data corruption X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2008 14:54:35 -0000 Kris Kennaway writes: > On Mon, Apr 21, 2008 at 01:02:33AM +0200, Arno J. Klaassen wrote: > > > I didn't stress-test this MB for a while, but last time I did was > > with 7-PRELEASE/RC?/CANTremember-exactly-but-close-to-release > > and all worked great > > > > I did add 2G ECC to the 2nd CPU since, though I doubt that interferes > > with NFS. > > Uh, you're getting server-side data corruption, it could definitely be > because of the memory you added. yop, though I'm still not convinced the memory is bad (the very same Kingston ECC as the 2*1G in use for about half a year already) : I added it directly to the 2nd CPU (diagram on page 9 of http://www.tyan.com/manuals/m_s2895_101.pdf) and the problem seems to be the interaction between nfe0 and powerd .... : - if I stop powerd, problems go away - I let run powerd but turn of txcsum and tso4 on the interface, the problem is a lot harder to produce (if ever this gives a hint to anyone) Device is : nfe0@pci0:0:10:0: class=0x068000 card=0x289510f1 chip=0x005710de rev=0xa3 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce4 Ultra NVidia Network Bus Enumerator' class = bridge cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0 (this is with the default BIOS setting " LAN Bridge Enabled", disabling that setting makes pciconf say "class = network" but does not influence my problem) I will restart my tests now by populating all 4G to only CPU1 and say whether that matters. Best, Arno