Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Apr 2008 13:53:47 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        pluknet <pluknet@gmail.com>
Cc:        stable@freebsd.org
Subject:   Re: nfs-server silent data corruption
Message-ID:  <20080423045347.GE54715@cdnetworks.co.kr>
In-Reply-To: <a31046fc0804221313k5bf5e2f3h106a342a5644e152@mail.gmail.com>
References:  <20080421094718.GY25623@hub.freebsd.org> <wp63ubp8e0.fsf@heho.snv.jussieu.fr> <200804211537.m3LFbaZA086977@lava.sentex.ca> <wpy77650s0.fsf@heho.snv.jussieu.fr> <200804221501.m3MF1guW092221@lava.sentex.ca> <wpzlrlu6w7.fsf@heho.snv.jussieu.fr> <200804221741.m3MHfYjO092795@lava.sentex.ca> <wpabjln518.fsf@heho.snv.jussieu.fr> <200804221807.m3MI73bN092981@lava.sentex.ca> <a31046fc0804221313k5bf5e2f3h106a342a5644e152@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 23, 2008 at 12:13:44AM +0400, pluknet wrote:
 > On 22/04/2008, Mike Tancsa <mike@sentex.net> wrote:
 > > At 02:00 PM 4/22/2008, Arno J. Klaassen wrote:
 > >
 > > > >
 > > > > Are you using the latest RELENG_7, or at least the latest version of
 > > > > nfe thats in RELENG_7 ?
 > > >
 > > >
 > > > Think so :
 > > >
 > >
 > >  OK, and it is the latest RELENG_7 ? Or just the if_nfe.c file has been
 > > manually updated ? Also, you are using ULE or the 4BSD scheduler ?  I still
 > > have 4BSD on the box I am testing on.
 > 
 > Hi, I have the same problem with data corruption (with nfe on nfs server side),
 > particularly when transferring large files.
 > Maybe this is somehow associated with the topic.
 > 
 > My simple test case:
 > truncate -s 1000m bigfile
 > ^^ here I get zero-filed file
 > cp bigfile /nfs/mounted
 > ^^ here I get not-at-all-zero-filed file, after uploading to nfs server
 > 
 > I looked at the corrupted file. It contains a few ranges, filed with
 > non-zero bytes:
 > equal to zero?  real 4-byte value   offset
 > ======================================
 > not equal       1200355616     at pos=38797316
 > ... <-- this range contains per-4bytes garbage, omit
 > not equal       3879749905     at pos=38813696
 > 
 > not equal       161160732      at pos=45613060
 > ... <-- ditto
 > not equal       575257183      at pos=45629440
 > 
 > not equal       1943682165     at pos=59768836
 > ... <-- ditto
 > not equal       2843639625     at pos=59785216
 > 
 > not equal       2653910121     at pos=60293124
 > ... <-- ditto
 > not equal       3462830780     at pos=60309504
 > 
 > Some info:
 > 
 > nfs server on 8-CURRENT as of Apr 17
 > nfs client on 7.0-STABLE as of Apr 12
 > 
 > dmesg | grep nfe
 > nfe0: <NVIDIA nForce2 MCP2 Networking Adapter> port 0xe000-0xe007 mem
 > 0xe2001000-0xe2001fff irq 20 at device 4.0 on pci0
 > miibus0: <MII bus> on nfe0
 > nfe0: Ethernet address: 00:04:61:6c:76:b1
 > nfe0: [FILTER]
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > nfe0: tx v1 error 0x6001
 > ^^^

I'm not sure it's related with data corruption issue but 0x6001
would mean Tx underflow error. I recall these Tx errors were seen
on nfe(4) if negotiated speed/duplex does not match with link
partner or MACs.
Does link partner also agree on speed/duplex settings of nfe(4)?
What PHY driver nfe(4) use?

 > This appears while cp'ing file to server.
 > (btw they do not appear with disabled polling, probably it's an another issue)
 > 
 > vmstat -i | grep nfe
 > irq20: nfe0 ohci0                      1          0
 > 
 > nfe0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
 >         options=48<VLAN_MTU,POLLING>
 >         ether 00:04:61:6c:76:b1
 >         inet 192.168.200.137 netmask 0xffffff00 broadcast 192.168.200.255
 >         media: Ethernet autoselect (100baseTX <full-duplex>)
 >         status: active
 > I can reproduce it regardless polling presence.
 > 
 > nfe0@pci0:0:4:0:        class=0x020000 card=0x10001695 chip=0x006610de
 > rev=0xa1 hdr=0x00
 > 

-- 
Regards,
Pyun YongHyeon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080423045347.GE54715>