Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Oct 2008 15:12:21 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Andriy Gapon <avg@icyb.net.ua>
Cc:        freebsd-stable@freebsd.org, yongari@freebsd.org
Subject:   Re: 6.3 nfe: dead after system reset
Message-ID:  <20081021061221.GI43039@cdnetworks.co.kr>
In-Reply-To: <48FC7E8D.2000506@icyb.net.ua>
References:  <47A3041D.5050402@icyb.net.ua> <20080201123603.GA14050@cdnetworks.co.kr> <47A321BB.1060708@icyb.net.ua> <47A32501.7080703@icyb.net.ua> <20080204035242.GA28554@cdnetworks.co.kr> <47C2BC50.5040702@icyb.net.ua> <47C2DBEF.301@icyb.net.ua> <20080226073633.GC47750@cdnetworks.co.kr> <486D440F.1090601@icyb.net.ua> <48FC7E8D.2000506@icyb.net.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 20, 2008 at 03:50:21PM +0300, Andriy Gapon wrote:
 > 
 > Pyun,
 > 
 > something new about this issue.
 > Today I got another instance of it, but with a new twist.
 > In addition to all the usual symptoms I got a lot of messages like the
 > following in console:
 > 
 > nfe0: discard frame w/o leading ethernet header (len 0 pkt len 0)
 > and even couple like this:
 > 
 > nfe0: discard frame w/o leading ethernet header (len 4294967295 pkt len
 > 4294967295)
 > nfe0: discard frame w/o leading ethernet header (len 3 pkt len 3)
 > 
 > Maybe these messages could give a hint about what was going wrong in nfe.
 > 

It looks like there is a bug in Rx handling or hardware
initialization issue. Recently I've implemented hardware MAC
counters of MCP controller and these counters would provide
valuable information to analyze what's going on in controller.
See commit log of SVN r183561.
There is a WIP version to workaround CRC issues of MCP65. Though it
may not be directly related with your issue the patch at the 
fowllowing URL has a fix for MAC reset register. So give it try and
let me know how it goes.

http://people.freebsd.org/~yongari/nfe/nfe.rx.patch.20081021

 > on 04/07/2008 00:26 Andriy Gapon said the following:
 > > 
 > > As they say - long time, no see :-)
 > > I am back with some more details, but still with no insights.
 > > 
 > > Let me refresh an essence of the issue.
 > > The issue: after 'abrupt' reset/reboot of a system my nfe interface is
 > > dead.
 > > That is, if I do a graceful reboot (e.g. via shutdown -r) everything is
 > > ok, ditto if I do power-down (whether graceful or not) and the power-up.
 > > The problem happens only if I press reset button and then boot up.
 > > 
 > > Details.
 > > The issue can not be reproduced with nve driver.
 > > Moreover, when I reproduce the problem with nfe, then kldunload nfe
 > > driver, kldload nve driver - nve interface is alive. Then kldunload nve,
 > > kldload nfe - nfe interface is dead again.
 > > 
 > > Specification of dead.
 > > There are no errors. ifconfig shows the same output (active, media, up,
 > > etc) as in normal case. But I can not ping any host on local network
 > > (connected to the same switch), ping outputs "Host is down". tcpdump
 > > also doesn't show any incoming traffic.
 > > 
 > > More details.
 > > I was able to verify that packets do actually go through the interface.
 > > When I try to ping some machine I see (on the other host) arp requests
 > > for its ethernet address. All address in arp packets are correct
 > > (ethernet and ip). So the interface works for outgoing packets, but
 > > somehow loses incoming arp replies. Not sure if thap happens in the NIC
 > > or in the driver itself (see the above nve/nfe live replacement
 > > experiment).
 > > 
 > > So, there are some facts, but still no clues.
 > > 
 > 
 > 
 > -- 
 > Andriy Gapon

-- 
Regards,
Pyun YongHyeon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081021061221.GI43039>