Date: Sat, 25 Mar 2006 02:54:38 -0800 From: "David G. Lawrence" <dg@dglawrence.com> To: Ion-Mihai Tetcu <itetcu@people.tecnik93.com> Cc: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, freebsd-stable@freebsd.org, JoaoBR <joao@matik.com.br> Subject: Re: nve timeout (and down) regression? Message-ID: <20060325105438.GB12815@tnn.dglawrence.com> In-Reply-To: <20060324223317.2069564f@it.buh.tecnik93.com> References: <20060323215738.C2181@maildrop.int.zabbadoz.net> <20060323223424.D00D945042@ptavv.es.net> <20060324223317.2069564f@it.buh.tecnik93.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> This happens w/o any "real" activity on that interface (which goes into > an Allied Telesyn switch): > ....... > Mar 24 19:39:54 worf kernel: nve0: device timeout (1) > Mar 24 19:39:54 worf kernel: nve0: link state changed to DOWN > Mar 24 19:39:55 worf kernel: nve0: link state changed to UP > Mar 24 19:40:14 worf kernel: nve0: device timeout (1) The problem is the watchdog timeout itself. I've attached am email that I sent a few months ago which describes the problem, along with a simple patch which disables the watchdog timer. -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities. Date: Wed, 4 Jan 2006 16:21:03 -0800 Subject: Re: nve(4) patch - please test! > Since I sent the mail below I had to discover that the new driver > has a problem when no cable is plugged in, at least on my Asus board. > > It doesn't only run into timeouts, during some of these timeout the > machine or at least the keyboard hangs for about a minute. > > Is there anything I can do to help debug this? I ran into this problem recently as well and spent some time diagnosing it. It's not that the cable isn't plugged in - rather it happens whenever the traffic levels are low. The problem is that the nvidia-supplied portion of the driver is defering the releasing of the completed transmit buffers and this occasionally results in if_timer expiring, causing the driver watchdog routine to be called ("device timeout"). The watchdog routine resets the card and the nvidia-supplied code sits in a high-priority loop waiting for the card to reset. This can take many seconds and your system will be hung until it completes. I have a work-around patch for the problem that I've attached to this email. It simply disables the watchdog. A real fix would involve accounting for the outstanding transmit buffers differently (or perhaps not at all - e.g. always attempt to call the nvidia-supplied code and if a queue-full error occurs, then wait for an interrupt before trying to queue more transmit packets). -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities. Index: if_nve.c =================================================================== RCS file: /home/ncvs/src/sys/dev/nve/if_nve.c,v retrieving revision 1.7.2.8 diff -c -r1.7.2.8 if_nve.c *** if_nve.c 25 Dec 2005 21:57:03 -0000 1.7.2.8 --- if_nve.c 5 Jan 2006 00:12:45 -0000 *************** *** 943,949 **** return; } /* Set watchdog timer. */ ! ifp->if_timer = 8; /* Copy packet to BPF tap */ BPF_MTAP(ifp, m0); --- 943,949 ---- return; } /* Set watchdog timer. */ ! ifp->if_timer = 0; /* Copy packet to BPF tap */ BPF_MTAP(ifp, m0);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060325105438.GB12815>