From owner-freebsd-net@FreeBSD.ORG Tue Feb 23 22:26:45 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25E77106566C for ; Tue, 23 Feb 2010 22:26:45 +0000 (UTC) (envelope-from kirk.davis@epsb.ca) Received: from Exchange22.EDU.epsb.ca (exchange22.epsb.ca [198.161.119.187]) by mx1.freebsd.org (Postfix) with ESMTP id E3DC78FC16 for ; Tue, 23 Feb 2010 22:26:44 +0000 (UTC) Received: from Exchange26.EDU.epsb.ca ([10.0.5.123]) by Exchange22.EDU.epsb.ca with Microsoft SMTPSVC(6.0.3790.3959); Tue, 23 Feb 2010 15:26:44 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Tue, 23 Feb 2010 15:26:43 -0700 Message-ID: <529374128DC1B04D9D037911B8E8F05301C17A5F@Exchange26.EDU.epsb.ca> In-Reply-To: <2a41acea1002221629vbe7548am7b5f1ba94d7efa9f@mail.gmail.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Intel em0: watchdog timeout Thread-Index: Acq0H09CJ2pslyrITF+rtwHvUtSTlwAtkVXw References: <529374128DC1B04D9D037911B8E8F05301C17A51@Exchange26.EDU.epsb.ca> <43416_1266864062_4B82CFBE_43416_81_1_2a41acea1002221043k1b8742c9m8fb484a8e8a4fdda@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A54@Exchange26.EDU.epsb.ca> <43669_1266865888_4B82D6E0_43669_263_1_2a41acea1002221113v26804200q4f3971c3359dffab@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A55@Exchange26.EDU.epsb.ca> <201002222107.o1ML7v3Z059734@lava.sentex.ca> <529374128DC1B04D9D037911B8E8F05301C17A56@Exchange26.EDU.epsb.ca> <2a41acea1002221444o6e449602m1830761b21837c41@mail.gmail.com> <529374128DC1B04D9D037911B8E8F05301C17A57@Exchange26.EDU.epsb.ca> <2a41acea1002221629vbe7548am7b5f1ba94d7efa9f@mail.gmail.com> From: "Kirk Davis" To: "Jack Vogel" X-OriginalArrivalTime: 23 Feb 2010 22:26:44.0150 (UTC) FILETIME=[47598D60:01CAB4D7] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org, voovoos-fnet@killfile.pl, Mike Tancsa Subject: RE: Intel em0: watchdog timeout X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 22:26:45 -0000 Hi, Looks like I may have tracked down this problem. =20 =20 I noticed that fastforwarding ( net.inet.ip.fastforwarding=3D1 ) was turned on. I turned it off to see if that was causing the problem. Sure enough, 5 hours later and no watchdog timeouts. This is still running on FreeBSD 7.1 (I'm still planning to move to 7.2 soon). I read up on the net.inet.ip.fastforwarding sysctl and it doesn't look like it should cause any problems with the intel NIC driver. This may need to be looked at and tested by some one more knowledgeable with the networking code than I am. =20 Thanks to Jack and Mike for your help. =20 =20 ---- Kirk Kirk Davis=20 Senior Network Analyst, ITS=20 Edmonton Public Schools=20 One Kingsway Ave.=20 Edmonton, Alberta, Canada=20 T5H 4G9=20 ________________________________ From: Jack Vogel [mailto:jfvogel@gmail.com]=20 Sent: Monday, February 22, 2010 5:30 PM To: Kirk Davis Cc: Mike Tancsa; freebsd-net@freebsd.org Subject: Re: Intel em0: watchdog timeout =09 =09 Is your driver static, ie builtin, to the kernel, or do you load/unload it as a module? I ask because perhaps we could try a later driver, and being a module makes that easier.=20 =09 Jack =09 =09 =09 On Mon, Feb 22, 2010 at 3:37 PM, Kirk Davis wrote: =09 OK. I have the following in /boot/loader.conf (and rebooted) hw.em.rxd=3D1024 hw.em.txd=3D1024 =20 Should this be hw.em2.rxd? Is it set per interface or across all interfaces? =20 nmbcluster=3D262144 =20 # sysctl dev.em.2.stats=3D1 Feb 22 16:29:57 inet-gw kernel: em2: Defer count =3D 20 Feb 22 16:29:57 inet-gw kernel: em2: Missed Packets =3D 119947 =20 Feb 22 16:29:57 inet-gw kernel: em2: Receive No Buffers =3D 276762 Feb 22 16:29:57 inet-gw kernel: em2: Receive Length Errors =3D 0=20 Feb 22 16:29:57 inet-gw kernel: em2: Receive errors =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: Crc errors =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: Alignment errors =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: Collision/Carrier extension errors =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: RX overruns =3D 21 Feb 22 16:29:57 inet-gw kernel: em2: watchdog timeouts =3D 47 Feb 22 16:29:57 inet-gw kernel: em2: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: XON Rcvd =3D 22 Feb 22 16:29:57 inet-gw kernel: em2: XON Xmtd =3D 8349 Feb 22 16:29:57 inet-gw kernel: em2: XOFF Rcvd =3D 31 Feb 22 16:29:57 inet-gw kernel: em2: XOFF Xmtd =3D 15779 Feb 22 16:29:57 inet-gw kernel: em2: Good Packets Rcvd =3D 966101852 Feb 22 16:29:57 inet-gw kernel: em2: Good Packets Xmtd =3D 755993237 Feb 22 16:29:57 inet-gw kernel: em2: TSO Contexts Xmtd =3D 0 Feb 22 16:29:57 inet-gw kernel: em2: TSO Contexts Failed =3D 0 =20 still seeing the watchdog timer and link up/down messages. =20 Should I try going higher than 1024 on the hw.em.rxd? I'm not sure the next time I can schedule another reboot on this production server. =20 ---- Kirk =20 Kirk Davis=20 Senior Network Analyst, ITS=20 Edmonton Public Schools=20 One Kingsway Ave.=20 Edmonton, Alberta, Canada=20 T5H 4G9=20 phone: 1-780-429-8308=20 =20 ________________________________ =09 From: Jack Vogel [mailto:jfvogel@gmail.com]=20 =09 Sent: Monday, February 22, 2010 3:45 PM To: Kirk Davis Cc: Mike Tancsa; freebsd-net@freebsd.org=20 Subject: Re: Intel em0: watchdog timeout =09 OK, so you are still failing to get mbufs in the RX side, increase the nmbcluster value, and then what size is your RX ring (number of rx descriptors)? =09 If you havent already done so, change that to 1024.=20 =09 I am developing a change in the RX code right now that will help this situation, but am doing so in the 10G driver, once its solid there I will be backporting it into the 1G drivers, it will make discards almost unnecessary. =09 Jack =09 =09 On Mon, Feb 22, 2010 at 1:43 PM, Kirk Davis wrote: =09 > -----Original Message----- > From: Mike Tancsa [mailto:mike@sentex.net] > Subject: Re: Intel em0: watchdog timeout > > At 03:46 PM 2/22/2010, Kirk Davis wrote: > >Does this need to be done in loader.conf? It doesn't seem > to take from > >the command line. > ># sysctl dev.em.2.stats=3D1 > >dev.em.2.stats: -1 -> -1 > > > ># sysctl dev.em.2.stats > >dev.em.2.stats: -1 > > Hi, > After you issue those commands, the driver will spit out a > lot of useful stats to syslog. It will report something like the > following in /var/log/messages > > Feb 22 16:06:31 offsite kernel: em0: Excessive collisions =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Sequence errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Defer count =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Missed Packets =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Receive No Buffers =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Receive Length Errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Receive errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Crc errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Alignment errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Collision/Carrier > extension errors =3D 0 > Feb 22 16:06:31 offsite kernel: em0: RX overruns =3D 0 > Feb 22 16:06:31 offsite kernel: em0: watchdog timeouts =3D 0 > Feb 22 16:06:31 offsite kernel: em0: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 > LINK MSIX IRQ =3D 0 > Feb 22 16:06:31 offsite kernel: em0: XON Rcvd =3D 0 > Feb 22 16:06:31 offsite kernel: em0: XON Xmtd =3D 0 > Feb 22 16:06:31 offsite kernel: em0: XOFF Rcvd =3D 0 > Feb 22 16:06:31 offsite kernel: em0: XOFF Xmtd =3D 0 > Feb 22 16:06:31 offsite kernel: em0: Good Packets Rcvd =3D 2559032551 > Feb 22 16:06:31 offsite kernel: em0: Good Packets Xmtd =3D 1568751141 > Feb 22 16:06:31 offsite kernel: em0: TSO Contexts Xmtd =3D 0 > Feb 22 16:06:31 offsite kernel: em0: TSO Contexts Failed =3D 0 =09 =09 Thanks Mike and Jack. I don't know why I didn'ty notice the output in /var/log/messages =09 Here is the output for the two interfaces that are causing this issue. =09 Feb 22 13:33:52 inet-gw kernel: em0: Excessive collisions =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Sequence errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Defer count =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Missed Packets =3D 24296 Feb 22 13:33:52 inet-gw kernel: em0: Receive No Buffers =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Receive Length Errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Receive errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Crc errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Alignment errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Collision/Carrier extension errors =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: RX overruns =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: watchdog timeouts =3D 6 Feb 22 13:33:52 inet-gw kernel: em0: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: XON Rcvd =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: XON Xmtd =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: XOFF Rcvd =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: XOFF Xmtd =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: Good Packets Rcvd =3D 424303810 Feb 22 13:33:52 inet-gw kernel: em0: Good Packets Xmtd =3D 576529136 Feb 22 13:33:52 inet-gw kernel: em0: TSO Contexts Xmtd =3D 0 Feb 22 13:33:52 inet-gw kernel: em0: TSO Contexts Failed =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Excessive collisions =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Sequence errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Defer count =3D 20 Feb 22 13:34:12 inet-gw kernel: em2: Missed Packets =3D 68059 Feb 22 13:34:12 inet-gw kernel: em2: Receive No Buffers =3D 275612 Feb 22 13:34:12 inet-gw kernel: em2: Receive Length Errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Receive errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Crc errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Alignment errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: Collision/Carrier extension errors =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: RX overruns =3D 17 Feb 22 13:34:12 inet-gw kernel: em2: watchdog timeouts =3D 38 Feb 22 13:34:12 inet-gw kernel: em2: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: XON Rcvd =3D 21 Feb 22 13:34:12 inet-gw kernel: em2: XON Xmtd =3D 8344 Feb 22 13:34:12 inet-gw kernel: em2: XOFF Rcvd =3D 30 Feb 22 13:34:12 inet-gw kernel: em2: XOFF Xmtd =3D 9159 Feb 22 13:34:12 inet-gw kernel: em2: Good Packets Rcvd =3D 713607509 Feb 22 13:34:12 inet-gw kernel: em2: Good Packets Xmtd =3D 569694020 Feb 22 13:34:12 inet-gw kernel: em2: TSO Contexts Xmtd =3D 0 Feb 22 13:34:12 inet-gw kernel: em2: TSO Contexts Failed =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Excessive collisions =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Sequence errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Defer count =3D 20 Feb 22 13:35:10 inet-gw kernel: em2: Missed Packets =3D 68059 Feb 22 13:35:10 inet-gw kernel: em2: Receive No Buffers =3D 275612 Feb 22 13:35:10 inet-gw kernel: em2: Receive Length Errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Receive errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Crc errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Alignment errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: Collision/Carrier extension errors =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: RX overruns =3D 17 Feb 22 13:35:10 inet-gw kernel: em2: watchdog timeouts =3D 38 Feb 22 13:35:10 inet-gw kernel: em2: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: XON Rcvd =3D 21 Feb 22 13:35:10 inet-gw kernel: em2: XON Xmtd =3D 8344 Feb 22 13:35:10 inet-gw kernel: em2: XOFF Rcvd =3D 30 Feb 22 13:35:10 inet-gw kernel: em2: XOFF Xmtd =3D 9159 Feb 22 13:35:10 inet-gw kernel: em2: Good Packets Rcvd =3D 715555016 Feb 22 13:35:10 inet-gw kernel: em2: Good Packets Xmtd =3D 571157561 Feb 22 13:35:10 inet-gw kernel: em2: TSO Contexts Xmtd =3D 0 Feb 22 13:35:10 inet-gw kernel: em2: TSO Contexts Failed =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Excessive collisions =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Sequence errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Defer count =3D 20 Feb 22 13:39:12 inet-gw kernel: em2: Missed Packets =3D 68059 Feb 22 13:39:12 inet-gw kernel: em2: Receive No Buffers =3D 275612 Feb 22 13:39:12 inet-gw kernel: em2: Receive Length Errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Receive errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Crc errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Alignment errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: Collision/Carrier extension errors =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: RX overruns =3D 17 Feb 22 13:39:12 inet-gw kernel: em2: watchdog timeouts =3D 38 Feb 22 13:39:12 inet-gw kernel: em2: RX MSIX IRQ =3D 0 TX MSIX IRQ =3D 0 LINK MSIX IRQ =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: XON Rcvd =3D 21 Feb 22 13:39:12 inet-gw kernel: em2: XON Xmtd =3D 8344 Feb 22 13:39:12 inet-gw kernel: em2: XOFF Rcvd =3D 30 Feb 22 13:39:12 inet-gw kernel: em2: XOFF Xmtd =3D 9159 Feb 22 13:39:12 inet-gw kernel: em2: Good Packets Rcvd =3D 723521981 Feb 22 13:39:12 inet-gw kernel: em2: Good Packets Xmtd =3D 577211431 Feb 22 13:39:12 inet-gw kernel: em2: TSO Contexts Xmtd =3D 0 Feb 22 13:39:12 inet-gw kernel: em2: TSO Contexts Failed =3D 0 =09 =09 Can this be the problem? "Receive No Buffers =3D 275612" =09 ---- Kirk Kirk Davis Senior Network Analyst, ITS Edmonton Public Schools One Kingsway Ave. Edmonton, Alberta, Canada T5H 4G9 =09 phone: 1-780-429-8308 =09 =09 =09