From owner-freebsd-stable@FreeBSD.ORG Fri Apr 10 04:42:21 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A48C106564A for ; Fri, 10 Apr 2009 04:42:21 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.232]) by mx1.freebsd.org (Postfix) with ESMTP id 463A28FC0A for ; Fri, 10 Apr 2009 04:42:21 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by rv-out-0506.google.com with SMTP id l9so875024rvb.43 for ; Thu, 09 Apr 2009 21:42:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:date:to:cc :subject:message-id:reply-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=FJqpEIenMKBAzIps+fWd34VOT+i9yTJzCGeFYzeT6Bo=; b=NlyOgjr4oO/2DqCvjO84zJ2k5ZeLTVIzxcM7O52KzT7TpYqUiKsvyQv5mAa5wZFbVr r4uX29AqKzSAeVPPKMWK8IGO8LX/5S60qk7RwzMxO8mSWSfV7DW1Ro4kMnWUfmrgIoHh l+tbD06vqSKnJ+maDOCWSusMr4j/Q6B6MIL18= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=YiNuEA4GnapN8X7I9vaeusqc+BGhP48xb9jEA76fDfjxU8Ot51G3muvsK4uRB0DzHg r3uVvOwW6THNN7bykv0fW8Hzbyg1RVp1B1CaggHeJU34VgRtMz8nRnrr6K9/+872+f2V Z6pmZPy23EYdG1Lap+r1JDnODsSK+nlXsuIQs= Received: by 10.140.136.5 with SMTP id j5mr1342385rvd.39.1239338540875; Thu, 09 Apr 2009 21:42:20 -0700 (PDT) Received: from michelle.cdnetworks.co.kr ([114.111.62.249]) by mx.google.com with ESMTPS id k41sm2803059rvb.46.2009.04.09.21.42.18 (version=SSLv3 cipher=RC4-MD5); Thu, 09 Apr 2009 21:42:19 -0700 (PDT) Received: by michelle.cdnetworks.co.kr (sSMTP sendmail emulation); Fri, 10 Apr 2009 13:43:40 +0900 From: Pyun YongHyeon Date: Fri, 10 Apr 2009 13:43:40 +0900 To: xer Message-ID: <20090410044340.GJ37714@michelle.cdnetworks.co.kr> References: <20090407120032.633E410656D5@hub.freebsd.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="6sX45UoQRIJXqkqR" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: watchdog timeout X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 04:42:21 -0000 --6sX45UoQRIJXqkqR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Apr 08, 2009 at 10:41:44AM +0200, xer wrote: > Hello > I have some problems with 3Com nics, after a upgrade from 5.5-STABLE to > 6.4-STABLE. > > This machine has two 3com nics (one is LAN other is WAN) and i see too much > "watchdog timeout" on both cards. > This on/off up/down on cards, affect the interrupt to clients that are > downloading from apache web server, especially on large files. > > -------------------------------------------- > xer:/root# dmesg > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > xl1: watchdog timeout > xl1: link state changed to DOWN > xl1: link state changed to UP > --------------------------------------------- > > xer:/root# cat /var/run/dmesg.boot | grep xl > xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem > 0xfceffc00-0xfceffc7f irq 23 at device 11.0 on pci2 > miibus0: on xl0 > xlphy0: <3c905C 10/100 internal PHY> on miibus0 > xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > xl0: Ethernet address: 00:01:02:e0:04:1b > xl1: <3Com 3c905C-TX Fast Etherlink XL> port 0xe880-0xe8ff mem > 0xfceff800-0xfceff87f irq 20 at device 12.0 on pci2 > miibus1: on xl1 > xlphy1: <3c905C 10/100 internal PHY> on miibus1 > xlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > xl1: Ethernet address: 00:01:02:df:fe:ed > --------------------------------------------- > Another doubt would be my kernel config, maybe there is something wrong > that i cannot see, i'll post at the end of this post, 'cause is too long. > > As you can see, the cards are 3c905C-TX model. > Someone told me to change drivers, but i cannot understand this advice. > I got same errors with same cards but with another mainboard, same problem, > watchdog appears after an upgrade from 5.4-STABLE to 6.4-STABLE. > > I don't think that to change nic's pci slots, will solve the problem, i > think that maybe change the nics would resolve the matter, but i cannot > access to both FreeBSD phisically, cause the boxes are too far from me > (about 3500 km). > > I'm asking you some advices, and i can i fix this problem. > p.s. with both 5.4 or 5.5 old kernel, the nics was fine. > I vaguely remember there were a couple of reports on xl(4) watchdog timeouts. I'm not sure this came from missing Tx interrupts but would you try attached patch? Note, it was generated against HEAD and you should experiment the attached patch on local box prior to applying to your production server. --6sX45UoQRIJXqkqR Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="xl.watchdog.patch" Index: sys/dev/xl/if_xl.c =================================================================== --- sys/dev/xl/if_xl.c (revision 190876) +++ sys/dev/xl/if_xl.c (working copy) @@ -2097,13 +2097,13 @@ m_freem(cur_tx->xl_mbuf); cur_tx->xl_mbuf = NULL; ifp->if_opackets++; + ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; cur_tx->xl_next = sc->xl_cdata.xl_tx_free; sc->xl_cdata.xl_tx_free = cur_tx; } if (sc->xl_cdata.xl_tx_head == NULL) { - ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; sc->xl_wdog_timer = 0; sc->xl_cdata.xl_tx_tail = NULL; } else { @@ -2540,6 +2540,9 @@ XL_LOCK_ASSERT(sc); + if ((ifp->if_drv_flags & (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) != + IFF_DRV_RUNNING) + return; /* * Check for an available queue slot. If there are none, * punt. @@ -2668,7 +2671,8 @@ XL_LOCK_ASSERT(sc); - if (ifp->if_drv_flags & IFF_DRV_OACTIVE) + if ((ifp->if_drv_flags & (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) != + IFF_DRV_RUNNING) return; idx = sc->xl_cdata.xl_tx_prod; @@ -3207,12 +3211,31 @@ { struct ifnet *ifp = sc->xl_ifp; u_int16_t status = 0; + int misintr; XL_LOCK_ASSERT(sc); if (sc->xl_wdog_timer == 0 || --sc->xl_wdog_timer != 0) return (0); + xl_rxeof(sc); + xl_txeoc(sc); + misintr = 0; + if (sc->xl_type == XL_TYPE_905B) { + xl_txeof_90xB(sc); + if (sc->xl_cdata.xl_tx_cnt == 0) + misintr++; + } else { + xl_txeof(sc); + if (sc->xl_cdata.xl_tx_head == NULL) + misintr++; + } + if (misintr != 0) { + device_printf(sc->xl_dev, + "watchdog timeout (missed Tx interrupts) -- recovering\n"); + return (0); + } + ifp->if_oerrors++; XL_SEL_WIN(4); status = CSR_READ_2(sc, XL_W4_MEDIA_STATUS); @@ -3222,9 +3245,6 @@ device_printf(sc->xl_dev, "no carrier - transceiver cable problem?\n"); - xl_txeoc(sc); - xl_txeof(sc); - xl_rxeof(sc); xl_reset(sc); xl_init_locked(sc); --6sX45UoQRIJXqkqR--