Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Oct 2010 16:16:04 -0700
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Steve Kargl <sgk@troutmask.apl.washington.edu>
Cc:        freebsd-current@freebsd.org
Subject:   Re: recent bge(4) changes causing problems
Message-ID:  <20101011231604.GI4607@michelle.cdnetworks.com>
In-Reply-To: <20101011225331.GA2829@troutmask.apl.washington.edu>
References:  <20101011225331.GA2829@troutmask.apl.washington.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

--7qSK/uQB79J36Y4o
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Mon, Oct 11, 2010 at 03:53:31PM -0700, Steve Kargl wrote:
> It seems recent changes to the bge driver are causing
> some problems with my hardware where the watchdog is
> now timing out.
> 
> /var/log/messages contains
> 
> 14:23:14 kernel: SMP: AP CPU #1 Launched!
> 14:23:14 kernel: Trying to mount root from ufs:/dev/ad6s1a
> 14:23:15 kernel: bge1: link state changed to UP
> 14:23:15 lpd[1190]: lpd startup: logging=0
> 14:23:15 ntpd[1224]: ntpd 4.2.4p5-a (1)
> 14:23:15 kernel: bge0: link state changed to UP
> 14:23:24 ntpd[1225]: time reset -0.677316 s
> 14:23:24 ntpd[1225]: kernel time sync status change 2001
> 14:31:01 kernel: bge0: watchdog timeout -- resetting
> 14:31:01 kernel: bge0: link state changed to DOWN
> 14:31:02 kernel: Limiting icmp unreach response from 613 to 200 packets/sec
> 14:31:04 ntpd[1225]: sendto(140.142.2.8) (fd=22): No route to host
> 14:31:04 kernel: bge0: link state changed to UP
> 14:31:30 kernel: Limiting icmp unreach response from 205 to 200 packets/sec
> 14:31:31 kernel: Limiting icmp unreach response from 203 to 200 packets/sec
> 15:40:11 su: kargl to root on /dev/pts/0
> 15:40:35 kernel: bge0: link state changed to DOWN
> 15:40:38 kernel: bge0: link state changed to UP
> 
> The last 2 bge messages are from me manually using 
> ifconfig to bring my net connect back to life.
> 
> troutmask:kargl[206] sysctl -a | grep bge.0
> dev.bge.0.%desc: Broadcom Gigabit Ethernet Controller, ASIC rev. 0x002100
> dev.bge.0.%driver: bge
> dev.bge.0.%location: slot=9 function=0 handle=\_SB_.PCI0.GOLA.GLAN
> dev.bge.0.%pnpinfo: vendor=0x14e4 device=0x1648 subvendor=0x14e4 subdevice=0x1644 class=0x020000
> dev.bge.0.%parent: pci2
> dev.bge.0.forced_collapse: 0
> dev.bge.0.forced_udpcsum: 0
> dev.bge.0.stats.FramesDroppedDueToFilters: 0
> dev.bge.0.stats.DmaWriteQueueFull: 0
> dev.bge.0.stats.DmaWriteHighPriQueueFull: 0
> dev.bge.0.stats.NoMoreRxBDs: 0
> dev.bge.0.stats.InputDiscards: 0
> dev.bge.0.stats.InputErrors: 0
> dev.bge.0.stats.RecvThresholdHit: 325
> dev.bge.0.stats.DmaReadQueueFull: 0
> dev.bge.0.stats.DmaReadHighPriQueueFull: 0
> dev.bge.0.stats.SendDataCompQueueFull: 0
> dev.bge.0.stats.RingSetSendProdIndex: 469
> dev.bge.0.stats.RingStatusUpdate: 330
> dev.bge.0.stats.Interrupts: 330
> dev.bge.0.stats.AvoidedInterrupts: 0
> dev.bge.0.stats.SendThresholdHit: 0
> dev.bge.0.stats.rx.ifHCInOctets: 569452
> dev.bge.0.stats.rx.Fragments: 0
> dev.bge.0.stats.rx.UnicastPkts: 497
> dev.bge.0.stats.rx.MulticastPkts: 1
> dev.bge.0.stats.rx.FCSErrors: 0
> dev.bge.0.stats.rx.AlignmentErrors: 0
> dev.bge.0.stats.rx.xonPauseFramesReceived: 0
> dev.bge.0.stats.rx.xoffPauseFramesReceived: 0
> dev.bge.0.stats.rx.ControlFramesReceived: 0
> dev.bge.0.stats.rx.xoffStateEntered: 0
> dev.bge.0.stats.rx.FramesTooLong: 0
> dev.bge.0.stats.rx.Jabbers: 0
> dev.bge.0.stats.rx.UndersizePkts: 0
> dev.bge.0.stats.rx.inRangeLengthError: 0
> dev.bge.0.stats.rx.outRangeLengthError: 0
> dev.bge.0.stats.tx.ifHCOutOctets: 39056
> dev.bge.0.stats.tx.Collisions: 0
> dev.bge.0.stats.tx.XonSent: 0
> dev.bge.0.stats.tx.XoffSent: 0
> dev.bge.0.stats.tx.flowControlDone: 0
> dev.bge.0.stats.tx.InternalMacTransmitErrors: 0
> dev.bge.0.stats.tx.SingleCollisionFrames: 0
> dev.bge.0.stats.tx.MultipleCollisionFrames: 0
> dev.bge.0.stats.tx.DeferredTransmissions: 0
> dev.bge.0.stats.tx.ExcessiveCollisions: 0
> dev.bge.0.stats.tx.LateCollisions: 0
> dev.bge.0.stats.tx.UnicastPkts: 468
> dev.bge.0.stats.tx.MulticastPkts: 0
> dev.bge.0.stats.tx.BroadcastPkts: 1
> dev.bge.0.stats.tx.CarrierSenseErrors: 0
> dev.bge.0.stats.tx.Discards: 0
> dev.bge.0.stats.tx.Errors: 0
> dev.bge.0.wake: 0
> 
> In the time that it's taken me to compose this message
> the timeout has fire again.
> 
> 15:47:01 kernel: Limiting icmp unreach response from 662 to 200 packets/sec
> 15:47:02 kernel: Limiting icmp unreach response from 446 to 200 packets/sec
> 15:47:03 kernel: Limiting icmp unreach response from 436 to 200 packets/sec
> 15:47:04 kernel: Limiting icmp unreach response from 464 to 200 packets/sec
> 15:47:05 kernel: Limiting icmp unreach response from 438 to 200 packets/sec
> 15:47:06 kernel: Limiting icmp unreach response from 445 to 200 packets/sec
> 15:47:07 kernel: bge0: watchdog timeout -- resetting
> 15:47:07 kernel: bge0: link state changed to DOWN
> 15:47:07 kernel: Limiting icmp unreach response from 439 to 200 packets/sec
> 15:47:08 kernel: Limiting icmp unreach response from 330 to 200 packets/sec
> 15:47:11 kernel: bge0: link state changed to UP
> 15:47:12 kernel: Limiting icmp unreach response from 214 to 200 packets/sec
> 15:47:13 kernel: Limiting icmp unreach response from 202 to 200 packets/sec
> 15:47:14 kernel: Limiting icmp unreach response from 238 to 200 packets/sec
> 15:49:42 kernel: bge0: link state changed to DOWN
> 15:49:44 kernel: bge0: link state changed to UP
> 
> I not seen these icmp unreach response messages.
> 

The icmp unreach has nothing to do with bge(4). Check whether a
server that listens on an UDP port is still alive on your box.
What worries me is bge(4) watchdog timeouts. It looks like your
controller is BCM5704. I also have bge(4) regression report from
marius on sparc64. He said r213945 seemed to cause the issue and
I'm working on the issue. Could you also try the attached patch?

--7qSK/uQB79J36Y4o
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="bge.rxprod.patch"

Index: sys/dev/bge/if_bge.c
===================================================================
--- sys/dev/bge/if_bge.c	(revision 213695)
+++ sys/dev/bge/if_bge.c	(working copy)
@@ -1619,9 +1619,6 @@
 	CSR_WRITE_4(sc, BGE_RX_STD_RCB_MAXLEN_FLAGS, rcb->bge_maxlen_flags);
 	CSR_WRITE_4(sc, BGE_RX_STD_RCB_NICADDR, rcb->bge_nicaddr);
 
-	/* Reset the standard receive producer ring producer index. */
-	bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, 0);
-
 	/*
 	 * Initialize the jumbo RX producer ring control
 	 * block.  We set the 'ring disabled' bit in the
@@ -1665,6 +1662,9 @@
 		bge_writembx(sc, BGE_MBX_RX_MINI_PROD_LO, 0);
 	}
 
+	/* Reset the standard receive producer ring producer index. */
+	bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, 0);
+
 	/*
 	 * The BD ring replenish thresholds control how often the
 	 * hardware fetches new BD's from the producer rings in host

--7qSK/uQB79J36Y4o--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101011231604.GI4607>