Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 Jan 2015 11:22:35 +0100
From:      Harald Schmalzbauer <h.schmalzbauer@omnilan.de>
To:        FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: igb(4) watchdog timeout, lagg(4) fails
Message-ID:  <54AE5A6B.7040601@omnilan.de>
In-Reply-To: <54AE565D.50208@omnilan.de>
References:  <54ACC6A2.1050400@omnilan.de> <54AE565D.50208@omnilan.de>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigC4672369CB72B459969A3152
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

 Bez=C3=BCglich Harald Schmalzbauer's Nachricht vom 08.01.2015 11:05
(localtime):
>  Bez=C3=BCglich Harry Schmalzbauer's Nachricht vom 07.01.2015 06:39 (lo=
caltime):
>>  Hello,
>>
>> recently I upgraded one server from 9.1 to 10.1. There are two 82576
>> (one port of two Intel ET Dual-Port GbE [kawela]), driven by igb(4).
>> I've never seen any watchdog timeout with FreeBSD-9.1 but suddenly (wi=
th
>> 10-stable) I see:
>> igb0: Watchdog timeout -- resetting
>> igb0: Queue(0) tdh =3D 2974, hw tdt =3D 2973
>> igb0: TX(0) desc avail =3D 0,Next TX to Clean =3D 0
>>
>> My biggest problem is, that lagg(4) doesn't detect the problem with
>> igb0. It's configured with "lagghash l2' and most connections were
>> interupted until I manually do 'ifconfig igb0 down'. Then lagg does it=
's
>> job and connectivity was restored via the remaining igb1.
>>
>> Is there a way to auto-if-down an interface which suffers from watchdo=
g
>> timeouts? And any way to really reset it without rebooting the machine=
?
> igb wathchdog timeout happened again :-( ~48 hours after the last with
> very moderate-to-low avarage traffic.
>
> This time I could fetch dev.igb sysctls before igb0 was reset by watchd=
og
> It's showing strange irq load:

While systat tells:
   3 igb1:que 0
1619 igb1:que 1
   3 igb1:que 2
   1 igb1:que 3

sysctl dev.igb tells:
dev.igb.1.queue0.interrupt_rate: 43478
dev.igb.1.queue1.interrupt_rate: 76923
dev.igb.1.queue2.interrupt_rate: 111111
dev.igb.1.queue3.interrupt_rate: 90909

How do I have to understand sysctl's interrupt_rate value?

Thanks,

-Harry


--------------enigC4672369CB72B459969A3152
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAlSuWmwACgkQLDqVQ9VXb8g7mACgnCnyEHbMjfEULi91SjQq7ORR
XS0AoKB0pUpVvqQt+i2TU70QSRfyMtUA
=iVf7
-----END PGP SIGNATURE-----

--------------enigC4672369CB72B459969A3152--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54AE5A6B.7040601>