From owner-freebsd-i386@FreeBSD.ORG Thu Apr 15 17:00:12 2010 Return-Path: Delivered-To: freebsd-i386@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D516A1065673 for ; Thu, 15 Apr 2010 17:00:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 98AF48FC1F for ; Thu, 15 Apr 2010 17:00:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o3FH0Cws063550 for ; Thu, 15 Apr 2010 17:00:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o3FH0CJk063549; Thu, 15 Apr 2010 17:00:12 GMT (envelope-from gnats) Resent-Date: Thu, 15 Apr 2010 17:00:12 GMT Resent-Message-Id: <201004151700.o3FH0CJk063549@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-i386@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, AD Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0A0061065672 for ; Thu, 15 Apr 2010 16:50:22 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id EC8F38FC17 for ; Thu, 15 Apr 2010 16:50:21 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o3FGoL0Y035636 for ; Thu, 15 Apr 2010 16:50:21 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id o3FGoLTA035635; Thu, 15 Apr 2010 16:50:21 GMT (envelope-from nobody) Message-Id: <201004151650.o3FGoLTA035635@www.freebsd.org> Date: Thu, 15 Apr 2010 16:50:21 GMT From: AD To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: i386/145728: Stops working lagg between two servers. X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Apr 2010 17:00:13 -0000 >Number: 145728 >Category: i386 >Synopsis: Stops working lagg between two servers. >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-i386 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Apr 15 17:00:12 UTC 2010 >Closed-Date: >Last-Modified: >Originator: AD >Release: 7.2-RELEASE-p6 and 7.2-STABLE >Organization: ad >Environment: FreeBSD 1 7.2-RELEASE-p6 FreeBSD 7.2-RELEASE-p6 #1: Wed Mar 17 22:31:00 KRAT 2010 root@1:/usr/obj/usr/src/sys/1 i386 FreeBSD 2 7.2-STABLE FreeBSD 7.2-STABLE #8: Thu Apr 1 02:06:36 KRAST 2010 root@2:/usr/obj/usr/src/sys/2 i386 >Description: There are 2 servers, in everyone costs on 4 network cards. 2 from them are united in lagg. In some days lagg collapses: 1 server lagg0: flags=8843 metric 0 mtu 1500 options=19b ether 00:1b:21:3b:4d:4d inet 1.1.1.1 netmask 0xffffffc0 broadcast 1.1.1.255 media: Ethernet autoselect status: active laggproto lacp laggport: em3 flags=1c laggport: em2 flags=4 ifconfig em2 em2: flags=9c43 metric 0 mtu 1500 options=19b ether 00:1b:21:3b:4d:4d media: Ethernet autoselect (1000baseTX ) status: active lagg: laggdev lagg0 #less /var/run/dmesg.boot | grep em2 em2: port 0x3000-0x301f mem 0xd3180000-0xd319ffff,0xd3100000-0xd317ffff,0xd31a0000-0xd31a3fff irq 16 at device 0.0 on pci2 em2: Using MSIX interrupts em2: Using TXD_LOW instead of TXDW em2: [FILTER] em2: [FILTER] em2: [FILTER] em2: Ethernet address: 00:1b:21:3b:4d:4d em2@pci0:2:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet em3@pci0:4:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet 2 server lagg1: flags=8943 metric 0 mtu 1500 options=19b ether 00:1b:21:1b:19:5d media: Ethernet autoselect status: active laggproto lacp laggport: em4 flags=1c laggport: em1 flags=18 em1: flags=8943 metric 0 mtu 1500 options=19b ether 00:1b:21:1b:19:5d media: Ethernet autoselect (1000baseTX ) status: active lagg: laggdev lagg1 # less /var/run/dmesg.boot |grep em1 em1: port 0x4000-0x401f mem 0xd0320000-0xd033ffff,0xd0300000-0xd031ffff irq 16 at device 0.0 on pci3 em1: Using MSI interrupt em1: Using TXD_LOW instead of TXDW em1: [FILTER] em1: Ethernet address: 00:1b:21:1b:19:5d em1@pci0:3:0:0: class=0x020000 card=0x10838086 chip=0x10b98086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82572EI PRO/1000 PT Desktop Adapter (Copper)' class = network subclass = ethernet em4@pci0:5:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet Error log: Apr 16 00:27:31 2 kernel: em4: link state changed to UP Apr 16 00:27:34 2 kernel: em4: watchdog timeout -- resetting Apr 16 00:27:34 2 kernel: em4: Excessive collisions = 0 Apr 16 00:27:34 2 kernel: em4: Sequence errors = 0 Apr 16 00:27:34 2 kernel: em4: Defer count = 0 Apr 16 00:27:34 2 kernel: em4: Missed Packets = 1217754 Apr 16 00:27:34 2 kernel: em4: Receive No Buffers = 0 Apr 16 00:27:34 2 kernel: em4: Receive Length Errors = 0 Apr 16 00:27:34 2 kernel: em4: Receive errors = 0 Apr 16 00:27:34 2 kernel: em4: Crc errors = 0 Apr 16 00:27:34 2 kernel: em4: Alignment errors = 0 Apr 16 00:27:34 2 kernel: em4: Collision/Carrier extension errors = 0 Apr 16 00:27:34 2 kernel: em4: RX overruns = 0 Apr 16 00:27:34 2 kernel: em4: watchdog timeouts = 143 Apr 16 00:27:34 2 kernel: em4: RX MSIX IRQ = 1654280804 TX MSIX IRQ = 1491971579 LINK MSIX IRQ = 1214367 Apr 16 00:27:34 2 kernel: em4: XON Rcvd = 203508246 Apr 16 00:27:34 2 kernel: em4: XON Xmtd = 3183073363 Apr 16 00:27:34 2 kernel: em4: XOFF Rcvd = 202792650 Apr 16 00:27:34 2 kernel: em4: XOFF Xmtd = 3170508497 Apr 16 00:27:34 2 kernel: em4: Good Packets Rcvd = 108209172443 Apr 16 00:27:34 2 kernel: em4: Good Packets Xmtd = 113645818564 Apr 16 00:27:34 2 kernel: em4: TSO Contexts Xmtd = 0 Apr 16 00:27:34 2 kernel: em4: TSO Contexts Failed = 0 Apr 16 00:27:34 2 kernel: em4: Adapter hardware address = 0xc52a0218 Apr 16 00:27:34 2 kernel: em4: CTRL = 0x58100248 RCTL = 0x801a Apr 16 00:27:34 2 kernel: em4: Packet buffer = Tx=20k Rx=20k Apr 16 00:27:34 2 kernel: em4: Flow control watermarks high = 18432 low = 16932 Apr 16 00:27:34 2 kernel: em4: tx_int_delay = 0, tx_abs_int_delay = 64 Apr 16 00:27:34 2 kernel: em4: rx_int_delay = 0, rx_abs_int_delay = 66 Apr 16 00:27:34 2 kernel: em4: fifo workaround = 0, fifo_reset_count = 0 Apr 16 00:27:34 2 kernel: em4: hw tdh = 0, hw tdt = 1 Apr 16 00:27:34 2 kernel: em4: hw rdh = 0, hw rdt = 4095, next_rx_desc_to_check = 0 Apr 16 00:27:34 2 kernel: em4: Num Tx descriptors avail = 4095 Apr 16 00:27:34 2 kernel: em4: Tx Descriptors not avail1 = 12063 Apr 16 00:27:34 2 kernel: em4: Tx Descriptors not avail2 = 0 Apr 16 00:27:34 2 kernel: em4: Std mbuf failed = 0 Apr 16 00:27:34 2 kernel: em4: Std mbuf cluster failed = 6 Apr 16 00:27:34 2 kernel: em4: Driver dropped packets = 0 Apr 16 00:27:34 2 kernel: em4: Driver tx dma failure in encap = 0 Apr 16 00:27:34 2 kernel: em4: Packets pended due to reorder = 0 Apr 16 00:27:34 2 kernel: em4: RX interrupts has been masked = 77251713 Apr 16 00:27:34 2 kernel: em4: TX interrupts has been generated = 0 Apr 16 00:27:34 2 kernel: em4: link state changed to DOWN tcpdump -i em4 00:47:06.511867 LACPv1, length: 110 00:47:36.997247 LACPv1, length: 110 After reboot for some time all is normalised. >How-To-Repeat: To connect 2 servers directly through lagg. >Fix: While only reboot :( >Release-Note: >Audit-Trail: >Unformatted: