From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 14:36:19 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 147691065670 for ; Fri, 1 Apr 2011 14:36:19 +0000 (UTC) (envelope-from free@isafeelin.org) Received: from progress.isafeelin.org (progress.isafeelin.org [80.69.81.6]) by mx1.freebsd.org (Postfix) with ESMTP id CCE5A8FC13 for ; Fri, 1 Apr 2011 14:36:18 +0000 (UTC) Received: from progress.isafeelin.org (localhost [127.0.0.1]) by progress.isafeelin.org (Postfix) with ESMTP id E7953131182 for ; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Received: from s5375723c.adsl.wanadoo.nl (s5375723c.adsl.wanadoo.nl [83.117.114.60]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by progress.isafeelin.org (Postfix) with ESMTPS id BDFD813117A for ; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Received: by s5375723c.adsl.wanadoo.nl (Postfix, from userid 1002) id 603D828428; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Date: Fri, 1 Apr 2011 16:16:55 +0200 From: Frederique Rijsdijk To: freebsd-net@freebsd.org Message-ID: <20110401141655.GA5350@deta.isafeelin.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV using ClamSMTP Subject: Network stack unstable after arp flapping X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 14:36:19 -0000 Hi, We (hosting provider) are in the process of implementing ipv6 in our network (yay). Yesterday one of the final steps in configuring and updating our core routers were taken, which did not go entirely as planned. As a result, the default gateway mac addresses for all our machines changed about 800 times in a time span of about 4 minutes. Here's a small piece of the logging: Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:13 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:15 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 The x.x.x.1 is always the same IP, the gateway of the machine. The result of that, is that loads of FreeBSD machines (6.x, 7.x and 8.x) developed serious network issues, mainly being no or slow traffic between other (FreeBSD) machine accross different VLAN's in our own network. First thing that comes to mind is the network itself, but all Linux machines (Ubuntu, Red Hat and CentOS) had no issues at all. Only BSD. An arp -ad on both machines where problems occured, didn't solve anything. What worked better was /etc/rc.d/netif restart and a /etc/rc.d/routing restart. Some machines even had to be rebooted in order to get networking back to normal. This almost sounds like a bug in the network stack in BSD, but I can not imagine that I'm right. The BSD networking stack is considered to be one of the best.. Any ideas anyone? -- Frederique