From owner-freebsd-net@FreeBSD.ORG Mon Mar 18 22:03:10 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4A0BF82C for ; Mon, 18 Mar 2013 22:03:10 +0000 (UTC) (envelope-from rganascim@gmail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) by mx1.freebsd.org (Postfix) with ESMTP id C64E1FAF for ; Mon, 18 Mar 2013 22:03:09 +0000 (UTC) Received: by mail-wi0-f169.google.com with SMTP id l13so3564981wie.0 for ; Mon, 18 Mar 2013 15:03:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=obhWdEZulOaIEjThJw1Erj7Z0uL+AsQjqrfblnmWWsA=; b=tDdpYv8YkFFa4LhpNIsNjN+dsQYA5ozhHdJOkRzXcrkfD5WgVgqoxUWFOuMqQPOzUW fPgGhqEcCaA9nn65BqjLJCSk7MlvcKrSHLeqcYi80Hd3y/OOLlCiS9V/aSG6lKHo6S6G IkKcjMXb0Gl6RUi+tZMs90mrXskZoQlyMKO1QK+trvtsBfdKjSE+8eOwTJ2Ja51jVdAl CxkTCG+FDKlCfuNd5yz+QPLziUGnUjcsSqLtbOjTeFgwYiQzWRaTZEQvQS6zjOZ3Sv3y mt/+goELrHyuYXnSh2fc2TH0iavRaDD7UCnnR6iYypz87HDdjnHpEHQWrZzs94xBsfLh OOlw== MIME-Version: 1.0 X-Received: by 10.194.103.72 with SMTP id fu8mr28196805wjb.42.1363644188840; Mon, 18 Mar 2013 15:03:08 -0700 (PDT) Received: by 10.216.34.3 with HTTP; Mon, 18 Mar 2013 15:03:08 -0700 (PDT) In-Reply-To: References: Date: Mon, 18 Mar 2013 19:03:08 -0300 Message-ID: Subject: Re: Carp strange behavior From: Rafael Ganascim To: Damien Fleuriot Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Mar 2013 22:03:10 -0000 2013/3/18 Damien Fleuriot > > On 18 Mar 2013, at 22:22, Rafael Ganascim wrote: > > > Hi list, > > > > I have multiple FreeBSD firewalls with carp working well. I have no > problem > > and the vast majority of firewalls works perfectly. > > > > But now, I'm with problems with a simple firewall cluster with carp that > > the state randomly goes to MASTER and randomly returns to BACKUP. > > > > Looking to the L1/L2 tests, I have no rx/tx erros, buffers miss, in/out > > drops , etc. The physical conection between the firewalls looks good. > > > > Monitoring the interfaces/buffers/mbufs/virtual memory with netstat, > > vmstat.... no errors was found. > > > > Using tcpdump, I can see that in the exact moment of the state change, > the > > currently master's firewall stop sending multicasts to the 224.0.0.18 > > during some seconds and the state change occurs. > > > > The system: > > # uname -a > > FreeBSD fw-cj-01 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Thu Feb 28 13:18:41 > > BRT 2013 root@fw-new-01:/usr/obj/usr/src/sys/DEDICr9v1CoreX64 amd64 > > > > > > Now, how can I debug why carp stops to send multicast packets? > > > > Lots of things to be said here. > > First, how do you know carp stops sending packets ? > Might not be the case. > > Second, triple check that the VHID is not already used somewhere else. > > Third, any firewalling in place ? > If so, disable it, check for better results. > > Fourth, netstat -m -p carp > > Fifth, raise advbase on both boxes and see if that helps. > > Sixth, what's the frequency of these role swaps ? > > Seventh, what do you get in dmesg ? > Hi Damien, thanks for the help. 1) Really, I don't know. What I can see is the ausence of the multicast packets to the 224.0.0.18 for some seconds. 2) VHID not used anywhere. 3) pf is enabled with basic rules. I'll test with pf disabled. 4) here the output: # netstat -m -p carp 10744/5396/16140 mbufs in use (current/cache/total) 10232/5406/15638/262144 mbuf clusters in use (current/cache/total/max) 10232/3976 mbuf+clusters out of packet secondary zone in use (current/cache) 0/151/151/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 23150K/12765K/35915K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines # netstat -s -p carp carp: 590531 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 0 discarded for bad vhid 0 discarded because of a bad address list 4915111 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error 5) I'll raise and report the results. 6) Look here: Mar 15 10:41:36 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 10:45:27 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 14:09:33 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 15:36:36 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 16:31:01 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 19:31:23 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 22:13:58 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 22:46:14 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 15 23:41:55 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 16 12:31:43 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 17 17:38:01 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 18 12:21:48 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) Mar 18 17:30:57 fw-cj-01 kernel: carp300: MASTER -> BACKUP (more frequent advertisement received) 7) dmesg carp300: MASTER -> BACKUP (more frequent advertisement received) carp300: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp302: link state changed to UP carp319: link state changed to UP carp300: link state changed to UP carp301: link state changed to UP carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp302: BACKUP -> MASTER (preempting a slower master) carp302: link state changed to UP carp319: BACKUP -> MASTER (preempting a slower master) carp319: link state changed to UP carp300: MASTER -> BACKUP (more frequent advertisement received) carp300: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp302: link state changed to UP carp319: link state changed to UP carp301: link state changed to UP carp300: link state changed to UP carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp301: BACKUP -> MASTER (preempting a slower master) carp301: link state changed to UP carp302: BACKUP -> MASTER (preempting a slower master) carp302: link state changed to UP carp319: BACKUP -> MASTER (preempting a slower master) carp319: link state changed to UP Limiting icmp ping response from 428 to 200 packets/sec Limiting icmp ping response from 466 to 200 packets/sec carp300: MASTER -> BACKUP (more frequent advertisement received) carp300: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp300: link state changed to UP carp301: BACKUP -> MASTER (preempting a slower master) carp301: link state changed to UP carp302: BACKUP -> MASTER (preempting a slower master) carp302: link state changed to UP carp319: BACKUP -> MASTER (preempting a slower master) carp319: link state changed to UP carp300: MASTER -> BACKUP (more frequent advertisement received) carp300: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp319: MASTER -> BACKUP (more frequent advertisement received) carp319: link state changed to DOWN carp301: link state changed to UP carp302: link state changed to UP carp319: link state changed to UP carp300: link state changed to UP carp302: MASTER -> BACKUP (more frequent advertisement received) carp302: link state changed to DOWN carp301: MASTER -> BACKUP (more frequent advertisement received) carp319: MASTER -> BACKUP (more frequent advertisement received) carp301: link state changed to DOWN carp319: link state changed to DOWN carp302: BACKUP -> MASTER (preempting a slower master) carp302: link state changed to UP carp319: BACKUP -> MASTER (preempting a slower master) carp319: link state changed to UP carp301: link state changed to UP