Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Sep 2011 15:05:58 GMT
From:      Damien Fleuriot <dam@my.gd>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/161123: CARP - when preemption is enabled carp interface assumes MASTERship immediately even with higher advbase/advskew
Message-ID:  <201109291505.p8TF5wir047014@red.freebsd.org>
Resent-Message-ID: <201109291510.p8TFA9m7094615@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         161123
>Category:       kern
>Synopsis:       CARP - when preemption is enabled carp interface assumes MASTERship immediately even with higher advbase/advskew
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 29 15:10:09 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator:     Damien Fleuriot
>Release:        8.2-RELEASE
>Organization:
Hi-Media
>Environment:
FreeBSD pf2.multiprojet 8.2-RELEASE FreeBSD 8.2-RELEASE #1: Thu Sep 29 16:11:04 CEST 2011     root@pf2.multiprojet:/usr/obj/usr/src/sys/MULTI  amd64
>Description:
Under normal operating circumstances, a CARP interface goes through the following states:
- INIT : when it's down
- BACKUP : immediately upon being brought up, the interface assumes a BACKUP role and starts its timer to know if it should  claim mastership.
- MASTER : if the delay has expired (advbase * 3) without the interface seeing another master, it assumes mastership.


BUG: When preemption is enabled (net.inet.carp.preempt=1) , the CARP interface immediately assumes MASTERship regardless of its advbase and advskew values.

This causes CARP switchovers when a firewall from a CARP cluster is rebooted, for example.

In our case, this actually led to lost client connections, lost database sessions, developers' daemons crashes because of lost java/db connections...



This is a known problem with OpenBSD 3.8 and lower's implementation of CARP.
This has been fixed as of OpenBSD 3.9.

Refer: my post on -stable
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=368260+0+current/freebsd-stable

>How-To-Repeat:
Set up 2 boxes with a shared CARP IP.
Enable CARP preemption.

Bring down your CARP interface on the BACKUP box.
Bring it up again.
Notice how your interface assumed MASTERship for a short time.
Check with dmesg which confirms that your box actually preempted.
>Fix:
The fix lies in sys/netinet/ip_carp.c in function carp_setrun(struct carp_softc *sc,
sa_family_t af).

All that is needed is to get rid of the code portion which instruct the CARP interface to immediately transition from INIT to MASTER if it has preemption enabled.

Patch attached.


Patch attached with submission follows:

--- sys/netinet/ip_carp.c	2011-09-29 15:00:07.000000000 +0200
+++ sys/netinet/ip_carp.c	2011-09-29 15:01:37.000000000 +0200
@@ -1390,22 +1390,10 @@
 
 	switch (sc->sc_state) {
 	case INIT:
-		if (carp_opts[CARPCTL_PREEMPT] && !carp_suppress_preempt) {
-			carp_send_ad_locked(sc);
-			carp_send_arp(sc);
-#ifdef INET6
-			carp_send_na(sc);
-#endif /* INET6 */
-			CARP_LOG("%s: INIT -> MASTER (preempting)\n",
-			    SC2IFP(sc)->if_xname);
-			carp_set_state(sc, MASTER);
-			carp_setroute(sc, RTM_ADD);
-		} else {
-			CARP_LOG("%s: INIT -> BACKUP\n", SC2IFP(sc)->if_xname);
-			carp_set_state(sc, BACKUP);
-			carp_setroute(sc, RTM_DELETE);
-			carp_setrun(sc, 0);
-		}
+		CARP_LOG("%s: INIT -> BACKUP\n", SC2IFP(sc)->if_xname);
+		carp_set_state(sc, BACKUP);
+		carp_setroute(sc, RTM_DELETE);
+		carp_setrun(sc, 0);
 		break;
 	case BACKUP:
 		callout_stop(&sc->sc_ad_tmo);


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201109291505.p8TF5wir047014>