From owner-freebsd-net@FreeBSD.ORG  Sat Feb  4 21:34:47 2006
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: freebsd-net@freebsd.org
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3DB6B16A420
	for <freebsd-net@freebsd.org>; Sat,  4 Feb 2006 21:34:47 +0000 (GMT)
	(envelope-from b.candler@pobox.com)
Received: from thorn.pobox.com (thorn.pobox.com [208.210.124.75])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C384643D45
	for <freebsd-net@freebsd.org>; Sat,  4 Feb 2006 21:34:46 +0000 (GMT)
	(envelope-from b.candler@pobox.com)
Received: from thorn (localhost [127.0.0.1])
	by thorn.pobox.com (Postfix) with ESMTP id BAFCD9D;
	Sat,  4 Feb 2006 16:35:07 -0500 (EST)
Received: from mappit.local.linnet.org (212-74-113-67.static.dsl.as9105.com
	[212.74.113.67])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by thorn.sasl.smtp.pobox.com (Postfix) with ESMTP id 6737EADF8;
	Sat,  4 Feb 2006 16:35:06 -0500 (EST)
Received: from lists by mappit.local.linnet.org with local (Exim 4.60
	(FreeBSD)) (envelope-from <b.candler@pobox.com>)
	id 1F5V3K-000Nr2-Fm; Sat, 04 Feb 2006 21:34:42 +0000
Date: Sat, 4 Feb 2006 21:34:42 +0000
From: Brian Candler <B.Candler@pobox.com>
To: Matthew Lineen <matt@tablexi.com>
Message-ID: <20060204213442.GA91647@uk.tiscali.com>
References: <43E3B018.3080301@tablexi.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <43E3B018.3080301@tablexi.com>
User-Agent: Mutt/1.4.2.1i
Cc: freebsd-net@freebsd.org
Subject: Re: freebsd 6.0 network card / route fail over question
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 04 Feb 2006 21:34:47 -0000

On Fri, Feb 03, 2006 at 01:33:44PM -0600, Matthew Lineen wrote:
> I'm trying to workout the specifics of NIC/route fail over on FreeBSD 
> 6.0 and hoped someone here could point me in the right direction.
> 
> We have 2 ServerIron load balancers and each of our application servers 
> is plugged into both LBs.
> 
> So, for example, an app server would have the following...
> 
>  bge0 IP of x.y.z.61 netmask 255.255.255.128
>  bge1 IP of x.y.z.63 netmask 255.255.255.128
> 
> In /etc/rc.conf the default route is x.y.z.1
> 
> In the routing table, the default route uses Netif bge0.  So, when we 
> turn off the first load balancer, bge0 goes down, but the default route 
> never "moves" from bge0.
> 
> I assume this is because ...
> 
> #1 - FreeBSD doesn't like having two interfaces bound to the same 
> x.y.z/25 network (we get plenty of the "arp: x.y.z.123 is on bge0 but 
> got reply from ... on bge1" messages)

Correct.

> #2 - The default route is bound to bge0 because bge0 is the first 
> interface that contains an IP in the same network as the default route's.
> 
> So, my question is: what approaches do people take to solve this 
> problem?  I've come across  forwarding and carp, but I thought I'd ask 
> the list to see if there is something simple I'm missing, other ways of 
> handling this, etc...

I don't see a simple alternative. The approaches I can see are:

(1) The layer 2 approach. Try to make an ethernet bundle consisting of two
links; a single IP address will be shared by both. I don't know if FreeBSD
supports this, and in any case, it will almost certainly only work if the
two uplinks go into the same switch.

(2) The layer 3 approach. Assign bge0 and bge1 different IP addresses
(preferably on two different subnets). Learn your default route via OSPF or
RIP from the upstream router(s), using something like quagga. Given that the
upstream devices are ServerIrons, which are really just fancy switches, this
may not work, but maybe you can get a RIP defaultroute announcement out of
them.

(3) The layer 7 approach. On each server just have a *single* uplink into
one of the two ServerIrons, and rely on your application failover mechanism.

You presumably have multiple application servers, so if a whole server
fails, everything keeps working properly, right? In that case, rely on this
mechanism to cope with the case where your server's NIC or the cable or the
upstream switch fails. Make sure half the servers are on one switch and half
on the other, so if the whole switch fails, you still have half your servers
reachable. And keep a spare switch in the closet.

Method (3) is the one I've used successfully for a mailserver cluster. There
were two MX receivers, two webmail servers, four POP3 servers; half on one
uplink and half on the other. IMO it's at least as likely likely that a
whole server will fail (bad PSU, failed hard drive etc) than the NIC or
switch port fails.

Regards,

Brian.