Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Mar 2006 22:56:46 -0500
From:      Richard A Steenbergen <ras@e-gerbil.net>
To:        Brad <brad@comstyle.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: 802.3ad?
Message-ID:  <20060329035645.GK45591@overlord.e-gerbil.net>
In-Reply-To: <20060329020343.GB20602@blar.home.comstyle.com>
References:  <20060328205624.GZ20678@gremlin.foo.is> <20060328215911.GA20602@blar.home.comstyle.com> <20060329002015.GI45591@overlord.e-gerbil.net> <20060329020343.GB20602@blar.home.comstyle.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Mar 28, 2006 at 09:03:43PM -0500, Brad wrote:
> On Tue, Mar 28, 2006 at 07:20:15PM -0500, Richard A Steenbergen wrote:
> > On Tue, Mar 28, 2006 at 04:59:11PM -0500, Brad wrote:
> > > On Tue, Mar 28, 2006 at 08:56:24PM +0000, Baldur Gislason wrote:
> > > > Following an unrelated discussion about "interface grouping" in OpenBSD,
> > > > I'd like to know if there are any known or planned implementations of LACP (802.3ad)
> > > > interface teaming in FreeBSD?
> > > > FreeBSD currently has etherchannel support, but to my knowledge that will only
> > > > work for a link to a single switch and not provide the possibility of having layer
> > > > 2 resiliance with paths to 2 switches in the same network.
> > > > Any thoughts on this?
...
> He was not asking for ECMP. He clearly asked for a failover mechanism
> for two switches within the SAME Layer *2* domain.

Hrm ok I misread the original post, but I'm still not exactly sure what 
the original poster wanted. 2 redundant paths out via two switches? Sounds 
like an application for VRRP to me.

The operations you're really trying to support are:

a) Routing or bridging directly on an interface,
b) Creating a virtual L2 interface (such as link-agg) and then routing or 
   bridge on it, or
c) Bridging on the physical interfaces, making them members of the same 
   vlan, and then creating a common virtual L3 interface for the vlan.

C) is probably what the original poster is going for, so that they can 
talk to two different switches on two different physical interfaces via a 
common L3 interface.

A properly designed system should be seperating out these layers 
internally, in order to tie all of these features together under a common 
framework. Whether you configure your L3 interface on one physical 
interface, a virtual L2 interface composed of multiple member links, or an 
L2 vlan which may be mapping to one or many physical ports (or having 
multiple L2 vlans map to a single physical interface doing 802.1q), it 
should all be the same. It should be designed this way from the lower 
layers up, not from the higher layers down, with funky bridging hacks and 
funky vlan hacks and funky link-agg hacks.

> What does link aggregation have to do with trunk ports? Link aggregation
> and trunk ports can be configured separately or in a combined fashion but
> in this scenario there is no need or use for trunking. What you have
> described is still not what trunk(4) does.

Ok so, I'm not sure what you think trunking means (in all fairness it IS 
probably one of the most abused terms out there :P), but link aggregation 
and "trunking" as it is used in this context are the exact same thing. 
You're taking two seperate physical L2 channels, binding them together 
under a common virtual L3 interface, applying certain rules like "don't 
create forwarding loops", and using some mechanism to decide what traffic 
to send to what links. Methods like "round robin" and "backup only" just 
happen to be really bad/unusual hashes, but in all other respects it is 
link aggregation.

> PAgP is the negotiation protocol for EtherChannel. So LACP is not bringing
> anything new to the table, except getting away from Cisco propreitary
> standards, which is a really good thing.

I think thats what I said before. :)

> In theory round robin is bad but in reality it is not as bad as is typically
> made out, plus it also depends on the protocols being used over this link.

Reordering tcp packets within a flow has a definite and profound impact on 
the performance of those flows, unless your link speed is so low already 
that it doesn't much matter. If you don't have any mechanism that attempts 
to hash individual TCP flows onto a specific channel/trunk/aggregation 
member, you will most definitely see reordering within a flow.

If you're talking 4xT1's or something this probably isn't a big deal, if 
you're talking NxFE or NxGE speeds and high speed flows it probably is. 
I'm not saying they shouldn't be providing the option if you REALLY want 
per-packet load balancing, but I don't see any sane reason to make that 
the only load balancing mechanism.

> trunk(4) can also provide L2 failover with a primary uplink and one or more
> backup uplinks. trunk(4) can also work in other scenarios like for example
> failover between a wireless uplink and a Ethernet uplink, when the AP is
> within the same L2 domain (bridging AP, as opposed to a router) as the
> Ethernet uplink.

Like I said, that is actually just link aggregation, but with an unusual 
"member aware" hash that isn't often implemented in commercial networking 
devices.

> You can think it sucks all you want, but it makes me sleep sounder at night
> knowing I can engineer my system with higher availability in mind with trunk(4)
> in use than is currently possible with a FreeBSD/NetBSD/DragonFly system.

Well, just to be clear, its the implementation that sucks not the concept. 
I still think you're confused about what the "trunking" is actually buying 
you though. The original motivation behind creating etherchannel style 
link aggregation was for L2 switches to be able to talk to each other with 
more capacity than a single link can provide. The inability to create 
operational parallel paths without forwarding loops or a specific 
etherchannel construct is an issue that only exists at L2. You still need 
this L2 construct in order to happily work with all the devices out there 
(though now we just dig deeper into the header to do L3/L4/Payload 
hashing on these L2 trunks), but a properly designed ECMP system would 
solve a lot of issues too.

-- 
Richard A Steenbergen <ras@e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060329035645.GK45591>