Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Apr 2008 17:53:12 +0200
From:      Andre Oppermann <andre@freebsd.org>
To:        Qing Li <qingli@speakeasy.net>
Cc:        'Qing Li' <qingli@FreeBSD.org>, src-committers@FreeBSD.org, claudio@openbsd.org, cvs-all@FreeBSD.org, cvs-src@FreeBSD.org
Subject:   Re: cvs commit: src/sys/conf files options src/sys/net radix.c radix.h route.c route.h rtsock.c src/sys/netinet in_proto.c ip_output.c  src/sys/netinet6 in6_proto.c in6_src.c nd6_nbr.c
Message-ID:  <4804CF68.9060109@freebsd.org>
In-Reply-To: <000201c89eae$d4dcfe10$b1335140@SAINTS>
References:  <200804130545.m3D5jEtd081771@repoman.freebsd.org> <4803D7E2.80000@freebsd.org> <000201c89eae$d4dcfe10$b1335140@SAINTS>

next in thread | previous in thread | raw e-mail | index | archive | help
Qing Li wrote:
> 	Hi Andre,
>>>   is disallowed. For example,
>>>   
>>>           route add -net 192.103.54.0/24 10.9.44.1
>>>           route add -net 192.103.54.0/24 10.9.44.2
>>>   
>>>   The second route insertion will trigger an error message of
>>>   "add net 192.103.54.0/24: gateway 10.2.5.2: route already 
>> in table"
>>
>> Would it make sense to retain this behavior by default (POLA) 
>> and have multi-path being enabled via sysctl like packet 
>> forwarding in general?
>> Just adding the same route twice with different next-hops can 
>> lead to very confusing situations for the users which are not 
>> used to multi-path.
>>
> 
> 	I think that is possible. Were you thinking more along the
> 	line of accidental route insertion ... Because users who
> 	are not familiar with ecmp probably won't ever bother
> 	with more than one route per destination. 

If there is no error message when adding a second route it easily
happens.  Due to hash based balancing some connections work and
some do not.  Very confusing.

>>>   "route: writing to routing socket: No such process"
>>>   "delete net default: not in table"
>> Can this be made more descriptive?  This messages are about 
>> as confusing and non-descript as possible.  
>>
> 
> 	We should fix the above error message in general.
> 
>> Not being aware of the multipath functionality I would pull 
>> out my last hair try to get rid of a route.
>>
> 
> 	I think updating the manpage would be a necessary
> 	next step.
> 
>> How does this behave with common routing daemons; 
>> Quagga/Zebra, OpenBGPD, OpenOSPFD?  
>>
> 
> 	Hmm... Good question, I haven't tried them but
> 	I will.  Is this something you could help me
> 	with ?

I've chatted with Claudio Jeker (claudio@openbsd.org).  He's the author
of OpenBGPD and OpenOSPFD plus some work on the OpenBSD multipath support.

He says the implicit multipath doesn't work out right and is very difficult
to manage from the routing daemons.  In OpenBSD they had to change it to
explicit mark multipath routes with the RTM_MPATH flag in the table, during
creation and removal.

The problem is that many daemons and programs (dhclient, ppp, ...) do not
properly remove routes and simply re-add a new one with different parameters.
This obviously leads to chaos.

In OpenBSD multipath one has to install an multipath route explicitly with
the -mpath modifier to route(8) and for daemons with RTF_MPATH in the routing
message.  Multipath routes also retain this flag during their lifetime.  If
not set, the normal one-route-only behavior is kept.  This allows all non-mpath
aware programs to continue to work.

I think this is the model to follow.  Also for inter-BSD compatibility.

>> Do they have to be aware 
>> of the multipath functionality?  Will it confuse them?
>>
> 
> 	I don't believe these routing protocols necessarily
> 	have to know about the multipath functionality.
> 	The routing protocols should continue to function
> 	wrt route insertion/deletion.

It's easy to throw them into disarray as they do not expect routes to
persist when they delete (one of) them.

> 	You do bring up a good question about whether
> 	we should associate ownership with a route entry
> 	if multiple routing protocols are running
> 	in parallel. Is this a common practice from your
> 	experience ? And should we allow multiple routes
> 	with the same next-hop but different owners in
> 	the FIB ??

Yes.  Let me explain.  There are two approaches here: The Quagga/Zebra
approach where all routing protocol daemons communicate with a central
daemon that is the single point of contact to the kernel.  The other
approach is the OpenBGPD/OpenOSPFD approach where each daemon runs on
its own (because most of the time there is little to no overlap) and
does its own routing table manipulations.  The second approach is a
bit tricky at the moment as the routing socket is not really intended
for operating in this way and the daemons have to be aware of each
other in certain ways.

Ideally, and this is what Claudio says as well, we should end up with
the following functionality:

  - equal cost multipath where one prefix can have multiple next-hops.
  - ecmp should be explicit with the RTM_MPATH flag.
  - a hierarchy of multiple prefixes where the one with the highest
    priority carries the traffic (possibly with ecmp).
  - the hierarchy should have a number of precedence levels (interface
    route, static route, IGP route, EGP route, other).
  - within those precedence levels it should have further subdivision
    to prefer OSPF over RIP in the IGP category for example.
  - a change/delete applies to a specific precedence level if specified.
  - routing socket filters on reading so that routing daemons can
    select which precendence levels they want to track (IGP doesn't
    have to track EGP route changes for example).

With this functionality a number of independent but complementary routing
daemons can work together is a useful and -more important- standardized way.

The ospfd inserts a multipath for 10.0.0.0/8 via 192.168.1.1 and 192.168.1.2
and precedence 4.  The bgpd inserts a single route for 10.0.0.0/8 via 192.168.1.3
with precedence 8.  All traffic goes through 192.68.1.1 and .2.  If the ospfd
removes both routes .3 will become active right away.  Normally bgpd would have
to notice the removal and then has to insert the new prefix.  If ospfd then
wants to insert them again it has to remove or modify the route bgpd installed.
With precedence multipath these problems go away.

>> What about the other big missing piece; new-arp? ;-)  
>>
> 
> 	That's on its way. Julian is helping me testing the
> 	patch and reviewing the code etc.  I am still
> 	debugging a locking/reference count issue and
> 	I hope to make good progress in the coming week.

May I have a look too before it goes into CVS?

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4804CF68.9060109>