Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Jan 2008 10:29:03 PST
From:      Qing Li <qingli@speakeasy.net>
To:        freebsd-arch@freebsd.org, Randall Stewart <rrs@cisco.com>, qingli@freebsd.org
Subject:   Re: Routing in the network :-)
Message-ID:  <30834.1199989743@speakeasy.net>

next in thread | raw e-mail | index | archive | help

    Interesting you are bringing this up ...  I actually sent a similar ema=
il to freebsd-net@=20
    about 2 years ago and had one response back (it was a polite no).

    I back ported and integrated the radix_mpath changes from KAME into Fre=
eBSD 5.4
    and the changes are working good right now in production environment. C=
hanges
    were also necessary in quite a few place throughout the netinet/ files,=
 e.g.,=20
    address initialization functions such as in_ifinit().

     I actually discussed what I have done with itojun back in August of 20=
07.

>=20
> On Thu Jan 10 6:55 , Randall Stewart sent:
>=20
> Hi all:
>=20
> A number of years ago, Itojun and I had played off and on
> with some modifications to both the routing table and to a
> "new" interfaces that could be used by transports to gain
> routing information.
>=20
> I am contemplating digging back in my archives and building
> a p4 branch that would have the changes for folks to look at..
> But before I go to all that trouble I want to have a discussion
> about this here ;-)
>=20
> This will be a longish email so if you get bored easily or just
> don't care about routing/networks and all that fun, you have
> been warned :-)
>=20
> The basic concept:
>=20
> So say I am at home and have purchased two DSL's. One from
> AT&T (don't you love the new ma-bell) and the other from
> SpeakEasy (Note I had this until I moved out to the country
> now I am lucky to have one DSL.. but many can do this if they
> want)... So my home looked like:
>=20
>=20
> IP-A IP-S
> | |
> | |
> | |
> ,__|__________|___
> | |
> | |
> | lakerest.net |
> | |
> |_________________|
>=20
> Now life is good, I have some degree of
> fault tolerance right?
>=20
> So AT&T (IP-A) gives me the default route to IP-A1
> and Speak Easy gives me the default route to IP-S1.
> Life is not so good... how do I plumb these in the
> routing table?
>=20
> I can say
>=20
> route add default IP-A1
> or
> route add default IP-S1
>=20
> But I cannot have both. And worse if I had a connection
> up to FreeBSD.net and AT&T's network went down.. and I
> happened to have put the first command in.. my network
> connection would stop...
>=20
> What would be nice if I had a way to add BOTH routes
> into the kernel.. and when Layer 4 realized there was some
> major problems going on it could "use" the alternate
> route (i.e. via IP-S1) and life would once again be
> good..
>=20
> Ok, yes, the observant person out there will say.. wait
> IP-S1 will NEVER allow your packets through since they
> probably do ingress filtering.. yes I am aware of this.. but
> this would *NOT* hold true for some device in the network
> talking to some other device in the network.. *OR* for
> speakeasy.. at least not circa 2004.. since speakeasy
> did *NOT* do ingress filtering and my way back former
> employer (AT&T) *DID* do ingress filtering..
>=20
> So the idea is rather simple:
>=20
> 1) Allow multiple routes on any level of the kernel
> patricia trees.
>=20

    This is done.

>
> 2) Add an additional interface to the routing code
> so that a transport protocol could query the
> routing table for additional support... i.e.
> excuse me, the route that I had no longer seems
> to be working, do you have an alternate gateway?
>

    There was a inp_route field in the in_pcb{} structure but
    that field was later removed by Andre in 5.5. I never quite
    understood why but I did find that field to be rather useful.

	union {
		/* placeholder for routing entry */
		struct	route inc4_route;
#if 1 /* def NEW_STRUCT_ROUTE */
		struct  route inc6_route;
#else
		struct  route_in6 inc6_route;
#endif
	} inc_dependroute;

    I used this field for caching and it gets flushed when
    there is a routing table change. Works out good.

>
> Now I admit for TCP these API's would have limited use..
>

    That depends ...  :-)

>
> but for SCTP these are golden.. since both sides know
> about all addresses and thus you get a form of true
> network diversity out of this little software change.
>
>=20
> Now yes, this does not help you if both your DSL's
> go out to the same pole outside your house, and a
> truck hits the pole... but it *DOES* help you if
> your network provider dies somewhere back in the CO
> running across your carpet to AT&T's DSL and it thinks
> chewing on it would be fun :-)
>=20
> So what was required way back in 4.x when Itojun and
> I did this work.. (note that Itojun called his changes
> RADIX_MPATH which did NOT include my alternate
> routing lookup code).
>=20
> a) For radix.c there were just a few simple changes that
> removed the restriction that prevents duplicate routes
> at any level of the tree.
>=20
> b) For route.c a new method is added.. this is a bit
> of code not huge but some.
>=20

   The rtrequest1() function needed a bit of work but not so huge.

>=20
> c) One thing I added but took back out, was some changes to
> the "route delete" api... can't remember exactly where.. but
> basically the delete does not look at the destination ... i.e.
> with the changes Itojun and I had cooked up if you said:
> route add default IP-1
> route add default IP-2
> route add default IP-3
>=20
> and then when.. opps.. I don't want IP-2, you could NOT
> say route delete default IP-2.. well you could but it did
> no good.. it removed the first one (IP-1). I had a fix for
> this but Itojun thought it was too radical since it changed
> an interface to one of the routing routines... so we just settled
> for the fact that if you did that you got to have the pleasure
> of using:
> route delete default
> 3 times.. and then starting again...
>=20

    I have been enhancing the code for some time now ...

    I can do both route delete and even route modification (I added
    route preferences in addition to ECMP).

    I have 7 fundamental test cases to perform on the implementation to ens=
ure=20
    both correctness and compatibility.=20

>
> So is it worth my time resurrecting these patches for 8.0? Objections
> (being in a routing company I know there will be a lot of them..
> gee the routing system is supposed to do that.. etc etc).
>
> Comments would be welcome before I dust off the patches..
>

    I would like to get these changes made into 8.0.

    If there is enough interest out there, I'd be happy to share my impleme=
ntation
    and we probabaly can collaborate on this effort if that works for you.

    -- Qing
=20=20=20


R
--=20
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 803-317-4952 (cell)
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?30834.1199989743>