From owner-freebsd-arch@FreeBSD.ORG Thu Jan 10 18:45:56 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C92316A46B for ; Thu, 10 Jan 2008 18:45:56 +0000 (UTC) (envelope-from qingli@speakeasy.net) Received: from wmail1.sea5.speakeasy.net (wmail1.sea5.speakeasy.net [69.17.117.157]) by mx1.freebsd.org (Postfix) with ESMTP id 23F4E13C461 for ; Thu, 10 Jan 2008 18:45:56 +0000 (UTC) (envelope-from qingli@speakeasy.net) Received: from wmail.speakeasy.net (localhost [127.0.0.1]) by wmail1.sea5.speakeasy.net (Postfix) with ESMTP id 7811C8008; Thu, 10 Jan 2008 10:29:03 -0800 (PST) Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 From: Qing Li To: freebsd-arch@freebsd.org, Randall Stewart , qingli@freebsd.org X-Origin: 12.178.37.11 Date: Thu, 10 Jan 2008 10:29:03 PST Message-Id: <30834.1199989743@speakeasy.net> X-Mailer: AtMail 4.61 - 12.178.37.11 - qingli@speakeasy.net Cc: Subject: Re: Routing in the network :-) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: qingli@speakeasy.net List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jan 2008 18:45:56 -0000 Interesting you are bringing this up ... I actually sent a similar ema= il to freebsd-net@=20 about 2 years ago and had one response back (it was a polite no). I back ported and integrated the radix_mpath changes from KAME into Fre= eBSD 5.4 and the changes are working good right now in production environment. C= hanges were also necessary in quite a few place throughout the netinet/ files,= e.g.,=20 address initialization functions such as in_ifinit(). I actually discussed what I have done with itojun back in August of 20= 07. >=20 > On Thu Jan 10 6:55 , Randall Stewart sent: >=20 > Hi all: >=20 > A number of years ago, Itojun and I had played off and on > with some modifications to both the routing table and to a > "new" interfaces that could be used by transports to gain > routing information. >=20 > I am contemplating digging back in my archives and building > a p4 branch that would have the changes for folks to look at.. > But before I go to all that trouble I want to have a discussion > about this here ;-) >=20 > This will be a longish email so if you get bored easily or just > don't care about routing/networks and all that fun, you have > been warned :-) >=20 > The basic concept: >=20 > So say I am at home and have purchased two DSL's. One from > AT&T (don't you love the new ma-bell) and the other from > SpeakEasy (Note I had this until I moved out to the country > now I am lucky to have one DSL.. but many can do this if they > want)... So my home looked like: >=20 >=20 > IP-A IP-S > | | > | | > | | > ,__|__________|___ > | | > | | > | lakerest.net | > | | > |_________________| >=20 > Now life is good, I have some degree of > fault tolerance right? >=20 > So AT&T (IP-A) gives me the default route to IP-A1 > and Speak Easy gives me the default route to IP-S1. > Life is not so good... how do I plumb these in the > routing table? >=20 > I can say >=20 > route add default IP-A1 > or > route add default IP-S1 >=20 > But I cannot have both. And worse if I had a connection > up to FreeBSD.net and AT&T's network went down.. and I > happened to have put the first command in.. my network > connection would stop... >=20 > What would be nice if I had a way to add BOTH routes > into the kernel.. and when Layer 4 realized there was some > major problems going on it could "use" the alternate > route (i.e. via IP-S1) and life would once again be > good.. >=20 > Ok, yes, the observant person out there will say.. wait > IP-S1 will NEVER allow your packets through since they > probably do ingress filtering.. yes I am aware of this.. but > this would *NOT* hold true for some device in the network > talking to some other device in the network.. *OR* for > speakeasy.. at least not circa 2004.. since speakeasy > did *NOT* do ingress filtering and my way back former > employer (AT&T) *DID* do ingress filtering.. >=20 > So the idea is rather simple: >=20 > 1) Allow multiple routes on any level of the kernel > patricia trees. >=20 This is done. > > 2) Add an additional interface to the routing code > so that a transport protocol could query the > routing table for additional support... i.e. > excuse me, the route that I had no longer seems > to be working, do you have an alternate gateway? > There was a inp_route field in the in_pcb{} structure but that field was later removed by Andre in 5.5. I never quite understood why but I did find that field to be rather useful. union { /* placeholder for routing entry */ struct route inc4_route; #if 1 /* def NEW_STRUCT_ROUTE */ struct route inc6_route; #else struct route_in6 inc6_route; #endif } inc_dependroute; I used this field for caching and it gets flushed when there is a routing table change. Works out good. > > Now I admit for TCP these API's would have limited use.. > That depends ... :-) > > but for SCTP these are golden.. since both sides know > about all addresses and thus you get a form of true > network diversity out of this little software change. > >=20 > Now yes, this does not help you if both your DSL's > go out to the same pole outside your house, and a > truck hits the pole... but it *DOES* help you if > your network provider dies somewhere back in the CO > running across your carpet to AT&T's DSL and it thinks > chewing on it would be fun :-) >=20 > So what was required way back in 4.x when Itojun and > I did this work.. (note that Itojun called his changes > RADIX_MPATH which did NOT include my alternate > routing lookup code). >=20 > a) For radix.c there were just a few simple changes that > removed the restriction that prevents duplicate routes > at any level of the tree. >=20 > b) For route.c a new method is added.. this is a bit > of code not huge but some. >=20 The rtrequest1() function needed a bit of work but not so huge. >=20 > c) One thing I added but took back out, was some changes to > the "route delete" api... can't remember exactly where.. but > basically the delete does not look at the destination ... i.e. > with the changes Itojun and I had cooked up if you said: > route add default IP-1 > route add default IP-2 > route add default IP-3 >=20 > and then when.. opps.. I don't want IP-2, you could NOT > say route delete default IP-2.. well you could but it did > no good.. it removed the first one (IP-1). I had a fix for > this but Itojun thought it was too radical since it changed > an interface to one of the routing routines... so we just settled > for the fact that if you did that you got to have the pleasure > of using: > route delete default > 3 times.. and then starting again... >=20 I have been enhancing the code for some time now ... I can do both route delete and even route modification (I added route preferences in addition to ECMP). I have 7 fundamental test cases to perform on the implementation to ens= ure=20 both correctness and compatibility.=20 > > So is it worth my time resurrecting these patches for 8.0? Objections > (being in a routing company I know there will be a lot of them.. > gee the routing system is supposed to do that.. etc etc). > > Comments would be welcome before I dust off the patches.. > I would like to get these changes made into 8.0. If there is enough interest out there, I'd be happy to share my impleme= ntation and we probabaly can collaborate on this effort if that works for you. -- Qing =20=20=20 R --=20 Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 803-317-4952 (cell) _______________________________________________ freebsd-arch@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arch To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"