Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 May 2000 23:40:17 -0700 (PDT)
From:      "Rodney W. Grimes" <freebsd@gndrsh.dnsmgr.net>
To:        jgreco@ns.sol.net (Joe Greco)
Cc:        stable@freebsd.org, freebsd-net@freebsd.org
Subject:   Re: Any known problems with routing in 3.4R?
Message-ID:  <200005060640.XAA16249@gndrsh.dnsmgr.net>
In-Reply-To: <200005110724.CAA60699@aurora.sol.net> from Joe Greco at "May 11, 2000 02:24:36 am"

next in thread | previous in thread | raw e-mail | index | archive | help
> > > I've set up a FreeBSD 3.4R box to do BGP.  It takes full routes off an ATM
> >                ^^^^^^^^^^^^^^
> > 
> > Upgrade to 3.4-Stable, the reference count of your network interface
> > is wrapping past the 16 bit limit of a short and probably the cause
> > of your panic.
> 
> Bleah.  Fixed in 4.0R?  I don't like running non-releases on production
> equipment, too hard to rebuild in a crisis situation.

Yes, fixed in 4.0R, though some other rough edges exists there and I
could not deploy that on our AS until about 5 weeks ago.  (The 3.4 output
I showed you was from about the last router we have running on 3.x, most
of the others are on 4.0-stable, here is the brother to the earlier
output:  (These aren't quite OC-3, they are peered upstream at 100BaseTX,
peered to each other and internal routers over dual 100BaseTX.)

br2.CN85pm.abtltd.com:rgrimes {59}% uname -a
FreeBSD br2.CN85pm.abtltd.com 4.0-STABLE FreeBSD 4.0-STABLE #1: Sat Mar 18 07:49:48 PST 2000     root@br2.CN85pm.abtltd.com:/usr/src/sys/compile/BR2  i386
br2.CN85pm.abtltd.com:rgrimes {60}% netstat -rn | wc
   77759  312930 4205246
br2.CN85pm.abtltd.com:rgrimes {61}% vmstat -m | grep route
     routetbl160654 21962K  22898K 32768K  7151686    0     0  16,32,64,128,256

...
> 
> I'm sorry but I wasn't going to run with something like 48M of kvm when
> the default it was giving me was 40 and I was running out (since the limit
> is half of kvm).  I don't really need the case where some misconfigured
> peer decides to hand me ten thousand internal routes or something stupid
> like that, and the GRF handles up to 150,000 (in theory) - so a FreeBSD
> router should probably be able to cope with that too.

GRF's are after all just a BSD box running a hacked gated :-).  You'll need
to shove kvm up pretty high, but the best way to protect yourself from
missconfigured peers is to route filter carefully.   I have been wanting
to migrate to zebra due to it's superiour configurability in this area,
but the stability of ospf has just never been there, though it is looking
good right now.  (I run zebra on a few test boxes in the lab.)

> So as you can see, I set 80M for kvm.  This also leaves sufficient space
> for other stuff.

Yea... well, be carefull your wasting memory on kernel page table pages
which are wired in pages that can't be used for anything else.

> > br1.CN85pm.abtltd.com# uname -a
> > FreeBSD br1.CN85pm.abtltd.com 3.4-STABLE FreeBSD 3.4-STABLE #0: Mon Jan  3 02:40:43 PST 2000     root@br1.CN85pm.abtltd.com:/usr/src/sys/compile/BR1  i386
> > br1.CN85pm.abtltd.com# netstat -rn | wc
> >    77695  468055 5444199
> > 
> > > I'm getting periodic (every few days) crashes.  I recently started recording
> > > the console output, and the last crash was due to a panic
> > 
> > Yepp.. that sure sounds like the 3.4R if reference counter problem, usually
> > right after a big flash update from one of your BGP peers.
> 
> Well, bearing in mind that on OC3c you get a full table in a matter of
> seconds (it's fun because I can watch gated eat the CPU), what I was seeing
> was more like it'd freak five to ten minutes after a route table reload.  I 
> would kill and restart gated, with the chances apparently being 50/50ish as
> to whether or not I'd lose it as described.  

Killing gated is one of the things that can trip the 16 bit if ref count
bug.  As gated tries to remove the routes from the kernel the ref counter
makes the fatal 1 to 0 transition when infact it should be 65537 to 65536.
> 
> Now, the thing is, the system would generally be quiescent from a routing
> update POV.  While I do maintain a large number of internal OSPF routes, 
> they do not tend to flap, and inspection of the other side did not indicate 
> any change coming from BGP.  I do not propagate external routes into OSPF
> or anything fancy (I'd kill most of my boxes if I did) and just run a
> default-with-some-preferred-routes type thing.

OSPF churn can also trigger this, though we found that to be rare, most
often it was either a gated restart, an upstream doing a big flash update,
or loss of an iBGP peer causing an internal flap.  We also do not export
BGP into OSPF, but instead make generous use of gated generate statements
to create internal OSPF default routes (yes, routes, as in 1 default per
upstream peer)  the OSPF routes get propogated AS wide and all boarder
routers are fully iBGP meshed so they all know the shortest way out, we
let the internal stuff use OSPF to get it to the closest boarder router
which then deals with iBGP to decide where to exit the AS.  Works pretty
well, had to tweak a few things to get it balanced, but working like a
champ for almost a year now.

> 
> I also got one other crash that simply said "panic: rtfree" and it went,
> so I'm really not sure exactly what is going on.

Thats IT!!  Yep, ref count when through 0 in the downward direction :-(.

> Overall, I'm pleased with the performance of FreeBSD in this application.
> The machine it is now on is a K6/233, and seems to perform respectably.  I
> do not know if I'd want to push full OC3 data rates over it, but there's
> always the K6-III/400 :-)

Don't bother with the K6-III, go to the K6-2-550, the 100MHz bus is what
you really need, the K6-III is not reliable due to heat problems and
being extreamly picky about the regulation of the core voltage.

> Which reminds me, Rod, I'm finally phasing out those PCI/I-SP3G boards you
> sold me.  Great investment, but what with me being able to pick up T2P4's
> and AMD K6/233's for a song, it's time to move on. 

Ahhh... I still have a stack of T2P4's I am running on here, there are
some ap notes (or should I call them mods) out there that allow you to
run a K6-2-400 in them.   The rev 3.10 boards is as simple as setting
2 jumpers on the regulator section (ASUS built one hell of a regulator
on that 3.10 board, it can actually do 1.8 to 3.4 volts, and can deliver
more current than you would need for a K6-550).  To drive the K6-2 above 400
requires that you solder a wire on the back of the CPU socket for the
high order multiplier bit.

You don't get the 100Mhz bus, but some folks have run them at 83Mhz.

I could use a good supply of 233Mhz K6 chips... :-)  Got lots of older
boards around here that I don't have chips for, well, other than the
barell of P75's :-)  Maybe we can work out a deal ??

> But I can't complain
> about boards which ran very well for five years.  Thanks.

Your welcome!


-- 
Rod Grimes - KD7CAX @ CN85sl - (RWG25)               rgrimes@gndrsh.dnsmgr.net


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200005060640.XAA16249>