Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Dec 1999 11:25:43 -0800 (PST)
From:      "Rodney W. Grimes" <freebsd@gndrsh.dnsmgr.net>
To:        jgreco@ns.sol.net (Joe Greco)
Cc:        jdp@polstra.com (John Polstra), current@FreeBSD.ORG, stable@FreeBSD.ORG
Subject:   Re: Route table leaks
Message-ID:  <199912081925.LAA98301@gndrsh.dnsmgr.net>
In-Reply-To: <199912081423.IAA59277@aurora.sol.net> from Joe Greco at "Dec 8, 1999 08:23:35 am"

next in thread | previous in thread | raw e-mail | index | archive | help
> > Have any of you been seeing route table leaks in -current?  I noticed
> > this week that cvsup-master.freebsd.org is suffering from them.  I
> > actually had to reboot it because it couldn't allocate any more.  From
> > the "vmstat -m" output:
> > 
> > Memory statistics by type                          Type  Kern
> >         Type  InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
> > [...]
> >      routetbl150907 21221K  21221K 21221K   462184    0     0  16,32,64,128,256
> > [...]
> > I can think of some experiments to try in order to start to diagnose
> > it.  But first, have any of you seen this problem?
> 
> Hell, I've been seeing this for well over a year.  The last time I mentioned
> it, everybody seemed to think I was nuts.  :-)

:-)

> FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999
> 
>      routetbl289178 40961K  40961K 40960K   435741    0     0 16,32,64,128,256
> 

Mine has leeked very much, and this is on a bgp4 gated box:
     routetbl143395 19599K  21961K 32768K  2344966    0     0  16,32,64,128,256

Note the request counts vs total table size, oh and:
 {104}% netstat -ran | wc
   69398  418030 4862684
 {105}% uptime
11:20AM  up 7 days,  8:15, 1 user, load averages: 0.21, 0.06, 0.02
 {106}% uname -a
FreeBSD br1 3.3-STABLE FreeBSD 3.3-STABLE #0: Tue Nov 23 20:15:59 PST 1999

I haven't leaked away as much as you have, so it seems that actually
having the full routing table reduces it :-)

> When it gets like that, it starts losing the ability to add further ARP
> table entries and essentially starts going randomly deaf to local hosts
> (and to a lesser extent remote hosts).

Thats what I have seen on 3 occassions now, you get a can't allocate llinfo
error from arpresolve/arplookup:
/var/log/messages.1.gz:Dec  1 18:01:18 br1 /kernel: arpresolve: can't allocate llinfo for 205.238.40.30rt

Note the bad printf output, that ``rt'' really is in my syslogs :-(

> 
> I've also seen it on a 3.3-RELEASE box, but it's not currently happening
> to any of them right now.
> 
> Machines in question are SMP boxes, and get hit fairly heavily in various
> Usenet news server roles.  Seems to happen quite a bit more often on boxes 
> that talk to a wide variety of host types, and I can't recall having seen
> it on boxes that only talk to other FreeBSD boxes.  But that could also be
> because the network environment is much more controlled internally.
> 

Running a few full blown IBGP and EBGP sessions carrying 2 or more view of
the full 68K internet route routing table and it takes about 7 to 10 days
of route churn on a large KVM space kernel to cause it to have the llinfo
problem...   or at least I think this is what I have been seeing since I
upgraded our 3.2 systems to 3.3-stable about 3 weeks ago... before this
we where getting at least 30 day uptimes (about all I'd let it get before
some other change had has rebooting, not due to a problem on the boxes)


-- 
Rod Grimes - KD7CAX @ CN85sl - (RWG25)               rgrimes@gndrsh.dnsmgr.net


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912081925.LAA98301>