Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Dec 2007 23:09:23 +0100
From:      Per olof Ljungmark <peo@intersonic.se>
To:        Stefan Lambrev <stefan.lambrev@moneybookers.com>
Cc:        Maxime Henrion <mux@FreeBSD.org>, freebsd-net@FreeBSD.org, freebsd-current <freebsd-current@freebsd.org>
Subject:   Re: [Fwd: Re: rtfree: 0xc5caad98 has 2 refs]
Message-ID:  <47757413.6010807@intersonic.se>
In-Reply-To: <4775229A.3040707@moneybookers.com>
References:  <4774E2FB.2090107@intersonic.se> <4774E68E.7030200@moneybookers.com> <47750046.80705@intersonic.se> <4775229A.3040707@moneybookers.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Stefan Lambrev wrote:
> Hi,
> 
> Can you replace all calls to rtfree() with RTFREE_LOCKED() in those files:
> 
> netinet/if_ether.c
> netinet6/nd6_nbr.c
> netinet6/in6_ifattach.c
> netinet6/in6_gif.c
> 
> Of course do not forget net/route.c with the patch from the PR.
> Recompile the kernel and check if this will cure your hangs?
> 
> I'm not sure about the lock order reversal, may be it was introduced 
> with kbd_backtrace().
> You can remove it from route.c, replace rtfree() and build kernel with 
> debug, to see if the LOR is gone.
> 
> It seems that the panic is caused by rtalloc1() called in route.c line 
> 333 :
> rt = rtalloc1(dst, 0, 0UL);     /* NB: rt is locked */
> 
> most probably because rt is not locked :)
> I'm out of ideas how to check if it is really locked, but you can 
> experiment with RT_LOCK() and RT_UNLOCK().
> May be mtx_trylock() can help too.
> 
> Please share your findings with -net & -current if you did not before.
> 
> =cut=

Unfortunately I ran out of time before I could complete the test. 
However, I can report one more interesting finding from today: The icmp 
packets that triggers the bug probably comes either from a Cisco router 
or the setup itself.

Late today our network topology was changed,

Previous setup:


affected hosts   ISP's router (default gw)
                  .1
LAN ------------ router-------- wlan 1 (via ISP)
                     |         192.168.3.0
our firewall   .254 |
                    fw ----------wlan 2
                     |         172.16.2.0 (isakmpd)
                     |
                  Internet


Current setup:

affected hosts   our fw (OpenBSD)
                  .1            192.168.3.0
LAN ------------ router------ wlan 1 (isakmpd)
                     |
                     |         172.16.2.0 (isakmpd)
                     | --------wlan 2
                     |
                     |
                  Internet

and this "fixed" the problem!

We have no access to the Cisco so I don't know it's configuration. But: 
No lockups, no "rtfree" messages.

If the bug is still unresolved mid-January I can continue testing by 
then. Thanks to all for your suggestions and help!

--per



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47757413.6010807>