Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Sep 2005 20:14:34 +0100 (BST)
From:      Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        Yar Tikhiy <yar@comp.chem.msu.su>, freebsd-current@FreeBSD.org
Subject:   Re: 6.0BETA3 panic in ip_output (vlan/RIP related?)
Message-ID:  <20050904190334.N26703@ury.york.ac.uk>
In-Reply-To: <20050903201628.D88940@fledge.watson.org>
References:  <1125487485.34476.6.camel@buffy.york.ac.uk> <20050831134927.GA20529@comp.chem.msu.su> <1125594635.63101.3.camel@buffy.york.ac.uk> <20050903201628.D88940@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 3 Sep 2005, Robert Watson wrote:
> On Thu, 1 Sep 2005, Gavin Atkinson wrote:
>>> Thanks for reporting this!  The problem seems known and has to do with a 
>>> deficiency in our multicast code WRT interface removal/re-insertion. It is 
>>> on the to-do list of our networking gurus and hopefully will be dealt with 
>>> RSN, after IP multicast code locking and cleanup are complete.
>> 
>> It's good to hear that the problem is understood, however it seems that 
>> this panic is trivial to recreate for anyone running routed, and therefore 
>> 6.0-RELEASE may well be unusable for me.  I've just got this second panic 
>> from the same machine which looks like it may also be related to the 
>> multicast code.
>
> I believe I've chatted with Gleb about this some, but want to confirm that I 
> understand the problem here: this occurs when an interface is removed while 
> IP multicast membership is still present for multicast groups on the 
> interface.  When the multicast socket is closed, then the kernel panics 
> because it has a now invalid cached pointer to the interface structure (now 
> freed), which cases an assertion failure because the mutex code detects that 
> it is operating on an invalid mutex.

I don't explicitely use multicast anywhere in my setup.  I run routed and 
get my default gateway via RIP, which seems to involve the multicast code. 
I've followed the code through and as far as I can tell, your belief is 
correct (at least for the second panic).  The first panic occured when 
creating an interface, but I suspect that was more because routed noticed 
that interfaces had changed and attempted to send packets out, and fell 
over when it stumbled over the invalid mutex in the destroyed vlan.

I have coredumps from both the panics if they will help at all, although I 
suspect it'll be just as easy for you to recreate it at your end - run 
routed without arguments and do something with interfaces - delete and 
recreate vlan interfaces seem the easiest way, although I guess inserting 
or removing a cardbus card would also be bad.

> So it sounds like we need to figure out how the multicast code should behave 
> on interface removal -- I wonder what other operating systems do here?  Do 
> they simply invalidate current membership related with the interface, or do 
> they leave the multicast sockets in a state such that if the interface comes 
> back, the memberships are re-bound?

I can't help here, I'm afraid.

Gavin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050904190334.N26703>