Date: Tue, 21 Apr 2009 23:25:40 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Mikolaj Golub <to.my.trociny@gmail.com> Cc: freebsd-stable@FreeBSD.org Subject: Re: RELENG_7 crash Message-ID: <alpine.BSF.2.00.0904212324280.67705@fledge.watson.org> In-Reply-To: <86k55demux.fsf@kopusha.onet> References: <200904210524.n3L5O9YS086865@lava.sentex.ca> <200904211111.57295.jhb@freebsd.org> <200904211519.n3LFJFsk090691@lava.sentex.ca> <20090421153112.GA47589@edoofus.dev.vega.ru> <200904211610.n3LGAYll090970@lava.sentex.ca> <alpine.BSF.2.00.0904211851040.67705@fledge.watson.org> <86k55demux.fsf@kopusha.onet>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 22 Apr 2009, Mikolaj Golub wrote: > RW> There are several bugs here, one difficult to fix (lack of > RW> refcounting), but also stuff like ifp being derived from an interface > RW> number twice, but checked against NULL only the first time (line 85 > RW> checked for NULL, re-queried but no check line 88). Fixing the top > RW> bit of the function to only query the ifp once and check it for NULL > RW> then would be a good idea. More fundamentally, we do need to refcount > RW> ifnets when used from the management path, which is not all that hard > RW> a change, but preferably to try the easy way first given where we are > RW> in the release cycle. > > I was thinking a bit on this problem (the same issue was reported in > kern/132734) and eliminating double call of ifnet_byindex() was the first > thing I did. But I decided then that the proper fix would be to wrap all > critical code in sysctl_ifdata in IFNET_RLOCK/IFNET_RUNLOCK (the patch is > attached). It looks like I am wrong and my idea about how the kernel works > is oversimplified? :-) Unfortunately, I didn't manage to reproduce the panic > in my environment so I was not able to do some experiments and tests. The problem is that you can't hold IFNET_RLOCK() over the sysctl copyouts. I'm preparing a patch for 8-CURRENT that will add a refcount to struct ifnet to handle this case, but that isn't an MFC candidate for 7.2. If fixing the return value checks for ifnet_byindex() avoids the insta-panic Mike's running into, it's a better 7.2 change at this point. We can target the ifnet refcount changes for 7.3. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.0904212324280.67705>