From owner-freebsd-net@FreeBSD.ORG Fri Jun 11 12:14:32 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 928C6106564A for ; Fri, 11 Jun 2010 12:14:32 +0000 (UTC) (envelope-from freebsd-net@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 186AA8FC17 for ; Fri, 11 Jun 2010 12:14:31 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1ON382-000692-Tn for freebsd-net@freebsd.org; Fri, 11 Jun 2010 14:14:30 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 11 Jun 2010 14:14:30 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 11 Jun 2010 14:14:30 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-net@freebsd.org connect(): No such file or directory From: Ivan Voras Date: Fri, 11 Jun 2010 14:14:23 +0200 Lines: 88 Message-ID: References: <4BD885C6.10600@FreeBSD.org> <20100429204544.GC1286@arthur.nitro.dk> <1272998683.2406.38.camel@localhost.localdomain> <20100504190328.GC31196@valentine.liquidneon.com> <4BE80F07.8090309@cs.duke.edu> <4BE82011.6050009@cs.duke.edu> <20100511092000.GA12735@walton.maths.tcd.ie> <4BE961EA.2060806@cs.duke.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.9) Gecko/20100518 Thunderbird/3.0.4 In-Reply-To: <4BE961EA.2060806@cs.duke.edu> X-Enigmail-Version: 1.0.1 Subject: Re: FreeBSD.org IPv6 issue - AAAA records disabled X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2010 12:14:32 -0000 On 05/11/10 15:55, Andrew Gallatin wrote: > David Malone wrote: >> On Mon, May 10, 2010 at 11:02:41AM -0400, Andrew Gallatin wrote: >>> I think something may be holding onto an mbuf after free, >>> then re-freeing it. But only after somebody else allocated >>> it. I was hoping that the mbuf double free referenced >>> above was the smoking gun, but it turns out that there isn't >>> even a bge interface in my pr (just bce and mxge). >> >> Weren't there some bugs fixed recently that alowed the arp/ndp code >> to free packets that weren't previously being freed? They'd be good >> candidates for something that holds onto an mbuf for a while and >> then frees it. > > Unfortunately, I think at least the PR I'm looking into pre-dates > those fixes -- these problems started in r202120 (early Jan). > I need to ask what he upgraded from. > > When did IPv6 become unstable for others? For what it's worth, it looks like using IPv6 really is causing my crashes, but in a really wide-spread ways, possibly a deep memory corruption. I've disabled it on the machine I have here that would crash or hang once or twice a week and so far there are no problems with uptime of nearly 10 days. When IPv6 is enabled here, it is used for "everything" - from interactive ssh sessions, http, to NFS. I have 9 core files here with text crashdumps, if anyone's interested. Here's a sample: # grep -i panic core* core.txt.0:panic: sbsndptr: sockbuf 0xffffff007cca8c20 and mbuf 0xffffff00490a6400 clashing core.txt.0:panic: sbsndptr: sockbuf 0xffffff007cca8c20 and mbuf 0xffffff00490a6400 clashing core.txt.0:#2 0xffffffff80585e4c in panic ( core.txt.0:panic: sbsndptr: sockbuf 0xffffff007cca8c20 and mbuf 0xffffff00490a6400 clashing core.txt.1:panic: page fault core.txt.1:panic: page fault core.txt.1:#2 0xffffffff8058afdc in panic (fmt=0xffffffff809364ac "%s") core.txt.1:panic: page fault core.txt.2:panic: general protection fault core.txt.2:panic: general protection fault core.txt.2:#2 0xffffffff8058afdc in panic (fmt=0xffffffff809364ac "%s") core.txt.2:panic: page fault core.txt.2:savecore: reboot after panic: page fault core.txt.2:Feb 17 20:43:07 geri savecore: reboot after panic: page fault core.txt.2:panic: general protection fault core.txt.3:panic: sbdrop core.txt.3:panic: sbdrop core.txt.3:#2 0xffffffff8058afdc in panic (fmt=0xffffffff80963dff "sbdrop") core.txt.3:panic: sbdrop core.txt.4:panic: general protection fault core.txt.4:panic: general protection fault core.txt.4:#2 0xffffffff8058d26c in panic (fmt=0xffffffff80938c44 "%s") core.txt.4:panic: general protection fault core.txt.5:panic: general protection fault core.txt.5:panic: general protection fault core.txt.5:#2 0xffffffff8058d5ac in panic (fmt=0xffffffff80944404 "%s") core.txt.5:panic: general protection fault core.txt.6:panic: sbflush_internal: cc 0 || mb 0xffffff003a7a9600 || mbcnt 4608 core.txt.6:panic: sbflush_internal: cc 0 || mb 0xffffff003a7a9600 || mbcnt 4608 core.txt.6:Dumping 1573 MB:panic: bufwrite: buffer is not busy??? core.txt.6:#2 0xffffffff8058d5ac in panic ( core.txt.6:panic: sbflush_internal: cc 0 || mb 0xffffff003a7a9600 || mbcnt 4608 core.txt.6:Dumping 1573 MB:panic: bufwrite: buffer is not busy??? core.txt.7:panic: sbsndptr: sockbuf 0xffffff0001f7eec8 and mbuf 0xffffff0001909c00 clashing core.txt.7:#2 0xffffffff8058d5ac in panic ( core.txt.7:panic: sbsndptr: sockbuf 0xffffff0001f7eec8 and mbuf 0xffffff0001909c00 clashing core.txt.7:Dumping 1715 MB:panic: bufwrite: buffer is not busy??? core.txt.8:panic: sbdrop core.txt.8:panic: sbdrop core.txt.8:#2 0xffffffff805a2e9c in panic (fmt=0xffffffff8098f15f "sbdrop") core.txt.8:panic: sbdrop All of the dumps have several messages like "em0: discard frame w/o packet header" immediately before crashing.