From owner-freebsd-current@FreeBSD.ORG Thu May 2 17:53:58 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 06DC5997; Thu, 2 May 2013 17:53:58 +0000 (UTC) (envelope-from ianf@clue.co.za) Received: from zcs03.jnb1.cloudseed.co.za (zcs03.jnb1.cloudseed.co.za [41.154.0.139]) by mx1.freebsd.org (Postfix) with ESMTP id 93D5C1CE6; Thu, 2 May 2013 17:53:57 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs03.jnb1.cloudseed.co.za (Postfix) with ESMTP id 0FB132B430A0; Thu, 2 May 2013 19:53:55 +0200 (SAST) X-Virus-Scanned: amavisd-new at zcs03.jnb1.cloudseed.co.za Received: from zcs03.jnb1.cloudseed.co.za ([127.0.0.1]) by localhost (zcs03.jnb1.cloudseed.co.za [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rq2jOqtN4EZr; Thu, 2 May 2013 19:53:54 +0200 (SAST) Received: from clue.co.za (unknown [41.154.88.19]) by zcs03.jnb1.cloudseed.co.za (Postfix) with ESMTPSA id B9D212B4309D; Thu, 2 May 2013 19:53:53 +0200 (SAST) Received: from localhost ([127.0.0.1] helo=zen.clue.co.za) by clue.co.za with esmtp (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1UXxhT-000MPI-Ju; Thu, 02 May 2013 19:53:47 +0200 To: John Baldwin From: Ian FREISLICH Subject: Re: panic: in_pcblookup_local (?) In-Reply-To: <201305021209.41221.jhb@freebsd.org> References: <201305021209.41221.jhb@freebsd.org> <20130502104219.GA1586@glenbarber.us> <52B3AEE5-D24A-4ED3-BB11-E7E27BFB447F@freebsd.org> X-Attribution: BOFH Date: Thu, 02 May 2013 19:53:47 +0200 Message-Id: Cc: Glen Barber , freebsd-current@freebsd.org, "Robert N. M. Watson" , Peter Wemm X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 May 2013 17:53:58 -0000 John Baldwin wrote: > On Thursday, May 02, 2013 7:25:08 am Robert N. M. Watson wrote: > > > > On 2 May 2013, at 11:42, Glen Barber wrote: > > > > > Hmm. Perhaps it would be worthwhile for me to rebuild the current > > > kernel with DDB support. It looks like the machine has panicked a few > > > times over the last two weeks or so, but based on the timestamps of the > > > crash dumps and nagios complaints, happened during the middle of the > > > night when I would not have really noticed, or otherwise would have just > > > blamed my ISP. > > > > > > Two of the panics are ath(4) related. One looks similar to the one > > > referenced in this thread, similarly triggered by a CFEngine process. > > > > > > In that case, the backtrace looks like: > > > > > > #4 0xffffffff808cdbb3 at calltrap+0x8 > > > #5 0xffffffff807371d8 at in_pcb_lport+0x128 > > > #6 0xffffffff8073745a at in_pcbbind_setup+0x16a > > > #7 0xffffffff80737d8e at in_pcbconnect_setup+0x71e > > > #8 0xffffffff80737df9 at in_pcbconnect_mbuf+0x59 > > > #9 0xffffffff807bf29f at udp_connect+0x11f > > > #10 0xffffffff80680615 at kern_connectat+0x275 > > > > > > Regarding DDB though, it would be rather difficult to access the machine > > > if it drops to a DDB debugger session, since the machine acts as my > > > firewall. > > > > Thanks -- will take a look at the attached. > > > > FWIW, though, I'm worried by the number of panics you are seeing, especiall y > given that they involve multiple subsystems, and in particular, John's > observation about a potentially corrupted pointer. This makes me wonder > whether (a) you are experiencing hardware faults -- it would be worth running > some memory/cpu/etc tests and (b) if we might be seeing a software memory > corruption bug of some sort. > > Other users have reported this (Ian Lepore), and Peter Wemm can now reproduce > these at will as well, so I think this is a software bug. What might be > easiest if we can't figure this out from the crashdump is just to bisect the > offending revision. I've started a binary search. I'll let you know what that turns up. Ian -- Ian Freislich