Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 May 2013 12:09:41 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        "Robert N. M. Watson" <rwatson@freebsd.org>
Cc:        Ian FREISLICH <ianf@clue.co.za>, Glen Barber <gjb@freebsd.org>, freebsd-current@freebsd.org, Peter Wemm <peter@freebsd.org>
Subject:   Re: panic: in_pcblookup_local (?)
Message-ID:  <201305021209.41221.jhb@freebsd.org>
In-Reply-To: <52B3AEE5-D24A-4ED3-BB11-E7E27BFB447F@freebsd.org>
References:  <E1UW0K5-000P7H-36@clue.co.za> <20130502104219.GA1586@glenbarber.us> <52B3AEE5-D24A-4ED3-BB11-E7E27BFB447F@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, May 02, 2013 7:25:08 am Robert N. M. Watson wrote:
> 
> On 2 May 2013, at 11:42, Glen Barber wrote:
> 
> > Hmm.  Perhaps it would be worthwhile for me to rebuild the current
> > kernel with DDB support.  It looks like the machine has panicked a few
> > times over the last two weeks or so, but based on the timestamps of the
> > crash dumps and nagios complaints, happened during the middle of the
> > night when I would not have really noticed, or otherwise would have just
> > blamed my ISP.
> > 
> > Two of the panics are ath(4) related.  One looks similar to the one
> > referenced in this thread, similarly triggered by a CFEngine process.
> > 
> > In that case, the backtrace looks like:
> > 
> > #4 0xffffffff808cdbb3 at calltrap+0x8
> > #5 0xffffffff807371d8 at in_pcb_lport+0x128
> > #6 0xffffffff8073745a at in_pcbbind_setup+0x16a
> > #7 0xffffffff80737d8e at in_pcbconnect_setup+0x71e
> > #8 0xffffffff80737df9 at in_pcbconnect_mbuf+0x59
> > #9 0xffffffff807bf29f at udp_connect+0x11f
> > #10 0xffffffff80680615 at kern_connectat+0x275
> > 
> > Regarding DDB though, it would be rather difficult to access the machine
> > if it drops to a DDB debugger session, since the machine acts as my
> > firewall.
> 
> Thanks -- will take a look at the attached.
> 
> FWIW, though, I'm worried by the number of panics you are seeing, especially 
given that they involve multiple subsystems, and in particular, John's 
observation about a potentially corrupted pointer. This makes me wonder 
whether (a) you are experiencing hardware faults -- it would be worth running 
some memory/cpu/etc tests and (b) if we might be seeing a software memory 
corruption bug of some sort.

Other users have reported this (Ian Lepore), and Peter Wemm can now reproduce
these at will as well, so I think this is a software bug.  What might be 
easiest if we can't figure this out from the crashdump is just to bisect the
offending revision.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201305021209.41221.jhb>