Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 May 2013 11:56:03 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Glen Barber <gjb@freebsd.org>
Cc:        Ian FREISLICH <ianf@clue.co.za>, freebsd-current@freebsd.org, Robert Watson <rwatson@freebsd.org>
Subject:   Re: panic: in_pcblookup_local (?)
Message-ID:  <201305011156.03974.jhb@freebsd.org>
In-Reply-To: <20130430211908.GB1621@glenbarber.us>
References:  <E1UW0K5-000P7H-36@clue.co.za> <201304301653.13845.jhb@freebsd.org> <20130430211908.GB1621@glenbarber.us>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, April 30, 2013 5:19:08 pm Glen Barber wrote:
> On Tue, Apr 30, 2013 at 04:53:13PM -0400, John Baldwin wrote:
> > Try 'p phd' to start.  INP_PCBPORTHASH is a macro, so you will
> > have to do it by hand:
> > 
> > 'p pcbinfo->ipi_porthashbase[lport & pcbinfo->ipi_porthashmask]'
> > 
> > (That should be what 'porthash' is.)
> > 
> 
> Thanks for the pointers.  (Hah!)
> 
> Hopefully this is the info you are looking for:
> 
> Script started on Tue Apr 30 17:16:07 2013
> root@orion:/usr/obj/usr/src/sys/ORION # kgdb ./kernel.debug 
/var/crash/vmcore.4
> [...]
> #0  doadump (textdump=<value optimized out>) at pcpu.h:231
> 231		__asm("movq %%gs:%1,%0" : "=r" (td)
> (kgdb) frame 6
> #6  0xffffffff80736cec in in_pcblookup_local (pcbinfo=0xffffffff80dc9180, 
laddr=
>       {s_addr = 50374848}, lport=339, lookupflags=1, 
cred=0xfffffe016cdad100)
>     at /usr/src/sys/netinet/in_pcb.c:1438
> 1438			LIST_FOREACH(phd, porthash, phd_hash) {
> (kgdb) p phd
> $1 = (struct inpcbport *) 0x9e17b100fffffe00

That is odd, that looks word-swapped, as if it should be
0xfffffe009e17b100 (which would be a more normal pointer in the kernel on 
amd64).

> (kgdb) p pcbinfo->ipi_porthashbase[lport & pcbinfo->ipi_porthashmask]
> $2 = {lh_first = 0x0}

So the list is now empty. :(

This feels like the list was updated out from under the pcbinfo.  Looking at
your earlier e-mail:

(kgdb) p *pcbinfo
$1 = {ipi_lock = {lock_object = {lo_name = 0xffffffff809d4d82 "udp", lo_flags 
= 69926912, 
      lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, ipi_listhead = 
0xffffffff80dc9108, 
  ipi_count = 28, ipi_gencnt = 535501, ipi_lastport = 21249, ipi_lastlow = 0, 
  ipi_lasthi = 0, ipi_zone = 0xfffffe0017b60380, ipi_pcbgroups = 0x0, 
ipi_npcbgroups = 0, 
  ipi_hashfields = 0, ipi_hash_lock = {lock_object = {
      lo_name = 0xffffffff80a03d80 "pcbinfohash", lo_flags = 69402624, lo_data 
= 0, 
      lo_witness = 0x0}, rw_lock = 18446741877615517696}, ipi_hashbase = 
0xfffffe00120f6000, 
  ipi_hashmask = 127, ipi_porthashbase = 0xfffffe00120f5c04, ipi_porthashmask 
= 127, 
  ipi_wildbase = 0x0, ipi_wildmask = 0, ipi_vnet = 0x0, ipi_pspare = {0x0, 
0x0}}

It looks like the ipi_hash_lock is locked (and udp_connect() locks it), so I 
think the offending code is somewhere else.  Also, I can't find anything that
removes an inp without hold the correct pcbinfo lock.  Only thing I can think
of is if the pcbinfo pointer for an inp could change, so we could maybe
lock the wrong one while removing it?

Hmmmmmm, you know.  In in_pcbremlists() and in_pcbdrop(), we read inp_phd 
without holding the hash lock. I think that probably don't actaully break
anything, but this feels like a locking issue of some sort.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201305011156.03974.jhb>