Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 May 2000 23:49:09 -0400
From:      "David A. Panariti" <davep@who.net>
To:        freebsd-net@freebsd.org
Subject:   kernel panic in in_delayed_cksum()
Message-ID:  <200005260349.XAA01157@h0000f806dfda.ne.mediaone.net>

next in thread | raw e-mail | index | archive | help
I sent this to freebsd-stable but have gotten no solutions, so I'll
post it here, too.

I believe I have found a bug in netinet.  After a cvsup and make
world/kernel +  mergemaster, I started getting the followiing panics:

delayed m_pullup, m->len: 40  off: 23040  p: 6

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc01b10a8
stack pointer           = 0x10:0xcd069ae4
frame pointer           = 0x10:0xcd069b10
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 5740 (itnd)
interrupt mask          =
trap number             = 12
panic: page fault
syncing disks...
done
Uptime: 10h22m23s

I cannot remember exactly when I cvsup'd.  It was either May 17 or 
May 20. (Any easy way to tell?)
UPDATE: Thanks to Trond Endrest, I know it was cvsup'd on May 20,
16:05 EDT.

Unfortunately, I can only reproduce the panics using an old version of
the AltaVista tunnel.  The tunnel worked perfectly with up to
4-RELEASE. 
I only have the binary of the tunnel code and it was compiled for
FreeBSD2.2.  The fact that it ran perfectly up to 4R is a testament to
backward compatibility!

Anyway, after some investigation, it looks like the m_pullup() is
failing inside in_delayed_cksum().  The mbuf is then NULL and we
panic when we set the csum.
It looks like m_pullup() is failing since offset is very big.
Some prints I added yield this:

(IP_VHL_HL(ip->ip_vhl) << 2): 0, csum_data: 23040
off too big, skipping csum

(I added code to return w/o setting the csum if I see a bogus offset
and I no longer panic, and the ftp which was failing now works better,
but now can panic elsewhere)

Further investigation shows csum_data being mangled here in
ip_output():

	ip = mtod(m, struct ip *);
	
	/*
	 * Fill in IP header.
	 */
	if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
		ip->ip_vhl = IP_MAKE_VHL(IPVERSION, hlen >> 2);
		ip->ip_off &= IP_DF;
>>>>>>>>>>>	ip->ip_id = htons(ip_id++);
		ipstat.ips_localout++;
	} else {
		hlen = IP_VHL_HL(ip->ip_vhl) << 2;
		dp_ck_csum_data(m, "a-7.1"); /* davep */
	}

More prints show:
off too big @ a-7.3, off: 0x14, csum_data: 0x5a00
ip: 0xc0a91920, m: 0xc0a91900, &csum_data: 0xc0a91924

Where:
ip is ip header inside mbuf
m is mbuf pointer
&csum_data = &m->m_pkthdr.csum_data

csum_data is inside the IP header!  And, coincidentally(NOT) ip_id is
4 bytes inside the struct ip, thus overlaying csum_data.

So it looks like the m_data is pointing at M_databuf which should
imply (as the comment states) /* !M_PKTHDR, !M_EXT */
And yet the code is using fields from 
struct	pkthdr MH_pkthdr;	/* M_PKTHDR set */

This is where I leave it for those more familiar with the code to
pursue.  Hopefully someone who knows the code can use this info to
find and fix the bug quickly.  It's taken me ~6 hrs just to find out
this much.

thanks,

davep

--
David Panariti                      /    I can't complain,
davep@who.net                     <//>     but sometimes I still do.
(see also http://www.four11.com)   /     -- Joe Walsh


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200005260349.XAA01157>