From owner-freebsd-net  Sun Feb 25 16:44:28 2001
Delivered-To: freebsd-net@freebsd.org
Received: from VL-MS-MR002.sc1.videotron.ca (relais.videotron.ca [24.201.245.36])
	by hub.freebsd.org (Postfix) with ESMTP
	id 256A537B401; Sun, 25 Feb 2001 16:44:18 -0800 (PST)
	(envelope-from bmilekic@technokratis.com)
Received: from jehovah ([24.202.203.190]) by
          VL-MS-MR002.sc1.videotron.ca (Netscape Messaging Server 4.15)
          with SMTP id G9CA1K01.LG4; Sun, 25 Feb 2001 19:44:08 -0500 
Message-ID: <00d001c09f8d$8ee4d360$becbca18@jehovah>
From: "Bosko Milekic" <bmilekic@technokratis.com>
To: "Adrian Penisoara" <ady@warpnet.ro>,
	<freebsd-stable@FreeBSD.ORG>, <freebsd-net@FreeBSD.ORG>
References: <Pine.BSF.4.10.10102252012040.446-300000@ady.warpnet.ro>
Subject: Re: Kernel crush due to frag attack
Date: Sun, 25 Feb 2001 19:46:29 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2919.6700
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Sender: owner-freebsd-net@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


Adrian Penisoara wrote:

> Hi,
>
>   As we are facing a heavy fragments attack (40-60byte packets in a
> ~ 1000 pkts/sec flow) I see some sporadic panics. Kernel/world is
> 4.2-STABLE as of 18 Jan 2001 -- it's a production machine and I
hadn't yet
> the chance for another update; if it's been fixed in the mean time I
would
> be glad to hear it...
>
>  I have attached a gdb trace and a snip of a tcpdump log. When I
rebuilt
> the kernel with debug options it seemed to crush less often. I
remember
> that at the time of this panic I had an ipfw rule to deny IP
fragments.

    This is one of those "odd" faults I've seen in -STABLE sometimes.
Thanks to good debugging information you've provided, to be noted:

#16 0xc014de98 in m_copym (m=0xc07e7c00, off0=0, len=40, wait=1)
          at ../../kern/uipc_mbuf.c:621
621     n->m_pkthdr.len -= off0;
(kgdb) list
616   if (n == 0)
617     goto nospace;
618   if (copyhdr) {
619     M_COPY_PKTHDR(n, m);
620     if (len == M_COPYALL)
621       n->m_pkthdr.len -= off0;    <-- fault happens here (XXX)
622    else
623       n->m_pkthdr.len = len;
624       copyhdr = 0;
625    }
(kgdb) print n
$1 = (struct mbuf *) 0x661c20
(kgdb) print *n
cannot read proc at 0
(kgdb) print m
$2 = (struct mbuf *) 0xc07e7c00

Where the fault happens (XXX), the possible problem is that the mbuf
pointer n is bad, and as printed from the debugger, it does appear to
be bad. However, there are two things to note:

    1. the fault virtual address displayed in the trap message:

Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x89c0c800
[...]

    is different from the one printed in your analysis (even though
0x89c0c800 seems bogus as well, although it is at a correct boundry).

    2. Nothing bad happens in M_COPY_PKTHDR() which dereferences an
equivalent pointer.

Something seriously evil is happening here and, unfortunately, I have
no idea what.

Does this only happen on this one machine? Or is it reproducable on
several different machines? I used to stress test -STABLE for mbuf
starvation and never stumbled upon one of these `spontaneous pointer
deaths' myself. Although I have seen other weird problems reported by
other people, but only in RELENG_3.

If you cannot reproduce it on any other machines, I would start
looking at possibly bad hardware... unless someone else sees something
I'm not.

>  If you need further data just ask, I'd be glad to help,
>  Ady (@warpnet.ro)

Regards,
Bosko.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message