From owner-freebsd-stable@FreeBSD.ORG  Sat Oct 23 08:21:42 2010
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B62B4106564A
	for <stable@freebsd.org>; Sat, 23 Oct 2010 08:21:42 +0000 (UTC)
	(envelope-from mike@sentex.net)
Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca
	[IPv6:2607:f3e0:80:80::2])
	by mx1.freebsd.org (Postfix) with ESMTP id EF1878FC0A
	for <stable@freebsd.org>; Sat, 23 Oct 2010 08:21:41 +0000 (UTC)
Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18])
	by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o9N8LXJg052168
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
	for <stable@freebsd.org>; Sat, 23 Oct 2010 04:21:33 -0400 (EDT)
	(envelope-from mike@sentex.net)
Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27])
	by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o9N8LVuR001382;
	Sat, 23 Oct 2010 04:21:31 -0400 (EDT) (envelope-from mike@sentex.net)
Message-Id: <201010230821.o9N8LVuR001382@lava.sentex.ca>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Sat, 23 Oct 2010 04:21:31 -0400
To: Jack Vogel <jfvogel@gmail.com>
From: Mike Tancsa <mike@sentex.net>
In-Reply-To: <AANLkTimWTTHWC04my3CSoNGYsLarS9F10eoO=8Fz37cF@mail.gmail.c
 om>
References: <m2zku7cqt5.wl%randy@psg.com> <m2y69rcqjc.wl%randy@psg.com>
	<201010221416.o9MEGSa0094817@lava.sentex.ca>
	<m2tykeb9ac.wl%randy@psg.com>
	<201010221425.o9MEPcWC094867@lava.sentex.ca>
	<m2k4lab6nh.wl%randy@psg.com>
	<201010221848.o9MIm7WF096197@lava.sentex.ca>
	<m2y69q9e38.wl%randy@psg.com> <4CC1F3B8.3010302@bogus.com>
	<4CC225D3.1030502@ops-netman.net>
	<7.1.0.9.0.20101022210145.06fe25e8@sentex.net>
	<201010230159.o9N1xGGF098363@lava.sentex.ca>
	<AANLkTimWTTHWC04my3CSoNGYsLarS9F10eoO=8Fz37cF@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50
Cc: Chris Morrow <morrowc@ops-netman.net>, Joel Jaeggli <joelja@bogus.com>,
	stable <stable@freebsd.org>, warren@kumari.net, Randy Bush <randy@psg.com>
Subject: Re: repeating crashes with 8.1
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Oct 2010 08:21:42 -0000

At 12:41 AM 10/23/2010, Jack Vogel wrote:
>Odd, can you make any connection between this and the em complaints??

I dont think so.  This is on an igb nic and a different 
panic/behaviour. I have the box sitting at the debugger prompt in the 
FreeBSD netperf cluster, so hopefully someone can take a look and see 
what is the issue.

         ---Mike


>Jack
>
>
>On Fri, Oct 22, 2010 at 6:59 PM, Mike Tancsa 
><<mailto:mike@sentex.net>mike@sentex.net> wrote:
>At 09:11 PM 10/22/2010, Mike Tancsa wrote:
>At 08:01 PM 10/22/2010, Chris Morrow wrote:
>Note, Warren and I attempted to test this this evening on a 10.04 Ubuntu
>box, no crashy-crashy...
>
>
>
>I was able to trigger the issue on box (c).  I was ping6ing box (a) 
>when I did a hard down of (d)'s connected interface. The box then 
>dropped to debugger
>
>
>Fatal trap 9: general protection fault while in kernel mode
>cpuid = 0; apic id = 00
>instruction pointer     = 0x20:0xffffffff80740a50
>stack pointer           = 0x28:0xffffff800005a890
>frame pointer           = 0x28:0xffffff800005a930
>
>code segment            = base 0x0, limit 0xfffff, type 0x1b
>                        = DPL 0, pres 1, long 1, def32 0, gran 1
>processor eflags        = interrupt enabled, resume, IOPL = 0
>current process         = 12 (swi4: clock)
>[thread pid 12 tid 100007 ]
>Stopped at      in6_cksum+0x410:        movzwl  (%rsi),%r10d
>db> bt
>Tracing pid 12 tid 100007 td 0xffffff00025083e0
>in6_cksum() at in6_cksum+0x410
>icmp6_reflect() at icmp6_reflect+0x312
>icmp6_error() at icmp6_error+0x1ec
>nd6_llinfo_timer() at nd6_llinfo_timer+0x208
>softclock() at softclock+0x2a6
>intr_event_execute_handlers() at intr_event_execute_handlers+0x66
>ithread_loop() at ithread_loop+0xb2
>fork_exit() at fork_exit+0x12a
>fork_trampoline() at fork_trampoline+0xe
>--- trap 0, rip = 0, rsp = 0xffffff800005ad30, rbp = 0 ---
>db>
>
>
>
>
>I was able to do it, but not the box I expected
>
>4 boxes
>
>(a) Attacking host 2001:db8:1:1/64
>(b) victim, not on a connected interface with a). Outside interface 
>- em0 - 2001:db8::2:1/64, inside interface - em1 - 2001:db8::3:1/64
>(c) a host behind (b) 2001:db8::3:c/64
>(d) a host behind (b), 2001:db8::3:d/64
>
>
>hosts (c) and (d) have default gateways to b).  (c) however, has a 
>next hop for (a) via (d).  So rather than go out its normal default 
>gateway, it takes an extra hop via (d).
>
>Start a ping6 from (a) to (c).  Then down (d)'s interface so that 
>the ping6 fails.  Let the ping keep running for an hour or 
>two.  Eventually (b) gets error messages like
>
>Oct 22 18:38:32 zoo kernel: em1: discard frame w/o packet header
>
>and crashes.
>
>Unfortunately, I thought it would be (c) that crapped out, not (b) 
>and I didnt have crash dumps enabled on the host.  Just in the 
>process of setting up a better environment.
>
>        ---Mike
>
>-chris
>
>On 10/22/10 16:27, Joel Jaeggli wrote:
> > Ok I'll try testing that on some box I can reach with both hands.
> >
> > fyi nagasaki is:
> >
> > [root@nagasaki ~]# uname -a
> > FreeBSD <http://nagasaki.bogus.com>nagasaki.bogus.com 
> 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #13:
> > Sun May 30 22:19:23 UTC 2010
> > root@nagasaki.bogus.com:/usr/obj/usr/src/sys/GENERIC  i386
> > [root@nagasaki ~]#
> >
> >
> > On 10/22/10 1:17 PM, Randy Bush wrote:
> >>>>>>> Do you know how this panic is triggered ? Are you able to
> >>>>>>> create it on demand ?
> >>>>>>
> >>>>>> no i do not.  bring server up and it'll happen in half an hour.
> >>>>>> and the server was happy for two months.  so i am thinking hardware.
> >>>>>
> >>>>> Perhaps. The reason I ask is that I had a box go down last night with
> >>>>> the same set of errors.  The box has a number of ipv6 routes, but its
> >>>>> next hop was down and the problems started soon after. So I wonder if
> >>>>> it has something to do with that.  Do you have ipv6 on this box and
> >>>>> are all the next hop addresses correct / reachable ?
> >>>>>
> >>>>> Oct 22 02:06:02 i4 kernel: em1: discard frame w/o packet header
> >>>>> Oct 22 02:06:10 i4 kernel: em2: discard frame w/o packet header
> >>>>> Oct 22 02:06:21 i4 kernel: em1: discard frame w/o packet header
> >>>>
> >>>> it was co-incident with a border router being taken down for new router
> >>>> install.  that router was the v6 exit the servers was using.  i have now
> >>>> pointed default6 to a different exit.  the server seems happy.
> >>>
> >>>
> >>> Are you servers still up ?  I guess the question now is how to
> >>> trigger this problem on demand.  Perhaps lots of inbound ipv6 traffic
> >>> with a bad next hop out ?  How recent are you sources ?  The kernel
> >>> said Oct 21st. Were the sources from then too ?
> >>
> >> yes, kernel and world from 21 oct
> >>
> >> chris had an idea on retrigger, install a static for a small dest that
> >> points to a hole.  send a packet to the small dest.
> >>
> >> randy
> >>
>
>
>--------------------------------------------------------------------
>Mike Tancsa,                                      tel +1 519 651 3400
>Sentex 
>Communications, 
><mailto:mike@sentex.net>mike@sentex.net
>Providing Internet since 
>1994                    <http://www.sentex.net>www.sentex.net
>Cambridge, Ontario 
>Canada                         <http://www.sentex.net/mike>www.sentex.net/mike
>
>
>--------------------------------------------------------------------
>Mike Tancsa,                                      tel +1 519 651 3400
>Sentex 
>Communications, 
><mailto:mike@sentex.net>mike@sentex.net
>Providing Internet since 
>1994                    <http://www.sentex.net>www.sentex.net
>Cambridge, Ontario 
>Canada                         <http://www.sentex.net/mike>www.sentex.net/mike
>
>_______________________________________________
><mailto:freebsd-stable@freebsd.org>freebsd-stable@freebsd.org mailing list
><http://lists.freebsd.org/mailman/listinfo/freebsd-stable>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to 
>"<mailto:freebsd-stable-unsubscribe@freebsd.org>freebsd-stable-unsubscribe@freebsd.org"
>

--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike