From owner-freebsd-stable@FreeBSD.ORG Tue Apr 20 13:41:49 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 94259106564A; Tue, 20 Apr 2010 13:41:49 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 541B88FC13; Tue, 20 Apr 2010 13:41:49 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id DFBCB46B37; Tue, 20 Apr 2010 09:41:48 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id CF53A8A021; Tue, 20 Apr 2010 09:41:47 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Tue, 20 Apr 2010 07:48:09 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201004200748.09566.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 20 Apr 2010 09:41:47 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: c0re , net@freebsd.org Subject: Re: FreeBSD 7.3, reboot after panic: double fault X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Apr 2010 13:41:49 -0000 On Tuesday 20 April 2010 2:53:16 am c0re wrote: > Hello All! > I've upgraded freebsd from 7.0 to 7.3 and all was good until I tryed to > configure gre interface and use ipfw fwd. > I'm actually does not know what was the point of failure in my > configuration. > > [ some details snipped ] > > It worked about one week and then I made some configuration changes: > added gre interface and 2 aliases: > > # cat /etc/rc.conf |grep > ifconfig_xl0="inet 192.168.0.10 netmask 255.255.255.0" > ifconfig_xl0_alias0="192.168.0.11 netmask 255.255.255.255" > ifconfig_xl0_alias1="192.168.0.12 netmask 255.255.255.255" > cloned_interfaces="gre0" > ifconfig_gre0="inet 192.168.250.6 192.168.250.5 tunnel 192.168.0.12 > 192.168.200.15 netmask 255.255.255.252 link1 up" > > and > > # cat /etc/rc.local > #!/bin/sh > ipfw add fwd 192.168.250.5 icmp from 192.168.0.11 to any out via xl0 > ipfw add fwd 192.168.250.5 tcp from 192.168.0.11 443 to any out via xl0 > ipfw add allow ip from any to any > > # ifconfig gre0 > gre0: flags=b050 metric 0 mtu > 1476 > tunnel inet 192.168.0.12 --> 192.168.200.15 > inet 192.168.250.6 --> 192.168.250.5 netmask 0xfffffffc > > I shutted down gre interface to prevent requests via gre to buggy IP. > > The main idea of such configurations was: fwd all connections to https to > 192.168.0.1 via gre interface. > And also I made apache configurations to make it listen on 192.168.0.11 too. > > And make some tests: ping 192.168.0.11 - was fine, goes via gre. Telnet to > 192.168.0.11 443 was fine too. Then I tryed to make browser https > connection to 192.168.0.11. Apache showed me certificate warning and I > accepted, then in browser nothing happened, it was trying to open page. But > server got kernel panic at that moment. > > At first time I thought that it was some power failure, I tryed 2 more times > and got same behaviour. > > So https works without kernel panic via 192.168.0.10 address but kernel > panics when I try do https via 192.168.0.11 address that source-forwarded > via gre. Looks like the TCP output path got stuck in an infinite recursion loop until it exhausted the kernel stack: > # cd /usr/obj/usr/src/sys/MYKERNEL > # kgdb kernel.debug /var/crash/vmcore.2 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > Fatal double fault: > eip = 0xc08e3ba3 > esp = 0xccf6dfc4 > ebp = 0xccf6e274 > cpuid = 0; apic id = 00 > panic: double fault > cpuid = 0 > Uptime: 7m14s > Physical memory: 235 MB > Dumping 35 MB: 20 4 > > Reading symbols from /boot/kernel/acpi.ko...Reading symbols from > /boot/kernel/acpi.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/acpi.ko > Reading symbols from /boot/kernel/if_gre.ko...Reading symbols from > /boot/kernel/if_gre.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/if_gre.ko > Reading symbols from /boot/kernel/linux.ko...Reading symbols from > /boot/kernel/linux.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/linux.ko > #0 doadump () at pcpu.h:196 > 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:196 > #1 0xc07f2857 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 > #2 0xc07f2b29 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:574 > #3 0xc0a7ea2b in dblfault_handler () at /usr/src/sys/i386/i386/trap.c:983 > #4 0xc08e3ba3 in ipfw_chk (args=0xccf6e28c) at > /usr/src/sys/netinet/ip_fw2.c:2465 > #5 0xc08e6ce1 in ipfw_check_out (arg=0x0, m0=0xccf6e390, ifp=0xc25c5c00, > dir=2, inp=0xc28ba708) at /usr/src/sys/netinet/ip_fw_pfil.c:248 > #6 0xc08a1968 in pfil_run_hooks (ph=0xc0c55240, mp=0xccf6e420, > ifp=0xc25c5c00, dir=2, inp=0xc28ba708) at /usr/src/sys/net/pfil.c:78 > #7 0xc08eb6f2 in ip_output (m=0xc2710b00, opt=0x0, ro=0xccf6e3f4, flags=0, > imo=0x0, inp=0xc28ba708) at /usr/src/sys/netinet/ip_output.c:443 > #8 0xc08f4016 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1134 > #9 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #10 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #11 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #12 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #13 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #14 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #15 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #16 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #17 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #18 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #19 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #20 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #21 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #22 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #23 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #24 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #25 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #26 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #27 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #28 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #29 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #30 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #31 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #32 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #33 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #34 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #35 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #36 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #37 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #38 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #39 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #40 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #41 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #42 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #43 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #44 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #45 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #46 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #47 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #48 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #49 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > ---Type to continue, or q to quit--- > #50 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #51 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #52 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #53 0xc08f6d98 in tcp_mtudisc (inp=0xc28ba708, errno=0) at tcp_offload.h:269 > #54 0xc08f4105 in tcp_output (tp=0xc25b2570) at > /usr/src/sys/netinet/tcp_output.c:1195 > #55 0xc08fdcf8 in tcp_usr_send (so=0xc2ac1820, flags=0, m=0xc270ed00, > nam=0x0, control=0x0, td=0xc28e2d80) at tcp_offload.h:269 > #56 0xc0850405 in sosend_generic (so=0xc2ac1820, addr=0x0, uio=0xc28766c0, > top=0xc270ed00, control=0x0, flags=0, td=0xc28e2d80) at > /usr/src/sys/kern/uipc_socket.c:1243 > #57 0xc084bf7f in sosend (so=0xc2ac1820, addr=0x0, uio=0xc28766c0, top=0x0, > control=0x0, flags=0, td=0xc28e2d80) at /usr/src/sys/kern/uipc_socket.c:1285 > #58 0xc0833c5b in soo_write (fp=0xc28e84c0, uio=0xc28766c0, > active_cred=0xc28e5900, flags=0, td=0xc28e2d80) at > /usr/src/sys/kern/sys_socket.c:103 > #59 0xc082d2e7 in dofilewrite (td=0xc28e2d80, fd=24, fp=0xc28e84c0, > auio=0xc28766c0, offset=-1, flags=0) at file.h:257 > #60 0xc082d5c8 in kern_writev (td=0xc28e2d80, fd=24, auio=0xc28766c0) at > /usr/src/sys/kern/sys_generic.c:402 > #61 0xc082d816 in writev (td=0xc28e2d80, uap=0xccf6fcfc) at > /usr/src/sys/kern/sys_generic.c:388 > #62 0xc0a7f2d5 in syscall (frame=0xccf6fd38) at > /usr/src/sys/i386/i386/trap.c:1101 > #63 0xc0a636a0 in Xint0x80_syscall () at > /usr/src/sys/i386/i386/exception.s:262 > #64 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > (kgdb) quit tcp_output() calls tcp_mtudisc() if ip_output() returns EMSGSIZE: case EMSGSIZE: /* * For some reason the interface we used initially * to send segments changed to another or lowered * its MTU. * * tcp_mtudisc() will find out the new MTU and as * its last action, initiate retransmission, so it * is important to not do so here. * * If TSO was active we either got an interface * without TSO capabilits or TSO was turned off. * Disable it for this connection as too and * immediatly retry with MSS sized segments generated * by this function. */ if (tso) tp->t_flags &= ~TF_TSO; tcp_mtudisc(tp->t_inpcb, 0); return (0); But tcp_mtudisc() calls tcp_output(): tcpstat.tcps_mturesent++; tp->t_rtttime = 0; tp->snd_nxt = tp->snd_una; tcp_free_sackholes(tp); tp->snd_recover = tp->snd_max; if (tp->t_flags & TF_SACK_PERMIT) EXIT_FASTRECOVERY(tp); tcp_output_send(tp); return (inp); I'm not sure why it's not able to figure out the MTU, perhaps folks on net@ can help. However, it would seem that for the tcp_output() case, tcp_mtudisc() should probably not call tcp_output_send(), but instead tcp_output() should just loop back up to the top after calling tcp_mtudisc() and retry. -- John Baldwin