From owner-freebsd-net@FreeBSD.ORG Mon Apr 3 18:44:05 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DCE0916A422 for ; Mon, 3 Apr 2006 18:44:04 +0000 (UTC) (envelope-from nielsen-list@memberwebs.com) Received: from mail.npubs.com (mail.npubs.com [209.66.100.224]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0C86743D69 for ; Mon, 3 Apr 2006 18:44:03 +0000 (GMT) (envelope-from nielsen-list@memberwebs.com) From: Nate Nielsen User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051013) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Message-Id: <20060403184402.9DA3EDCAC70@mail.npubs.com> X-Virus-Scanned: ClamAV using ClamSMTP Date: Mon, 3 Apr 2006 18:44:03 +0000 (GMT) Subject: Panic (race condition?) in ipsec_process_done X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: nielsen@memberwebs.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Apr 2006 18:44:05 -0000 I've been experiencing a panic in ipsec_process_done. Below is a backtrace and a patch which supresses the issue. I don't profess to understand the IPSec code completely... The panic occurs when performing IKE negotiations (racoon) with multiple systems at the same time. The panicing boxes are routers, and running a slow CPU so negotiations take several seconds. Immediately after boot and while IKE is going on the system panics. Needless to say after the reboot (after panic) IKE happens again and this results in a the box rebooting over and over. I'm guessing this a is due to a halfway setup IPSec keys. For me this issue only happens on production systems, so debugging is very difficult, but I've managed to get a kernel dump and backtrace. The patch (below) is probably incomplete, but prevents the problem from happening for me. USING - FreeBSD 6.0 - FAST_IPSEC - Hardware encryption (hifn driver, aes algorithm) - ipsec-tools 0.6.2 - Soekris net4826 BACKTRACE Fatal trap 12: page fault while in kernel mode fault virtual address = 0x70 fault code = supervisor read, page not present instruction pointer = 0x20:0xc05ee61e stack pointer = 0x28:0xc6e43ca4 frame pointer = 0x28:0xc6e43cb4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6 (crypto returns) trap number = 12 panic: page fault Uptime: 1m6s Dumping 109 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 109MB (27904 pages) 94 78 62 46 30 14 (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc050fcb2 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc050ff48 in panic (fmt=0xc06c6078 "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc06a0c00 in trap_fatal (frame=0xc6e43c64, eva=112) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc06a096b in trap_pfault (frame=0xc6e43c64, usermode=0, eva=112) at /usr/src/sys/i386/i386/trap.c:742 #5 0xc06a05a9 in trap (frame= {tf_fs = -1006895096, tf_es = 167968808, tf_ds = 168099880, tf_edi = -1059907712, tf_esi = -1060580736, tf_ebp = -958120780, tf_isp = -958120816, tf_ebx = -1061533440, tf_edx = -1061533440, tf_ecx = -1059907712, tf_eax = 0, tf_trapno = 12, tf_err = -1065091072, tf_eip = -1067522530, tf_cs = -1060634592, tf_eflags = 66178, tf_esp = 0, tf_ss = -1061533440}) at /usr/src/sys/i386/i386/trap.c:432 #6 0xc06903ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc05ee61e in ipsec_process_done (m=0xc0b6e100, isr=0xc0ba4900) at /usr/src/sys/netipsec/ipsec_output.c:96 #8 0xc05fbe29 in esp_output_cb (crp=0xc0d31780) at /usr/src/sys/netipsec/xform_esp.c:919 #9 0xc061c5d8 in crypto_ret_proc () at /usr/src/sys/opencrypto/crypto.c:1227 #10 0xc04f9c48 in fork_exit (callout=0xc061c4c4 , arg=0x0, frame=0xc6e43d38) at /usr/src/sys/kern/kern_fork.c:789 #11 0xc069041c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 PATCH --- sys/netipsec/ipsec_output.c.orig Mon Apr 3 17:58:32 2006 +++ sys/netipsec/ipsec_output.c Mon Apr 3 17:57:52 2006 @@ -93,6 +93,13 @@ IPSEC_ASSERT(m != NULL, ("null mbuf")); IPSEC_ASSERT(isr != NULL, ("null ISR")); + + /* XXX This happens. Figure out why. */ + if (!isr->sav) { + m_freem (m); + return ENOBUFS; + } + sav = isr->sav; IPSEC_ASSERT(sav != NULL, ("null SA")); IPSEC_ASSERT(sav->sah != NULL, ("null SAH")); Cheers, Nate