From owner-freebsd-net@FreeBSD.ORG Mon Nov 7 23:23:24 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 40D82106564A; Mon, 7 Nov 2011 23:23:24 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail.sippysoft.com (mail.sippysoft.com [4.59.13.245]) by mx1.freebsd.org (Postfix) with ESMTP id 0E6668FC1A; Mon, 7 Nov 2011 23:23:23 +0000 (UTC) Received: from s0106005004e13421.vs.shawcable.net ([70.71.175.212] helo=[192.168.1.79]) by mail.sippysoft.com with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.72 (FreeBSD)) (envelope-from ) id 1RNYXC-000HQ8-Oc; Mon, 07 Nov 2011 15:23:22 -0800 Message-ID: <4EB86866.9060102@sippysoft.com> Date: Mon, 07 Nov 2011 15:23:18 -0800 From: Maxim Sobolev Organization: Sippy Software, Inc. User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-net@freebsd.org References: <4EB804D2.2090101@FreeBSD.org> <4EB86276.6080801@sippysoft.com> In-Reply-To: <4EB86276.6080801@sippysoft.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-ssp-trusted: yes Cc: "Bjoern A. Zeeb" , Robert Watson , "current@freebsd.org" , Jack Vogel Subject: Re: Panic in the udp_input() under heavy load X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2011 23:23:24 -0000 On 11/7/2011 2:57 PM, Maxim Sobolev wrote: > On 11/7/2011 10:24 AM, Bjoern A. Zeeb wrote: >> Unlikely; the inp is properly locked there and the udp info attach >> better still be valid there; your problem is most likely elsewhere; >> try to see if you have other threads and see what they do at the same >> time, etc. You would need to race with udp_detach(); you also want >> to make sure that the inp still looks sane from either ddb or a dump >> and we are not talking about random memory corruption here. > > Well, as you can see from the trace it points pretty strongly to that > piece of code. And as I said this panic is completely reproducible, > we've seen it at least 5 times to date in exactly this location. > Unfortunately the trace is rather long so we could not capture it in > full before, until we've switched to the 80x50 mode. > > If it was a memory corruption it would be just random fault, while here > we have it failing in this point reliably. > > Unfortunately the panic happens in the driver thread context (I > believe), so the KDB/dump is not working. After panicing the machine > just hangs there. Keyboard is not working and I need to do a hard reset. > > Is there any other explanation that you can think of? Is it possible for > some other portion of the code (i.e. network driver, DMA engine etc) to > trash this structure by writing something off bound? Or something along > the lines? OK, I've put the following catch to prove the case: up = intoudpcb(inp); if (up == NULL) { printf("BZZT! Something is terribly wrong, up == NULL!\n"); INP_RUNLOCK(inp); goto badunlocked; } if (up->u_tun_func == NULL) { I am going to give it a spin on two busiest boxes and see if I can log anything. Regards, -- Maksym Sobolyev Sippy Software, Inc. Internet Telephony (VoIP) Experts Tel: +1-646-651-1110 Fax: +1-866-857-6942 Web: http://www.sippysoft.com MSN: sales@sippysoft.com Skype: SippySoft