From owner-freebsd-net@FreeBSD.ORG Mon Nov 7 23:26:08 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76F2D106566C; Mon, 7 Nov 2011 23:26:08 +0000 (UTC) (envelope-from bz@FreeBSD.ORG) Received: from mx1.sbone.de (bird.sbone.de [46.4.1.90]) by mx1.freebsd.org (Postfix) with ESMTP id 28C818FC1C; Mon, 7 Nov 2011 23:26:07 +0000 (UTC) Received: from mail.sbone.de (mail.sbone.de [IPv6:fde9:577b:c1a9:31::2013:587]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.sbone.de (Postfix) with ESMTPS id C587F25D385D; Mon, 7 Nov 2011 23:25:36 +0000 (UTC) Received: from content-filter.sbone.de (content-filter.sbone.de [IPv6:fde9:577b:c1a9:31::2013:2742]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPS id ED602BD469C; Mon, 7 Nov 2011 23:25:35 +0000 (UTC) X-Virus-Scanned: amavisd-new at sbone.de Received: from mail.sbone.de ([IPv6:fde9:577b:c1a9:31::2013:587]) by content-filter.sbone.de (content-filter.sbone.de [fde9:577b:c1a9:31::2013:2742]) (amavisd-new, port 10024) with ESMTP id ody5jEqJnMWZ; Mon, 7 Nov 2011 23:25:34 +0000 (UTC) Received: from nv.sbone.de (nv.sbone.de [IPv6:fde9:577b:c1a9:31::2013:138]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.sbone.de (Postfix) with ESMTPSA id 7EBE0BD4699; Mon, 7 Nov 2011 23:25:34 +0000 (UTC) Date: Mon, 7 Nov 2011 23:25:34 +0000 (UTC) From: "Bjoern A. Zeeb" To: Maxim Sobolev In-Reply-To: <4EB86866.9060102@sippysoft.com> Message-ID: References: <4EB804D2.2090101@FreeBSD.org> <4EB86276.6080801@sippysoft.com> <4EB86866.9060102@sippysoft.com> X-OpenPGP-Key-Id: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Robert Watson , Jack Vogel Subject: Re: Panic in the udp_input() under heavy load X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2011 23:26:08 -0000 On Mon, 7 Nov 2011, Maxim Sobolev wrote: > On 11/7/2011 2:57 PM, Maxim Sobolev wrote: >> On 11/7/2011 10:24 AM, Bjoern A. Zeeb wrote: >>> Unlikely; the inp is properly locked there and the udp info attach >>> better still be valid there; your problem is most likely elsewhere; >>> try to see if you have other threads and see what they do at the same >>> time, etc. You would need to race with udp_detach(); you also want >>> to make sure that the inp still looks sane from either ddb or a dump >>> and we are not talking about random memory corruption here. >> >> Well, as you can see from the trace it points pretty strongly to that >> piece of code. And as I said this panic is completely reproducible, >> we've seen it at least 5 times to date in exactly this location. >> Unfortunately the trace is rather long so we could not capture it in >> full before, until we've switched to the 80x50 mode. >> >> If it was a memory corruption it would be just random fault, while here >> we have it failing in this point reliably. >> >> Unfortunately the panic happens in the driver thread context (I >> believe), so the KDB/dump is not working. After panicing the machine >> just hangs there. Keyboard is not working and I need to do a hard reset. >> >> Is there any other explanation that you can think of? Is it possible for >> some other portion of the code (i.e. network driver, DMA engine etc) to >> trash this structure by writing something off bound? Or something along >> the lines? > > OK, I've put the following catch to prove the case: > > up = intoudpcb(inp); > if (up == NULL) { > printf("BZZT! Something is terribly wrong, up == NULL!\n"); > INP_RUNLOCK(inp); > goto badunlocked; > } > if (up->u_tun_func == NULL) { > > I am going to give it a spin on two busiest boxes and see if I can log > anything. Now if you are clever you'd also log the inp there as the above will only prove the case that something is wrong but still not help us in anything to figure out what. /bz -- Bjoern A. Zeeb You have to have visions! Stop bit received. Insert coin for new address family.