From owner-freebsd-net@FreeBSD.ORG Sun Nov 26 12:39:17 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A997D16A403 for ; Sun, 26 Nov 2006 12:39:17 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9B48A43D70 for ; Sun, 26 Nov 2006 12:38:18 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 4401546C9E; Sun, 26 Nov 2006 07:39:12 -0500 (EST) Date: Sun, 26 Nov 2006 12:39:12 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Bruce Evans In-Reply-To: <20061126165353.Y47830@delplex.bde.org> Message-ID: <20061126123700.G2108@fledge.watson.org> References: <20061125015223.GA51565@cdnetworks.co.kr> <200611260331.28847.max@love2party.net> <20061126165353.Y47830@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Max Laier , freebsd-net@freebsd.org, Nicolae Namolovan Subject: Re: ping -f panic [Re: Marvell Yukon 88E8056 FreeBsd Drivers] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Nov 2006 12:39:17 -0000 On Sun, 26 Nov 2006, Bruce Evans wrote: > On Sun, 26 Nov 2006, Max Laier wrote: > >> On Saturday 25 November 2006 23:20, Nicolae Namolovan wrote: >>> But I need to use it on a production server and the CURRENT one is too >>> unstable, without too much thinking I just run ping -f 127.0.0.1 and after >>> some minutes I got kernel panic, heh. >> >> could you please be more specific about this? My rather recent current box >> is running for over 45min doing "ping -f 127.0.0.1" with no panic or other >> ill behavior so far. After about 10min I disabled the icmp limiting which >> obviously didn't trigger it either. Could you provide a back trace or at >> least a panic message? Thanks. > > I haven't seen any problems with ping, but ttcp -u causes the panic in > sbdrop_internal() about half the time when the client ttcp is killed by ^C. > There is apparently a race in close when packets are arriving. The stack > trace on the panicing CPU is (always?): > > ... sigexit exit1 ... closef ... soclose ... > sbflush_internal sbdrop_internal panic > > and on the other CPU, with net.isr.direct=1 it was: > > bge_rxeof ... netisr_dispatch ip_input ... > sbappendaddr_locked mb_ctor_mbuf --- trap (NMI IPI for cpustop). > > and with net.isr.direct=0, the other CPU was just running "idle: cpuN" and > the bge thread was in ithread_loop. Historically, sbflush panics have been a sign of a driver<->stack race, in which the driver touches the mbuf [chain] after injecting it into the stack, corrupting the socket buffer state. For example, freeing it, appending another mbuf, changing the length, etc. It often triggers in sbflush because we notice the inconsistency when we close the socket and flush the buffer later. I wouldn't preclude a network stack bug, but I would definitely take a look at the driver in detail first, making sure all error cases are handled properly, etc. Robert N M Watson Computer Laboratory University of Cambridge