From owner-freebsd-net@FreeBSD.ORG Fri Jan 18 15:00:22 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 302D9C35; Fri, 18 Jan 2013 15:00:22 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id E9583E1E; Fri, 18 Jan 2013 15:00:21 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 98FFC7300A; Fri, 18 Jan 2013 15:59:44 +0100 (CET) Date: Fri, 18 Jan 2013 15:59:44 +0100 From: Luigi Rizzo To: Barney Cordoba Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx() Message-ID: <20130118145944.GA68125@onelab2.iet.unipi.it> References: <20130117174427.GA65218@onelab2.iet.unipi.it> <1358520603.55904.YahooMailClassic@web121606.mail.ne1.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1358520603.55904.YahooMailClassic@web121606.mail.ne1.yahoo.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jan 2013 15:00:22 -0000 On Fri, Jan 18, 2013 at 06:50:03AM -0800, Barney Cordoba wrote: > > > the problem i was actually seeing are slightly different, > > namely: > > - once the driver lags behind, it does not have a chance to > > recover > > ? even if there are CPU cycles available, because both > > interrupt > > ? rate and packets per interrupt are capped. > > - much worse, once the input stream stops, you have a huge > > backlog that > > ? is not drained. And if, say, you try to ping the > > machine, > > ? the incoming packet is behind another 3900 packets, > > so the first > > ? interrupt drains 100 (but not the ping request, so no > > response), > > ? you keep going for a while, eventually the external > > world sees the > > ? machine as not responding and stops even trying to > > talk to it. > > This is a silly example. As I said before, the 100 work limit is > arbitrary and too low for a busy network. If you have a backlog of > 3900 packets with a workload of 100, then your system is so incompetently > tuned that it's not even worthy of discussion. > > If you're using workload and task queues because you don't know how to > tune moderation the process_limit, that's one discussion. But if you can't > process all of the packets in your RX queue in the interrupt window than > you either need to tune your machine better or get a faster machine. > > When you tune the work limit you're making a decision about the trade off between livelock and dropping packets. It's not an arbitrary decision. maybe i am too incompetent to participate to this discussion. what do i know about this stuff, after all! have fun :) luigi