From owner-freebsd-net@FreeBSD.ORG Mon Dec 24 11:54:10 2007 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3460616A419; Mon, 24 Dec 2007 11:54:10 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id EF05B13C4DB; Mon, 24 Dec 2007 11:54:09 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 9BE6447911; Mon, 24 Dec 2007 06:54:09 -0500 (EST) Date: Mon, 24 Dec 2007 11:54:09 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: dima <_pppp@mail.ru> In-Reply-To: Message-ID: <20071224114504.E40176@fledge.watson.org> References: <20071220135342.O67327@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@FreeBSD.org, net@FreeBSD.org Subject: Re: Re: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Dec 2007 11:54:10 -0000 On Thu, 20 Dec 2007, dima wrote: >> Per earlier e-mail, I've created a page to track the various on-going >> projects: >> >> http://wiki.freebsd.org/TCPProjects8 >> >> Rui has already kindly added the TCP ECN work to the page. > > As I know, we have a single swi:net thread in the kernel yet. Are there any > plans to make several such threads? If yes, this activity isn't mentioned in > wiki. > > There are 2 ideas: 1. per-core thread 2. per-interface thread I like the > second more. This is a kind of tricky point, and one we will definitely be looking at. In FreeBSD 6, we did link layer processing in the ithread, and deferred network layer and socket layer processing to the netisr and user thread. In FreeBSD 7, we process up through the network layer and socket deliver in the ithread, and only the socket read/copyout are deferred to the user thread. This means that in FreeBSD 7, we get true parallelism between different input sources. We still have the netisr, which is used for certain types of deferred processing, such as loopback network traffic (in order to avoid entering the receive path from the transmit path), IPSEC tunnel processing, etc, but for general ethernet traffic, it is not used. This appears to work really well for a small number of interfaces because we eliminate a large number of context switches, and pushed the "drop point" from software into hardware, meaning that we don't burn cycles doing link layer processing for packets that will never make it to the network layer (netisr queue overflow). The two real downsides are that this promotes network layer processing to interrupt priority rather than soft interrupt priority (and this may propagate to more other threads), and that the opportunity for parallelism is reduced between the link layer and the network processing layer. The reason we went ahead and made the default change (it's configurable at runtime) is that it seemed that in most cases, we saw a significant performance improvement. However, the current ithread/direct dispatch model has scaling issues as we approach larger numbers of interfaces, as the ithread approach does generally, because when the number of active thread exceeds the number of cores and the system is really busy, context switches are re-introduced, as well as an increased chance of ithreads bouncing around, etc. What to do at that point is an interesting question--would we be better off reducing the number of active threads so that we have a small ithread worker pool serving many devices, for example? So, in answer to your original question: we already do a per-interface thread for all in-bound processing in FreeBSD 7, but we'll need to continue to work on the underlying model and its behavior under high load. Robert N M Watson Computer Laboratory University of Cambridge