From owner-freebsd-current Sun Mar 16 15:50: 7 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8214E37B401 for ; Sun, 16 Mar 2003 15:50:04 -0800 (PST) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id CE29C43F93 for ; Sun, 16 Mar 2003 15:50:03 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0507.cvx22-bradley.dialup.earthlink.net ([209.179.199.252] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18uht9-0004CQ-00; Sun, 16 Mar 2003 15:50:00 -0800 Message-ID: <3E750D52.FFA28DA2@mindspring.com> Date: Sun, 16 Mar 2003 15:48:34 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Petri Helenius Cc: freebsd-current@FreeBSD.ORG Subject: Re: mbuf cache References: <0ded01c2e295$cbef0940$932a40c1@PHE> <20030304164449.A10136@unixdaemons.com> <0e1b01c2e29c$d1fefdc0$932a40c1@PHE> <20030304173809.A10373@unixdaemons.com> <0e2b01c2e2a3$96fd3b40$932a40c1@PHE> <20030304182133.A10561@unixdaemons.com> <0e3701c2e2a7$aaa2b180$932a40c1@PHE> <20030304190851.A10853@unixdaemons.com> <001201c2e2ee$54eedfb0$932a40c1@PHE> <20030307093736.A18611@unixdaemons.com> <008101c2e4ba$53d875a0$932a40c1@PHE> <3E68ECBF.E7648DE8@mindspring.com> <3E70813B.7040504@he.iki.fi> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4a31c97f95b20bec9c98888d852147212548b785378294e88350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Petri Helenius wrote: > Terry Lambert wrote: > >Ah. You are receiver livelocked. Try enabling polling; it will > >help up to the first stall barrier (NETISR not getting a chance > >to run protocol processing to completion because of interrupt > >overhead); there are two other stall barriers after that, and > >another in user space is possible depending on whether the > >application layer is request/response. > > Are you sure that polling would help even since the em driver is using > interrupt regulation by default? You mean hardware interrupt coelescing, not regulation. Regulation is where you prevent the card from generating interrupts during a livelock situation, to permit the host to process the data it already has in the pipeline. It will help some. Instead of livelocking by the interrupt load causing NETISR to never run, it will livelock where NETISR attemps to push data to user space, which is never read by the user space process, because the user space process never gets to run, since interrupts, and now, NETISR processing, are taking all the CPU time. You can get to this same point in -CURRENT, if you are using up to date sources, by enabling direct dispatch, which disables NETISR. This will help somewhat more than polling, since it will remove the normal timer latency between receipt of a packet, and processing of the packet through the networks stack. This should reduce overall pool retention time for individual mbufs that don't end up on a socket so_rcv queue. Because interrupts on the card are not acknowledged until the code runs to completion, this also tends to requlate interupt load. This also has the desirable side effect that stack processing will occur on the same CPU as the interrupt processing occurred. This avoids inter-CPU memory bus arbitration cycles, and ensures that you won't engage in a lot of unnecessary L1 cache busting. Hence I prefer this method to polling. > It might solve the livelock but it does > probably not increase the performance of the mbuf allocator? No, it does not increase the performance of the mbuf allocator. The main problem with the mbuf allocator as it stands today is that there is a tradeoff between how fast you can make it, and whether or not it's SMP safe. There is a researcher at the University of Kentucky, who I have explained a number of obscure details of the VM system to, who has implemented a freelist allocator, and gotten a 5 times performance increase on his TCP stack. I'm not sure if he'd be willing to share his research with you or anyone else, but if you read back over my own postings regarding mbuf allocators, you should be able to repeat the code developement that he has done. Note that his allocator is not SMP safe, and it's probably antithetical to the idea, at all. Personally, I'm coming to the conclusion that SMP systems should be treated as NUMA machines, and seperately allocated resources, and, potentially, even OS images. Until the memory and I/O bus speeds catch up with the CPU speeds again, the cost of resource contention stalls is so incredibly high because of the speed multipliers as to make it not really worth running SMP systems. You will get much better load capacity scaling out of two cheaper boxes, if you implement correctly, IMO. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message