From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 18 03:35:53 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E735716A4CE for ; Wed, 18 Aug 2004 03:35:53 +0000 (GMT) Received: from pythagoras.math.uwaterloo.ca (pythagoras.math.uwaterloo.ca [129.97.140.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5F14D43D31 for ; Wed, 18 Aug 2004 03:35:53 +0000 (GMT) (envelope-from anand@cs.uwaterloo.ca) Received: from hopper.math.uwaterloo.ca (anand@hopper.math.uwaterloo.ca [129.97.78.132])i7I3Zob07249 for ; Tue, 17 Aug 2004 23:35:51 -0400 (EDT) Date: Tue, 17 Aug 2004 23:35:48 -0400 (EDT) From: Anand Subramanian X-X-Sender: anand@hopper.math.uwaterloo.ca To: freebsd-hackers@freebsd.org In-Reply-To: <20040120200044.8377516A4D0@hub.freebsd.org> Message-ID: References: <20040120200044.8377516A4D0@hub.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Miltered: at rhadamanthus by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Virus-Scanned: clamd / ClamAV version 0.75, clamav-milter version 0.75g on localhost X-Virus-Status: Clean Subject: Network Packet drops in FreeBSD 5.2.1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Aug 2004 03:35:54 -0000 Hi All, I am using an Intel Celeron box (single CPU,1.7GHz, 495MB real and 472 MB avail memory, FreeBSD 5.2.1 #15 release), to run a daemon process which shares a circular queue/buffer with the kernel. The daemon drains objects off the front of the queue while the queue objects are populated by the protocol processing function say, XXX_input() in the kernel, called by ip_input(). If the front and rear indices of the shared buffer are equal then the protocol stack drops the packet alright. This model works fine for packet (60-64 bytes in size) input rates upto 11500 packets/sec, after which packets are lost. The machine running the daemon uses the SiS 900 NIC, 10/100Mbps. The packet loss was detected using the "netstat -I sis0" command, run both on starting the daemon and upon shutting the daemon down. The Ipkts field in the netstat output should indicate the number of packets received by the interface in question.. With an appropriate chosen value for the shared buffer length, no packets are dropped because the shared queue is full. Hence packets seem to be dropped at the adapter level. The surprising part seems to be that though packets are being dropped/lost, top shows a ~70% idle system with peak interrupt time of ~25%. The daemon uses a "hacked" version of the select() call with a timeout value. The XXX_input() protocol processing function signals the thread/KSE waiting on the hacked select() call as soon as it sees that there are packets in the shared buffer(shared between the daemon and the kernel). Question is : 1. Is top really accurate in reporting all stats at such workloads, or input pkt rates? Can the %Idle time reported by top be trusted? 2. At increasing network input loads (12000 pkts/sec), much of the system time maybe spent in the hardware interrupt handler, ip processing functions. With the user daemon calling select(), any time spent in the select() call would be charged to the daemon's timeslice. Would it be fairly scheduled to run. It should be(of course depending on the RPLs), but wanted to confirm this. 3. When ip_drain() is called and it calls the DEQUEUE macro, it acquires Giant. DOes this mean other netisr's and handlers are disabled so that the queue gets emptied in a sort of batch-mode behavior? 4. I am trying different clock speeds by changing kern.hz in loader.conf. Doesn't seem to help but I am still looking into this. When network packets are being dropped at the interface level, is it really necessary for the system to be ~0% idle??? Any other input is greatly appreciated. Best, Anand