From owner-freebsd-performance@FreeBSD.ORG Mon Feb 4 17:01:18 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0C6716A417 for ; Mon, 4 Feb 2008 17:01:18 +0000 (UTC) (envelope-from stefan.lambrev@moneybookers.com) Received: from blah.sun-fish.com (blah.sun-fish.com [217.18.249.150]) by mx1.freebsd.org (Postfix) with ESMTP id 3C6EB13C45A for ; Mon, 4 Feb 2008 17:01:18 +0000 (UTC) (envelope-from stefan.lambrev@moneybookers.com) Received: by blah.sun-fish.com (Postfix, from userid 1002) id E9ED41B10F39; Mon, 4 Feb 2008 18:01:16 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on blah.cmotd.com X-Spam-Level: X-Spam-Status: No, score=-10.6 required=5.0 tests=ALL_TRUSTED,BAYES_00, NORMAL_HTTP_TO_IP autolearn=ham version=3.2.3 Received: from hater.haters.org (hater.cmotd.com [192.168.3.125]) by blah.sun-fish.com (Postfix) with ESMTP id C59741B10EDC for ; Mon, 4 Feb 2008 18:01:13 +0100 (CET) Message-ID: <47A744D9.5080808@moneybookers.com> Date: Mon, 04 Feb 2008 19:01:13 +0200 From: Stefan Lambrev User-Agent: Thunderbird 2.0.0.9 (X11/20071120) MIME-Version: 1.0 To: freebsd-performance@freebsd.org References: <4794E6CC.1050107@moneybookers.com> <47A0B023.5020401@moneybookers.com> <47A3074A.3040409@moneybookers.com> <47A72EAB.6070602@moneybookers.com> In-Reply-To: <47A72EAB.6070602@moneybookers.com> Content-Type: text/plain; charset=windows-1251; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: network performance X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2008 17:01:18 -0000 Greetings, Stefan Lambrev wrote: > Greetings, > > In my desire to increase network throughput, and to be able to handle > more then ~250-270kpps > I started experimenting with lagg and link aggregation control > protocol (lacp). > To my surprise this doesn't increase the amount of packets my server > can handle > > Here is what netstat reports: > > netstat -w1 -I lagg0 > input (lagg0) output > packets errs bytes packets errs bytes colls > 267180 0 16030806 254056 0 14735542 0 > 266875 0 16012506 253829 0 14722260 0 > > netstat -w1 -I em0 > input (em0) output > packets errs bytes packets errs bytes colls > 124789 72976 7487340 115329 0 6690468 0 > 126860 67350 7611600 114769 0 6658002 0 > > netstat -w1 -I em2 > input (em2) output > packets errs bytes packets errs bytes colls > 123695 65533 7421700 113575 0 6584856 0 > 130277 62646 7816626 113648 0 6592280 0 > 123545 64171 7412706 113714 0 6596174 0 > > Using lagg doesn't improve situation at all, and also errors are not > reported. > Also using lagg increased content switches: > > procs memory page disk faults cpu > r b w avm fre flt re pi po fr sr ad4 in sy cs us > sy id > 1 0 0 81048 1914640 52 0 0 0 50 0 0 3036 37902 > 13512 1 20 79 > 0 0 0 81048 1914640 13 0 0 0 0 0 0 9582 83 22166 > 0 56 44 > 0 0 0 81048 1914640 13 0 0 0 0 0 0 9594 80 22028 > 0 55 45 > 0 0 0 81048 1914640 13 0 0 0 0 0 0 9593 82 22095 > 0 56 44 > > Top showed for CPU states +55% system, which is quite high? > > I'll use hwpmc and lock_profiling to see where the kernel spends it's > time. > Greetings, Here is what hwpmc shows (without using lagg): % cumulative self self total time seconds seconds calls ms/call ms/call name 14.7 325801.00 325801.00 0 100.00% MD5Transform [1] 8.4 512008.00 186207.00 0 100.00% _mtx_unlock_flags [2] 6.1 646787.00 134779.00 0 100.00% _mtx_lock_flags [3] 5.6 769909.00 123122.00 0 100.00% uma_zalloc_arg [4] 5.0 879853.00 109944.00 0 100.00% rn_match [5] 3.5 957294.00 77441.00 0 100.00% memcpy [6] 3.1 1025989.00 68695.00 0 100.00% bzero [7] 2.8 1087273.00 61284.00 0 100.00% em_encap [8] 2.6 1145231.00 57958.00 0 100.00% ip_output [9] 2.5 1200105.00 54874.00 0 100.00% bus_dmamap_load_mbuf_sg [10] 2.3 1251626.00 51521.00 0 100.00% syncache_add [11] 2.1 1297826.50 46200.50 0 100.00% syncache_lookup [12] 2.1 1343661.50 45835.00 0 100.00% tcp_input [13] 1.8 1383912.00 40250.50 0 100.00% ip_input [14] 1.5 1417997.00 34085.00 0 100.00% syncache_respond [15] 1.5 1451114.50 33117.50 0 100.00% uma_zfree_internal [16] 1.5 1484046.00 32931.50 0 100.00% critical_exit [17] 1.5 1516899.00 32853.00 0 100.00% MD5Update [18] em0: flags=8843 metric 0 mtu 1500 options=19b ether 00:15:17:58:11:a5 inet 10.3.3.1 netmask 0xffffff00 broadcast 10.3.3.255 media: Ethernet autoselect (1000baseTX ) status: active Is it normal so much time to be spent in MD5Transform with tx/rx enabled? LOCK_PROFILING results here - http://89.186.204.158/lock_profiling2.txt -- Best Wishes, Stefan Lambrev ICQ# 24134177