From owner-freebsd-net@FreeBSD.ORG Wed Aug 3 03:49:36 2005 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D3A916A41F for ; Wed, 3 Aug 2005 03:49:36 +0000 (GMT) (envelope-from dave-sender-1932b5@seddon.ca) Received: from seddon.ca (seddon.ca [203.209.212.18]) by mx1.FreeBSD.org (Postfix) with SMTP id 5B4E943D48 for ; Wed, 3 Aug 2005 03:49:35 +0000 (GMT) (envelope-from dave-sender-1932b5@seddon.ca) Received: (qmail 95463 invoked by uid 89); 3 Aug 2005 03:49:34 -0000 Received: by seddon.ca (tmda-sendmail, from uid 89); Wed, 03 Aug 2005 13:49:33 +1000 (EST) To: freebsd-net@freebsd.org Date: Wed, 03 Aug 2005 13:49:32 +1000 Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: quoted-printable From: Dave+Seddon Message-ID: <1123040973.95445.TMDA@seddon.ca> X-Delivery-Agent: TMDA/1.0.3 (Seattle Slew) Subject: running out of mbufs? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2005 03:49:36 -0000 Greetings, I'm trying to do some performance testing of a content filtering system, = so I'm trying to get very high HTTP throughput. I've got 4 * HP DL380s with= 3.4G Xeon processors (hyper threading) and 1 G RAM, 2 onboard BGEs, and 2= * 2 port EM. Using FreeBSD5.4-stable (as of 2005/08/02) and device polling= , I've configured a large number (246) VLAN interfaces on two machines, and= have apache on one box and siege on the other. Using 'siege -f /home/my_big_list_of_urls -c 50 --internet' one host does a large number = of request from the other machine. I've been trying to tune for maximum performance and have been using lots of examples for /etc/sysctl.conf and= so on from the web. Adjusting these settings and running the siege, I've fo= und the apache server completely loses network connectivity when device polli= ng is enabled. I've adjusted the HZ lots and found the system survives the longest set a 15000 (yes it seems very large doesn't it). The problem no= w seems to be that I'm running out of mbufs: -------------------------------------- 4294264419 mbufs in use 4294866740/2147483647 mbuf clusters in use (current/max) 0/3/6656 sfbufs in use (current/peak/max) 3817472 KBytes allocated to network 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines -------------------------------------- host228# cat kern.polling kern.polling.burst: 671 kern.polling.each_burst: 100 kern.polling.burst_max: 1000 kern.polling.idle_poll: 0 kern.polling.poll_in_trap: 0 kern.polling.user_frac: 70 kern.polling.reg_frac: 40 kern.polling.short_ticks: 3523 kern.polling.lost_polls: 49996588 kern.polling.pending_polls: 1 kern.polling.residual_burst: 0 kern.polling.handlers: 2 kern.polling.enable: 1 kern.polling.phase: 0 kern.polling.suspect: 1768262 kern.polling.stalled: 9 kern.polling.idlepoll_sleeping: 1 ------------------------------------- For some reason, the 'current' can be WAAAY higher than the 'max' which seems very odd. I've tried putting the 'max' right up to 5 billion, howe= ver it only goes to 2.1 billion. How should I proceed further? How come the box loses all connectivity, rather than just some TCP stream= s failing? Why doesn't the network recover when I stop the siege? Why does kern.polling.burst_max only go to 1000 when I try setting it to 1500? Settings: ---------------------------------------------------------- host228# sysctl kern.polling kern.polling.burst: 684 kern.polling.each_burst: 100 kern.polling.burst_max: 1000 kern.polling.idle_poll: 0 kern.polling.poll_in_trap: 0 kern.polling.user_frac: 70 kern.polling.reg_frac: 40 kern.polling.short_ticks: 97 kern.polling.lost_polls: 8390 kern.polling.pending_polls: 0 kern.polling.residual_burst: 0 kern.polling.handlers: 2 kern.polling.enable: 1 kern.polling.phase: 0 kern.polling.suspect: 3642 kern.polling.stalled: 0 kern.polling.idlepoll_sleeping: 1 ------------------------------------------------------------ host228# cat /etc/sysctl.conf #kern.polling.enable=3D1 kern.polling.enable=3D1 #kern.polling.user_frac: 50 #kern.polling.reg_frac: 20 kern.polling.user_frac=3D70 kern.polling.reg_frac=3D40 #kern.polling.burst: 5 #kern.polling.each_burst: 5 #kern.polling.burst_max: 150 #default for 100MB/s kern.polling.burst=3D1000 kern.polling.each_burst=3D100 kern.polling.burst_max=3D2000 #example I found on the web #kern.polling.burst: 1000 #kern.polling.each_burst: 80 #kern.polling.burst_max: 1000 #net.inet.tcp.sendspace: 32768 #net.inet.tcp.recvspace: 65536 net.inet.tcp.sendspace=3D1024000 net.inet.tcp.recvspace=3D1024000 #sysctl net.inet.tcp.rfc1323=3D1 Activate window scaling and timestamp options according to RFC 1323. net.inet.tcp.rfc1323=3D1 net.inet.tcp.delayed_ack=3D0 #kern.ipc.maxsockbuf: 262144 kern.ipc.maxsockbuf=3D20480000 #The kern.ipc.somaxconn sysctl variable limits the size of the listen que= ue for accepting new TCP connections. The default value of 128 is typically = too low for robust handling of new connections in a heavily loaded web server= environment. #kern.ipc.somaxconn: 128 kern.ipc.somaxconn=3D1024 #The TCP Bandwidth Delay Product Limiting is similar to TCP/Vegas in NetB= SD. It can be enabled by setting net.inet.tcp.inflight.enable sysctl variable= to 1. The system will attempt to calculate the bandwidth delay product for e= ach connection and limit the amount of data queued to the network to just the= amount required to maintain optimum throughput. #This feature is useful if you are serving data over modems, Gigabit Ethernet, or even high speed WAN links (or any other link with a high bandwidth delay product), especially if you are also using window scaling= or have configured a large send window. If you enable this option, you shoul= d also be sure to set net.inet.tcp.inflight.debug to 0 (disable debugging),= and for production use setting net.inet.tcp.inflight.min to at least 6144= may be beneficial. #these are the defaults #net.inet.tcp.inflight.enable: 1 #net.inet.tcp.inflight.debug: 0 #net.inet.tcp.inflight.min: 6144 #net.inet.tcp.inflight.max: 1073725440 #net.inet.tcp.inflight.stab: 20 #Disable entropy harvesting for ethernet devices and interrupts. There a= re optimizations present in 6.x that have not yet been backported that impro= ve the overhead of entropy harvesting, but you can get the same benefits by disabling it. In your environment, it's likely not needed. I hope to backport these changes in a couple of weeks to 5-STABLE. kern.random.sys.harvest.ethernet=3D0 kern.random.sys.harvest.interrupt=3D0 -------------------------------------------------- host228# sysctl -a | grep ipc | grep nm kern.ipc.nmbclusters: 25600 host228# sysctl kern.ipc.nmbclusters=3D5000000000 kern.ipc.nmbclusters: 25600 -> 2147483647 host228# sysctl -a | grep ipc | grep nm kern.ipc.nmbclusters: 2147483647 ------------------------------------------------- host228# sysctl -a | grep hz kern.clockrate: { hz =3D 15000, tick =3D 66, profhz =3D 1024, stathz =3D = 128 } debug.psmhz: 20 -------------------------------------------------- THE PHYSCIAL INTERFACES ONLY (I'm only using 1 interface per 2 port card,= and only running performance tests on the em cards) bge0: flags=3D8843 mtu 1500 options=3D1a inet 192.168.1.228 netmask 0xffffff00 broadcast 192.168.1.255 ether 00:12:79:cf:d0:bf media: Ethernet autoselect (1000baseTX ) status: active bge1: flags=3D8802 mtu 1500 options=3D1a ether 00:12:79:cf:d0:be media: Ethernet autoselect (none) status: no carrier em0: flags=3D18843 mtu 15= 00 options=3D4b ether 00:11:0a:56:ab:3a media: Ethernet autoselect (1000baseTX ) status: active em1: flags=3D8843 mtu 1500 options=3D4b ether 00:11:0a:56:ab:3b media: Ethernet autoselect status: no carrier em2: flags=3D18843 mtu 15= 00 options=3D4b ether 00:11:0a:56:b2:4c media: Ethernet autoselect (1000baseTX ) status: active em3: flags=3D8843 mtu 1500 options=3D4b ether 00:11:0a:56:b2:4d media: Ethernet autoselect status: no carrier lo0: flags=3D8049 mtu 16384 inet 127.0.0.1 netmask 0xff000000 --------------------------------------- Regards, Dave Seddon das-keyword-net.6770cb@seddon.ca