From owner-freebsd-net@FreeBSD.ORG Mon Mar 16 15:41:00 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6EBA61065693; Mon, 16 Mar 2009 15:41:00 +0000 (UTC) (envelope-from sepron@gmail.com) Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.27]) by mx1.freebsd.org (Postfix) with ESMTP id 100A88FC08; Mon, 16 Mar 2009 15:40:59 +0000 (UTC) (envelope-from sepron@gmail.com) Received: by qw-out-2122.google.com with SMTP id 3so1683553qwe.7 for ; Mon, 16 Mar 2009 08:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=9Zyu+NODluhZnxUGOHQLkPQvylxJjFT0Q/7f3+KXMJY=; b=SaFW1Y/V0JSdV6oO5m990O3JI79EsCt++ZSXkJcf0zmflb5zvwof5f2TTYRQrVCDb5 TEB/cS5cwSmcvYHqXxdgFI2LQYl5xVJ7LptMPQVxznQXz4mdx88lPOT/WMlMzszZvJgb O6vv0+GW5Iat+3UPwtiwhLrf2WtqyRQh6NxeM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=TOuhISEYlzaF+XzDDPtawN+5mWwmBYk/Lky7rymtiyofIYVRrdIGnbhqNPCvuJN/hS GPhs6g2oVH/NZfBdjPsWxkiu+N3TrMyj+9K4AgOfxLykrLPn6l0Z1/SLZr2fCvjn8f8b CySCCpBMDL/qCyOmUid9ysuhVdHJKHSW/AiJQ= MIME-Version: 1.0 Received: by 10.142.139.14 with SMTP id m14mr2198351wfd.309.1237218058837; Mon, 16 Mar 2009 08:40:58 -0700 (PDT) Date: Mon, 16 Mar 2009 18:40:58 +0300 Message-ID: From: Sergey Pronin To: freebsd-net@freebsd.org, freebsd-bugs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Synopsis: process swi1: net, taskq em0 and dummynet gives 100% CPU usage X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Mar 2009 15:41:01 -0000 Synopsis: process swi1: net, taskq em0 and dummynet gives 100% CPU usage Related to http://lists.freebsd.org/pipermail/freebsd-net/2009-February/021120.html Not depending on the conditions (no heavy load, not a lot of traffic passing through, not a lot of ng nodes) server stops to work properly. A: 1) swi1:net gives me 100% CPU usage. 2) server is not responding to icmp echo requests 3) ssh of course not working 4) mpd has an "ngsock" state at the top 5) rebooting the server helps. B: 1) taskq: em0 gives me 100% CPU usage. 2) I have watchdog timeout in my /var/log/messages 3) server is not responding to icmp echo requests 4) ssh of course not working 5) mpd has an "ngsock" state at the top 6) rebooting the server helps. 7) swi1:net is 0% C: 1) dummynet process gives 100% CPU usage. 2) server is not responding to icmp echo requests 3) ssh of course not working 4) mpd has an "ngsock" state at the top 5) rebooting the server helps. I have few servers: INTEL S3200SH with Q8200 or E8600 NICs: 82566DM-2 or 82571EB (em driver) OSes: FreeBSD 7.0-RELEASE-p10, FreeBSD 7.0-RELEASE-p9, FreeBSD 6.4-RELEASE-p3 Soft: mpd 4.4.1, ipfw with dummynet shaping, pf (nat only) PPPoE I'm using only em0 card with about 550 vlans 2000 ng nodes created About 500-700 simultaneous PPPoE sessions in a rush hour. kernel: device bpf # Berkeley packet filter device pf options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_FORWARD options IPFIREWALL_VERBOSE_LIMIT=1000 options IPFIREWALL_DEFAULT_TO_ACCEPT options IPDIVERT options DUMMYNET options DEVICE_POLLING options HZ=2000 options NETGRAPH options NETGRAPH_ETHER options NETGRAPH_IFACE options NETGRAPH_SOCKET options NETGRAPH_PPP options NETGRAPH_TCPMSS options NETGRAPH_TEE options NETGRAPH_VJC options NETGRAPH_PPPOE On some servers i have netgraph as modules and polling option commented out. sysctl.conf: net.inet.ip.intr_queue_maxlen=1000 net.inet.tcp.blackhole=2 net.inet.udp.blackhole=1 net.inet.ip.dummynet.hash_size=1024 net.inet.ip.dummynet.io_fast=1 net.inet.ip.fw.one_pass=1 net.inet.ip.fastforwarding=1 net.isr.direct=0 #net.inet.ip.portrange.randomized=0 net.inet.tcp.syncookies=1 kern.ipc.maxsockbuf=1048576 net.graph.maxdgram=524288 net.graph.recvspace=524288 net.inet.ip.portrange.first=1024 net.inet.ip.portrange.last=65535 dev.em.0.rx_int_delay=160 dev.em.0.rx_abs_int_delay=160 dev.em.0.tx_int_delay=160 dev.em.0.tx_abs_int_delay=160 dev.em.0.rx_processing_limit=200 loader.conf: autoboot_delay="2" kern.ipc.maxpipekva=10000000 net.graph.maxalloc=2048 hw.em.rxd="512" hw.em.txd="1024" About 30 ipfw rules and 2 rules for shaping: 00300 pipe tablearg ip from any to table(4) out via ng* 00301 pipe tablearg ip from table(5) to any in via ng* I have tested different network cards with different chipsets. With and without lagg0. I had the same problems with Freebsd 7.1-RELEASE-p1/p2. I tried to start servers without em tuning in loader.conf and sysctl.conf. Server uptime differs from one week to two month. I have two another servers with the same hardware, but without using dummynet, netgraph and mpd. There is only quagga + bgp, same chipsets, FreeBSD 7.0-RELEAS-p10. No problems at all. IMHO: problem is somewhere in netgraph. Something is causing an infinite loop. Any ideas?