From owner-freebsd-net@FreeBSD.ORG Mon Jul 28 17:30:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 07CF196C for ; Mon, 28 Jul 2014 17:30:50 +0000 (UTC) Received: from mail-qa0-x22a.google.com (mail-qa0-x22a.google.com [IPv6:2607:f8b0:400d:c00::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B894527BB for ; Mon, 28 Jul 2014 17:30:49 +0000 (UTC) Received: by mail-qa0-f42.google.com with SMTP id j15so8158492qaq.1 for ; Mon, 28 Jul 2014 10:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=0d+VF/ElmmLOZTby5VDyA8P/11VWymKEbwSzbairIys=; b=R4mVCqjeUU2z+wvRD8pICylnso76iRaVj4FeN5cyN/t0nkubC1ZjGk3lQ7hgFj25M2 VMP1Ha+gE9YoF6j9m5tjnwMonEK5yXsE0V9kik65R7XC5f5fHVDF/qnC7DMzccoR8R+L ObXCBPhklLSKWB74X5mblNOb+NkUMe69LEgNkXL9o3R9HCWb7Ru51X2umLpbURXwxjds GTvDUwwBD0nqYp6MU38BBvyiObMrksHrVBfMRTl6mpVyRQOYckZTMpBseUBRJozjSxI9 1OQHNCCoh84ovCR2BQDWBXoUyMN8Xh0Xm2tw6+n2YdsHixCYTWrtxCPlah5tcS97rph+ cSMA== MIME-Version: 1.0 X-Received: by 10.224.161.83 with SMTP id q19mr6603925qax.26.1406568648715; Mon, 28 Jul 2014 10:30:48 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.1.6 with HTTP; Mon, 28 Jul 2014 10:30:48 -0700 (PDT) In-Reply-To: References: <53CE80DD.9090109@gmail.com> <53D4600A.1010505@gmail.com> <53D4F77B.9020009@gmail.com> Date: Mon, 28 Jul 2014 10:30:48 -0700 X-Google-Sender-Auth: 4gvSozSrnXVJb9xmqcN3efmdW8E Message-ID: Subject: Re: fastforward/routing: a 3 million packet-per-second system? From: Adrian Chadd To: John Jasen Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jul 2014 17:30:50 -0000 On 28 July 2014 07:51, John Jasen wrote: > in_input crept up into the top 5, versus fastforward. > > > Would PMC counters help? Not at the moment. This is a lock contention thing, not a pmc thing. I bet if you ran pmc the mutex/rwlock things would be up high. :) > > cat debug.lock.pref.stats.out-20140728-1 | sort -nk 4 | tail -5 > 5 4 413 115 160 2 0 0 > 63 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant) > 1 1 148858 4095 650072 0 0 0 > 11184 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile chain) > 8 14 13747639 561636 72520256 0 0 0 > 689603 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) > 3 20 3907071 2322975 72520256 0 0 0 > 2529589 /usr/src/sys/netinet/ip_input.c:1315 (sleep mutex:rtentry) > 3 17 3665247 3715117 72520256 0 0 0 > 8425384 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) Try disabling net.inet.ip.redirect (sysctl net.inet.ip.redirect=0). That'll eliminate that in_rmx.c check. Oh look! The ip_output() path doesn't know about flowtable either. I'm kind of surprised that the 2-tuple flowtable (ie, only ipv4 and only ipv6 addresses, not TCP/UDP ports) isn't used in the ip forwarding case. All ip_rtaddr() is doing is doing a route lookup and taking a reference to the ifa. It's using that for things like network/netmask on that interface. Anyway - yeah, it looks like you've hit lock contention on that particular setup. You'll likely get a little more throughout out by disabling redirects for now. The real solution is to make the whole rtentry locking less stupid and bottleneck-y as well as extending the flowtable support to include the ip_forward() path. -a