From owner-freebsd-net@FreeBSD.ORG Sun Aug 18 13:52:04 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B3BEF838 for ; Sun, 18 Aug 2013 13:52:04 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm8-vm3.bullet.mail.ne1.yahoo.com (nm8-vm3.bullet.mail.ne1.yahoo.com [98.138.91.138]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6D6A62869 for ; Sun, 18 Aug 2013 13:52:03 +0000 (UTC) Received: from [98.138.90.52] by nm8.bullet.mail.ne1.yahoo.com with NNFMP; 18 Aug 2013 13:48:58 -0000 Received: from [98.138.101.178] by tm5.bullet.mail.ne1.yahoo.com with NNFMP; 18 Aug 2013 13:48:58 -0000 Received: from [127.0.0.1] by omp1089.mail.ne1.yahoo.com with NNFMP; 18 Aug 2013 13:48:58 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 905106.27473.bm@omp1089.mail.ne1.yahoo.com Received: (qmail 1785 invoked by uid 60001); 18 Aug 2013 13:48:58 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1376833738; bh=1voKgghZZdQbUirDZCEq5x2KgW0qzT/FJ6zDgNM0CHo=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=p4yMfHzR1HIFu2+baV4hgDrj8INbo/b04GGTFJd9vQnAJ7BeyChc7txW5+8tyQIkcV8ymhIm+LpRQqhbwfLwGG2kBOhQFPE//MPY6ETd8zOrkRrMRenwC2Hza44TtQYQ1DLy1eAkXjiUISySJvSjn+EWTv16LxDCiWrVHeKBQYs= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=Ud5jvwCsbHbG3bfRWbZntvBlYGJFQbBGaZrgHcjvTHUg0MWeWoccoWtaA4NOaTJgVWj+9oaDyR+XbKX165yOaays8dKFb8rSdf6q2xBx5H/j3vq6NwG8nX1bWelnFNcMThghipRA1b9MDBKeMhnGCdBxtC54xIK4uATXlOtqors=; X-YMail-OSG: ip1.TD0VM1l6lFAWO_KPGKan41xKsziKW_cwsSXpaua7ZyA A4wXKTqlzrwat6n0y5O8aYnN7RxSwByj8MDkJRnp6mIkhVPhMwMHUZIxckUJ rxYC19Y1JJNAgWynmyR.icImFCIGxk9yhRpgiOKXKHCIW_z3n2Nz7Mu0qTY. 5Ry7L6dBtG3MRNsd0XQdL06WYcLH7WKKRkrHkbZOq5pTFCgwDknyEeqceyUY FAt5qODukQyhlt0DcAcV5q35SGQmr9QuU5MkxFJCh.Voro89Gi0P3tvEHE0T hvQZdZuas2Mz3W06NxdAZYj3TikNW60WCVoFx5f.znBpS1JRIes2daCInAzq veYlewRy_1w585MK4Pjk6svOmBC_0JtIgHEXlNdd2fwmTCezE5_9CSE.odxc cunySPHbGaIpz4qMZCI3GnwHPsM3jVpTrI5lMhA3mxoRucPngPLZqeKwN9jm 6Y9cDdQliHWnsxoVCs01J7PGqWfpWrh704HmOdDu5utfMUSWZTiLvVMwMRe7 CYX4JE_RC3.yPjCc26.2sHpDtFVWRQOsFnfyZKoJwke3ex0AG2rSOGby37JO _ISejDlp4PD.rd5nk6_j0iYcn_qRT0NJELZDSbTvE61qGhW.atf5x408ZM1o UqeZoEHtDj6jae8NTQ0MdWlQeSzQI34qXGDnI Received: from [98.203.118.124] by web121605.mail.ne1.yahoo.com via HTTP; Sun, 18 Aug 2013 06:48:58 PDT X-Rocket-MIMEInfo: 002.001, CgoKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCiBGcm9tOiBBZHJpYW4gQ2hhZGQgPGFkcmlhbkBmcmVlYnNkLm9yZz4KVG86IEJhcm5leSBDb3Jkb2JhIDxiYXJuZXlfY29yZG9iYUB5YWhvby5jb20.IApDYzogTHVpZ2kgUml6em8gPHJpenpvQGlldC51bmlwaS5pdD47IExhd3JlbmNlIFN0ZXdhcnQgPGxzdGV3YXJ0QGZyZWVic2Qub3JnPjsgRnJlZUJTRCBOZXQgPG5ldEBmcmVlYnNkLm9yZz4gClNlbnQ6IFNhdHVyZGF5LCBBdWd1c3QgMTcsIDIwMTMgMTE6NTkgQU0KU3ViamVjdDogUmU6IGkBMAEBAQE- X-Mailer: YahooMailWebService/0.8.154.571 References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org> <520B24A0.4000706@freebsd.org> <520B3056.1000804@freebsd.org> <20130814102109.GA63246@onelab2.iet.unipi.it> <1376745244.6575.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1376748170.66110.YahooMailNeo@web121601.mail.ne1.yahoo.com> Message-ID: <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com> Date: Sun, 18 Aug 2013 06:48:58 -0700 (PDT) From: Barney Cordoba Subject: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux) To: Adrian Chadd In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: Lawrence Stewart , Luigi Rizzo , FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Barney Cordoba List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Aug 2013 13:52:04 -0000 =0A=0A=0A=0A________________________________=0A From: Adrian Chadd =0ATo: Barney Cordoba =0ACc: Luigi R= izzo ; Lawrence Stewart ; FreeBSD= Net =0ASent: Saturday, August 17, 2013 11:59 AM=0ASubjec= t: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)= =0A =0A=0A=0A... we get perfectly good throughput without 400k ints a secon= d on the ixgbe driver.=0A=0AAs in, I can easily saturate 2 x 10GE on ixgbe = hardware with a handful of flows. That's not terribly difficult.=0A=0AHowev= er, there's a few interesting problems that need addressing:=0A=0A* There's= lock contention between the transmit side from userland and the TCP timers= , and the receive side with ACK processing. Under very high traffic load a = lot of lock contention stalls things. We (the royal "we", I'm mostly just d= oing tooling at the moment) working on that.=0A* There's lock contention on= the ARP, routing table and PCB lookups. The latter will go away when we've= finally implemented RSS for transmit and receive and then moved things ove= r to using PCB groups on CPUs which have NIC driver threads bound to them.= =0A* There's increasing cache thrashing from a larger workload, causing the= expensive lookups to be even more expensive.=0A* All the list walks suck. = We need to be batching things so we use CPU caches much more efficiently.= =0A=0AThe idea of using TSO on the transmit side and generic LRO on the rec= eive side is to make the per-packet overhead less. I think we can be much m= ore efficient in general in packet processing, but that's a big task. :-) S= o, using at least TSO is a big benefit if purely to avoid decomposing thing= s into smaller mbufs and contending on those locks in a very big way.=0A=0A= I'm working on PMC to make it easier to use to find these bottlenecks and m= ake the code and data more efficient. Then, likely, I'll end up hacking on = generic TSO/LRO, TX/RX RSS queue management and make the PCB group thing de= fault on for SMP machines. I may even take a knife to some of the packet pr= ocessing overhead.=0A=0A-------------------------------=0A=0AThe ints/sec r= eference was based on Luigi's implication that turning off moderation was s= ome sort of performance choice.=0A=0AAgain, you're talking "throughput" and= not efficiency. I could fill a tx queue with 10gb of traffic with =A0yeste= ryear's cpus. It's not an achievement. Being able to bridge=A0=0Areal traff= ic at=A010gb/s with 2 cores is.=0A=0ABC From owner-freebsd-net@FreeBSD.ORG Sun Aug 18 14:25:35 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 09F12B82 for ; Sun, 18 Aug 2013 14:25:35 +0000 (UTC) (envelope-from jim@netgate.com) Received: from mail-oa0-f52.google.com (mail-oa0-f52.google.com [209.85.219.52]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C30232998 for ; Sun, 18 Aug 2013 14:25:34 +0000 (UTC) Received: by mail-oa0-f52.google.com with SMTP id n12so4141363oag.25 for ; Sun, 18 Aug 2013 07:25:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:from:subject:date:to; bh=fsJgAmE/V2eOCFdtqy3O2oFz9ZyCN/G62G5aE4U000Y=; b=jAfNuFUtNwMvjUnGgWqz0NMS9HT8rUxfJLt0TIMU6g43X/jqboA+yPWVL5fRfJzS8q g5DBgFXRyo+7JvUWlfeF/2VnMxUBaLddNBAyl/ntM5XWL29jyvExPnjLkHLeBQIru7Vt iDvqTSb3izdwX7TjqIxy5uXzjQpHISviD7uOgp1ZR1c6NhvJSCezIjrD4heEc6x4p6pt Sdguqr25R4Cis121ThK3MjFmE6SbEeGy6MTcqKEPnIosdxMjhrVgh8rzihFJMwWD31is r/fsJSJn6VnkL2JNXrcspD8B7grSCW0tCFpTI2ZuASIxRRcj2Q78/XizOmAD61a/Rqpv QGhg== X-Gm-Message-State: ALoCoQnbv2TlhuVIyrZPyXKuygwE5k82jgz5342zzVZ9WOjpCT4NMNopL0oDBjX3dqH8TFnmfTXT X-Received: by 10.182.81.41 with SMTP id w9mr8046505obx.18.1376835928656; Sun, 18 Aug 2013 07:25:28 -0700 (PDT) Received: from [172.21.0.31] (67-198-60-238.static.grandenetworks.net. [67.198.60.238]) by mx.google.com with ESMTPSA id g1sm8179565oeq.6.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 18 Aug 2013 07:25:27 -0700 (PDT) References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org> <520B24A0.4000706@freebsd.org> <520B3056.1000804@freebsd.org> <20130814102109.GA63246@onelab2.iet.unipi.it> <1376745244.6575.YahooMailNeo@web121606.mail.ne1.yahoo.com> <1376748170.66110.YahooMailNeo@web121601.mail.ne1.yahoo.com> <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com> Mime-Version: 1.0 (1.0) In-Reply-To: <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: <71EA3DFB-B410-432D-98E0-B6341556BE6D@netgate.com> X-Mailer: iPhone Mail (10B350) From: Jim Thompson Subject: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux) Date: Sun, 18 Aug 2013 09:25:27 -0500 To: Barney Cordoba Cc: Lawrence Stewart , Adrian Chadd , Luigi Rizzo , FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Aug 2013 14:25:35 -0000 On Aug 18, 2013, at 8:48 AM, Barney Cordoba wrote= : > I could fill a tx queue with 10gb of traffic with yesteryear's cpus. It's= not an achievement. Being able to bridge=20 > real traffic at 10gb/s with 2 cores is Or forward at layer 3.=20 Or filter packets.=20 Or IPSEC.=20 Or...=