From owner-freebsd-net@FreeBSD.ORG Sat Dec 14 22:01:53 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 24D7924A for ; Sat, 14 Dec 2013 22:01:53 +0000 (UTC) Received: from nm19-vm5.bullet.mail.ne1.yahoo.com (nm19-vm5.bullet.mail.ne1.yahoo.com [98.138.91.241]) by mx1.freebsd.org (Postfix) with SMTP id CA49E11D1 for ; Sat, 14 Dec 2013 22:01:52 +0000 (UTC) Received: from [98.138.226.177] by nm19.bullet.mail.ne1.yahoo.com with NNFMP; 14 Dec 2013 21:58:40 -0000 Received: from [98.138.226.133] by tm12.bullet.mail.ne1.yahoo.com with NNFMP; 14 Dec 2013 21:58:40 -0000 Received: from [127.0.0.1] by smtp220.mail.ne1.yahoo.com with NNFMP; 14 Dec 2013 21:58:40 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1387058320; bh=9y4djfmYsn37YiCYZHhjPSR3SkAnotokXPvBx7ESEuI=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=guhNOK8aC1mi9WMr+7eCJO2YL8wqE1Ny8ml8GKvxjvmgNnY+eXTMta8zBGztwoWR+4MRZFGV7rmQd9V9pHFbo35SEf7tq8QDuPS9OZGWzE6pI3SRV4v/hJsB84B+x9Z9vR8hWrMtAg22ZLt8pBGMaNcOFnb/4+b2hzzvoyfZgms= X-Yahoo-Newman-Id: 286214.51215.bm@smtp220.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: YU3sLtEVM1kpwXAajwhKsG1pD0oaC6C9wKXyMSczF5ZGUJ8 nKhV6xQSER5tzds.vOLT7n_XuhYKbK9vAldHtnLa3fSEqL_UU29HUv4qYOSa iZu__ZH710h_OtyEFv6mzsGm_PR5i3n8qqicGq5mjI9QWcQXsvRmEJ40rq7I YywnLwFy34QMQJe5hSyiuQuW5XA8cTrBL5S7vABFy97o6mrVek855hTvtqVm 3Q8kCyFqUKJjoRliKXkufZ1wArK7hOPFBifmBmFAWOWKLndvA9zunQqNG7p6 GNaW0PKw7RnUFQN_w5vlK7SlnQ7Ax9Ul87MrQFalwTWL.mKTNpWumkb5xpgG PsgrTF3tPucelVd2rBWVXjj.kIR_e4yCOWO3XoEJYZIinMVg.Dp1wrRM.nyp AX7kZ7fi1PE7lZHlmXqO1MBGj3Kb7.YJvSPBSp9ADefW5vyUV4rucsgrXib6 3DBiP9D4dmzz6MLBM7.MZ1bVRExcgvg0kCSgECxVh82Z1BAmwQdcmeNJoCzb ya_d_G2c5kNqLN3p4tkPUEXOjuwyOIbwmexNt6EOyUo0rc13G.nR9fTEltJA bFA4- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [192.168.254.211] (scott4long@168.103.85.57 with plain [98.138.105.21]) by smtp220.mail.ne1.yahoo.com with SMTP; 14 Dec 2013 13:58:40 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) Subject: Re: buf_ring in HEAD is racy From: Scott Long In-Reply-To: Date: Sat, 14 Dec 2013 14:58:37 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <2002669A-DDE0-470A-A558-F812EA5D59F0@yahoo.com> References: To: Ryan Stone X-Mailer: Apple Mail (2.1822) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Dec 2013 22:01:53 -0000 We see regular buf_ring drops at Netflix as well, but had always assumed = that it was because we were overfilling the ring. I=92ll take a closer = look now. Scott On Dec 13, 2013, at 10:04 PM, Ryan Stone wrote: > I am seeing spurious output packet drops that appear to be due to > insufficient memory barriers in buf_ring. I believe that this is the > scenario that I am seeing: >=20 > 1) The buf_ring is empty, br_prod_head =3D br_cons_head =3D 0 > 2) Thread 1 attempts to enqueue an mbuf on the buf_ring. It fetches > br_prod_head (0) into a local variable called prod_head > 3) Thread 2 enqueues an mbuf on the buf_ring. The sequence of events > is essentially: >=20 > Thread 2 claims an index in the ring and atomically sets br_prod_head = (say to 1) > Thread 2 sets br_ring[1] =3D mbuf; > Thread 2 does a full memory barrier > Thread 2 updates br_prod_tail to 1 >=20 > 4) Thread 2 dequeues the packet from the buf_ring using the > single-consumer interface. The sequence of events is essentialy: >=20 > Thread 2 checks whether queue is empty (br_cons_head =3D=3D = br_prod_tail), > this is false > Thread 2 sets br_cons_head to 1 > Thread 2 grabs the mbuf from br_ring[1] > Thread 2 sets br_cons_tail to 1 >=20 > 5) Thread 1, which is still attempting to enqueue an mbuf on the ring. > fetches br_cons_tail (1) into a local variable called cons_tail. It > sees cons_tail =3D=3D 1 but prod_head =3D=3D 0 and concludes that the = ring is > full and drops the packet (incrementing br_drops unatomically, I might > add) >=20 >=20 > I can reproduce several drops per minute by configuring the ixgbe > driver to use only 1 queue and then sending traffic from concurrent 8 > iperf processes. (You will need this hacky patch to even see the > drops with netstat, though: > http://people.freebsd.org/~rstone/patches/ixgbe_br_drops.diff) >=20 > I am investigating fixing buf_ring by using acquire/release semantics > rather than load/store barriers. However, I note that this will > apparently be the second attempt to fix buf_ring, and I'm seriously > questioning whether this is worth the effort compared to the > simplicity of using a mutex. I'm not even convinced that a correct > lockless implementation will even be a performance win, given the > number of memory barriers that will apparently be necessary. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"