From owner-freebsd-transport@freebsd.org Sat Nov 14 23:50:59 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 719BFA2D89F for ; Sat, 14 Nov 2015 23:50:59 +0000 (UTC) (envelope-from pfg@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 57AA017B3 for ; Sat, 14 Nov 2015 23:50:59 +0000 (UTC) (envelope-from pfg@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5498FA2D89D; Sat, 14 Nov 2015 23:50:59 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54237A2D89B for ; Sat, 14 Nov 2015 23:50:59 +0000 (UTC) (envelope-from pfg@freebsd.org) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1.freebsd.org (Postfix) with SMTP id 2BDD817B1 for ; Sat, 14 Nov 2015 23:50:58 +0000 (UTC) (envelope-from pfg@freebsd.org) Received: (qmail 33732 invoked by uid 99); 14 Nov 2015 23:50:52 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Nov 2015 23:50:52 +0000 Received: from [192.168.0.103] (unknown [181.55.232.163]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id CD5BF1A04DE; Sat, 14 Nov 2015 23:50:51 +0000 (UTC) Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Updates to the tcp_rfc_compliance wiki From: Pedro Giffuni In-Reply-To: Date: Sat, 14 Nov 2015 18:50:47 -0500 Cc: transport@freebsd.org Message-Id: References: To: Kevin Bowling X-Mailer: Apple Mail (2.2104) X-Mailman-Approved-At: Sun, 15 Nov 2015 02:52:01 +0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Nov 2015 23:50:59 -0000 > Il giorno 13/nov/2015, alle ore 16:56, Kevin Bowling = ha scritto: >=20 > I dug through some other projects including illumos recently as well. = If you check commit logs, Sun had quite a bit going on prior to Oracle, = and then all "transport" activity stopped in 2010. >=20 Yes, Illumos hasn=E2=80=99t changed much there so looking in there = basically only gives hints on the older Solaris implementation. It is difficult to say = exactly what happened in the Solaris implementation since the source is now closed but you = could also say it has been losing relevance. It seems unlikely that Oracle has been = keeping up to date with the standards but still looking at Illumos remains an = interesting reference. By updating the list of RFCs I can see the transport group was created = in a very good moment. Many of the traditional RFCs have received important revisions and extensions that were suggested previously have been = obsoleted and even deemed insecure. Perhaps RFC 7414 TCP Roadmap, is a good starting point to plan future work. Regards, Pedro. > Regards, >=20 > Kevin Bowling - Systems Software Team - P: 480-227-1233 >=20 > On Fri, Nov 13, 2015 at 1:20 PM, Pedro Giffuni > wrote: > Hello; >=20 > After digging into some illumos headers and checking the RFC list for = updates, > I did some bold updates to the Wiki. >=20 > Most notable changes: > I added RFC3168 to the list: Illumos implemented the now obsolete = RFC2481 >=20 > I tagged many of the now obsolete RFCs and added the new ones. If we = were > supporting the obsolete versions I tagged the new ones =E2=80=9Cpartiall= y supported=E2=80=9D > while the updates are confirmed/done. >=20 > I was unsure if I should add NetBIOS (RFC1001 and RFC1002), as illumos > has them but I suspect they are not a high priority here. >=20 > Enjoy ... it seems like a lot of fun will be happening here :). >=20 > Pedro. > _______________________________________________ > freebsd-transport@freebsd.org = mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-transport = > To unsubscribe, send any mail to = "freebsd-transport-unsubscribe@freebsd.org = " >=20 >=20 > The information in this message may be confidential. It is intended = solely for > the addressee(s). If you are not the intended recipient, any = disclosure, > copying or distribution of the message, or any action or omission = taken by you > in reliance on it, is prohibited and may be unlawful. Please = immediately > contact the sender if you have received this message in error. >=20 From owner-freebsd-transport@freebsd.org Mon Nov 16 08:13:44 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9B2F4A30D8C for ; Mon, 16 Nov 2015 08:13:44 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 844D81876 for ; Mon, 16 Nov 2015 08:13:44 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id 80020A30D8B; Mon, 16 Nov 2015 08:13:44 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7EC4FA30D8A for ; Mon, 16 Nov 2015 08:13:44 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 64D291874; Mon, 16 Nov 2015 08:13:44 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 73618C0232; Mon, 16 Nov 2015 00:13:38 -0800 (PST) Date: Mon, 16 Nov 2015 00:13:38 -0800 From: hiren panchasara To: Pedro Giffuni Cc: transport@FreeBSD.org Subject: Re: Updates to the tcp_rfc_compliance wiki Message-ID: <20151116081338.GQ29829@strugglingcoder.info> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="PPxI8paQBs33t8dK" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 08:13:44 -0000 --PPxI8paQBs33t8dK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 11/13/15 at 03:20P, Pedro Giffuni wrote: > Hello; >=20 > After digging into some illumos headers and checking the RFC list for upd= ates, > I did some bold updates to the Wiki. >=20 > Most notable changes: > I added RFC3168 to the list: Illumos implemented the now obsolete RFC2481 >=20 > I tagged many of the now obsolete RFCs and added the new ones. If we were > supporting the obsolete versions I tagged the new ones ?partially support= ed? > while the updates are confirmed/done. Thanks for the updates. That list needs a lot of work. Also don't hesitate to update the rfc notes section if you have something useful. >=20 > I was unsure if I should add NetBIOS (RFC1001 and RFC1002), as illumos > has them but I suspect they are not a high priority here. >=20 > Enjoy ... it seems like a lot of fun will be happening here :). That is the plan. :-) Cheers, Hiren --PPxI8paQBs33t8dK Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWSZAyXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lvx0H/jjeMpaT1UAaFzuL7FxdAr5o kwVCXkyPZwx+Jutb5RJlWsvBhcNZb6GwL4Nm7KqZX+LOoxGSbPbb9sfkX1tNf0yp Fy8M7DehXM6vEFlu8Cv5w+qbwCM3x2IcO7CZ6rXmzhaUvCQF4ezeEXos+OpNqGwA C6ged3qZO0z5Ib54YQQRNa2kB9iKda1L6Oz/ar7c3A5aNhryxRII75VOVefoH/eY xkIDOVtSJPgew5nLxBwCYuEpnIbpW8omafiLQ82kj+O10IWz0wCJty8tiFCxCBUV r0XoE9P3V5XzTiQPElqD1Yw0XImMFRgGsisiChD7gTfLT97/USsvQKKmv2YHPRA= =8DeL -----END PGP SIGNATURE----- --PPxI8paQBs33t8dK-- From owner-freebsd-transport@freebsd.org Mon Nov 16 18:06:45 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EB78BA30C51 for ; Mon, 16 Nov 2015 18:06:45 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id D1F101912 for ; Mon, 16 Nov 2015 18:06:45 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id CE730A30C50; Mon, 16 Nov 2015 18:06:45 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B524CA30C4E for ; Mon, 16 Nov 2015 18:06:45 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 98DE91911 for ; Mon, 16 Nov 2015 18:06:45 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 12B7FC0479; Mon, 16 Nov 2015 10:06:44 -0800 (PST) Date: Mon, 16 Nov 2015 10:06:44 -0800 From: hiren panchasara To: Jonathan Looney Cc: Randall Stewart , FreeBSD Transports Subject: Re: Maintaining dupack counter per hole (was: The trouble with sack..) Message-ID: <20151116180644.GU29829@strugglingcoder.info> References: <20151030062423.GB5261@strugglingcoder.info> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="MUnXZt0Uv08c1hBe" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 18:06:46 -0000 --MUnXZt0Uv08c1hBe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Getting back to this old thread... On 10/30/15 at 01:09P, Jonathan Looney wrote: > On 10/30/15, 2:24 AM, "hiren panchasara" > hiren@strugglingcoder.info> wrote: >=20 > >(Something Randall and I discussed today) > > > >On 10/07/15 at 12:17P, Randall Stewart via freebsd-transport wrote: > >>=20 > >> 3) When we have more than one hole the goal of SACK was to retransmit > >>every time that > >> a hole had 3 dup-acks so that one could recover multiple blocks > >>that were lost. We just > >> plain don?t track dup-acks per hole. We do continue to count, but > >>we will wait to retransmit > >> anything until after we have drained 1/2 the data in flight from > >>the network at a minimum. And only then > >> do we start incrementing cwnd (remember we crashed it to 1 MTU) so > >>that we can retransmit. There > >> may be some other twists in the code that we are missing but this > >>is what we believe (this could could > >> probably win the C obfuscation contest if someone were willing to > >>enter it :-D) > > > >Wondering if we can add this dupack counter in struct sackhole {} and > >every time we process acks with sack in tcp_sack_doack(), we increment > >this counter if the same hole appears again. And retransmit it on (or > >after?) 3rd dupack. >=20 > The SACK hole-tracking code is already quite complex. If we're going to > make a fundamental change, perhaps it is time to consider a rewrite, > rather than a smaller patch? Maybe this is the best code we can write. Or, > maybe it is time for a re-coding to make it more easily accessible. As Randall said (in response to this), that rewrite is not necessary. And I agree to that. I don't see tcp_sack.c being in that bad shape that it demands a rewrite. But ofc, if you have something really cleaner/faster in mind, I'm all ears. :-) >=20 > In any case, how do you propose tracking holes that are carved up by later > packets? I think this scenario is rather easy as a single hole is being carved out bellow: Let me know if I am missing anything obvious. >=20 > E.g. >=20 > Hole is 1:1500. hole1: strike=3D1 >=20 > Then, you receive a packet with 500:750, leaving two holes. subhole 1:500 strike=3D2, subhole 750:1500 strike=3D2 >=20 > Then, you receive a packet with 1000:1250, leaving three holes. subhole 1:500 strike=3D3, subhole 750:1000 strike=3D3, subhole 1250:1500 st= rike=3D3 >=20 > Do you charge all three holes with the duplicate ACKs? Do you copy the > counter to the holes? >=20 > Or, is the fact that the ACK is slightly different enough to reset the > counter? Basically, whenever a hole gets broken up, subholes carry-forward the dupack strike counter.=20 >=20 > If you reset the counter anytime the hole is broken up, it will take a > while to get to three in a really out-of-order network scenario. On the > other hand, if you don't reset the counter, you may retransmit too fast. By 'too fast', I think you mean spurious retransmissions. If so, can you explain a bit more? >=20 > Just my initial reaction... Thanks you for the discussion! Cheers, Hiren --MUnXZt0Uv08c1hBe Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWShsvXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/l3FgH/jzmsrN+a2apY3NJprFBHOrq P+xs/o2clK8A2AWJxGEwTVtZhYzQJjGgOV/vCRl7qS16RhxloaS8ywBHfZyCLdXn dFgQ8itGUPcvVYJTN1pOak3TnsO9lqFoO54YGg8SxZ7kXfzZA9Idvh+3NPUujyyw PiNSCCy/9uflUGacrgtAey8FEKM+ANBbzv5hAxQzZg8rfIliGmQ+L/UtUsER7MLE vdVbaep//Z/W7fMfhVC884nk5waU/NDnkh6tgInKJF5UztnyVbWpuBCNMJOYihqm w8m/0zW+haMnZ/QWdS97p5mwlBnhApk6JUFnnX5GTlIHfWdUVeYPxH6DX5NBeBg= =n7Tz -----END PGP SIGNATURE----- --MUnXZt0Uv08c1hBe-- From owner-freebsd-transport@freebsd.org Mon Nov 16 19:35:33 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A7DFFA3020F for ; Mon, 16 Nov 2015 19:35:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 710001317 for ; Mon, 16 Nov 2015 19:35:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: by mailman.ysv.freebsd.org (Postfix) id 6ED3DA3020E; Mon, 16 Nov 2015 19:35:33 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 53958A3020D for ; Mon, 16 Nov 2015 19:35:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0146.outbound.protection.outlook.com [157.56.111.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F09DF1316 for ; Mon, 16 Nov 2015 19:35:32 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from BLUPR05MB1971.namprd05.prod.outlook.com (10.162.224.25) by BLUPR05MB1970.namprd05.prod.outlook.com (10.162.224.24) with Microsoft SMTP Server (TLS) id 15.1.325.17; Mon, 16 Nov 2015 19:19:16 +0000 Received: from BLUPR05MB1971.namprd05.prod.outlook.com ([10.162.224.25]) by BLUPR05MB1971.namprd05.prod.outlook.com ([10.162.224.25]) with mapi id 15.01.0325.003; Mon, 16 Nov 2015 19:19:16 +0000 From: Jonathan Looney To: hiren panchasara CC: Randall Stewart , FreeBSD Transports Subject: Re: Maintaining dupack counter per hole (was: The trouble with sack..) Thread-Topic: Maintaining dupack counter per hole (was: The trouble with sack..) Thread-Index: AQHREtukPt6xEWx9x06P2pRt2Ui3Kp6Dv5mAgBtNyQD//8BvgA== Date: Mon, 16 Nov 2015 19:19:16 +0000 Message-ID: References: <20151030062423.GB5261@strugglingcoder.info> <20151116180644.GU29829@strugglingcoder.info> In-Reply-To: <20151116180644.GU29829@strugglingcoder.info> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.5.8.151023 authentication-results: spf=none (sender IP is ) smtp.mailfrom=jlooney@juniper.net; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [66.129.241.14] x-microsoft-exchange-diagnostics: 1; BLUPR05MB1970; 5:NtZJER1J8sPx0p814KNU1yVuTTWViUFAQanjjcKV+uEPDTtNVcNd4nl75CHMVwR0WzpDACkx0bXiEc5ZmCOLua/zvH7u0YKplsopCpNtEotyCKX1YZlD9gTIh6ZkPpg9dx4tP67lCMB1eMhaM726xw==; 24:IcBKxOKBMgt5Tdu3LSbraE8QHcmVB3i1ra6puYnhTinb+x0SCCgOgKAqTOOCdvSNfGsilSt6Mg5yOBzrmdJmj6DJ0vwBPrjVFqfzAyKAIEU=; 20:4tWkjFUZTZX9zAgXrmbc2x3PoT1McEVL1fdjMZMG2eTtNPXcJFCx3nx2xi87AMorjslwa8nrMeIiK+YEws71VQ== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR05MB1970; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(520078)(10201501046)(3002001); SRVR:BLUPR05MB1970; BCL:0; PCL:0; RULEID:; SRVR:BLUPR05MB1970; x-forefront-prvs: 0762FFD075 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(24454002)(189002)(479174004)(199003)(51694002)(377454003)(51444003)(99286002)(106356001)(54356999)(101416001)(106116001)(189998001)(5001960100002)(105586002)(10400500002)(586003)(50986999)(110136002)(76176999)(93886004)(5008740100001)(36756003)(4001350100001)(2950100001)(92566002)(11100500001)(102836002)(5001920100001)(5002640100001)(97736004)(19580405001)(87936001)(81156007)(5004730100002)(2900100001)(122556002)(5007970100001)(77096005)(86362001)(83506001)(40100003)(19580395003)(66066001)(427584002)(6606295002); DIR:OUT; SFP:1102; SCL:1; SRVR:BLUPR05MB1970; H:BLUPR05MB1971.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:3; A:1; LANG:en; received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts) spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: juniper.net X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Nov 2015 19:19:16.1991 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR05MB1970 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 19:35:33 -0000 On 11/16/15, 1:06 PM, "hiren panchasara" wrote: >Getting back to this old thread... > >On 10/30/15 at 01:09P, Jonathan Looney wrote: >>The SACK hole-tracking code is already quite complex. If we're going to >> make a fundamental change, perhaps it is time to consider a rewrite, >> rather than a smaller patch? Maybe this is the best code we can write. >>Or, >> maybe it is time for a re-coding to make it more easily accessible. > >As Randall said (in response to this), that rewrite is not necessary. And >I agree to that. I don't see tcp_sack.c being in that bad shape that it >demands a rewrite. But ofc, if you have something really cleaner/faster >in mind, I'm all ears. :-) I don't have anything better in mind. :-) In fact, I think the code performs pretty well. And, that may be because it prioritizes performance over readability. But, since I think we've all stared at this code for a long time trying to figure out how it works (note the repetitive comments about obscure code), I think it is worth considering whether it is time to try to rewrite it to be more readable. Having said that, I don't see any evidence that now is the time. I just wanted to prompt the discussion. :-) >>=20 >> Do you charge all three holes with the duplicate ACKs? Do you copy the >> counter to the holes? >>=20 >> Or, is the fact that the ACK is slightly different enough to reset the >> counter? > >Basically, whenever a hole gets broken up, subholes carry-forward the >dupack strike counter. This somewhat makes sense. And, it looks like you are charging all existing holes for each ACK that doesn't cover them (regardless of the fact that it isn't really a "duplicate ACK" in this case). That also generally makes sense. But, I think we could change this to help cases where packets are re-ordered. >=20 >>=20 >> If you reset the counter anytime the hole is broken up, it will take a >> while to get to three in a really out-of-order network scenario. On the >> other hand, if you don't reset the counter, you may retransmit too fast. > >By 'too fast', I think you mean spurious retransmissions. If so, can you >explain a bit more? Sure. Imagine that my network is load-balancing over a 4xGE bundle, but the bundle links have different loads and slightly re-order the packets. The packets arrive at the remote side out of order, like this: 4500:5999 1500:2999 3000:4499 0:1499 The remote side ACKs them in that order and we receive the ACKs in that order. After the third ACK, we will view the 0:1499 block as having three strikes and will retransmit it (assuming that is our threshold), even though the ACK will arrive momentarily. Of course, I think that problem already exists with our code today. But, my question is how the per-hole counter improves this. I think the bottom line is that our code today assumes that we need to do a SACK retransmit (for all outstanding SACK holes) as soon as the first outstanding byte is not ACKd for three ACKs in a row. If we instead reset the counter for a hole anytime the hole was partially ACKd (including being split), we would retransmit less aggressively. In this example, that would probably be good. A middle-ground is to charge all holes every time we receive an ACK that does not partially ACK the hole. If a hole is partially ACKd, neither charge the ACK with a strike nor reset the counter. In any case, I think a rule that any particular byte must be ACKd in one of the next three packets after a higher byte is ACKd is potentially too rigid given the myriad ways packets can be re-ordered. Jonathan From owner-freebsd-transport@freebsd.org Mon Nov 16 20:41:47 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7517FA3042F for ; Mon, 16 Nov 2015 20:41:47 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 580FD1F4D for ; Mon, 16 Nov 2015 20:41:47 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id 56A40A3042D; Mon, 16 Nov 2015 20:41:47 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C66AA3042A for ; Mon, 16 Nov 2015 20:41:47 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1DA5E1F4C for ; Mon, 16 Nov 2015 20:41:46 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 80333C0EF6; Mon, 16 Nov 2015 12:41:45 -0800 (PST) Date: Mon, 16 Nov 2015 12:41:45 -0800 From: hiren panchasara To: Jonathan Looney Cc: FreeBSD Transports , Randall Stewart Subject: Re: Maintaining dupack counter per hole (was: The trouble with sack..) Message-ID: <20151116204145.GX29829@strugglingcoder.info> References: <20151030062423.GB5261@strugglingcoder.info> <20151116180644.GU29829@strugglingcoder.info> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Cou6PmgoyP0+llr2" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 20:41:47 -0000 --Cou6PmgoyP0+llr2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 11/16/15 at 07:19P, Jonathan Looney wrote: > On 11/16/15, 1:06 PM, "hiren panchasara" > wrote: >=20 > >Getting back to this old thread... > > > >On 10/30/15 at 01:09P, Jonathan Looney wrote: > >>The SACK hole-tracking code is already quite complex. If we're going to > >> make a fundamental change, perhaps it is time to consider a rewrite, > >> rather than a smaller patch? Maybe this is the best code we can write. > >>Or, > >> maybe it is time for a re-coding to make it more easily accessible. > > > >As Randall said (in response to this), that rewrite is not necessary. And > >I agree to that. I don't see tcp_sack.c being in that bad shape that it > >demands a rewrite. But ofc, if you have something really cleaner/faster > >in mind, I'm all ears. :-) >=20 > I don't have anything better in mind. :-) >=20 > In fact, I think the code performs pretty well. And, that may be because > it prioritizes performance over readability. But, since I think we've all > stared at this code for a long time trying to figure out how it works > (note the repetitive comments about obscure code), I think it is worth > considering whether it is time to try to rewrite it to be more readable. >=20 > Having said that, I don't see any evidence that now is the time. I just > wanted to prompt the discussion. :-) Fair enough. And we should look out for such opportunities whenever possible :-) >=20 >=20 > >>=20 > >> Do you charge all three holes with the duplicate ACKs? Do you copy the > >> counter to the holes? > >>=20 > >> Or, is the fact that the ACK is slightly different enough to reset the > >> counter? > > > >Basically, whenever a hole gets broken up, subholes carry-forward the > >dupack strike counter. >=20 > This somewhat makes sense. And, it looks like you are charging all > existing holes for each ACK that doesn't cover them (regardless of the > fact that it isn't really a "duplicate ACK" in this case). That also > generally makes sense. Why do you say those are not duplicate acks? I know this goes into now the definition of a duplicate ack what what we mean by that but to me, this is *the* benefit we get from SACK that we can identify these guys as duplicate acks. >=20 > But, I think we could change this to help cases where packets are > re-ordered. I disagree with the proposal with reasons in-line below.=20 > > >> If you reset the counter anytime the hole is broken up, it will take a > >> while to get to three in a really out-of-order network scenario. On the > >> other hand, if you don't reset the counter, you may retransmit too fas= t. > > > >By 'too fast', I think you mean spurious retransmissions. If so, can you > >explain a bit more? >=20 > Sure. >=20 > Imagine that my network is load-balancing over a 4xGE bundle, but the > bundle links have different loads and slightly re-order the packets. >=20 > The packets arrive at the remote side out of order, like this: >=20 > 4500:5999 >=20 > 1500:2999 >=20 > 3000:4499 >=20 > 0:1499 >=20 > The remote side ACKs them in that order and we receive the ACKs in that > order. >=20 > After the third ACK, we will view the 0:1499 block as having three strikes > and will retransmit it (assuming that is our threshold), even though the > ACK will arrive momentarily. >=20 > Of course, I think that problem already exists with our code today. But, > my question is how the per-hole counter improves this. I don't have an articulated answer for this right now. One clear issue with the way we do retransmit is that we wait for 1/2 the inflight to drain. (Look at 'Compute the amount of data in flight....' comment and nearby code) > I think the bottom > line is that our code today assumes that we need to do a SACK retransmit > (for all outstanding SACK holes) as soon as the first outstanding byte is > not ACKd for three ACKs in a row. But getting up to this dupack thresh "three" is also broken right now. A slightly different but related problem. (As explained by Randall in 2) of 'The trouble with sack..' email on this list.) >=20 > If we instead reset the counter for a hole anytime the hole was partially > ACKd (including being split), we would retransmit less aggressively. In > this example, that would probably be good. >=20 > A middle-ground is to charge all holes every time we receive an ACK that > does not partially ACK the hole. If a hole is partially ACKd, neither > charge the ACK with a strike nor reset the counter. This is the idea I don't agree to. SACK which partially acks does tell us something important and IMO it'd be bad if don't consider that information and strike the counter. >=20 > In any case, I think a rule that any particular byte must be ACKd in one > of the next three packets after a higher byte is ACKd is potentially too > rigid given the myriad ways packets can be re-ordered. So, coming back to your example, it's a valid problem. But this is not a valid solution for it. When you know that you are introducing some=20 degree of reordering as a sender (as your example shows), the dupack threshold should be bumped up to that degree. Now, linux does this with some smart heuristics. It tracks per connection degree of reordering that dupack threshold is always set to that for any point in time. I am wondering it something like that'd help. OR at least, a sockopt that anyone can set (much simpler than coming up with heuristics logic for the first pass) for their dupack threshold. This would probably help your case very much. Apologies if I am all over the place in this response. Cheers, Hiren --Cou6PmgoyP0+llr2 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWSj+FXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lzYkH/0lEHuG/XbqeK+HNuumRxToe tFrP+Xit+mA3Gf4cOhExj8f5ti93CdfVLK07Hg/oRpN4GaVUupVQHu2HoSiSXvBL hOi10NIC7l2Mlf9agm5H1kiTlzIoWlWqaSOsviJjTzpm+r5aFyE62VZvkXL7BqWz YfSbJBLwg3ObZ7zDiPwlcRVtuUO9iEmGhwdTnyssrzXDsdeb4KJwZO0wgaAEsxY/ hWyFz8R/kM9Wdapy1t9lfTTxqifchv9/lbIsfAnkoBA1S5M5pu120rI5a1dbP/Jm fK/30qOyuil1tKLBRukQrrRjAwOaMiUZ7+98B73VDEpCUq+Rfbwjehryoo8+5VE= =uUsr -----END PGP SIGNATURE----- --Cou6PmgoyP0+llr2-- From owner-freebsd-transport@freebsd.org Mon Nov 16 21:22:34 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F0253A310BE for ; Mon, 16 Nov 2015 21:22:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B75881C17 for ; Mon, 16 Nov 2015 21:22:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: by mailman.ysv.freebsd.org (Postfix) id B5562A310BD; Mon, 16 Nov 2015 21:22:33 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9AFF1A310BC for ; Mon, 16 Nov 2015 21:22:33 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0117.outbound.protection.outlook.com [157.56.110.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 475DC1C16 for ; Mon, 16 Nov 2015 21:22:32 +0000 (UTC) (envelope-from jlooney@juniper.net) Received: from BLUPR05MB1971.namprd05.prod.outlook.com (10.162.224.25) by BLUPR05MB1972.namprd05.prod.outlook.com (10.162.224.26) with Microsoft SMTP Server (TLS) id 15.1.325.17; Mon, 16 Nov 2015 21:22:25 +0000 Received: from BLUPR05MB1971.namprd05.prod.outlook.com ([10.162.224.25]) by BLUPR05MB1971.namprd05.prod.outlook.com ([10.162.224.25]) with mapi id 15.01.0325.003; Mon, 16 Nov 2015 21:22:25 +0000 From: Jonathan Looney To: hiren panchasara CC: FreeBSD Transports , Randall Stewart Subject: Re: Maintaining dupack counter per hole (was: The trouble with sack..) Thread-Topic: Maintaining dupack counter per hole (was: The trouble with sack..) Thread-Index: AQHREtukPt6xEWx9x06P2pRt2Ui3Kp6Dv5mAgBtNyQD//8BvgIAAauGA//+3iAA= Date: Mon, 16 Nov 2015 21:22:24 +0000 Message-ID: References: <20151030062423.GB5261@strugglingcoder.info> <20151116180644.GU29829@strugglingcoder.info> <20151116204145.GX29829@strugglingcoder.info> In-Reply-To: <20151116204145.GX29829@strugglingcoder.info> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.5.8.151023 authentication-results: spf=none (sender IP is ) smtp.mailfrom=jlooney@juniper.net; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [66.129.241.14] x-microsoft-exchange-diagnostics: 1; BLUPR05MB1972; 5:HGgkMyq1chZeGmcUp5jMtOXdiPtR9FuMcnpNuYGBu+0uGyklOHQaMZrPZZWVktormcJpjToG+qs4ovz0ThlkPYcHISL+qFsnmjX1/C7ED4fZNePmnoubX28g7Vs2m5zGmNB7Fg8Kqo0ESbg+93hqgQ==; 24:LXcepth5XgeEv0dsS/r/35CqBX+mgHZWsKcI0VPnKRnO7XUF3sI5V9mBcSTzerVWChvnR07NyVvyCVQ6QDRRp9WpQLhD9ZPAAirmYh2Hx6U=; 20:UK1v0Dcu0WhVLPYLPH57UpFP/YAESyGOvphfgyR5WrLsTzH9COMBbX5QWIrZ7s80AkGjPxv68EdjqhT0Tg9A8A== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR05MB1972; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(520078)(8121501046)(5005006)(10201501046)(3002001); SRVR:BLUPR05MB1972; BCL:0; PCL:0; RULEID:; SRVR:BLUPR05MB1972; x-forefront-prvs: 0762FFD075 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(479174004)(189002)(199003)(24454002)(377454003)(54356999)(4001350100001)(77096005)(11100500001)(2950100001)(106116001)(102836002)(586003)(10400500002)(5008740100001)(5002640100001)(2900100001)(5007970100001)(92566002)(97736004)(101416001)(105586002)(5001920100001)(87936001)(81156007)(36756003)(19580395003)(99286002)(5004730100002)(122556002)(86362001)(110136002)(83506001)(66066001)(189998001)(93886004)(40100003)(5001960100002)(106356001)(19580405001)(50986999)(76176999)(6606295002); DIR:OUT; SFP:1102; SCL:1; SRVR:BLUPR05MB1972; H:BLUPR05MB1971.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts) spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <47FB3CD77C8F8643B9B7DCC012003E9F@namprd05.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: juniper.net X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Nov 2015 21:22:24.9376 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR05MB1972 X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 21:22:34 -0000 On 11/16/15, 3:41 PM, "hiren panchasara" wrote: >On 11/16/15 at 07:19P, Jonathan Looney wrote: >>This somewhat makes sense. And, it looks like you are charging all >> existing holes for each ACK that doesn't cover them (regardless of the >> fact that it isn't really a "duplicate ACK" in this case). That also >> generally makes sense. > >Why do you say those are not duplicate acks? I know this goes into now >the definition of a duplicate ack what what we mean by that but to me, >this is *the* benefit we get from SACK that we can identify these guys >as duplicate acks. Right. What is a duplicate ACK? Actually, SACK lets us differentiate between these ACKs. Without SACK, we would see three ACKs for the same segment. We wouldn't know that the receiver actually had received three later segments. However, with SACK, we have more information about what the remote side actually received and we can (and should) use that to make intelligent decisions about what to retransmit and when. I say these aren't duplicate ACKs because the are selectively ACKing different data. I know the code considers them "duplicate" because the ACK field in the TCP header is unchanged. But, the SACK option gives us further information. >>=20 >>If we instead reset the counter for a hole anytime the hole was partially >> ACKd (including being split), we would retransmit less aggressively. In >> this example, that would probably be good. >>=20 >> A middle-ground is to charge all holes every time we receive an ACK that >> does not partially ACK the hole. If a hole is partially ACKd, neither >> charge the ACK with a strike nor reset the counter. > >This is the idea I don't agree to. SACK which partially acks does tell >us something important and IMO it'd be bad if don't consider that >information and strike the counter. [...] >So, coming back to your example, it's a valid problem. But this is >not a valid solution for it. Why not? I don't have much invested in the idea. But, I'd like to understand your reasoning. >When you know that you are introducing some >degree of reordering as a sender (as your example shows), the dupack >threshold should be bumped up to that degree. That's fine. Anything that lets us use the SACK data to make intelligent decisions. My main argument is that a fixed "retransmit after X ACKs" rule is insufficient, whether applied on a per-hole or per-tcpcb basis. > Now, linux does this with some smart heuristics. It tracks per > connection degree of reordering that dupack threshold is always set to > that for any point in time. I am wondering it something like that'd > help. OR at least, a sockopt that anyone can set (much simpler than > coming up with heuristics logic for the first pass) for their dupack > threshold. This would probably help your case very much. A sockopt would be simpler, but I don't think it helps my example unless I implement user-space logic to track reordering (which seems like it would be hard to do well). This is why I like the idea of changing the way we count strikes. I don't think it is as good as the Linux heuristics sound, or even as the Google RACK algorithm may be. But, I think we could improve on the strike-counting code to make our SACK performance better than what we have now (even after we fix the bugs ;-) ). Jonathan From owner-freebsd-transport@freebsd.org Mon Nov 16 23:44:26 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4C002A30A4A for ; Mon, 16 Nov 2015 23:44:26 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 319491F74 for ; Mon, 16 Nov 2015 23:44:26 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id 2CFA7A30A49; Mon, 16 Nov 2015 23:44:26 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2B8CFA30A48 for ; Mon, 16 Nov 2015 23:44:26 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 131AF1F72 for ; Mon, 16 Nov 2015 23:44:25 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 3EAD1C3A90; Mon, 16 Nov 2015 15:44:24 -0800 (PST) Date: Mon, 16 Nov 2015 15:44:24 -0800 From: hiren panchasara To: Jonathan Looney Cc: FreeBSD Transports , Randall Stewart Subject: Re: Maintaining dupack counter per hole (was: The trouble with sack..) Message-ID: <20151116234424.GA29829@strugglingcoder.info> References: <20151030062423.GB5261@strugglingcoder.info> <20151116180644.GU29829@strugglingcoder.info> <20151116204145.GX29829@strugglingcoder.info> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="6nSyB+bcl/pT7+kx" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 23:44:26 -0000 --6nSyB+bcl/pT7+kx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 11/16/15 at 09:22P, Jonathan Looney wrote: > On 11/16/15, 3:41 PM, "hiren panchasara" > wrote: >=20 > >On 11/16/15 at 07:19P, Jonathan Looney wrote: > >>This somewhat makes sense. And, it looks like you are charging all > >> existing holes for each ACK that doesn't cover them (regardless of the > >> fact that it isn't really a "duplicate ACK" in this case). That also > >> generally makes sense. > > > >Why do you say those are not duplicate acks? I know this goes into now > >the definition of a duplicate ack what what we mean by that but to me, > >this is *the* benefit we get from SACK that we can identify these guys > >as duplicate acks. >=20 > Right. What is a duplicate ACK? >=20 > I say these aren't duplicate ACKs because the are selectively ACKing > different data. I think rfc5681 makes it pretty clear that those are duplicate acks: Alternatively, a TCP that utilizes selective acknowledgments (SACKs) [RFC2018, RFC2883] can leverage the SACK information to determine when an incoming ACK is a "duplicate" (e.g., if the ACK contains previously unknown SACK information). > >>=20 > >>If we instead reset the counter for a hole anytime the hole was partial= ly > >> ACKd (including being split), we would retransmit less aggressively. In > >> this example, that would probably be good. > >>=20 > >> A middle-ground is to charge all holes every time we receive an ACK th= at > >> does not partially ACK the hole. If a hole is partially ACKd, neither > >> charge the ACK with a strike nor reset the counter. > > > >This is the idea I don't agree to. SACK which partially acks does tell > >us something important and IMO it'd be bad if don't consider that > >information and strike the counter. > [...] > >So, coming back to your example, it's a valid problem. But this is > >not a valid solution for it. >=20 > Why not? I don't have much invested in the idea. But, I'd like to > understand your reasoning. >=20 All I meant here was that the solution we are talking about here of per hole dupack strike counter is not going to solve the reordering problem you presented. Not directly, at least. Until we have intelligent (or a little better) way of detecting reordering and adjusting our dupack threshold, that problem is not going to get solved.=20 The solution of correcting the incrementing of dupack counter so we "detect" loss fast enough is also going to help so much. As the critical piece is what should we set our dupack thresh to? It has to co-relate to the degree of reordering present for that connection. In absence of that, you'd always have the risk of being too aggressive or not reacting fast enough to the loss. > > Now, linux does this with some smart heuristics. It tracks per > > connection degree of reordering that dupack threshold is always set to > > that for any point in time. I am wondering it something like that'd > > help. OR at least, a sockopt that anyone can set (much simpler than > > coming up with heuristics logic for the first pass) for their dupack > > threshold. This would probably help your case very much. >=20 > A sockopt would be simpler, but I don't think it helps my example unless I > implement user-space logic to track reordering (which seems like it would > be hard to do well). Yes, I agree and can see the gotchas in that approach. BUT I'd love to have a global sysctl that can let me bump that value for a box which is sitting behind such load balancers and I know it has higher probability of sending ooo packets. Another scenario is where on some networks, I see that the degree of reordering is always > 3 no matter what. I'd again love to have a way to try 4 or 5 and see if it behaves better. Just thinking out loud here. >=20 > This is why I like the idea of changing the way we count strikes. I don't > think it is as good as the Linux heuristics sound, or even as the Google > RACK algorithm may be. But, I think we could improve on the > strike-counting code to make our SACK performance better than what we have > now (even after we fix the bugs ;-) ). We violently agree here that we need to fix the strike counter :-) Cheers, Hiren --6nSyB+bcl/pT7+kx Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWSmpUXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lafMH/0aM6I4IKHHYxZJtcpmiAvFC 83+BQiNjGfDVAsKGsTqel0DSZrvElA+w85Oq67TNeaTp9iVNeaAV8ZyvKsPDsa7b 9vfGcJdzsf6kdvOYFaZ76peRdYnP1bbuj+RNKkMgjXgjjzMJujo4uPXf82l0Mu/u eFA6InaP6qehnigkaM/qaPN2PBB0qjlyjqCsbUl3ZId2UeBRCgRgQpeXZ/uHKIsr tGOrskVP+n29r6LGXyLhgRpwXc3ntWO0XqB7uJr++hxk0Qr8NK9mdeJ8jZ8rR/xj UrVkYVCWbHUtSvdj0szN/cHPyJq0G5X6FUzABlHoIOmIN7A8dCSroHXZnRrb4r8= =ComK -----END PGP SIGNATURE----- --6nSyB+bcl/pT7+kx-- From owner-freebsd-transport@freebsd.org Thu Nov 19 23:14:42 2015 Return-Path: Delivered-To: freebsd-transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B9C38A327C6 for ; Thu, 19 Nov 2015 23:14:42 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id A1BFD1554 for ; Thu, 19 Nov 2015 23:14:42 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: by mailman.ysv.freebsd.org (Postfix) id 9DFE4A327C3; Thu, 19 Nov 2015 23:14:42 +0000 (UTC) Delivered-To: transport@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C610A327C2; Thu, 19 Nov 2015 23:14:42 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.strugglingcoder.info", Issuer "mail.strugglingcoder.info" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8B1611553; Thu, 19 Nov 2015 23:14:41 +0000 (UTC) (envelope-from hiren@strugglingcoder.info) Received: from localhost (unknown [10.1.1.3]) (Authenticated sender: hiren@strugglingcoder.info) by mail.strugglingcoder.info (Postfix) with ESMTPA id 7DB56C4946; Thu, 19 Nov 2015 15:14:40 -0800 (PST) Date: Thu, 19 Nov 2015 15:14:40 -0800 From: hiren panchasara To: Randall Stewart Cc: FreeBSD Transports , freebsd-net@FreeBSD.org Subject: dupack counter processing Message-ID: <20151119231440.GE98283@strugglingcoder.info> References: <20151018003740.GE87252@strugglingcoder.info> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="8TaQrIeukR7mmbKf" Content-Disposition: inline In-Reply-To: <20151018003740.GE87252@strugglingcoder.info> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-transport@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions of transport level network protocols in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Nov 2015 23:14:42 -0000 --8TaQrIeukR7mmbKf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable + freebsd-net as this deserves more eyeballs, imo. On 10/17/15 at 05:37P, hiren panchasara wrote: > On 10/07/15 at 12:17P, Randall Stewart via freebsd-transport wrote: > >=20 > > 2) When we recognize a dup-ack we *will not* recognize it if for exampl= e if the rwnd changes even > > if new SACK information is reported in the sack blocks. This is due= to the fact that in non-SACK you don?t > > (on purpose) recognize ACK?s where the window changed (since you ca= n?t really tell if its a > > plain window update or a dup-ack).. This means we occasionally mis= s out > > on stroking the dup-ack counter and getting out of recovery.... After a lot of discussion with Randall and colleagues at Limelight, here is the patch for review: https://reviews.freebsd.org/D4225 This changes the default behavior of how we detect loss. I'd appreciate review/comments. Cheers, Hiren --8TaQrIeukR7mmbKf Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAABCgBmBQJWTlfcXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4 QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/l7tgIAKfxmxuhg/qh1TFQsb2fVk8m UYTHkQkeFmr7hGkyDVt4196D9yPzp8Xvdk7L5d5N3jKyErEBzBPlh9rHWVHsijA2 7v1H4KmNf9w8gYVTVSh727DcdWODJtJJEPDCn1bygaRV0UmaVxMjrbOUcub1fU+e EegM/n0eQlbDN2FD/GjLJu8HFw6M25Grwf2r56XO4/l7NvUqZGFxKYaa+mvcIZWe 7XJPsRtIpgKZJ2ZQ51BDzgpUSCl76bEJ9MjipJ6CbsF+Av5HBC7Bm5gu7TZKlCn6 EPWjPr+4AS8BDVTa6Ju92yD5q4SOQmRBjWB0dnr8a8w1vLhSGD1AEz4IJUmoCD8= =WsbP -----END PGP SIGNATURE----- --8TaQrIeukR7mmbKf--