From owner-freebsd-net@freebsd.org  Sun Nov 22 05:29:34 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 12E36A35559
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Sun, 22 Nov 2015 05:29:34 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id F20AF1505
 for <freebsd-net@FreeBSD.org>; Sun, 22 Nov 2015 05:29:33 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAM5TXIF034719
 for <freebsd-net@FreeBSD.org>; Sun, 22 Nov 2015 05:29:33 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 203630] [Hyper-V] [nat] [tcp] 10.2 NAT bug in TCP stack or
 hyperv netsvc driver
Date: Sun, 22 Nov 2015 05:29:33 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-RELEASE
X-Bugzilla-Keywords: patch
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: weh@microsoft.com
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: emulation@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: maintainer-feedback? mfc-stable9? mfc-stable10?
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-203630-2472-gHXbuX1AzU@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-203630-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-203630-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Nov 2015 05:29:34 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203630

--- Comment #17 from Wei Hu <weh@microsoft.com> ---
The fix went into Head as r291156. I will merge to 10 stable branch in a week.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

From owner-freebsd-net@freebsd.org  Sun Nov 22 12:06:15 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 29CA6A35F9A
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Sun, 22 Nov 2015 12:06:15 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (mail.neosystem.cz
 [IPv6:2001:41d0:2:5ab8::10:15])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E7E4E1348;
 Sun, 22 Nov 2015 12:06:14 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (unknown [127.0.10.15])
 by mail.neosystem.cz (Postfix) with ESMTP id A7BAD74C;
 Sun, 22 Nov 2015 13:06:11 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.neosystem.cz
Received: from dragon.sn.neosystem.cz (unknown
 [IPv6:2001:41d0:2:5ab8::100:101])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mail.neosystem.cz (Postfix) with ESMTPSA id 07793746;
 Sun, 22 Nov 2015 13:06:10 +0100 (CET)
Date: Sun, 22 Nov 2015 13:02:40 +0100
From: Daniel Bilik <ddb@neosystem.org>
To: Kristof Provost <kp@FreeBSD.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-Id: <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
In-Reply-To: <20151121212043.GC2307@vega.codepro.be>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
X-Mailer: Sylpheed 3.4.3 (GTK+ 2.24.28; x86_64-portbld-dragonfly4.3)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Nov 2015 12:06:15 -0000

On Sat, 21 Nov 2015 22:20:43 +0100
Kristof Provost <kp@FreeBSD.org> wrote:

>> Sure, pf.conf attached.
> Thanks. As a first guess, I think the origin of the problem might be
> related to the double nat rule you've got.

Well, even though pf may play some role in the problem, I tend to suspect
the routing table as the main trigger. There are several facts to support
this...

1. after reboot, the router runs fine, even with this "double nat" rule

2. this "double nat" rule was also present on the router when it was
running 9-stable, working flawlessly for years

3. when the problems start, there already is one or more "hits" to routing
table (by a previously mentioned cron task that updates default route to
keep the connectivity), ie. the problems may or may not start only after
touching the routing table

4. it seems that touching routing table can also "cure" the problem: last
week I noticed the router was unable to make tcp connections to one host
over vpn - same problem, it was pushing packets via re0 instead of tap0,
but yesterday I've found the problem is gone, without any reboot or other
intervention, and surprise... there was short connectivity problem at the
beginning of this week, thus default route was changed twice

> I don't have the time to dig into this right away. Could you create a PR
> and cc me to it?

Created, bug id 204735.

Thank you.

--
						Dan

From owner-freebsd-net@freebsd.org  Sun Nov 22 12:51:43 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 643FDA32A05
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Sun, 22 Nov 2015 12:51:43 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4D276153F
 for <freebsd-net@FreeBSD.org>; Sun, 22 Nov 2015 12:51:43 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAMCphRH094113
 for <freebsd-net@FreeBSD.org>; Sun, 22 Nov 2015 12:51:43 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204735] [net] Outgoing packets being sent via wrong interface
Date: Sun, 22 Nov 2015 12:51:43 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.0-STABLE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: linimon@FreeBSD.org
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: assigned_to
Message-ID: <bug-204735-2472-0oWeOb2Xpj@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204735-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204735-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Nov 2015 12:51:43 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204735

Mark Linimon <linimon@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|freebsd-bugs@FreeBSD.org    |freebsd-net@FreeBSD.org

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Mon Nov 23 06:13:50 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 52639A351A3
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 06:13:50 +0000 (UTC)
 (envelope-from honzhan@microsoft.com)
Received: from na01-bl2-obe.outbound.protection.outlook.com
 (mail-bl2on0125.outbound.protection.outlook.com [65.55.169.125])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "MSIT Machine Auth CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A59AD1DEA
 for <freebsd-net@freebsd.org>; Mon, 23 Nov 2015 06:13:49 +0000 (UTC)
 (envelope-from honzhan@microsoft.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
 s=selector1; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=7P1hNSwPhsLsXzgbQ1qgjntOILD2Txsh3MUh+gKfauc=;
 b=hgd59OwSVET7Jh2WX+y2Tfsze1KuDo7vmZ8dLYORVPsr4KocK9wpUFrmX6gIAJ92EQKUGc5QNhZDN/eUwVW3NITLdg7mIwa8QcZO4dTPN69W035/2btFvcS7USWmJt1dNTwaD4jy/hl1WbfyZ2jaPdnRuo2TOka35lAFs2RchSk=
Received: from BY2PR03CA059.namprd03.prod.outlook.com (10.141.249.32) by
 BL2PR03MB227.namprd03.prod.outlook.com (10.255.231.17) with Microsoft SMTP
 Server (TLS) id 15.1.331.20; Mon, 23 Nov 2015 05:58:16 +0000
Received: from BL2FFO11OLC016.protection.gbl (2a01:111:f400:7c09::169) by
 BY2PR03CA059.outlook.office365.com (2a01:111:e400:2c5d::32) with Microsoft
 SMTP Server (TLS) id 15.1.331.20 via Frontend Transport; Mon, 23 Nov 2015
 05:58:16 +0000
Authentication-Results: spf=pass (sender IP is 206.191.228.180)
 smtp.mailfrom=microsoft.com; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=pass action=none header.from=microsoft.com;
Received-SPF: Pass (protection.outlook.com: domain of microsoft.com designates
 206.191.228.180 as permitted sender)
 receiver=protection.outlook.com; 
 client-ip=206.191.228.180; helo=064-smtp-out.microsoft.com;
Received: from 064-smtp-out.microsoft.com (206.191.228.180) by
 BL2FFO11OLC016.mail.protection.outlook.com (10.173.160.82) with Microsoft
 SMTP Server (TLS) id 15.1.331.11 via Frontend Transport; Mon, 23 Nov 2015
 05:58:14 +0000
Received: from SG2PR3002MB0106.064d.mgd.msft.net (141.251.56.18) by
 SG2PR3002MB0108.064d.mgd.msft.net (141.251.56.20) with Microsoft SMTP Server
 (TLS) id 15.1.337.9; Mon, 23 Nov 2015 05:58:05 +0000
Received: from SG2PR3002MB0106.064d.mgd.msft.net ([141.251.56.18]) by
 SG2PR3002MB0106.064d.mgd.msft.net ([141.251.56.18]) with mapi id
 15.01.0337.009; Mon, 23 Nov 2015 05:58:05 +0000
From: Hongjiang Zhang <honzhan@microsoft.com>
To: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed it
 into FreeBSD 9.2
Thread-Topic: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed
 it into FreeBSD 9.2
Thread-Index: AdEls7mtuSdVMBL4RcS0JahNHNGLGA==
Date: Mon, 23 Nov 2015 05:58:04 +0000
Message-ID: <a3ff0bfe18c64a4cb8fdda778fb58a24@SG2PR3002MB0106.064d.mgd.msft.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [141.251.57.5]
MIME-Version: 1.0
X-EOPAttributedMessage: 0
X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11OLC016;
 1:ZtvnE5akFcfVYk3gI3b1zj+KFEzPmdiGTBUWGGH0tgYXi95knBZ74wkgHxxY8WVKxAPme9zyV6xHx+NcGjcun+iVwYQisItAWWNK5wpTjuyyGVfCiLBwTw3L4xfJ8AvdmxK/He7JNoeVXn5GIPUiW39H/6AcBn3QbVwsk1Ytc+V0gUcM62M84O8CDvURxg8vPLqZjsIzYp16RYn+bjYUFDFXx1+0sE1hVvLVIjqB10urzi0jZIjCBkfhv0vDzXV2uvupfdbYgA533MrA4CMCyitco3spkmy5tSgB8aRFupwELCGMveKH8wgj4sEIwmuQmNmYnuVJKCTBYoRyqfrevUFKhZvbuNGwt7S6qSGet90=
X-Forefront-Antispam-Report: CIP:206.191.228.180; CTRY:US; IPV:NLI; EFV:NLI;
 SFV:NSPM;
 SFS:(10019020)(6009001)(2980300002)(438002)(199003)(189002)(19300405004)(50986999)(69596002)(19580395003)(92566002)(24736003)(5008740100001)(86362001)(81156007)(97736004)(790700001)(86612001)(6806005)(2501003)(586003)(300700001)(6116002)(66066001)(3846002)(102836003)(260700001)(11100500001)(5004730100002)(86146001)(5003600100002)(84326002)(5007970100001)(87936001)(33646002)(10290500002)(10400500002)(106466001)(5005710100001)(10090500001)(450100001)(2900100001)(189998001)(110136002)(5001960100002)(107886002)(229853001)(19625215002)(54356999)(15975445007)(2351001)(108616004)(512954002)(16236675004)(16796002);
 DIR:OUT; SFP:1102; SCL:1; SRVR:BL2PR03MB227; H:064-smtp-out.microsoft.com;
 FPR:; SPF:Pass; PTR:ErrorRetry; A:1; MX:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; BL2PR03MB227;
 2:jJIt8UTR50m/7B7A8+1JkeRZr49hIiEZb1EV1dPDA6aR48AEOl3dvJCVbhf0V92PbNXt9N9rcOG2ygmSb6Y8+ibYoEyfJ5nY1oH9+Jp9MagQhUFFYJdW7n01tmLetCXcoVTjmwSYfJXiUlX/T8L4+g==;
 3:8Ep7sOg1t1uK7OK2+cMp7JqO58XkAxMINKevuLOSjbRdHIAZcCZ5yb5mkbsY0HOY4Xy4Mpe4k9ESG2q0iabDZ1+SEzUcD8MC6osgRG8EwGCEWdfPbXYcA8eHUs8ab6XK7ALTmPto8l4zXAdIpo4k0qBcsRgblriyZ4CoValOKdBsfq0LI22JW4XpsgRObnIzqzBCAfrBOvHcX5ljuzkUec0cbF2ZNpkoImUVqfllUXC6WAAUYOnz/AM20RC6ZCvH3f+khWURuRaejySwPsUUMA==;
 25:NI4MG0HBTuae37f06MnRH9HdWxaNbfIp4VCk6wrydxfncqhiqeAOQKlAJLZyL4uzWeJqIG9vtOrfswiS0sRS9LuVUtCHa2L0nsTFuqqlQ5uUh5pA97cxULpr9jr0gBaqW+SbyTyKR5X1NExB0M19SaT2Sb3wjgCCKMZmHjsSjfYIvXSSEHsalcFHGO+8vHzctGjIuRHob9ekvDQCJwnXWA9O/sCTqiit9AHCwjFKly2fx6dZGQNzDaIZIGDE82rSkOdwDaxVkv9SvaKxaE72Zg==
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501001);
 SRVR:BL2PR03MB227; 
X-O365EOP-Header: O365_EOP: AllowList from IP - set SCL to -1
X-Microsoft-Exchange-Diagnostics: 1; BL2PR03MB227;
 20:eTM42XggP5lGJtDV5eah+On8+2omA11Ren2fd1GpDAe7cyVUr4C8wMhCrVjrgQf1OEB/QtgYfdBvNGjbiw2YGIkL16M0dfHwYAQptoPD/P2ZA51Tgfk4mmxYCLvd57FDeqwDNc888tWqpYHFQgUFHKZdG5iRDfs0eVVagGpL95gHeN1eWyEWbhbOwn9K7LPizG9cep55Wx8uCESkCEOjkljUCT844nNK26g2cYn9UXDp1XSsAco2FJJ/KSJaJ68FPbtAF9fnmgsyGWBpxCLYoGQuqyTYIBSRew+WwkGuGm76RFNLV9/Gn5Y1F321eQ0yyjFVPEWjf/5In67jlPxcYryEiSSZ04x4XchGdytIuP8f4RWe8iXkGWe1uA43rMIzaE5+VojdrUhBpGZU87xKpEo+aUX/jKuIqp5jfDKgr4eQWVAsdKBa0lCvB0vc7pfVhhXCbfhDAv7iHyHUtexmDfRjxo8BmmEIrw0P5L1Xv67YIqeaGtlWE98zUo3vcLhF9I4bZmqlSoBC05BnPcruwC/PcfGIrsqka4HqvR9fITIRwhCi7BmkI76ubrodsIydHqBTLaBk5IfFxMJ2ZK2lDNFnZ4nvDmeNHXS3bZp1UYQ=
X-Microsoft-Antispam-PRVS: <BL2PR03MB2274EFE0C9175F3F3ACAA6BB5070@BL2PR03MB227.namprd03.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(108003899814671);
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(61425024)(601004)(2401047)(5005006)(520078)(8121501046)(10201501046)(3002001)(61426024)(61427024);
 SRVR:BL2PR03MB227; BCL:0; PCL:0; RULEID:; SRVR:BL2PR03MB227; 
X-Microsoft-Exchange-Diagnostics: 1; BL2PR03MB227;
 4:aJSTqwPWOMGUW14sMhe2mtY78TmeG6cO9Q6/nnezMSW7WGnhqbcBe3CQVf3zLG9pVEWA0OuZCsVSmkRdMrfdZE9TlQOmZVjfSfCdx2b6pbPB6yQ8nRhptuH6wo6I0SOIVbSr88YNldTVO8X5jKmf3RiLOhT4FZgUz8OhMC9Mu3yR1A2LWYF4uhLDlmKG/urQf5GWUlQ61tHap7qj8tbSGIQfd1fK5HK5ZWLSYCBnucuXypxX6dLQeAJrR+gMVoTopAipjUToEdYIog3wv0ARX9hcrtB1C8NWGWb/yailat9VJuiRFgD6p/SIlyIyJe2VYQy76su9H0TipZTJo1M/ngP8TpVTMy3ARpYjy8cR9t9+LD8CuQjWzIVRBw4iU+wEkYT8YJ97WD5Dh9LzgIYsOUX3O5lVJUspamEgbMut0Y9Wq/ulWY0SZIppm8KRaZCOkgA3fEkknK/LRuTY/gT6qCQ89WijHqZ/2u55E3wtkRD6As0/RAZZijf0PQlMpA5s
X-Forefront-PRVS: 07697999E6
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BL2PR03MB227;
 23:3vEpR5sJdMCFMduq7cTNYdKOTJmxQG8y2vw4Pq36/l?=
 =?us-ascii?Q?p+c81F3D3XrJQyf5W66n6e0HUrT7tBBmODFY8duAHebTc5v44FQShTpyRIJX?=
 =?us-ascii?Q?ggxswXgOZ97921y+zyjIbkP46/yRLticoi7nj9N/cIKuICp6Sbm1x6EzODtl?=
 =?us-ascii?Q?n/iEkc6SsqLKBMAvo4aSDAYCaBPyp0kDL2vCuz23tjg8L3SA5EwjJA+Zlwjf?=
 =?us-ascii?Q?JJVy6odwJZMEeJJ/1dR8BNwyUhjhJ2GMYXWlExRMUXKh/zcEN+9Bk2jv+CtH?=
 =?us-ascii?Q?QhbNbzS0kbdIYJ0lXPk8z8aPK350izoV6MyoniLLQkb9SGE/CGX9Wp5oMh+q?=
 =?us-ascii?Q?9l/dLgncWKJbbvgdhnTzUV5io26Vter9rlXeeB5gaA3fWFc6/k1q8amXU9mY?=
 =?us-ascii?Q?OPy0/Hi1X4/v9fsK3VDEyBJNDUig0CSMgXNOnej5RpA1TsmiJL2EhyAGjtIR?=
 =?us-ascii?Q?BVqjSoqWISMhxQZpzo1aSZLJg+NUE3nb68ocKnjzwdFQP4i6xYIMZ3lGe6GB?=
 =?us-ascii?Q?k2yMhk9my5c3p0pld7rTb+YJ4Y2wUozVfuEbtclFexCIOSFvgYFhL8TDBzVq?=
 =?us-ascii?Q?oWwFMbDKUL521B6a+Un6SmpQ7Q6Rl/k5+vDVPBenF9LX09jAmlmBwkDRKgD+?=
 =?us-ascii?Q?k+VgpuG+OCx7RJwTTMAAclTMVSDX15OHPRwyj6SqbXmOJiyxDifcdwhY0YLq?=
 =?us-ascii?Q?ssvg3drh9xxAtI5n8A5QpxII3jrcATv32cF5klWyaAUNT72TvMAvn40EwY+U?=
 =?us-ascii?Q?kBvZLaUcRfjmM6edLQyl3v0lROZmpQ4pnil1rDNCyZUEq57qvEfQ8awCsRem?=
 =?us-ascii?Q?cLK1ncfIfS3q00a5yODJ5JVWHqnzg573PLTxYu/8FTHQjpQzryA+3nTs/PeK?=
 =?us-ascii?Q?ErnPvX7uf/qYGBZ8EKs4u2+DO3XIEj4l6pV/x4Ikd66t4Z7XhzB8anTjfVrj?=
 =?us-ascii?Q?Bw4JXNAOv6qGlKuqMi3UCOu1CC6RzpU5IDYssciSuYL3ZfygF3KxUaMp4bJd?=
 =?us-ascii?Q?c+vdcQoRGbt565KQpymjxTaZG45GUWXfL8S0XSKEiU+FDfgDuIFNm1ycd4t5?=
 =?us-ascii?Q?I6R8CiWdSrTs1gi9DdhT7bwZxyfIwRSxKECdZVJW/znGbNwazE1YYjLt79ve?=
 =?us-ascii?Q?u8k5G+BQl1VEY6e0rrKWDW0IGLU8boVW1pV7w6W2OI9cJtfN+xWCkwuVrwT2?=
 =?us-ascii?Q?N5rYJgvc9kKf4O0SCffZtIvMWC4woiS+O6KOhPK9fh7R2lVtvjHELrI39uxL?=
 =?us-ascii?Q?dA6/5cIwgTXrXIcpO8qtcaNfOsbkLMtsi+2lsaannLCoEEMlWMuwN6yhVLH/?=
 =?us-ascii?Q?OgssyCzzkuvmop6OLX7h4ui/81x8drsk1s5gS6dAY3TUqw18rP7dv9sJ0wg7?=
 =?us-ascii?Q?g4IU2qqxvnrt+yKiYhFXaEzlPBv6af4spqgO+qBM2oSUNm4U/Im6v7Y3tQnX?=
 =?us-ascii?Q?ktKtdIQQ=3D=3D?=
X-Microsoft-Exchange-Diagnostics: 1; BL2PR03MB227;
 5:9B9kY3RPExiGM1NAFYcg3m7q8Jyd9n7mN3IuGMuB4tl9l5BfoedDp2vSJ5uINWl0yFa1xUdlPvIQqWwjs233eX5bdWD6XPQ00CRgcOzOWv03b51vY8Ut/Sw1tYRqDpgb9upfui92kJa/vnElBS58wg==;
 24:zcp731eVsJ16jsa6+vKF3XTf47UtN2Z62emD1JRzgt9ceOYQB5m9x20ZBQnGdUgSJleTQFW3Nn+hJOhMMRqkOq1J/WTyK4ejWejHzIWTYZc=
SpamDiagnosticOutput: 1:23
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Nov 2015 05:58:14.0623 (UTC)
X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=72f988bf-86f1-41af-91ab-2d7cd011db47; Ip=[206.191.228.180];
 Helo=[064-smtp-out.microsoft.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR03MB227
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 06:13:50 -0000

Hi,

Some people, who used FreeBSD 9.2 and back-port network driver for Hyper-v =
from FreeBSD 10, encountered a network issue. They installed 2 VM (FreeBSD =
9.2 with the customized FreeBSD kernel) on Azure. Network went offline very=
 soon when the big file (~320M byte) is copied from one VM to anther throug=
h "scp". If TSO is disabled through "sysctl -w net.inet.tcp.tso=3D0", this =
issue will be alleviated but cannot be eliminated. I did not figure out why=
.

I have checked the release notes of FreeBSD 9.2/9.3/10, but did not find an=
ything which blocked the back-port. It is supposed 9.2 allows the back-port=
ed Hyper-v drivers from 10. Is this assumption correct?

From owner-freebsd-net@freebsd.org  Mon Nov 23 12:52:00 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 547F4A34232
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 12:52:00 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 425F21583
 for <freebsd-net@FreeBSD.org>; Mon, 23 Nov 2015 12:52:00 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tANCq0YF053735
 for <freebsd-net@FreeBSD.org>; Mon, 23 Nov 2015 12:52:00 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204437] 10.2 STABLE Crashing with IPSec Support
Date: Mon, 23 Nov 2015 12:52:00 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-STABLE
X-Bugzilla-Keywords: crash, needs-qa
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: peixotocassiano@gmail.com
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: mfc-stable9? mfc-stable10?
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-204437-2472-4UszxDZUMh@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 12:52:00 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204437

--- Comment #16 from Cassiano Peixoto <peixotocassiano@gmail.com> ---
(In reply to emeric.poupon from comment #15)
I'll try it now and let you know the results.

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Mon Nov 23 14:19:14 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 21FBCA35AE4
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 14:19:14 +0000 (UTC)
 (envelope-from zjshangkun_l@126.com)
Received: from m15-55.126.com (m15-55.126.com [220.181.15.55])
 by mx1.freebsd.org (Postfix) with ESMTP id 9D0EC106E
 for <freebsd-net@freebsd.org>; Mon, 23 Nov 2015 14:19:13 +0000 (UTC)
 (envelope-from zjshangkun_l@126.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com;
 s=s110527; h=Date:From:Subject:MIME-Version:Message-ID; bh=7lt8k
 KACEy3y4AJ6LznkHwpxyYN6A0K57fjyN9ubUgY=; b=ppWma6s56ZAUd8cxoljOU
 pT4d2kJ2+XPGhnIBOv9kv/UUm+fwyhijRAIS3FGH6YZkT5zKU/E9q4E15l4clYBx
 e+9jiBp1IjeXqiGFeSBOC0RD2uYer8EP74u10a8Fy3UpzbVSUNGcrC6QVL4ck2IG
 sSYytoG1x5N2yHyddpwxDs=
Received: from zjshangkun_l$126.com ( [183.154.89.172, 123.58.177.192] ) by
 ajax-webmail-wmsvr55 (Coremail) ; Mon, 23 Nov 2015 21:26:30 +0800 (CST)
X-Originating-IP: [183.154.89.172, 123.58.177.192]
Date: Mon, 23 Nov 2015 21:26:30 +0800 (CST)
From: "ShangKun Co.,Ltd." <zjshangkun_l@126.com>
To: freebsd-net@freebsd.org
Subject: Repeater Supplier-China
X-Priority: 3
X-Mailer: Coremail Webmail Server Version SP_ntes V3.5 build
 20150911(74783.7961) Copyright (c) 2002-2015 www.mailtech.cn 126com
X-CM-CTRLDATA: qffAxGZvb3Rlcl9odG09MTI0OTo1Ng==
MIME-Version: 1.0
Message-ID: <6b52f85.122e8.15134863808.Coremail.zjshangkun_l@126.com>
X-CM-TRANSID: N8qowACHRsELFFNWA1YEAA--.1143W
X-CM-SenderInfo: x2mvxtpqjn30lbo6ij2wof0z/1tbiXwK7XFUJU7xfWwABsM
X-Coremail-Antispam: 1U5529EdanIXcx71UUUUU7vcSsGvfC2KfnxnUU==
Content-Type: text/plain; charset=GBK
Content-Transfer-Encoding: base64
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 14:19:14 -0000

RGVhciBNYW5hZ2VyLCAKSG93IGFyZSB5b3U/CldlIHNwZWNpYWxpemUgaW4gcmVwZWF0ZXIgZm9y
IDEwIHllYXJzLCB3aXRoIG1hdHVyZSBvbmUtc3RvcC4KCgpBbHNvIHdlIGhhdmUgb3VyIG93biBw
cm9mZXNzaW9uYWwgZGVzaWduZXJzIHRvIG1lZXQgYW55IG9mIHlvdXIgcmVxdWlyZW1lbnRzLiBG
b3Igc2VudCB1cyBtb3JlIGRldGFpbCByZXF1aXJlbWVudCx3ZSB3aWxsIHN1cHBseSBiZXN0IHBy
aWNlIGZvciB5b3UuCgoKU2hvdWxkIHlvdSBoYXZlIGFueSBxdWVzdGlvbnMsIGNhbGwgbWUsIGxl
dCdzIHRhbGsgZGV0YWlscy4gCgoKU2luY2VyZWx5LApBbmR5L1NhbGVzIE1hbmFnZXIuCi0tCgoq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioKVEVM
o7owMDg2LTA1NzktODUxMTgzNTkKRkFYo7owMDg2LTA1NzktODUxMTgzNTkgICAgICAgICAgICAK
U0tZUEWjunpqc2hhbmdrdW4gIApFbWFpbDp6anNoYW5na3VuQGdtYWlsLmNvbQpBZGSjuk5vLjY5
OSBDaG91emhvdSBOb3J0aCBSb2FkICxZaVd1IENpdHksMzIyMDAwLFpoZUppYW5nIFByb3ZpbmNl
LCBDaGluYS4KKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKio=
From owner-freebsd-net@freebsd.org  Mon Nov 23 15:14:18 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6A6F7A33A3D
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 15:14:18 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3BA0915E4
 for <freebsd-net@FreeBSD.org>; Mon, 23 Nov 2015 15:14:18 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tANFEIsc098356
 for <freebsd-net@FreeBSD.org>; Mon, 23 Nov 2015 15:14:18 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 201694] 10.2-BETA2 crashing when killing VIMAGE/VNET jails
Date: Mon, 23 Nov 2015 15:14:17 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-RELEASE
X-Bugzilla-Keywords: crash, needs-qa
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: koobs@FreeBSD.org
X-Bugzilla-Status: Open
X-Bugzilla-Priority: Normal
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: mfc-stable10?
X-Bugzilla-Changed-Fields: version bug_severity assigned_to flagtypes.name cc
 bug_status keywords priority
Message-ID: <bug-201694-2472-586fYKqyuJ@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-201694-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-201694-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 15:14:18 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201694

Kubilay Kocak <koobs@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|10.2-BETA1                  |10.2-RELEASE
           Severity|Affects Only Me             |Affects Some People
           Assignee|freebsd-jail@FreeBSD.org    |freebsd-net@FreeBSD.org
              Flags|                            |mfc-stable10?
                 CC|                            |koobs@FreeBSD.org
             Status|New                         |Open
           Keywords|                            |crash, needs-qa
           Priority|---                         |Normal

--- Comment #4 from Kubilay Kocak <koobs@FreeBSD.org> ---
Bartek / Paul,

To get this issue the attention it needs, id appreciate it if you could both
provide:

* Updated backtraces for this panic on the latest 10.2-RELEASE / CURRENT (for
extra debugging)
* Steps to reproduce. The summary mentions crash on 'killing' jails. what steps
exactly?
* Isolate/reduce the reproduction case and system configuration as much as
possible (kernel, ifconfig, whatever)
* Hardware (and virtualization if applicable) details. dmesg.boot should be
fine for now

Note: Please use attachments for any large outputs to keep the conversation
clear and easy to follow.

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Mon Nov 23 15:30:34 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id DDA72A33D54
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 15:30:34 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-ig0-x232.google.com (mail-ig0-x232.google.com
 [IPv6:2607:f8b0:4001:c05::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A12681BCD
 for <freebsd-net@freebsd.org>; Mon, 23 Nov 2015 15:30:34 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: by igcto18 with SMTP id to18so58509182igc.0
 for <freebsd-net@freebsd.org>; Mon, 23 Nov 2015 07:30:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type:content-transfer-encoding;
 bh=1QtS485PSWFGE9oTiDQx3xYmFXehBrv5V+OsVQV1TKg=;
 b=cS0/BTr3wsg0N8o9JXOQjP4PxZiIeVp00wJ03Z7YtDuz638Rgjf7fVsAiALc8M/RuQ
 eujudjZvZCB5t6WspAayWFk3L3MiF+RIaAeroDLTKYZEVfRbOBHSzqr6NtzRnau3p9pp
 dtzbHb0pHTiZk2RGxjLqdYPgX3M1C+SSS/5jBpBsM8D/Cr3Emm48xznRfhBg+7jRwF39
 YAIY4Ud24CMj8ZRxQ+4ok1/24j3zQtnVhVpxdsVGj18sVwoSZCQ0rcA54nS73wk8GpaD
 A8bYPIO097+RQLgBXCz4nECLcRm+s8ZkcJIh8R3jbBWjNlH2ZAYFbsRr572pOcpI17W9
 q8NQ==
MIME-Version: 1.0
X-Received: by 10.50.65.74 with SMTP id v10mr13293413igs.61.1448292633707;
 Mon, 23 Nov 2015 07:30:33 -0800 (PST)
Received: by 10.36.217.196 with HTTP; Mon, 23 Nov 2015 07:30:33 -0800 (PST)
In-Reply-To: <a3ff0bfe18c64a4cb8fdda778fb58a24@SG2PR3002MB0106.064d.mgd.msft.net>
References: <a3ff0bfe18c64a4cb8fdda778fb58a24@SG2PR3002MB0106.064d.mgd.msft.net>
Date: Mon, 23 Nov 2015 07:30:33 -0800
Message-ID: <CAJ-Vmone_fa8Z1BGn68btfF5iY3kDkn1jDEpisyBHFJeDZgjZA@mail.gmail.com>
Subject: Re: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed
 it into FreeBSD 9.2
From: Adrian Chadd <adrian.chadd@gmail.com>
To: Hongjiang Zhang <honzhan@microsoft.com>
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 15:30:35 -0000

On 22 November 2015 at 21:58, Hongjiang Zhang <honzhan@microsoft.com> wrote=
:
> Hi,
>
> Some people, who used FreeBSD 9.2 and back-port network driver for Hyper-=
v from FreeBSD 10, encountered a network issue. They installed 2 VM (FreeBS=
D 9.2 with the customized FreeBSD kernel) on Azure. Network went offline ve=
ry soon when the big file (~320M byte) is copied from one VM to anther thro=
ugh "scp". If TSO is disabled through "sysctl -w net.inet.tcp.tso=3D0", thi=
s issue will be alleviated but cannot be eliminated. I did not figure out w=
hy.
>
> I have checked the release notes of FreeBSD 9.2/9.3/10, but did not find =
anything which blocked the back-port. It is supposed 9.2 allows the back-po=
rted Hyper-v drivers from 10. Is this assumption correct?

Hi!

It may be something to do with maximum mbufs per packet or some other
limit like that. Is there a lot of interest in backporting the latest
hyperv driver to 9.2?


-a

> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@freebsd.org  Mon Nov 23 16:11:50 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0AD15A36548
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Mon, 23 Nov 2015 16:11:50 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CBE32149C
 for <freebsd-net@freebsd.org>; Mon, 23 Nov 2015 16:11:49 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from Julian-MBP3.local
 (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133])
 (authenticated bits=0)
 by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id tANGBhfB062679
 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
 Mon, 23 Nov 2015 08:11:46 -0800 (PST)
 (envelope-from julian@freebsd.org)
Subject: Re: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed
 it into FreeBSD 9.2
To: Adrian Chadd <adrian.chadd@gmail.com>,
 Hongjiang Zhang <honzhan@microsoft.com>
References: <a3ff0bfe18c64a4cb8fdda778fb58a24@SG2PR3002MB0106.064d.mgd.msft.net>
 <CAJ-Vmone_fa8Z1BGn68btfF5iY3kDkn1jDEpisyBHFJeDZgjZA@mail.gmail.com>
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
From: Julian Elischer <julian@freebsd.org>
Message-ID: <56533AB9.80707@freebsd.org>
Date: Tue, 24 Nov 2015 00:11:37 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0)
 Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <CAJ-Vmone_fa8Z1BGn68btfF5iY3kDkn1jDEpisyBHFJeDZgjZA@mail.gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Nov 2015 16:11:50 -0000

On 23/11/2015 11:30 PM, Adrian Chadd wrote:
> On 22 November 2015 at 21:58, Hongjiang Zhang <honzhan@microsoft.com> wrote:
>> Hi,
>>
>> Some people, who used FreeBSD 9.2 and back-port network driver for Hyper-v from FreeBSD 10, encountered a network issue. They installed 2 VM (FreeBSD 9.2 with the customized FreeBSD kernel) on Azure. Network went offline very soon when the big file (~320M byte) is copied from one VM to anther through "scp". If TSO is disabled through "sysctl -w net.inet.tcp.tso=0", this issue will be alleviated but cannot be eliminated. I did not figure out why.
>>
>> I have checked the release notes of FreeBSD 9.2/9.3/10, but did not find anything which blocked the back-port. It is supposed 9.2 allows the back-ported Hyper-v drivers from 10. Is this assumption correct?
> Hi!
>
> It may be something to do with maximum mbufs per packet or some other
> limit like that. Is there a lot of interest in backporting the latest
> hyperv driver to 9.2?

I believe we  ($JOB) have them back ported to 8.0. (I didn't do the work)

>
>
> -a
>
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@freebsd.org  Tue Nov 24 06:09:49 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 496E1A362D2
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Nov 2015 06:09:49 +0000 (UTC)
 (envelope-from honzhan@microsoft.com)
Received: from na01-bn1-obe.outbound.protection.outlook.com
 (mail-bn1bn0107.outbound.protection.outlook.com [157.56.110.107])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "MSIT Machine Auth CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A0AA512EB
 for <freebsd-net@freebsd.org>; Tue, 24 Nov 2015 06:09:48 +0000 (UTC)
 (envelope-from honzhan@microsoft.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
 s=selector1; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=4oIZybNa/dzCYXmGnUR1T18A7Uocu37MK9RyxRslx4k=;
 b=V1A6nIbkV83+vE8Nb1542IqtY2svL7BEDgdyK3YYdbWbTTVlataPAMVRlQyaxgyvNV6lL/9ABnRetcRDbIEQJPFB1+m7Z5fT1cqLrN+tVYHij8kvYrRfxfI3JnoY7jrzQttQ39G1uDN/EiKs0Ter7wKPB3eYwt7xgptSIHPS+qs=
Received: from BLUPR0301CA0018.namprd03.prod.outlook.com (10.162.113.156) by
 BY2PR03MB506.namprd03.prod.outlook.com (10.141.143.18) with Microsoft SMTP
 Server (TLS) id 15.1.331.20; Tue, 24 Nov 2015 06:09:44 +0000
Received: from BL2FFO11FD053.protection.gbl (2a01:111:f400:7c09::112) by
 BLUPR0301CA0018.outlook.office365.com (2a01:111:e400:5259::28) with Microsoft
 SMTP Server (TLS) id 15.1.331.20 via Frontend Transport; Tue, 24 Nov 2015
 06:09:43 +0000
Authentication-Results: spf=pass (sender IP is 206.191.228.180)
 smtp.mailfrom=microsoft.com; gmail.com; dkim=none (message not signed)
 header.d=none;gmail.com; dmarc=pass action=none header.from=microsoft.com;
Received-SPF: Pass (protection.outlook.com: domain of microsoft.com designates
 206.191.228.180 as permitted sender)
 receiver=protection.outlook.com; 
 client-ip=206.191.228.180; helo=064-smtp-out.microsoft.com;
Received: from 064-smtp-out.microsoft.com (206.191.228.180) by
 BL2FFO11FD053.mail.protection.outlook.com (10.173.161.181) with Microsoft
 SMTP Server (TLS) id 15.1.331.11 via Frontend Transport; Tue, 24 Nov 2015
 06:09:41 +0000
Received: from SG2PR3002MB0106.064d.mgd.msft.net (141.251.56.18) by
 SG2PR3002MB0106.064d.mgd.msft.net (141.251.56.18) with Microsoft SMTP Server
 (TLS) id 15.1.337.9; Tue, 24 Nov 2015 06:09:32 +0000
Received: from SG2PR3002MB0106.064d.mgd.msft.net ([141.251.56.18]) by
 SG2PR3002MB0106.064d.mgd.msft.net ([141.251.56.18]) with mapi id
 15.01.0337.009; Tue, 24 Nov 2015 06:09:32 +0000
From: Hongjiang Zhang <honzhan@microsoft.com>
To: Adrian Chadd <adrian.chadd@gmail.com>
CC: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject: RE: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed
 it into FreeBSD 9.2
Thread-Topic: Is it allowed to copy hyper-v drivers from FreeBSD 10 and packed
 it into FreeBSD 9.2
Thread-Index: AdEls7mtuSdVMBL4RcS0JahNHNGLGAAUCrGAABhWRmA=
Date: Tue, 24 Nov 2015 06:09:31 +0000
Message-ID: <ac15881eeed5419d9f91a6029da3281c@SG2PR3002MB0106.064d.mgd.msft.net>
References: <a3ff0bfe18c64a4cb8fdda778fb58a24@SG2PR3002MB0106.064d.mgd.msft.net>
 <CAJ-Vmone_fa8Z1BGn68btfF5iY3kDkn1jDEpisyBHFJeDZgjZA@mail.gmail.com>
In-Reply-To: <CAJ-Vmone_fa8Z1BGn68btfF5iY3kDkn1jDEpisyBHFJeDZgjZA@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [141.251.57.5]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-EOPAttributedMessage: 0
X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11FD053;
 1:NCkLZGGUqK3OHrfiNP26JwxGBXQJtibVb1Z1Aj4ChUbFLeASPVLStrNIiosuDTGSkHKNJMpTdY+byWLt0ShmTx2gqn36vKl4MhKKVsfMGJ6uCl/4tQhmIKENRk58oo1s+Jq4R4mE/70DzKNLWJNaTT4tWswjWIaiWBigGkmh4QUe5wlUzc9y5CVHl+WgHHnLUOKSJ+ypgjPNP4x/aVaNVLLTx9DK8ZL6Ex7MNhtNN1m+9Pw/DOO5H6dpyci+f8fwiSMAUGCIn4dhjARi2Ue4VguUFeZRRTJQMcfFHMiCdN9TV+geDps/ouZMD6sUfHhhjYEDHxDxnjMm1MSwpHAPE/qng895rHVg3ZQO9p6dG6s=
X-Forefront-Antispam-Report: CIP:206.191.228.180; CTRY:US; IPV:NLI; EFV:NLI;
 SFV:NSPM;
 SFS:(10019020)(979002)(6009001)(2980300002)(438002)(199003)(13464003)(189002)(24454002)(3905003)(6806005)(110136002)(66066001)(92566002)(23676002)(5003600100002)(19580405001)(81156007)(19580395003)(97736004)(76176999)(69596002)(189998001)(106466001)(16796002)(47776003)(5001960100002)(50986999)(5001920100001)(86612001)(586003)(86362001)(54356999)(33646002)(50466002)(5005710100001)(2900100001)(10090500001)(108616004)(5007970100001)(15975445007)(24736003)(5004730100002)(87936001)(2950100001)(6116002)(5008740100001)(11100500001)(10400500002)(10290500002)(86146001)(102836003)(3846002)(969003)(989001)(999001)(1009001)(1019001);
 DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR03MB506; H:064-smtp-out.microsoft.com;
 FPR:; SPF:Pass; PTR:ErrorRetry; A:1; MX:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; BY2PR03MB506;
 2:0iB9tmmS1krNEe8cEx5OEGn/oNwNNfWE53QCr5eevxxWW79M6FBryHNpqwJJEQmsLnaBvbqTDPHdX/N87kLXbniMNPdoXauMyJ6uX7HFWDiyxmIDkWUEbRW7PaVgbZ8Wow/vA00NuxTDxWLg9B2I8w==;
 3:GQi2B8ruxDB63PS8fY37bnMlMn44GoZzjw1/llJvl24C6VaCEtFsUngxc8BsWRjD/JEPJwart9FLt8xWtGaYIVgXSNsUgVYGUSlAed5nqyyt6Uw6uZsZkUOF/GW96IcD09Uh1j8K3rplwbqYiDeDc0jY/VzHz87Om10gsVljJh+W0/XTmjF/al9RokLvBaHSSMVB0Bex3anzK/UR71FAqRD0HodwHpH8W2uz9omloWjY9Pz014GrVNClL1diDkGKwDnJHC83qQgEel3BlEehcauSd52NDpApNuLh4YQJvWZRbe1/zwh0wdOOypIHHk/o;
 25:EjT7cHm/LoyPoeeL3pTXLylVQ2BhDPQvuHSm6egwClfJ8Zt+nmqLKZ3ZR+r0oZ3ewj+8Viis/nRPjTg5jl+Y+mInzORYUmrFdDD6f6ip/hsw8jHcXMYvxMiMfhB4qPx5JyKijR4WIYDLri0neR0MbxFb6ue4xLTALgVeJtIP54I0nWNBU7MrXAU9aqvTGByqLM1I31Tq5RRGRR+3X1faKji4R9vib4jxCf0daGxZMtzQmAbvOVpjvJHVEmbe+f99gi0rIjSpVOUTIw2pf/hCHg==
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(42134001)(42139001)(8251501001); SRVR:BY2PR03MB506; 
X-O365EOP-Header: O365_EOP: AllowList from IP - set SCL to -1
X-Microsoft-Exchange-Diagnostics: 1; BY2PR03MB506;
 20:L6JYtPrYNPTDyYHfBd0x9nkR0+eKjTV7rr+q2QfRyojBujd6dytnee6/pLkgWn+zYHWdakseZUiZM4JWHmWfiybaLlMajt1P/FIKGuTRvWVG/r8itlA4XFRLWzCccN1BYDN7JtmCeNHkhHQ5plQxLJ7gCOCzERXfebEoXsMJ7CJi7+D9TADtZUSJnPetWbLa55TtVvaY7TxfT6AW1FIMKyO15rft2FZEspIimSF4XiVVq0zwrE9xQOe1SG8WdiPIgEVrJjLS6QN1+wlIF4flREsLIgMbd49CMDu/lLPhj9eYXUEMSYLbYoQhm4dp90yxYKK3e22FKAfn8It1DGiccIY2pMaqskTexqXonczY0YBFmN/dMx1Nu3+2+ExkgfU4U8D2Eis5RNvDcLZ7ZvLW/9wD2LUEPaDa0xacbcCuj6sPWeurLZLKKSe0XytQfCztO8/TvF6axKYx4lnW3+RSU2uMxOSA5zL0L36OztwVvFQBYiREbbe9j5kJpCYlVIL4WCB1TC7jWoYUx29WwSNxoY4M85tBO3leEs9bVKJ5xHCSwujL84HStSevUIUlk68g46G39w5ONojRyNIjMbELmm8G/amyvqAhtgwpLj/Bdd8=
X-Microsoft-Antispam-PRVS: <BY2PR03MB506BCD892CCAC9F5FC764EFB5060@BY2PR03MB506.namprd03.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(202136424685340)(189930954265078)(108003899814671); 
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(61425024)(601004)(2401047)(8121501046)(520078)(5005006)(3002001)(10201501046)(61426024)(61427024);
 SRVR:BY2PR03MB506; BCL:0; PCL:0; RULEID:; SRVR:BY2PR03MB506; 
X-Microsoft-Exchange-Diagnostics: 1; BY2PR03MB506;
 4:cxFokBdkvZAjDs7YasIG5gXDRxaYkW6tLznoxYNOkc2X+2N9uYw4cd4eaevFDdefj9U10CfZyIAHjKeA9GbNpauhzKvuqyDSis3TFZYj4naGvi0wk/E3Vi2WST7DjvrjxbpIo3qauJ2PnxdVKiG7dAtDvf0xEAtK98StlQGP5QSvvEJhsd3/roMUaZMlvfmzjSjNff+4fhzLQuWNsUYv5WUu+AFZs4JAl0u9kCtqg4CZP46K1eURorjEDz5jskUumVeb3TRp5rU+7VEm4t/Fsx/c+0dA5bE/hFtTk/GFehE1ZuOVeg1g5aWYrhs8YeEAJkTD3b12Ga2+2J5Ni2yZYtZf7iwOvqlr4vlsvFZqsN0zv7XZWuHQsdSQo/wnk+wfyL6g1JJ3BZUOEhfdxBuhbeLxSKJ2vHzlswZg/786m2PR6h2Cu0uPAzKj5B8qnD0y7ePxtYP9+4LVCUfUiBmXtNZuUEPq0vrEYXFU9YOPDuJEW70LV9haV/rJ051TEiZUAiMXkd1FCU4fWguMydIYVQVmin15TRanKG8kuUW5+IqfJVW4bUYiYEjRf+5h36Zm
X-Forefront-PRVS: 0770F75EA9
X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtCWTJQUjAzTUI1MDY7MjM6ZEFJUjh4QWZpbERKSnI1Szk1V2c4Y0Y4cE13?=
 =?utf-8?B?NjRua2pLdThTUE43d1ZVYnIwUzQydkEvWnRTaklhN0NsYk1ZaENsQytpcmZX?=
 =?utf-8?B?N1F3c0pCbTduSndFZTVtNmRFSGRmcG1Yb1dBVFQyZTZQQ0U0UkNCeklhWWFh?=
 =?utf-8?B?SGh3T21xajJxaitIeUFiejVZT3VWdXdHbnl3WFUwbkRWcEptU0NXSUZnN2lv?=
 =?utf-8?B?emgwZE44S0I3L2I3VG0xdmdNSDlLd0tydDE0RHp1NmtXSER4YllMcW9maVFi?=
 =?utf-8?B?OW9RMTJJekN5V2xoZzBrOVNibGRLUWNGVE0vMzJqRXlGV0pTSHM4WDBPTjEz?=
 =?utf-8?B?WGZyTDNvamxsaHVlOVVTOXZTM0JXMEhkNGgzTmdMTWU0TFcwam5nVGlDaE4r?=
 =?utf-8?B?aVZza3FiQWIvK2Rmak41Ukt1YjArbEZkeE9Zd2xSK1Rxb1R2dGRRTWRkbDFS?=
 =?utf-8?B?Z2Nwd0dYOHd5aFVwSnB4YnlyWDRYaEdHNnZNeEZTbUhFMVhrZUhPckk4MkdT?=
 =?utf-8?B?RjZkcElGNWMxMS9obkxUOGZCU05USUR0Q3NWOFI4QXBIeW9TTGx6a1EyMjRS?=
 =?utf-8?B?ZzhMZVF3WFM0Mk5ONko4MDU5S29wQ3FycXpVbStmeE9DNXNReUdpeUtuSVhz?=
 =?utf-8?B?dTVJSWVnRjMwN0svenQrNnNkcXRvV1VsUFI0a1JMaERmZEFzNVlmUHk0eWl0?=
 =?utf-8?B?ak15d25wMFBKQ00yYkNTVzZUVzZqbk0vMzR3RXBRMlJvSzlRNTNZZ2szMnkw?=
 =?utf-8?B?djlCTFJJeEdJMnRCekRoUXUvNGwyekJTajhuQWx3RjZDYXNnMXE1TFlyTm5O?=
 =?utf-8?B?VHRkN3N1dzIrbWtGREJmVTRPbzN2OXROaXVoeVJTRGk5MVhmRFJnVzBhc1Zm?=
 =?utf-8?B?M0RSK2tDZ3RoZkM1NWVBVExXNmhkUWpETEpkRG0rNm9FZXVnWkFLMG1MYlh1?=
 =?utf-8?B?b0VUeGFWcXcvZjE5WjRvYkZjVE03eG1BdzdWZmNnL2N0SFI5d2FXUUE5R2dU?=
 =?utf-8?B?YkFqTTZROXRzZStwdWc1UzJlbXZYNXpCTlRPcEhlekcwUHpMSWs5SlAxclBP?=
 =?utf-8?B?Q2VENkZJTDlwOWhGWit3a2hydkdsVW44aDB6M0RGS3dWRFB4SjVrL0Jvek5n?=
 =?utf-8?B?QWhOYkEwZDN3cENTS3VMZVNERW8xNGJTS0NlV0NEdzRPMFJ1bDBFcjBrZ2gx?=
 =?utf-8?B?T2x0NlM4YzhBT2ZsNTlENDE4MHU3eEN6dlVOTFNLejdvYlJZSlNOdVRjOG42?=
 =?utf-8?B?QnpNeTZ3T0tCUnVzd0NPaEFHMmdSMDhmU28yWEZYd0RlT1NhN3grb3VlL2tQ?=
 =?utf-8?B?UTFxeEIzTkZpbEFWRFFERkZEZXlPN1BCMWNLdDBEZCtxa1NRcUFUS3lTZHVJ?=
 =?utf-8?B?Y2Rtdk82WHF4cDhhMzVMQ1hnWmJxL3BqNG95TDZ6NHdsbjZnYVN4Z1hXUE5Y?=
 =?utf-8?B?UFZLV3RaS3Faais4bThxVTRyNzBGVGY0eWRMT2ZpczRnQXdFSmdZYW5mZkt0?=
 =?utf-8?B?WTVvUTJ2ZHM4Sjg0a0o3TTBHV0VnU3ZSZVhiK2d6dUxLblQ2a0lLZnVoN2I2?=
 =?utf-8?B?QWtOeXhNckRVSUdaYXU2L2o0MnRTQ0ZWaU1Mb3REVk1lcDB3NXhMMVdmL2x0?=
 =?utf-8?B?U3VpMzIwR1UweFI0dkF1bFBOTzh6SE9GU2ZaUXR2UGxFWVl5NWJXR01hSFBH?=
 =?utf-8?B?RUlUMjNJaFNNeXlsTWR3UjNLNjVTNEJ5QkpZU2YybVpBMFcyYzd3Y1dYNyt0?=
 =?utf-8?B?ZU5aLzN6QkhrMmVUUTZWUnFnNWtTbDZmaVJqMTRLRVh3bVBBMUQ3NTRrS3Fi?=
 =?utf-8?B?eEVxR29aaG5jMC81enhhNGNoc1VTNmY2VFJwdThGTFBMTkZESUdiU05PMmgr?=
 =?utf-8?B?dGF0d3ZRN0xZYWQrRFZmcGJzVHJFV01lREZMVVMzWVlPdzVsRDV0LzNWSlFZ?=
 =?utf-8?B?U2twS0dzV2x4UTFvSE5heEFVSDAwcURsTmRSZXIyY1laRlZxMmlvbGhNbm1G?=
 =?utf-8?Q?JlVrd?=
X-Microsoft-Exchange-Diagnostics: 1; BY2PR03MB506;
 5:fPNGJsl9+aLzk7WbbmkcnZiDCPA/rhupngoLWl0B9PqyHQQj1o+uou+PUPyT05Mew2hP9D5wS1WIrjJYL5QfCZ96V8jvX0krQuxVjp+hd3ACryN/BvzvZslCSVjD4DM+hDSePep5Lt964z3pgU0Pjw==;
 24:22N1hsfne3krxCRJfE5r6orT2DfgwRS0+EBcrrp+q3MP2e6Q1s3a+Pi3I3Jk72Xw9OnDxrJ0ObT7YRLXuNIjkYWnfX39ihcEM45Wj6xukfg=
SpamDiagnosticOutput: 1:23
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Nov 2015 06:09:41.3204 (UTC)
X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=72f988bf-86f1-41af-91ab-2d7cd011db47; Ip=[206.191.228.180];
 Helo=[064-smtp-out.microsoft.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR03MB506
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2015 06:09:49 -0000

VGhhbmtzLiBFbmFibGluZyBUU08gY2FuIG1ha2UgdGhlIGlzc3VlIGVhc2lseSBvY2N1ci4gV2Vs
bCwgdGhpcyBpcyBqdXN0IGEgY2x1ZS4NCg0KVGhlIGJhY2sgcG9ydCBpcyBub3QgZG9uZSBieSBt
ZSwgYW5kIHRoZSBwZW9wbGUgd2hvIGRpZCB0aGF0IGJlY2F1c2UgdGhlaXIgc3lzdGVtIGlzIGJh
c2VkIG9uIDkuMiwgYW5kIGl0IHRha2VzIGEgbG90IG9mIGVmZm9ydCB0byB1cGdyYWRlIEZyZWVC
U0Qgc3lzdGVtLg0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogQWRyaWFuIENo
YWRkIFttYWlsdG86YWRyaWFuLmNoYWRkQGdtYWlsLmNvbV0gDQpTZW50OiAyMDE15bm0MTHmnIgy
M+aXpSAyMzozMQ0KVG86IEhvbmdqaWFuZyBaaGFuZyA8aG9uemhhbkBtaWNyb3NvZnQuY29tPg0K
Q2M6IGZyZWVic2QtbmV0QGZyZWVic2Qub3JnDQpTdWJqZWN0OiBSZTogSXMgaXQgYWxsb3dlZCB0
byBjb3B5IGh5cGVyLXYgZHJpdmVycyBmcm9tIEZyZWVCU0QgMTAgYW5kIHBhY2tlZCBpdCBpbnRv
IEZyZWVCU0QgOS4yDQoNCk9uIDIyIE5vdmVtYmVyIDIwMTUgYXQgMjE6NTgsIEhvbmdqaWFuZyBa
aGFuZyA8aG9uemhhbkBtaWNyb3NvZnQuY29tPiB3cm90ZToNCj4gSGksDQo+DQo+IFNvbWUgcGVv
cGxlLCB3aG8gdXNlZCBGcmVlQlNEIDkuMiBhbmQgYmFjay1wb3J0IG5ldHdvcmsgZHJpdmVyIGZv
ciBIeXBlci12IGZyb20gRnJlZUJTRCAxMCwgZW5jb3VudGVyZWQgYSBuZXR3b3JrIGlzc3VlLiBU
aGV5IGluc3RhbGxlZCAyIFZNIChGcmVlQlNEIDkuMiB3aXRoIHRoZSBjdXN0b21pemVkIEZyZWVC
U0Qga2VybmVsKSBvbiBBenVyZS4gTmV0d29yayB3ZW50IG9mZmxpbmUgdmVyeSBzb29uIHdoZW4g
dGhlIGJpZyBmaWxlICh+MzIwTSBieXRlKSBpcyBjb3BpZWQgZnJvbSBvbmUgVk0gdG8gYW50aGVy
IHRocm91Z2ggInNjcCIuIElmIFRTTyBpcyBkaXNhYmxlZCB0aHJvdWdoICJzeXNjdGwgLXcgbmV0
LmluZXQudGNwLnRzbz0wIiwgdGhpcyBpc3N1ZSB3aWxsIGJlIGFsbGV2aWF0ZWQgYnV0IGNhbm5v
dCBiZSBlbGltaW5hdGVkLiBJIGRpZCBub3QgZmlndXJlIG91dCB3aHkuDQo+DQo+IEkgaGF2ZSBj
aGVja2VkIHRoZSByZWxlYXNlIG5vdGVzIG9mIEZyZWVCU0QgOS4yLzkuMy8xMCwgYnV0IGRpZCBu
b3QgZmluZCBhbnl0aGluZyB3aGljaCBibG9ja2VkIHRoZSBiYWNrLXBvcnQuIEl0IGlzIHN1cHBv
c2VkIDkuMiBhbGxvd3MgdGhlIGJhY2stcG9ydGVkIEh5cGVyLXYgZHJpdmVycyBmcm9tIDEwLiBJ
cyB0aGlzIGFzc3VtcHRpb24gY29ycmVjdD8NCg0KSGkhDQoNCkl0IG1heSBiZSBzb21ldGhpbmcg
dG8gZG8gd2l0aCBtYXhpbXVtIG1idWZzIHBlciBwYWNrZXQgb3Igc29tZSBvdGhlciBsaW1pdCBs
aWtlIHRoYXQuIElzIHRoZXJlIGEgbG90IG9mIGludGVyZXN0IGluIGJhY2twb3J0aW5nIHRoZSBs
YXRlc3QgaHlwZXJ2IGRyaXZlciB0byA5LjI/DQoNCg0KLWENCg0KPiBfX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBmcmVlYnNkLW5ldEBmcmVlYnNkLm9y
ZyBtYWlsaW5nIGxpc3QNCj4gaHR0cHM6Ly9uYTAxLnNhZmVsaW5rcy5wcm90ZWN0aW9uLm91dGxv
b2suY29tLz91cmw9aHR0cHMlM2ElMmYlMmZsaXN0cw0KPiAuZnJlZWJzZC5vcmclMmZtYWlsbWFu
JTJmbGlzdGluZm8lMmZmcmVlYnNkLW5ldCZkYXRhPTAxJTdjMDElN2Nob256aGFuDQo+ICU0MDA2
NGQubWdkLm1pY3Jvc29mdC5jb20lN2MxYmMzMTJhMDQ0OWQ0ZmI2MWJjNzA4ZDJmNDFiMDdkYiU3
YzcyZjk4OGINCj4gZjg2ZjE0MWFmOTFhYjJkN2NkMDExZGI0NyU3YzEmc2RhdGE9Yk5FaHJvSWx3
aG9USHlHM2JoSGxRMkRZSlk1RldNaThBMQ0KPiAyM1RWZVo3RGMlM2QgVG8gdW5zdWJzY3JpYmUs
IHNlbmQgYW55IG1haWwgdG8gDQo+ICJmcmVlYnNkLW5ldC11bnN1YnNjcmliZUBmcmVlYnNkLm9y
ZyINCg==

From owner-freebsd-net@freebsd.org  Tue Nov 24 11:39:46 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B95CA36ED7
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Nov 2015 11:39:46 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3EA931FD1
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 11:39:46 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAOBdkZ9060863
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 11:39:46 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204437] 10.2 STABLE Crashing with IPSec Support
Date: Tue, 24 Nov 2015 11:39:45 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-STABLE
X-Bugzilla-Keywords: crash, needs-qa
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: peixotocassiano@gmail.com
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: mfc-stable9? mfc-stable10?
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-204437-2472-EL8HHgo1BY@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2015 11:39:46 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204437

--- Comment #17 from Cassiano Peixoto <peixotocassiano@gmail.com> ---
(In reply to emeric.poupon from comment #15)
Hi Emeric,

Your patch fixed the bug. Thank you very much for your help. My system is now
running for 15 hours with no reboot :)

Will you commit this patch ASAP?

Thank you again.

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Tue Nov 24 13:21:12 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4594CA36205
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Nov 2015 13:21:12 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 306D112BB
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 13:21:12 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAODLCbJ076671
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 13:21:12 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204437] 10.2 STABLE Crashing with IPSec Support
Date: Tue, 24 Nov 2015 13:21:10 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-STABLE
X-Bugzilla-Keywords: crash, needs-qa
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: emeric.poupon@stormshield.eu
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: mfc-stable9? mfc-stable10?
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-204437-2472-8hH0bLvDCe@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2015 13:21:12 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204437

--- Comment #18 from emeric.poupon@stormshield.eu ---
Hi,

I am glad you confirm it fixes the problem.
It is planned to be committed soon.

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Tue Nov 24 13:46:45 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A0CDA367BD
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Nov 2015 13:46:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E44981099
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 13:46:44 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAODkiUR033047
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 13:46:44 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204437] 10.2 STABLE Crashing with IPSec Support
Date: Tue, 24 Nov 2015 13:46:44 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-STABLE
X-Bugzilla-Keywords: crash, needs-qa
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: peixotocassiano@gmail.com
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: mfc-stable9? mfc-stable10?
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-204437-2472-61vaBvEiui@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204437-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2015 13:46:45 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204437

--- Comment #19 from Cassiano Peixoto <peixotocassiano@gmail.com> ---
(In reply to emeric.poupon from comment #18)
Please update this PR when it's commited. So i can keep posted :)

Thanks again.

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Tue Nov 24 22:22:41 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id F0DA8A372F7
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Tue, 24 Nov 2015 22:22:40 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id DA6671EA1
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 22:22:40 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAOMMeMQ032960
 for <freebsd-net@FreeBSD.org>; Tue, 24 Nov 2015 22:22:40 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 202983] ixv driver in 11.0-CURRENT(10.1 & 10.2 RELEASE) doesn't
 pass traffic using XEN hypervisor(AWS EC2)
Date: Tue, 24 Nov 2015 22:22:40 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 11.0-CURRENT
X-Bugzilla-Keywords: IntelNetworking
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: jlpetz@gmail.com
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-202983-2472-qDVgU8OljR@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-202983-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-202983-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Nov 2015 22:22:41 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202983

--- Comment #4 from Jarrod Petz <jlpetz@gmail.com> ---
Have had feedback from other engineers who confirmed this patch fixes the
issue.
https://reviews.freebsd.org/D4186

However there was some small issues with it. As detailed below.

-------------------------------------------------------------------------------------
I applied the changes from https://reviews.freebsd.org/D4186 to 11.0-CURRENT
(which among other things adds the missing VF-PF API renegotiation on the reset
path) and saw packets arriving in the instance, but tagged with vlan 2048.

# tcpdump -i ixv0 -e -vvv
tcpdump: listening on ixv0, link-type EN10MB (Ethernet), capture size 262144
bytes
10:39:07.551985 12:8d:18:b1:e5:6b (oui Unknown) > 12:39:94:73:0b:1d (oui
Unknown), ethertype 802.1Q (0x8100), length 60: vlan 2048, p 0, ethertype ARP,
Ethernet (len 6), IPv4 (len 4), Request who-has ip-10-0-3-114.ec2.internal tell
ip-10-0-3-1.ec2.internal, length 42
10:39:08.552133 12:8d:18:b1:e5:6b (oui Unknown) > 12:39:94:73:0b:1d (oui
Unknown), ethertype 802.1Q (0x8100), length 60: vlan 2048, p 0, ethertype ARP,
Ethernet (len 6), IPv4 (len 4), Request who-has ip-10-0-3-114.ec2.internal tell
ip-10-0-3-1.ec2.internal, length 42

After creating a vlan0 interface with ID 2048 on top of ixv0, I saw traffic
passing and DHCP worked.

# ifconfig vlan0 create
# ifconfig vlan0 vlan 2048 vlandev ixv0
# tcpdump -i vlan0 -vvv -s65534 -n
tcpdump: listening on vlan0, link-type EN10MB (Ethernet), capture size 65534
bytes
10:42:00.342629 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP
(17), length 328)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from
12:39:94:73:0b:1d, length 300, xid 0x5d968cbb, Flags [none] (0x0000)
          Client-Ethernet-Address 12:39:94:73:0b:1d
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Client-ID Option 61, length 7: ether 12:39:94:73:0b:1d
            Hostname Option 12, length 13: "ip-10-0-0-203"
            Parameter-Request Option 55, length 9:
              Subnet-Mask, BR, Time-Zone, Classless-Static-Route
              Default-Gateway, Domain-Name, Domain-Name-Server, Hostname
              Option 119
            END Option 255, length 0
            PAD Option 0, length 0, occurs 21
10:42:00.342916 IP (tos 0x10, ttl 16, id 0, offset 0, flags [none], proto UDP
(17), length 337)
    10.0.3.1.67 > 10.0.3.114.68: [udp sum ok] BOOTP/DHCP, Reply, length 309,
xid 0x5d968cbb, Flags [none] (0x0000)
          Your-IP 10.0.3.114
          Client-Ethernet-Address 12:39:94:73:0b:1d
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 10.0.3.1
            Lease-Time Option 51, length 4: 3600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 10.0.3.255
            Default-Gateway Option 3, length 4: 10.0.3.1
            Domain-Name Option 15, length 12: "ec2.internal"
            Domain-Name-Server Option 6, length 4: 10.0.0.2
            Hostname Option 12, length 13: "ip-10-0-3-114"
            END Option 255, length 0
10:42:02.365085 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP
(17), length 328)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from
12:39:94:73:0b:1d, length 300, xid 0x5d968cbb, Flags [none] (0x0000)
          Client-Ethernet-Address 12:39:94:73:0b:1d
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Request
            Server-ID Option 54, length 4: 10.0.3.1
            Requested-IP Option 50, length 4: 10.0.3.114
            Client-ID Option 61, length 7: ether 12:39:94:73:0b:1d
            Hostname Option 12, length 13: "ip-10-0-0-203"
            Parameter-Request Option 55, length 9:
              Subnet-Mask, BR, Time-Zone, Classless-Static-Route
              Default-Gateway, Domain-Name, Domain-Name-Server, Hostname
              Option 119
            END Option 255, length 0
            PAD Option 0, length 0, occurs 9
10:42:02.365274 IP (tos 0x10, ttl 16, id 0, offset 0, flags [none], proto UDP
(17), length 337)
    10.0.3.1.67 > 10.0.3.114.68: [udp sum ok] BOOTP/DHCP, Reply, length 309,
xid 0x5d968cbb, Flags [none] (0x0000)
          Your-IP 10.0.3.114
          Client-Ethernet-Address 12:39:94:73:0b:1d
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: ACK
            Server-ID Option 54, length 4: 10.0.3.1
            Lease-Time Option 51, length 4: 3600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 10.0.3.255
            Default-Gateway Option 3, length 4: 10.0.3.1
            Domain-Name Option 15, length 12: "ec2.internal"
            Domain-Name-Server Option 6, length 4: 10.0.0.2
            Hostname Option 12, length 13: "ip-10-0-3-114"
            END Option 255, length 0
10:42:02.370732 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.3.114
tell 10.0.3.114, length 28
10:42:16.345260 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.3.114
tell 10.0.3.1, length 42
10:42:16.345280 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.0.3.114 is-at
12:39:94:73:0b:1d, length 28
^C

So I added the following patch to the VF driver in the instance to force the VF
into stripping VLAN tags on RX and now the instance is able to acquire a DHCP
lease and pass traffic on the interface.

diff --git a/dev/ixgbe/if_ixv.c b/dev/ixgbe/if_ixv.c
index bd06492..a90b4f2 100644
--- a/dev/ixgbe/if_ixv.c
+++ b/dev/ixgbe/if_ixv.c
@@ -1700,6 +1700,7 @@ ixv_initialize_receive_units(struct adapter *adapter)
                /* Do the queue enabling last */
                rxdctl = IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(i));
                rxdctl |= IXGBE_RXDCTL_ENABLE;
+               rxdctl |= IXGBE_RXDCTL_VME;
                IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(i), rxdctl);
                for (int k = 0; k < 10; k++) {
                        if (IXGBE_READ_REG(hw, IXGBE_VFRXDCTL(i)) &

All this with an unmodified host driver. The patch probably breaks VLANs inside
the instance in some way.
-------------------------------------------------------------------------------------

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Wed Nov 25 08:25:00 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8E5D1A37A1C
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 08:25:00 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (mail.neosystem.cz [94.23.169.88])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 55D8C190D;
 Wed, 25 Nov 2015 08:24:59 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (unknown [127.0.10.15])
 by mail.neosystem.cz (Postfix) with ESMTP id 57990B8C7;
 Wed, 25 Nov 2015 09:24:51 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.neosystem.cz
Received: from dragon.sn.neosystem.cz (unknown
 [IPv6:2001:41d0:2:5ab8::100:101])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mail.neosystem.cz (Postfix) with ESMTPSA id 76B2DB8C1;
 Wed, 25 Nov 2015 09:24:50 +0100 (CET)
Date: Wed, 25 Nov 2015 09:21:45 +0100
From: Daniel Bilik <ddb@neosystem.org>
To: Kristof Provost <kp@FreeBSD.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-Id: <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
In-Reply-To: <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
Organization: neosystem.cz
X-Mailer: Sylpheed 3.4.3 (GTK+ 2.24.28; x86_64-portbld-dragonfly4.3)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 08:25:00 -0000

On Sun, 22 Nov 2015 13:02:40 +0100
Daniel Bilik <ddb@neosystem.org> wrote:

> Well, even though pf may play some role in the problem, I tend to suspect
> the routing table as the main trigger. There are several facts to support
> this...

It happened again, yesterday, and I can now definitely confirm that it's
related to default route.

In this case, affected address was 192.168.2.33. This host was unable to
connect to 192.168.2.15 (jail on the router), and router itself was unable
to even ping the affected host...

PING 192.168.2.33 (192.168.2.33): 56 data bytes
ping: sendto: Operation not permitted
ping: sendto: Operation not permitted

... because again it was pushing outgoing packets wrong way, via public
interface, where it's dropped by pf...

00:00:07.091814 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 12037, seq 0, length 64
00:00:01.011536 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 12037, seq 1, length 64

I've tried to just delete default route and enter it back to routing table.
In one tmux session ping was running, in another session I've performed
this...

# route delete default ; sleep 1 ; route add default 82.x.y.29

... and voila, ping started to communicate with affected host...

ping: sendto: Operation not permitted
ping: sendto: Operation not permitted
64 bytes from 192.168.2.33: icmp_seq=12 ttl=128 time=0.535 ms
64 bytes from 192.168.2.33: icmp_seq=13 ttl=128 time=0.264 ms

Touching nothing else (pf etc.), not rebooting, just "refreshing" the
default route entry, and the problem disappeared.

--
						Dan

From owner-freebsd-net@freebsd.org  Wed Nov 25 12:20:35 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73697A37454
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 12:20:35 +0000 (UTC)
 (envelope-from gpalmer@freebsd.org)
Received: from mail.in-addr.com (mail.in-addr.com
 [IPv6:2a01:4f8:191:61e8::2525:2525])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 365A714D2;
 Wed, 25 Nov 2015 12:20:35 +0000 (UTC)
 (envelope-from gpalmer@freebsd.org)
Received: from gjp by mail.in-addr.com with local (Exim 4.86 (FreeBSD))
 (envelope-from <gpalmer@freebsd.org>)
 id 1a1Z3p-0004eR-3g; Wed, 25 Nov 2015 12:20:33 +0000
Date: Wed, 25 Nov 2015 12:20:33 +0000
From: Gary Palmer <gpalmer@freebsd.org>
To: Daniel Bilik <ddb@neosystem.org>
Cc: Kristof Provost <kp@FreeBSD.org>, freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-ID: <20151125122033.GB41119@in-addr.com>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: gpalmer@freebsd.org
X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 12:20:35 -0000

On Wed, Nov 25, 2015 at 09:21:45AM +0100, Daniel Bilik wrote:
> On Sun, 22 Nov 2015 13:02:40 +0100
> Daniel Bilik <ddb@neosystem.org> wrote:
> 
> > Well, even though pf may play some role in the problem, I tend to suspect
> > the routing table as the main trigger. There are several facts to support
> > this...
> 
> It happened again, yesterday, and I can now definitely confirm that it's
> related to default route.
> 
> In this case, affected address was 192.168.2.33. This host was unable to
> connect to 192.168.2.15 (jail on the router), and router itself was unable
> to even ping the affected host...
> 
> PING 192.168.2.33 (192.168.2.33): 56 data bytes
> ping: sendto: Operation not permitted
> ping: sendto: Operation not permitted
> 
> ... because again it was pushing outgoing packets wrong way, via public
> interface, where it's dropped by pf...
> 
> 00:00:07.091814 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 12037, seq 0, length 64
> 00:00:01.011536 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 12037, seq 1, length 64
> 
> I've tried to just delete default route and enter it back to routing table.
> In one tmux session ping was running, in another session I've performed
> this...
> 
> # route delete default ; sleep 1 ; route add default 82.x.y.29
> 
> ... and voila, ping started to communicate with affected host...
> 
> ping: sendto: Operation not permitted
> ping: sendto: Operation not permitted
> 64 bytes from 192.168.2.33: icmp_seq=12 ttl=128 time=0.535 ms
> 64 bytes from 192.168.2.33: icmp_seq=13 ttl=128 time=0.264 ms
> 
> Touching nothing else (pf etc.), not rebooting, just "refreshing" the
> default route entry, and the problem disappeared.

When the problem happens, what does the output of

route -n get <unreachable IP>

show?  It would also be worth checking the arp table.

Gary

From owner-freebsd-net@freebsd.org  Wed Nov 25 12:53:02 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4E6CA37B85
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 12:53:02 +0000 (UTC)
 (envelope-from kp@vega.codepro.be)
Received: from venus.codepro.be (venus.codepro.be [IPv6:2a01:4f8:162:1127::2])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits))
 (Client CN "*.codepro.be", Issuer "Gandi Standard SSL CA 2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6A4D513E2
 for <freebsd-net@freebsd.org>; Wed, 25 Nov 2015 12:53:01 +0000 (UTC)
 (envelope-from kp@vega.codepro.be)
Received: from vega.codepro.be (unknown [172.16.1.3])
 by venus.codepro.be (Postfix) with ESMTP id 8A292D9F5;
 Wed, 25 Nov 2015 13:52:58 +0100 (CET)
Received: by vega.codepro.be (Postfix, from userid 1001)
 id 8637A1AAA6; Wed, 25 Nov 2015 13:52:58 +0100 (CET)
Date: Wed, 25 Nov 2015 13:52:58 +0100
From: Kristof Provost <kp@FreeBSD.org>
To: Daniel Bilik <ddb@neosystem.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-ID: <20151125125258.GB2469@vega.codepro.be>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
X-Checked-By-NSA: Probably
User-Agent: Mutt/1.5.24 (2015-08-30)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 12:53:02 -0000

On 2015-11-25 09:21:45 (+0100), Daniel Bilik <ddb@neosystem.org> wrote:
> Touching nothing else (pf etc.), not rebooting, just "refreshing" the
> default route entry, and the problem disappeared.
> 
I was still inclined to suspect pf based on your previous findings,
because pf subscribes to IP address (and group) information, so changing
those could have triggered something in pf.
It doesn't subscribe to routing information though, so right now it does
look unlikely to be a pf issue.

Regards,
Kristof

From owner-freebsd-net@freebsd.org  Wed Nov 25 13:20:01 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8B4E1A36109
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 13:20:01 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (mail.neosystem.cz
 [IPv6:2001:41d0:2:5ab8::10:15])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 5149512F3;
 Wed, 25 Nov 2015 13:20:01 +0000 (UTC)
 (envelope-from daniel.bilik@neosystem.cz)
Received: from mail.neosystem.cz (unknown [127.0.10.15])
 by mail.neosystem.cz (Postfix) with ESMTP id 77BCEBBCE;
 Wed, 25 Nov 2015 14:19:58 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.neosystem.cz
Received: from dragon.sn.neosystem.cz (unknown
 [IPv6:2001:41d0:2:5ab8::100:101])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mail.neosystem.cz (Postfix) with ESMTPSA id 60FA7BBC8;
 Wed, 25 Nov 2015 14:19:57 +0100 (CET)
Date: Wed, 25 Nov 2015 14:16:26 +0100
From: Daniel Bilik <ddb@neosystem.org>
To: Gary Palmer <gpalmer@freebsd.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-Id: <20151125141626.6f9579478e1b9d0eb1d4a84f@neosystem.cz>
In-Reply-To: <20151125122033.GB41119@in-addr.com>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
 <20151125122033.GB41119@in-addr.com>
Organization: neosystem.cz
X-Mailer: Sylpheed 3.4.3 (GTK+ 2.24.28; x86_64-portbld-dragonfly4.3)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 13:20:01 -0000

On Wed, 25 Nov 2015 12:20:33 +0000
Gary Palmer <gpalmer@freebsd.org> wrote:

> When the problem happens, what does the output of
> route -n get <unreachable IP>
> show?

I'll check this next time it happens. Thanks for the tip. Right now it
seems correct:

   route to: 192.168.2.33
destination: 192.168.2.0
       mask: 255.255.255.0
        fib: 0
  interface: re1
      flags: <UP,DONE,PINNED>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0 

>  It would also be worth checking the arp table.

Yes, checking arp table was one of the first things I did when analyzing
the problem. All arp entries seem correct, and do not change
before-during-after the problem. I've also tried to manually remove arp
entry for affected address (ie. forcing it to be refreshed), but it
does not help.

--
						Dan

From owner-freebsd-net@freebsd.org  Wed Nov 25 14:30:28 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67494A371DB
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 14:30:28 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org
 [IPv6:2001:1900:2254:206a::50:5])
 by mx1.freebsd.org (Postfix) with ESMTP id 49CE019C0
 for <freebsd-net@freebsd.org>; Wed, 25 Nov 2015 14:30:28 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 49D93A371DA; Wed, 25 Nov 2015 14:30:28 +0000 (UTC)
Delivered-To: net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49796A371D9
 for <net@mailman.ysv.freebsd.org>; Wed, 25 Nov 2015 14:30:28 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: from mail-lf0-x232.google.com (mail-lf0-x232.google.com
 [IPv6:2a00:1450:4010:c07::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id C7F1519BE
 for <net@freebsd.org>; Wed, 25 Nov 2015 14:30:27 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: by lfdl133 with SMTP id l133so62234427lfd.2
 for <net@freebsd.org>; Wed, 25 Nov 2015 06:30:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=/dTR4BJ1E9iHitRR0fscwfFZGAlZnMx9JToQ7T8bcyA=;
 b=Az67MlS3qsJpAMjF/aRMEnW/Fjy4DG77VGJPpv5JnJtTohTev3mSfWdeS1AYmLQPJh
 HxvppQ/s9sHrW/QdiIxFmp4TmpdD6QZaEv0jeVGNxiakhjQuVzNneeNkgqNoGxl6SpLZ
 vTaHvbLwnvoFGtgF4VUH0V7kZIJQCl9tyReEoDRsisfelWImR6KfAHYB+22H4wXjWg52
 gCsyXNWYLhjD0htRfZcvuvp99oy0yqkEXevyIsUhwef/mPbQKqViy9K01ZHUnEhri+M+
 s5xdzaYlI0ijKzmZYMpzWzzrdmOnx5VOnXNRzy+qfOe4e2Tkz3CGpWCQCaogQmr3PtRQ
 lDqQ==
MIME-Version: 1.0
X-Received: by 10.25.20.95 with SMTP id k92mr15797410lfi.13.1448461825789;
 Wed, 25 Nov 2015 06:30:25 -0800 (PST)
Received: by 10.25.148.213 with HTTP; Wed, 25 Nov 2015 06:30:25 -0800 (PST)
Date: Wed, 25 Nov 2015 22:30:25 +0800
Message-ID: <CAHmvmjGQzAUGf=UGQHXx0C8TPLAOa8G6ySD60BjbxhjGRgZM7Q@mail.gmail.com>
Subject: Can I send the Ethernet frames with particular payload via Netmap?
From: Hao Wu <wuhao.thu@gmail.com>
To: net@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 14:30:28 -0000

Hi all,

   I just start using Netmap. I want to know can I build the Ethernet
frames and send out via Netmap? I used to build the Ethernet frames via
Libnet, but it is too slow. So I turn to Netmap now. But I have no idea on
how to write the code using Netmap or what functions should I call?

   Any replay is highly appreciated!

+++++++++++++++++
Best,
Hao

From owner-freebsd-net@freebsd.org  Wed Nov 25 17:29:20 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id DC887A367F9
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 17:29:19 +0000 (UTC)
 (envelope-from kob6558@gmail.com)
Received: from mail-ob0-x235.google.com (mail-ob0-x235.google.com
 [IPv6:2607:f8b0:4003:c01::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9717C1FAC;
 Wed, 25 Nov 2015 17:29:19 +0000 (UTC)
 (envelope-from kob6558@gmail.com)
Received: by obbnk6 with SMTP id nk6so43961303obb.2;
 Wed, 25 Nov 2015 09:29:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=diwozxJUKcF34tx/S1NaAG2EDpRlDL2j0Lbw0PbYOXM=;
 b=uSBVyVfth2XuB7G93SNJ5srqSScn49ot69DwAMS5TqxIuvmobET3rSYVHi+rF0Uw6z
 2j9j5dtdES/K+bC8wvQFTYYTpfKVZ5KTtf1ujxzlE16vFPZC59OXOrm0Ucl3XUORaQDX
 TsybxhNurY+044QZdTUJcA/KEPqR0C+ec4NBP00X4ztc2OhHmy2xQ04Obo3mHIGF6/Lc
 mCT3RaFTM+GslBBrx5Lm/AFxxKZNQM4dapHd198Kya+afb6nVHBrOQER7WEK6pLHgcAv
 eNUb1/ZDijhmJmdBonaXNtDw6GBID7XRFFmtOl5WUBWpgaaXnVf1yNmGszk9AWhcF6Q3
 3MPg==
MIME-Version: 1.0
X-Received: by 10.182.148.164 with SMTP id tt4mr12231753obb.25.1448472558457; 
 Wed, 25 Nov 2015 09:29:18 -0800 (PST)
Sender: kob6558@gmail.com
Received: by 10.202.98.131 with HTTP; Wed, 25 Nov 2015 09:29:18 -0800 (PST)
In-Reply-To: <20151125141626.6f9579478e1b9d0eb1d4a84f@neosystem.cz>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
 <20151125122033.GB41119@in-addr.com>
 <20151125141626.6f9579478e1b9d0eb1d4a84f@neosystem.cz>
Date: Wed, 25 Nov 2015 09:29:18 -0800
X-Google-Sender-Auth: tpUZH05jWqMyjb2TyO20Jy_VGL0
Message-ID: <CAN6yY1vECcj5_4n0fRv6j-2CUJ8mHDoDaxBkYsKZaornQm2CKg@mail.gmail.com>
Subject: Re: Outgoing packets being sent via wrong interface
From: Kevin Oberman <rkoberman@gmail.com>
To: Daniel Bilik <ddb@neosystem.org>
Cc: Gary Palmer <gpalmer@freebsd.org>, 
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 17:29:20 -0000

On Wed, Nov 25, 2015 at 5:16 AM, Daniel Bilik <ddb@neosystem.org> wrote:

> On Wed, 25 Nov 2015 12:20:33 +0000
> Gary Palmer <gpalmer@freebsd.org> wrote:
>
> > When the problem happens, what does the output of
> > route -n get <unreachable IP>
> > show?
>
> I'll check this next time it happens. Thanks for the tip. Right now it
> seems correct:
>
>    route to: 192.168.2.33
> destination: 192.168.2.0
>        mask: 255.255.255.0
>         fib: 0
>   interface: re1
>       flags: <UP,DONE,PINNED>
>  recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
>        0         0         0         0      1500         1         0
>
> >  It would also be worth checking the arp table.
>
> Yes, checking arp table was one of the first things I did when analyzing
> the problem. All arp entries seem correct, and do not change
> before-during-after the problem. I've also tried to manually remove arp
> entry for affected address (ie. forcing it to be refreshed), but it
> does not help.
>
> --
>                                                 Dan


Have you looked for ICMP redirect traffic? Does your firewall allow them?
If so, could you try adding a rule to block them? I can't provide a sample
rule as I don't use pf, but you want to block ICMP type 5 messages.

For a good overview of redirects, see either Wikipedia
<https://en.wikipedia.org/wiki/Internet_Control_Message_Protocol#Redirect>
or Cisco
<http://www.cisco.com/c/en/us/support/docs/ip/routing-information-protocol-rip/13714-43.html>
articles (or Google for many others).
--
Kevin Oberman, Part time kid herder and retired Network Engineer
E-mail: rkoberman@gmail.com
PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683

From owner-freebsd-net@freebsd.org  Wed Nov 25 22:45:14 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5E62A362AF
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Wed, 25 Nov 2015 22:45:14 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-io0-x22e.google.com (mail-io0-x22e.google.com
 [IPv6:2607:f8b0:4001:c06::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6AFCF1870
 for <freebsd-net@freebsd.org>; Wed, 25 Nov 2015 22:45:14 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: by iouu10 with SMTP id u10so68977339iou.0
 for <freebsd-net@freebsd.org>; Wed, 25 Nov 2015 14:45:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=FxzwdmOliw6wYmn/EaPvNNfg7IXZMPrYFx6lkPRpdzY=;
 b=jnkJ5gVr7gn0NoNjFZz+hD8XwlSiu+P2AjklOrb5x3XCT24UtaK4yInfYrTBgPvzgM
 S0V2er4KEbEm0QgqqHI/BW9JeDmjuyWJHNaeFISW2II35S6uz51Us+fEhhs4eYUEFzFH
 jJRYX3u+m9S3pbRVKk+EOAmZ3DQyoKT1WUN5ndHLGW35pTXpR4nLvDNKi6/792EQd+OB
 58m3p/Jp0uvC1T5X/AJwNMzGq0hcecZfXJVFRAl3DnBs4MPRTT2jvkauG5yjmBK4lYQ0
 WAvqf2QcTUM6csis4wUAxSaBtRLQ6Qz9KOoaU5YCsqp9iPSMWE1Iec0Fop2sJZ2qOGH6
 niSQ==
MIME-Version: 1.0
X-Received: by 10.107.16.18 with SMTP id y18mr39247920ioi.113.1448491513697;
 Wed, 25 Nov 2015 14:45:13 -0800 (PST)
Received: by 10.107.170.102 with HTTP; Wed, 25 Nov 2015 14:45:13 -0800 (PST)
In-Reply-To: <CAN6yY1vECcj5_4n0fRv6j-2CUJ8mHDoDaxBkYsKZaornQm2CKg@mail.gmail.com>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
 <20151125122033.GB41119@in-addr.com>
 <20151125141626.6f9579478e1b9d0eb1d4a84f@neosystem.cz>
 <CAN6yY1vECcj5_4n0fRv6j-2CUJ8mHDoDaxBkYsKZaornQm2CKg@mail.gmail.com>
Date: Wed, 25 Nov 2015 17:45:13 -0500
Message-ID: <CAFMmRNyhc06utwTtTq69NQMqqMYz+RGGKYAU3Y5iPz3YSsggTg@mail.gmail.com>
Subject: Re: Outgoing packets being sent via wrong interface
From: Ryan Stone <rysto32@gmail.com>
To: Kevin Oberman <rkoberman@gmail.com>
Cc: Daniel Bilik <ddb@neosystem.org>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2015 22:45:14 -0000

An easier way to block ICMP redirects would be to set the sysctl:

sysctl net.inet.icmp.drop_redirect=1

From owner-freebsd-net@freebsd.org  Thu Nov 26 10:35:40 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B8102A39CFA
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Thu, 26 Nov 2015 10:35:40 +0000 (UTC) (envelope-from ulric@siag.nu)
Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org
 [IPv6:2001:1900:2254:206a::50:5])
 by mx1.freebsd.org (Postfix) with ESMTP id 976F4179A
 for <freebsd-net@freebsd.org>; Thu, 26 Nov 2015 10:35:40 +0000 (UTC)
 (envelope-from ulric@siag.nu)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 96542A39CF8; Thu, 26 Nov 2015 10:35:40 +0000 (UTC)
Delivered-To: net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95E05A39CF4
 for <net@mailman.ysv.freebsd.org>; Thu, 26 Nov 2015 10:35:40 +0000 (UTC)
 (envelope-from ulric@siag.nu)
Received: from smtp.outgoing.loopia.se (smtp.outgoing.loopia.se [194.9.95.113])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 52FF71798
 for <net@freebsd.org>; Thu, 26 Nov 2015 10:35:39 +0000 (UTC)
 (envelope-from ulric@siag.nu)
Received: from s314.loopia.se (localhost [127.0.0.1])
 by s314.loopia.se (Postfix) with ESMTP id C7FAF162B579
 for <net@freebsd.org>; Thu, 26 Nov 2015 11:25:54 +0100 (CET)
X-Loopia-Auth: webmail
X-Loopia-User: ulric@siag.nu
Received: from s498.loopia.se (unknown [172.21.200.96])
 by s314.loopia.se (Postfix) with ESMTP id AA90220057FD;
 Thu, 26 Nov 2015 11:25:54 +0100 (CET)
Received: from s405.loopia.se (unknown [172.21.200.105])
 by s498.loopia.se (Postfix) with ESMTP id A30D245F912;
 Thu, 26 Nov 2015 11:25:54 +0100 (CET)
X-Virus-Scanned: amavisd-new at amavis.loopia.se
X-Spam-Flag: NO
X-Spam-Score: -0.331
X-Spam-Level: 
X-Spam-Status: No, score=-0.331 tagged_above=-999 required=6.2
 tests=[ALL_TRUSTED=-1, AWL=0.669] autolearn=disabled
Received: from s498.loopia.se ([172.21.200.105])
 by s405.loopia.se (s405.loopia.se [172.21.200.135]) (amavisd-new, port 10024)
 with LMTP id G0V0CYfB_os3; Thu, 26 Nov 2015 11:25:53 +0100 (CET)
Received: from localhost (webmail.loopia.se [194.9.95.85])
 (Authenticated sender: ulric@siag.nu)
 by s498.loopia.se (Postfix) with ESMTPA id 9E82A45EDB0;
 Thu, 26 Nov 2015 11:25:53 +0100 (CET)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Thu, 26 Nov 2015 11:25:53 +0100
From: ulric@siag.nu
To: Hao Wu <wuhao.thu@gmail.com>
Cc: net@freebsd.org, owner-freebsd-net@freebsd.org
Subject: Re: Can I send the Ethernet frames with particular payload via Netmap?
In-Reply-To: <CAHmvmjGQzAUGf=UGQHXx0C8TPLAOa8G6ySD60BjbxhjGRgZM7Q@mail.gmail.com>
References: <CAHmvmjGQzAUGf=UGQHXx0C8TPLAOa8G6ySD60BjbxhjGRgZM7Q@mail.gmail.com>
Message-Id: <9b7b2eb25fd42295d3f2f8a46d2a3c1e@siag.nu>
X-Sender: ulric@siag.nu
User-Agent: Loopia Webmail/1.1.3
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2015 10:35:40 -0000


2015-11-25 15:30 skrev Hao Wu:
> Hi all,
> 
>    I just start using Netmap. I want to know can I build the Ethernet
> frames and send out via Netmap? I used to build the Ethernet frames via
> Libnet, but it is too slow. So I turn to Netmap now. But I have no idea 
> on
> how to write the code using Netmap or what functions should I call?
> 
>    Any replay is highly appreciated!


It's crazy simple. To open a netmap descriptor:

	d = nm_open(ifname, NULL, 0, 0);

To receive a frame:

	uint8_t *b = nm_nextpkt(d, &h);

To send a frame:

	int n = nm_inject(d, b, len);

Working example from Pen:

https://github.com/UlricE/pen/blob/master/dsr.c

Ulric

From owner-freebsd-net@freebsd.org  Thu Nov 26 12:14:11 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D611A39A4A
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Thu, 26 Nov 2015 12:14:11 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org
 [IPv6:2001:1900:2254:206a::50:5])
 by mx1.freebsd.org (Postfix) with ESMTP id E2D411A60
 for <freebsd-net@freebsd.org>; Thu, 26 Nov 2015 12:14:10 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: by mailman.ysv.freebsd.org (Postfix)
 id DFBD0A39A48; Thu, 26 Nov 2015 12:14:10 +0000 (UTC)
Delivered-To: net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C558EA39A47;
 Thu, 26 Nov 2015 12:14:10 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: from mail-lf0-x233.google.com (mail-lf0-x233.google.com
 [IPv6:2a00:1450:4010:c07::233])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4EB551A5E;
 Thu, 26 Nov 2015 12:14:10 +0000 (UTC)
 (envelope-from wuhao.thu@gmail.com)
Received: by lfs39 with SMTP id 39so92891988lfs.3;
 Thu, 26 Nov 2015 04:14:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=IbG9KZdhjTjxNiP+DBXerQIsDZzErEqKKQEN9i5GhVM=;
 b=0ZO1wOrOT8N4z84DW+iRDS1+WI0DSfinaSmiMAOMB5bOGjzMhpQ3sHZytlxPXpXjDe
 +r2m2xJIi2pIqMGs9+AWHqVBOfoGTlm2ssLGBqPMWirrw5DBYk5cNbXvz8G+2Y0FuW0H
 ulJIZ+6O8T7bRPDKfVxuTsXC/QluoHiGrVLWTCG6TqfovTdDqSlZ5VL8htft56jlX/g1
 L6EhQGbLhpwcBktqjCpyZgmz3fKMZOOUgzcKA3kqvgyK7oQH8BIwj9xO0owgzI/jndwW
 wqbDsr5Y4+xN26mmvG0xadpAFnjX6q90msrTuZwPd7JoQFP+83ZGpSAHtmKmKOzecH1v
 BCwA==
MIME-Version: 1.0
X-Received: by 10.112.199.4 with SMTP id jg4mr14915780lbc.59.1448540048159;
 Thu, 26 Nov 2015 04:14:08 -0800 (PST)
Received: by 10.25.148.213 with HTTP; Thu, 26 Nov 2015 04:14:08 -0800 (PST)
In-Reply-To: <9b7b2eb25fd42295d3f2f8a46d2a3c1e@siag.nu>
References: <CAHmvmjGQzAUGf=UGQHXx0C8TPLAOa8G6ySD60BjbxhjGRgZM7Q@mail.gmail.com>
 <9b7b2eb25fd42295d3f2f8a46d2a3c1e@siag.nu>
Date: Thu, 26 Nov 2015 20:14:08 +0800
Message-ID: <CAHmvmjHyOKiTt6Q60tqt7gbDnDuBPa27wGHCPtOSV_gr_OtuHA@mail.gmail.com>
Subject: Re: Can I send the Ethernet frames with particular payload via Netmap?
From: Hao Wu <wuhao.thu@gmail.com>
To: ulric@siag.nu
Cc: net@freebsd.org, owner-freebsd-net@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2015 12:14:11 -0000

Hi Ulric,

   Got it! Many thanks :)

+++++++++++++++++
Best,
Hao

On Thu, Nov 26, 2015 at 6:25 PM, <ulric@siag.nu> wrote:

>
>
> 2015-11-25 15:30 skrev Hao Wu:
>
>> Hi all,
>>
>>    I just start using Netmap. I want to know can I build the Ethernet
>> frames and send out via Netmap? I used to build the Ethernet frames via
>> Libnet, but it is too slow. So I turn to Netmap now. But I have no idea on
>> how to write the code using Netmap or what functions should I call?
>>
>>    Any replay is highly appreciated!
>>
>
>
> It's crazy simple. To open a netmap descriptor:
>
>         d = nm_open(ifname, NULL, 0, 0);
>
> To receive a frame:
>
>         uint8_t *b = nm_nextpkt(d, &h);
>
> To send a frame:
>
>         int n = nm_inject(d, b, len);
>
> Working example from Pen:
>
> https://github.com/UlricE/pen/blob/master/dsr.c
>
> Ulric
>

From owner-freebsd-net@freebsd.org  Thu Nov 26 16:41:52 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 06B4AA3A7A4
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Thu, 26 Nov 2015 16:41:52 +0000 (UTC)
 (envelope-from steven@multiplay.co.uk)
Received: from mail-wm0-x22c.google.com (mail-wm0-x22c.google.com
 [IPv6:2a00:1450:400c:c09::22c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8FE641042
 for <freebsd-net@freebsd.org>; Thu, 26 Nov 2015 16:41:51 +0000 (UTC)
 (envelope-from steven@multiplay.co.uk)
Received: by wmww144 with SMTP id w144so27951819wmw.1
 for <freebsd-net@freebsd.org>; Thu, 26 Nov 2015 08:41:49 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-type:content-transfer-encoding;
 bh=zGaCqlfEIyjkycC5lVbrA2dzIttoGT3UhSnIc9uUsmg=;
 b=Ym0//sDfx7OCjmEgNh+olVlxqUzaGW6TZBrgrEPr/wqsjg+X1GD/li/okeKp5oO5W6
 da/d6TwYLjQ33813k6T5/AcZ2ZvN/YO9TN2SzZFYXQ7wQ6/Xfrb7Ww0rBbBgZwHFt85Y
 LNED3jwNpvyc3yMSFGhQE/FTWM1+KEJ4iCKR6xd+rWmNAG81tr7dxj7JzgQD+V7tRJil
 rnJGOtx/dZ/sUsEsKrama/VwNQhcUOo3O13mQKWUtUpUa1nCdQ59PzpvEkiqmKQOATJw
 46LyInd9GuqC0hV4n/TpwK6ZjkUNVguIdQmrJ1oc3hpMneF2+TWm8SQhwAtVzA4Q+Nqr
 rR4g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-type
 :content-transfer-encoding;
 bh=zGaCqlfEIyjkycC5lVbrA2dzIttoGT3UhSnIc9uUsmg=;
 b=lEnstK5NusFe8sAv1IiiQmnh04FdXjq47kzN++68TVwxkT105gb0wooeVuWEgw/QKs
 042aZHQDQihsqqLvzwTUdsEGwXWMbn/g6Xpfas8vEvliaaDTL8Tz2HZqN+ytUndlvSwx
 qTSO1WOxX2j4kK9S+KY353JbtmWFi2vAmGlPpsJoCF5TSCEQCt0w3vtathsrCURdGfMc
 nwzrYQ4XpFrHfoLctjByELR9BfOvDLN59sFaOjJLUEs9aB2LIjyZWesa5PURpTqB7uW+
 THFY8vyuLH2AkgH+VVNJ9ThZcQbYrClgFaF12MAnkYVs2IeFAZqvR7tIJ/KQ1GUCl8Ao
 eDbw==
X-Gm-Message-State: ALoCoQmP5A3hTGRL1C6A2KSMuwJcGm6wywPj+4MaiGbBdcQWdDmnfPi+6F040UM6L5pvedpmfEkA
X-Received: by 10.28.4.7 with SMTP id 7mr4600539wme.85.1448556109659;
 Thu, 26 Nov 2015 08:41:49 -0800 (PST)
Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171])
 by smtp.gmail.com with ESMTPSA id jz1sm28811223wjc.27.2015.11.26.08.41.48
 (version=TLSv1/SSLv3 cipher=OTHER);
 Thu, 26 Nov 2015 08:41:48 -0800 (PST)
Subject: Re: Intel XL710 broken link down detection?
To: "Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>,
 Ryan Stone <rysto32@gmail.com>
References: <564357E0.1050002@freebsd.org> <56436A5F.4020102@multiplay.co.uk>
 <CAFMmRNwuc4N+6TSCFuknbfYZnMzXuuunFbyeGfyaOzYWxgWfaA@mail.gmail.com>
 <56446159.3080405@multiplay.co.uk>
 <2A35EA60C3C77D438915767F458D65688080C4A8@ORSMSX111.amr.corp.intel.com>
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
From: Steven Hartland <steven@multiplay.co.uk>
Message-ID: <5657364C.10806@multiplay.co.uk>
Date: Thu, 26 Nov 2015 16:41:48 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <2A35EA60C3C77D438915767F458D65688080C4A8@ORSMSX111.amr.corp.intel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2015 16:41:52 -0000

Its been a couple of weeks now so wanted to check if you had any news on 
this Pieper?

Also for extra visibility I'm looking to bring ixl in stable/10 up to 
date with the MFC's commits present in HEAD, which are quite a way behind.

One commit requires big changes between HEAD and stable/10, due to it 
being a combination of RSS support (not going to MFC'ed) and some bug 
fixes, is up for review here:
https://reviews.freebsd.org/D4265

I'm looking to use this for a 10.x based DC rollout over the next few 
weeks, so if anyone can look at that it would be most appreciated.

     Regards
     Steve

On 12/11/2015 15:18, Pieper, Jeffrey E wrote:
> We already have a fix in place that will be committed for review shortly.
>
> Thanks,
> Jeff
>
> -----Original Message-----
> From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-net@freebsd.org] On Behalf Of Steven Hartland
> Sent: Thursday, November 12, 2015 1:52 AM
> To: Ryan Stone <rysto32@gmail.com>
> Cc: Jack F Vogel <jfv@freebsd.org>; freebsd-net@freebsd.org
> Subject: Re: Intel XL710 broken link down detection?
>
> Yes this works but a better way IMO would be to invert the bits we want:
> https://people.freebsd.org/~smh/ixl_int_init.patch
>
> If there are no objections then I'll commit this later today.
>
> Also just fixed the debug sysctls from causing panics when compiled with
> INVARIANTS see:
> https://svnweb.freebsd.org/base?view=revision&revision=290708
>
>       Regards
>       Steve
>
> On 11/11/2015 16:31, Ryan Stone wrote:
>> On Wed, Nov 11, 2015 at 11:18 AM, Steven Hartland
>> <steven@multiplay.co.uk <mailto:steven@multiplay.co.uk>> wrote:
>>
>>      Comparing this to the Linux driver which does detect the link down
>>      I've discovered it actually polls the link status by default in
>>      its watchdog.
>>
>>      Disabling this with "ethtool --set-priv-flags eth1 LinkPolling
>>      off" and the Linux driver also fails to detect link down.
>>
>>      So this seems like a firmware or even hardware bug where it should
>>      be reporting down events and the Linux driver has been updated to
>>      workaround the problem?
>>
>>
>> No, apparently the Linux devs just didn't read the datasheet closely
>> enough (and presumably the FreeBSD driver copied the mistake).  There
>> is a mask of interrupt causes that works backwards from how one would
>> expect; you mask out events that you *don't* want rather than events
>> that you do want.  Both the Linux and FreeBSD drivers pass a mask of
>> events that they want interrupts for (the only reason why it appears
>> to work on link up is that the the AN Completed event fires when link
>> is up, as far as I can tell).  Try the following patch:
>>
>> https://people.freebsd.org/~rstone/patches/ixl_link_int.diff
>> <https://people.freebsd.org/%7Erstone/patches/ixl_link_int.diff>
>>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


From owner-freebsd-net@freebsd.org  Thu Nov 26 22:56:36 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 45E61A3A522
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Thu, 26 Nov 2015 22:56:36 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 316FF18E1
 for <freebsd-net@FreeBSD.org>; Thu, 26 Nov 2015 22:56:36 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAQMuaeE075106
 for <freebsd-net@FreeBSD.org>; Thu, 26 Nov 2015 22:56:36 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204831] mld_v2 listener report does not report all active
 groups  to the router
Date: Thu, 26 Nov 2015 22:56:35 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 9.3-STABLE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Some People
X-Bugzilla-Who: linimon@FreeBSD.org
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: assigned_to
Message-ID: <bug-204831-2472-snU6HI3XTn@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204831-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204831-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2015 22:56:36 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204831

Mark Linimon <linimon@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|freebsd-bugs@FreeBSD.org    |freebsd-net@FreeBSD.org

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Fri Nov 27 01:57:46 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4D72DA3A73F
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri, 27 Nov 2015 01:57:46 +0000 (UTC)
 (envelope-from mmacy@nextbsd.org)
Received: from sender163-mail.zoho.com (sender163-mail.zoho.com
 [74.201.84.163])
 (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3E34D1A7F
 for <freebsd-net@freebsd.org>; Fri, 27 Nov 2015 01:57:45 +0000 (UTC)
 (envelope-from mmacy@nextbsd.org)
Received: from mail.zoho.com by mx.zohomail.com
 with SMTP id 1448589456055734.7743441994829;
 Thu, 26 Nov 2015 17:57:36 -0800 (PST)
Date: Thu, 26 Nov 2015 17:57:35 -0800
From: Matthew Macy <mmacy@nextbsd.org>
To: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Message-ID: <15146a8f285.b094791a15089.3823664487014698900@nextbsd.org>
Subject: TCP notes and incast recommendations
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Priority: Medium
User-Agent: Zoho Mail
X-Mailer: Zoho Mail
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 01:57:46 -0000

In an effort to be somewhat current on the state TCP I've collected a small=
 bibliography. I've tried
to summarize RFCs and papers that I believe to be important and provide som=
e general background for
others who do not have a deeper familiarity with TCP or congestion control =
- in particular as impacts DCTCP.


Recommendations references phabricator changes.

Table Of Contents:

I)    - A Roadmap for Transmission Control Protocol (TCP)
        Specification Documents (RFC 7414)

II)   - Metrics for the Evaluation of Congestion Control Mechanisms
      =09(RFC 5166)

III)  - TCP Congestion Control (RFC 5681)

IV)   - Computing TCP's Retransmission Timer (RFC 6298)

V)    - Increasing TCP's Initial Window (RFC 6928)

VI)   - TCP Extensions for High Performance [RTO updates
      =09and changes to RFC 1323] (RFC 7323)

VII)  - Updating TCP to Support Rate-Limited Traffic
      =09[Congestion Window Validation] (RFC 7661)

VIII) - Active Queue Management (AQM)

IX)   - Explicit Congestion Notification (ECN)

X)    - AccurateECN (AccECN)

XI)   - Incast Causes and Solutions

XII)  - Data Center Transmission Control Protocol (DCTCP)

XIII) - Incast TCP (ICTCP)

XIV)  - Quantum Congestion Notification (QCN)

XV)   - Recommendations


A Roadmap for Transmission Control Protocol (TCP)
Specification Documents [important]:
https://tools.ietf.org/html/rfc7414

   A correct and efficient implementation of the Transmission Control
   Protocol (TCP) is a critical part of the software of most Internet
   hosts.  As TCP has evolved over the years, many distinct documents
   have become part of the accepted standard for TCP.  At the same time,
   a large number of experimental modifications to TCP have also been
   published in the RFC series, along with informational notes, case
   studies, and other advice.

   As an introduction to newcomers and an attempt to organize the
   plethora of information for old hands, this document contains a
   roadmap to the TCP-related RFCs.  It provides a brief summary of the
   RFC documents that define TCP.  This should provide guidance to
   implementers on the relevance and significance of the standards-track
   extensions, informational notes, and best current practices that
   relate to TCP.

   This roadmap includes a brief description of the contents of each
   TCP-related RFC [N.B. I only include an excerpt of the summary for those
   that I consider interesting or important].  In some cases, we simply sup=
ply=20
   the abstract or a key summary sentence from the text as a terse descript=
ion. =20
   In addition, a letter code after an RFC number indicates its category in=
 the
   RFC series (see BCP 9 [RFC2026] for explanation of these categories):

   S - Standards Track (Proposed Standard, Draft Standard, or Internet
       Standard)

   E - Experimental

   I - Informational

   H - Historic

   B - Best Current Practice

   U - Unknown (not formally defined)


[2.]  Core Functionality

   A small number of documents compose the core specification of TCP.
   These define the required core functionalities of TCP's header
   parsing, state machine, congestion control, and retransmission
   timeout computation.  These base specifications must be correctly
   followed for interoperability.


   RFC 793 S: "Transmission Control Protocol", STD 7 (September 1981)
              (Errata)

      This is the fundamental TCP specification document [RFC793].
      Written by Jon Postel as part of the Internet protocol suite's
      core, it describes the TCP packet format, the TCP state machine
      and event processing, and TCP's semantics for data transmission,
      reliability, flow control, multiplexing, and acknowledgment.

   RFC 1122 S: "Requirements for Internet Hosts - Communication Layers"
               (October 1989)

      This document [RFC1122] updates and clarifies RFC 793 (see above
      in Section 2), fixing some specification bugs and oversights.  It
      also explains some features such as keep-alives and Karn's and
      Jacobson's RTO estimation algorithms [KP87][Jac88][JK92].  ICMP
      interactions are mentioned, and some tips are given for efficient
      implementation.  RFC 1122 is an Applicability Statement, listing
      the various features that MUST, SHOULD, MAY, SHOULD NOT, and MUST
      NOT be present in standards-conforming TCP implementations.
      Unlike a purely informational roadmap, this Applicability
      Statement is a standards document and gives formal rules for
      implementation.


   RFC 2460 S: "Internet Protocol, Version 6 (IPv6) Specification"
               (December 1998) (Errata)

      This document [RFC2460] is of relevance to TCP because it defines
      how the pseudo-header for TCP's checksum computation is derived
      when 128-bit IPv6 addresses are used instead of 32-bit IPv4
      addresses.  Additionally, RFC 2675 (see Section 3.1 of this
      document) describes TCP changes required to support IPv6
      jumbograms.

   RFC 2873 S: "TCP Processing of the IPv4 Precedence Field" (June 2000)
               (Errata)

      This document [RFC2873] removes from the TCP specification all
      processing of the precedence bits of the TOS byte of the IP
      header.  This resolves a conflict over the use of these bits
      between RFC 793 (see above in Section 2) and Differentiated
      Services [RFC2474].

   RFC 5681 S: "TCP Congestion Control" (August 2009)

      Although RFC 793 (see above in Section 2) did not contain any
      congestion control mechanisms, today congestion control is a
      required component of TCP implementations.  This document
      [RFC5681] defines congestion avoidance and control mechanism for
      TCP, based on Van Jacobson's 1988 SIGCOMM paper [Jac88].

      A number of behaviors that together constitute what the community
      refers to as "Reno TCP" is described in RFC 5681.  The name "Reno"
      comes from the Net/2 release of the 4.3 BSD operating system.
      This is generally regarded as the least common denominator among
      TCP flavors currently found running on Internet hosts.  Reno TCP
      includes the congestion control features of slow start, congestion
      avoidance, fast retransmit, and fast recovery.

      RFC 5681 details the currently accepted congestion control
      mechanism, while RFC 1122, (see above in Section 2) mandates that
      such a congestion control mechanism must be implemented.  RFC 5681
      differs slightly from the other documents listed in this section,
      as it does not affect the ability of two TCP endpoints to
      communicate;

      RFCs 2001 and 2581 are the conceptual precursors of RFC 5681.  The
      most important changes relative to RFC 2581 are:

      (a)  The initial window requirements were changed to allow larger
           Initial Windows as standardized in [RFC3390] (see Section 3.2
           of this document).
      (b)  During slow start and congestion avoidance, the usage of
           Appropriate Byte Counting [RFC3465] (see Section 3.2 of this
           document) is explicitly recommended.
      (c)  The use of Limited Transmit [RFC3042] (see Section 3.3 of
           this document) is now recommended.

   RFC 6093 S: "On the Implementation of the TCP Urgent Mechanism"
               (January 2011)

      This document [RFC6093] analyzes how current TCP stacks process
      TCP urgent indications, ... and recommends against the use of urgent=
=20
      mechanism.

   RFC 6298 S: "Computing TCP's Retransmission Timer" (June 2011)

      Abstract of RFC 6298 [RFC6298]: "This document defines the
      standard algorithm that Transmission Control Protocol (TCP)
      senders are required to use to compute and manage their
      retransmission timer.  It expands on the discussion in
      Section 4.2.3.1 of RFC 1122 and upgrades the requirement of
      supporting the algorithm from a SHOULD to a MUST."  RFC 6298
      updates RFC 2988 by _changing_ the initial RTO from _3s_ to _1s_
      [emphasis mine].

   RFC 6691 I: "TCP Options and Maximum Segment Size (MSS)" (July 2012)

      This document [RFC6691] clarifies what value to use with the TCP
      Maximum Segment Size (MSS) option when IP and TCP options are in
      use.


[3.]  Strongly Encouraged Enhancements

   This section describes recommended TCP modifications that improve
   performance and security.  Section 3.1 represents fundamental changes
   to the protocol.  Sections 3.2 and 3.3 list improvements over the
   congestion control and loss recovery mechanisms as specified in RFC
   5681 (see Section 2).  Section 3.4 describes algorithms that allow a
   TCP sender to detect whether it has entered loss recovery spuriously.
   Section 3.5 comprises Path MTU Discovery mechanisms.  Schemes for
   TCP/IP header compression are listed in Section 3.6.  Finally,
   Section 3.7 deals with the problem of preventing acceptance of forged
   segments and flooding attacks.

[3.1.]  Fundamental Changes

   RFCs 2675 and 7323 represent fundamental changes to TCP by redefining
   how parts of the basic TCP header and options are interpreted.  RFC
   7323 defines the Window Scale option, which reinterprets the
   advertised receive window.  RFC 2675 specifies that MSS option and
   urgent pointer fields with a value of 65,535 are to be treated

   RFC 2675 S: "IPv6 Jumbograms" (August 1999) (Errata)

   RFC 7323 S: "TCP Extensions for High Performance" (September 2014)

      This document [RFC7323] defines TCP extensions for window scaling,
      timestamps, and protection against wrapped sequence numbers, for
      efficient and safe operation over paths with large bandwidth-delay
      products.  These extensions are commonly found in currently used
      systems.  The predecessor of this document, RFC 1323, was
      published in 1992, and is deployed in most TCP implementations.
      This document includes fixes and clarifications based on the
      gained deployment experience.  One specific issued addressed in
      this specification is a recommendation how to modify the algorithm
      for estimating the mean RTT when timestamps are used.  RFCs 1072,
      1185, and 1323 are the conceptual precursors of RFC 7323.

[3.2.] Congestion Control Extensions

   Two of the most important aspects of TCP are its congestion control
   and loss recovery features.  TCP treats lost packets as indicating
   congestion-related loss and cannot distinguish between congestion-
   related loss and loss due to transmission errors.  Even when ECN is
   in use, there is a rather intimate coupling between congestion
   control and loss recovery mechanisms.  There are several extensions
   to both features, and more often than not, a particular extension
   applies to both.  In these two subsections, we group enhancements to
   TCP's congestion control, while the next subsection focus on TCP's
   loss recovery.

   RFC 3168 S: "The Addition of Explicit Congestion Notification (ECN)
               to IP" (September 2001)

      This document [RFC3168] defines a means for end hosts to detect
      congestion before congested routers are forced to discard packets.
      Although congestion notification takes place at the IP level, ECN
      requires support at the transport level (e.g., in TCP) to echo the
      bits and adapt the sending rate.  This document updates RFC 793
      (see Section 2 of this document) to define two previously unused
      flag bits in the TCP header for ECN support.

   RFC 3390 S: "Increasing TCP's Initial Window" (October 2002)

      This document [RFC3390] specifies an increase in the permitted
      initial window for TCP from one segment to three or four segments
      during the slow start phase, depending on the segment size.

   RFC 3465 E: "TCP Congestion Control with Appropriate Byte Counting
               (ABC)" (February 2003)

      This document [RFC3465] suggests that congestion control use the
      number of bytes acknowledged instead of the number of
      acknowledgments received.  This change improves the performance of
      TCP in situations where there is no one-to-one relationship
      between data segments and acknowledgments (e.g., delayed ACKs or
      ACK loss). ABC is recommended by RFC 5681 (see Section 2).

   RFC 6633 S: "Deprecation of ICMP Source Quench Messages" (May 2012)

      This document [RFC6633] formally deprecates the use of ICMP Source
      Quench messages by transport protocols and recommends against the
      implementation of [RFC1016].

[3.3.]  Loss Recovery Extensions

   For the typical implementation of the TCP fast recovery algorithm
   described in RFC 5681 (see Section 2 of this document), a TCP sender
   only retransmits a segment after a retransmit timeout has occurred,
   or after three duplicate ACKs have arrived triggering the fast
   retransmit.  A single RTO might result in the retransmission of
   several segments, while the fast retransmit algorithm in RFC 5681
   leads only to a single retransmission.  Hence, multiple losses from a
   single window of data can lead to a performance degradation.
   Documents listed in this section aim to improve the overall
   performance of TCP's standard loss recovery algorithms.  In
   particular, some of them allow TCP senders to recover more
   effectively when multiple segments are lost from a single flight of
   data.

   RFC 2018 S: "TCP Selective Acknowledgment Options" (October 1996)
               (Errata)

      When more than one packet is lost during one RTT, TCP may
      experience poor performance since a TCP sender can only learn
      about a single lost packet per RTT from cumulative
      acknowledgments.  This document [RFC2018] defines the basic
      selective acknowledgment (SACK) mechanism for TCP, which can help
      to overcome these limitations.  The receiving TCP returns SACK
      blocks to inform the sender which data has been received.  The
      sender can then retransmit only the missing data segments.


   RFC 3042 S: "Enhancing TCP's Loss Recovery Using Limited Transmit"
               (January 2001)

      Abstract of RFC 3042 [RFC3042]: "This document proposes a new
      Transmission Control Protocol (TCP) mechanism that can be used to
      more effectively recover lost segments when a connection's
      congestion window is small, or when a large number of segments are
      lost in a single transmission window."  This algorithm described
      in RFC 3042 is called "Limited Transmit". Limited Transmit is=20
      recommended by RFC 5681 (see Section 2 of this document).

   RFC 6582 S: "The NewReno Modification to TCP's Fast Recovery
               Algorithm" (April 2012)

      This document [RFC6582] specifies a modification to the standard
      Reno fast recovery algorithm, whereby a TCP sender can use partial
      acknowledgments to make inferences determining the next segment to
      send in situations where SACK would be helpful but isn't
      available.  Although it is only a slight modification, the NewReno
      behavior can make a significant difference in performance when
      multiple segments are lost from a single window of data.

   RFC 6675 S: "A Conservative Loss Recovery Algorithm Based on
               Selective Acknowledgment (SACK) for TCP" (August 2012)

      This document [RFC6675] describes a conservative loss recovery
      algorithm for TCP that is based on the use of the selective
      acknowledgment (SACK) TCP option [RFC2018] (see above in
      Section 3.3).  The algorithm conforms to the spirit of the
      congestion control specification in RFC 5681 (see Section 2 of
      this document), but allows TCP senders to recover more effectively
      when multiple segments are lost from a single flight of data.

      RFC 6675 is a revision of RFC 3517 to address several situations
      that are not handled explicitly before.  In particular,

      (a)  it improves the loss detection in the event that the sender
           has outstanding segments that are smaller than Sender Maximum
           Segment Size (SMSS).
      (b)  it modifies the definition of a "duplicate acknowledgment" to
           utilize the SACK information in detecting loss.
      (c)  it maintains the ACK clock under certain circumstances
           involving loss at the end of the window.


3.4.  Detection and Prevention of Spurious Retransmissions

   Spurious retransmission timeouts are harmful to TCP performance and
   multiple algorithms have been defined for detecting when spurious
   retransmissions have occurred, but they respond differently with
   regard to their manners of recovering performance.  The IETF defined
   multiple algorithms because there are trade-offs in whether or not
   certain TCP options need to be implemented and concerns about IPR
   status.  The Standards Track RFCs in this section are closely related
   to the Experimental RFCs in Section 4.5 also addressing this topic.


   RFC 2883 S: "An Extension to the Selective Acknowledgement (SACK)
               Option for TCP" (July 2000)

      This document [RFC2883] extends RFC 2018 (see Section 3.3 of this
      document).  It enables use of the SACK option to acknowledge
      duplicate packets.  With this extension, called DSACK, the sender
      is able to infer the order of packets received at the receiver
      and, therefore, to infer when it has unnecessarily retransmitted a
      packet.  A TCP sender could then use this information to detect
      spurious retransmissions (see [RFC3708]).

  RFC 4015 S: "The Eifel Response Algorithm for TCP" (February 2005)

      Abstract of RFC 4015 [RFC4015]: "Based on an appropriate detection
      algorithm, the Eifel response algorithm provides a way for a TCP
      sender to respond to a detected spurious timeout.


   RFC 5682 S: "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
               Spurious Retransmission Timeouts with TCP" (September
               2009)

      The F-RTO detection algorithm [RFC5682], originally described in
      RFC 4138, provides an option for inferring spurious retransmission
      timeouts.  Unlike some similar detection methods (e.g., RFCs 3522
      and 3708, both listed in Section 4.5 of this document), F-RTO does
      not rely on the use of any TCP options.  The basic idea is to send
      previously unsent data after the first retransmission after a RTO.
      If the ACKs advance the window, the RTO may be declared spurious.

[3.5.]  Path MTU Discovery

   RFC 1191 S: "Path MTU Discovery" (November 1990)

   RFC 1981 S: "Path MTU Discovery for IP version 6" (August 1996)

   RFC 4821 S: "Packetization Layer Path MTU Discovery" (March 2007)

      Abstract of RFC 4821 [RFC4821]: "This document describes a robust
      method for Path MTU Discovery (PMTUD) that relies on TCP or some
      other Packetization Layer to probe an Internet path with
      progressively larger packets.


[3.6.]  Header Compression
   Especially in streaming applications, the overhead of TCP/IP headers
   could correspond to more than 50% of the total amount of data sent.
   Such large overheads may be tolerable in wired LANs where capacity is
   often not an issue, but are excessive for WANs and wireless systems
   where bandwidth is scarce.  Header compression schemes for TCP/IP
   like RObust Header Compression (ROHC) can significantly compress this
   overhead.  It performs well over links with significant error rates
   and long round-trip times.

   RFC 1144 S: "Compressing TCP/IP Headers for Low-Speed Serial Links"
               (February 1990)

   RFC 6846 S: "RObust Header Compression (ROHC): A Profile for TCP/IP
               (ROHC-TCP)" (January 2013)


3.7.  Defending Spoofing and Flooding Attacks

   By default, TCP lacks any cryptographic structures to differentiate
   legitimate segments from those spoofed from malicious hosts.
   Spoofing valid segments requires correctly guessing a number of
   fields.  The documents in this subsection describe ways to make that
   guessing harder or to prevent it from being able to affect a
   connection negatively.


   RFC 4953 I: "Defending TCP Against Spoofing Attacks" (July 2007)

   RFC 4987 I: "TCP SYN Flooding Attacks and Common Mitigations" (August
               2007)

   RFC 5925 S: "The TCP Authentication Option" (June 2010)

   RFC 5926 S: "Cryptographic Algorithms for the TCP Authentication
               Option (TCP-AO)" (June 2010)

   RFC 5927 I: "ICMP Attacks against TCP" (July 2010)

   RFC 5961 S: "Improving TCP's Robustness to Blind In-Window Attacks"
               (August 2010)

   RFC 6528 S: "Defending against Sequence Number Attacks" (February
               2012)

[4.]  Experimental Extensions

   The RFCs in this section are either Experimental and may become
   Proposed Standards in the future or are Proposed Standards (or
   Informational), but can be considered experimental due to lack of
   wide deployment.  At least part of the reason that they are still
   experimental is to gain more wide-scale experience with them before a
   standards track decision is made.

[4.1.]  Architectural Guidelines

   As multiple flows may share the same paths, sections of paths, or
   other resources, the TCP implementation may benefit from sharing
   information across TCP connections or other flows.  Some experimental
   proposals have been documented and some implementations have included
   the concepts.


   RFC 2140 I: "TCP Control Block Interdependence" (April 1997)


   RFC 3124 S: "The Congestion Manager" (June 2001)

      This document [RFC3124] is a related proposal to RFC 2140 (see
      above in Section 4.1).  The idea behind the Congestion Manager,
      moving congestion control outside of individual TCP connections,
      represents a modification to the core of TCP, which supports
      sharing information among TCP connections.  Although a Proposed
      Standard, some pieces of the Congestion Manager support
      architecture have not been specified yet, and it has not achieved
      use or implementation beyond experimental stacks, so it is not
      listed among the standard TCP enhancements in this roadmap.

[4.2.]  Fundamental Changes

   Like the Standards Track documents listed in Section 3.1, there also
   exist new Experimental RFCs that specify fundamental changes to TCP.
   At the time of writing, the only example so far is TCP Fast Open that
   deviates from the standard TCP semantics of [RFC793].

   RFC 7413 E: "TCP Fast Open" (December 2014)

      This document [RFC7413] describes TCP Fast Open that allows data
      to be carried in the SYN and SYN-ACK packets and consumed by the
      receiver during the initial connection handshake.

[4.3.]  Congestion Control Extensions

   TCP congestion control has been an extremely active research area for
   many years (see RFC 5783 discussed in Section 7.6 of this document),
   as it determines the performance of many applications that use TCP.
   A number of Experimental RFCs address issues with flow start up,
   overshoot, and steady-state behavior in the basic algorithms of RFC
   5681 (see Section 2 of this document).  In these subsections,
   enhancements to TCP's congestion control are listed.=20


   RFC 2861 E: "TCP Congestion Window Validation" (June 2000)


   RFC 3540 E: "Robust Explicit Congestion Notification (ECN) Signaling
               with Nonces" (June 2003)

   RFC 3649 E: "HighSpeed TCP for Large Congestion Windows" (December
               2003)

   RFC 3742 E: "Limited Slow-Start for TCP with Large Congestion
               Windows" (March 2004)


   RFC 4782 E: "Quick-Start for TCP and IP" (January 2007) (Errata)


   RFC 5562 E: "Adding Explicit Congestion Notification (ECN) Capability
               to TCP's SYN/ACK Packets" (June 2009)

   RFC 5690 I: "Adding Acknowledgement Congestion Control to TCP"
               (February 2010)


   RFC 6928 E: "Increasing TCP's Initial Window" (April 2013)

      This document [RFC6928] proposes to increase the TCP initial
      window from between 2 and 4 segments, as specified in RFC 3390
      (see Section 3.2 of this document), to 10 segments with a fallback
      to the existing recommendation when performance issues are
      detected.

[4.4.]  Loss Recovery Extensions

   RFC 5827 E: "Early Retransmit for TCP and Stream Control Transmission
               Protocol (SCTP)" (April 2010)

      This document [RFC5827] proposes the "Early Retransmit" mechanism
      for TCP (and SCTP) that can be used to recover lost segments when
      a connection's congestion window is small.  In certain special
      circumstances, Early Retransmit reduces the number of duplicate
      acknowledgments required to trigger fast retransmit to recover
      segment losses without waiting for a lengthy retransmission
      timeout.


   RFC 6069 E: "Making TCP More Robust to Long Connectivity Disruptions
               (TCP-LCD)" (December 2010)

   RFC 6937 E: "Proportional Rate Reduction for TCP" (May 2013)

      This document [RFC6937] describes an experimental Proportional
      Rate Reduction (PRR) algorithm as an alternative to the widely
      deployed Fast Recovery algorithm, to improve the accuracy of the
      amount of data sent by TCP during loss recovery.

[4.5.]  Detection and Prevention of Spurious Retransmissions

   In addition to the Standards Track extensions to deal with spurious
   retransmissions in Section 3.4, Experimental proposals have also been
   documented.

   RFC 3522 E: "The Eifel Detection Algorithm for TCP" (April 2003)


   RFC 3708 E: "Using TCP Duplicate Selective Acknowledgement (DSACKs)
               and Stream Control Transmission Protocol (SCTP) Duplicate
               Transmission Sequence Numbers (TSNs) to Detect Spurious
               Retransmissions" (February 2004)

   RFC 4653 E: "Improving the Robustness of TCP to Non-Congestion
               Events" (August 2006)

[4.6.]  TCP Timeouts


   RFC 5482 S: "TCP User Timeout Option" (March 2009)


[4.7.]  Multipath TCP

   MultiPath TCP (MPTCP) is an ongoing effort within the IETF that
   allows a TCP connection to simultaneously use multiple IP addresses /
   interfaces to spread their data across several subflows, while
   presenting a regular TCP interface to applications.  Benefits of this
   include better resource utilization, better throughput and smoother
   reaction to failures.  The documents listed in this section specify
   the Multipath TCP scheme, while the documents in Sections 7.2, 7.4,
   and 7.5 provide some additional background information.

   RFC 6356 E: "Coupled Congestion Control for Multipath Transport
               Protocols" (October 2011)


   RFC 6824 E: "TCP Extensions for Multipath Operation with Multiple
               Addresses" (January 2013) (Errata)

[5.]  TCP Parameters at IANA


   RFC 2780 B: "IANA Allocation Guidelines For Values In the Internet
               Protocol and Related Headers" (March 2000)

   RFC 4727 S: "Experimental Values in IPv4, IPv6, ICMPv4, ICMPv6, UDP,
               and TCP Headers" (November 2006)

   RFC 6335 B: "Internet Assigned Numbers Authority (IANA) Procedures
               for the Management of the Service Name and Transport
               Protocol Port Number Registry" (August 2011)

   RFC 6994 S: "Shared Use of Experimental TCP Options (August 2013)


[7.]  Support Documents

   This section contains several classes of documents that do not
   necessarily define current protocol behaviors but that are
   nevertheless of interest to TCP implementers.  Section 7.1 describes
   several foundational RFCs that give modern readers a better
   understanding of the principles underlying TCP's behaviors and
   development over the years.  Section 7.2 contains architectural
   guidelines and principles for TCP architects and designers.  The
   documents listed in Section 7.3 provide advice on using TCP in
   various types of network situations that pose challenges above those
   of typical wired links.  Guidance for developing, analyzing, and
   evaluating TCP is given in Section 7.4.  Some implementation notes
   and implementation advice can be found in Section 7.5.  RFCs that
   describe tools for testing and debugging TCP implementations or that
   contain high-level tutorials on the protocol are listed Section 7.6.
   The TCP Management Information Bases are described in Section 7.7,
   and Section 7.8 lists a number of case studies that have explored TCP
   performance.


7.4.  Guidance for Developing, Analyzing, and Evaluating TCP

   Documents in this section give general guidance for developing,
   analyzing, and evaluating TCP.  Some of the documents discuss, for
   example, the properties of congestion control protocols that are
   "safe" for Internet deployment as well as how to measure the
   properties of congestion control mechanisms and transport protocols.


   RFC 5033 B: "Specifying New Congestion Control Algorithms" (August
               2007)

      This document [RFC5033] considers the evaluation of suggested
      congestion control algorithms that differ from the principles
      outlined in RFC 2914 (see Section 7.2 of this document).  It is
      useful for authors of such algorithms as well as for IETF members
      reviewing the associated documents.

   RFC 5166 I: "Metrics for the Evaluation of Congestion Control
               Mechanisms" (March 2008)

      This document [RFC5166] discusses metrics that need to be
      considered when evaluating new or modified congestion control
      mechanisms for the Internet.  Among other topics, the document
      discusses throughput, delay, loss rates, response times, fairness,
      and robustness for challenging environments.

   RFC 6077 I: "Open Research Issues in Internet Congestion Control"
               (February 2011)

      This document [RFC6077] summarizes the main open problems in the
      domain of Internet congestion control.  As a good starting point
      for newcomers, the document describes several new challenges that
      are becoming important as the network grows, as well as some
      issues that have been known for many years.

   RFC 6181 I: "Threat Analysis for TCP Extensions for Multipath
               Operation with Multiple Addresses" (March 2011)

      This document [RFC6181] describes a threat analysis for Multipath
      TCP (MPTCP) (see Section 4.7 of this document).  The document
      discusses several types of attacks and provides recommendations
      for MPTCP designers how to create an MPTCP specification that is
      as secure as the current (single-path) TCP.

   RFC 6349 I: "Framework for TCP Throughput Testing" (August 2011)

      From the Abstract of RFC 6349 [RFC6349]: "This framework describes
      a practical methodology for measuring end-to-end TCP Throughput in
      a managed IP network.  The goal is to provide a better indication
      in regard to user experience.  In this framework, TCP and IP
      parameters are specified to optimize TCP Throughput."

7.5.  Implementation Advice

   RFC 794 U: "PRE-EMPTION" (September 1981)

      This document [RFC794] clarifies that operating systems need to
      manage their limited resources, which may include TCP connection
      state, and that these decisions can be made with application
      input, but they do not need to be part of the TCP protocol
      specification itself.

   RFC 879 U: "The TCP Maximum Segment Size and Related Topics"
               (November 1983)

   RFC 1071 U: "Computing the Internet Checksum" (September 1988)
               (Errata)

   RFC 1624 I: "Computation of the Internet Checksum via Incremental
               Update" (May 1994)

   RFC 1936 I: "Implementing the Internet Checksum in Hardware" (April
               1996)

   RFC 2525 I: "Known TCP Implementation Problems" (March 1999)

   RFC 2923 I: "TCP Problems with Path MTU Discovery" (September 2000)

   RFC 3493 I: "Basic Socket Interface Extensions for IPv6" (February
               2003)

   RFC 6056 B: "Recommendations for Transport-Protocol Port
               Randomization" (December 2010)

   RFC 6191 B: "Reducing the TIME-WAIT State Using TCP Timestamps"
               (April 2011)

   RFC 6429 I: "TCP Sender Clarification for Persist Condition"
               (December 2011)

   RFC 6897 I: "Multipath TCP (MPTCP) Application Interface
               Considerations" (March 2013)

7.6.  Tools and Tutorials

   RFC 1180 I: "TCP/IP Tutorial" (January 1991) (Errata)

      This document [RFC1180] is an extremely brief overview of the TCP/
      IP protocol suite as a whole.  It gives some explanation as to how
      and where TCP fits in.

   RFC 1470 I: "FYI on a Network Management Tool Catalog: Tools for
               Monitoring and Debugging TCP/IP Internets and
               Interconnected Devices" (June 1993)

      A few of the tools that this document [RFC1470] describes are
      still maintained and in use today, for example, ttcp and tcpdump.
      However, many of the tools described do not relate specifically to
      TCP and are no longer used or easily available.

   RFC 2398 I: "Some Testing Tools for TCP Implementors" (August 1998)

      This document [RFC2398] describes a number of TCP packet
      generation and analysis tools.  Although some of these tools are
      no longer readily available or widely used, for the most part they
      are still relevant and usable.

   RFC 5783 I: "Congestion Control in the RFC Series" (February 2010)

      This document [RFC5783] provides an overview of RFCs related to
      congestion control that had been published at the time.  The focus
      of the document is on end-host-based congestion control.

8.  Undocumented TCP Features

   There are a few important implementation tactics for the TCP that
   have not yet been described in any RFC.  Although this roadmap is
   primarily concerned with mapping the TCP RFCs, this section is
   included because an implementer needs to be aware of these important
   issues.


   Header Prediction

      Header prediction is a trick to speed up the processing of
      segments.  Van Jacobson and Mike Karels developed the technique in
      the late 1980s.  The basic idea is that some processing time can
      be saved when most of a segment's fields can be predicted from
      previous segments.  A good description of this was sent to the
      TCP-IP mailing list by Van Jacobson on March 9, 1988 (see
      [Jacobson] for the full message):

         Quite a bit of the speedup comes from an algorithm that we
         ('we' refers to collaborator Mike Karels and myself) are
         calling "header prediction".  The idea is that if you're in the
         middle of a bulk data transfer and have just seen a packet, you
         know what the next packet is going to look like: It will look
         just like the current packet with either the sequence number or
         ack number updated (depending on whether you're the sender or
         receiver).  Combining this with the "Use hints" epigram from
         Butler Lampson's classic "Epigrams for System Designers", you
         start to think of the tcp state (rcv.nxt, snd.una, etc.) as
         "hints" about what the next packet should look like.

         If you arrange those "hints" so they match the layout of a tcp
         packet header, it takes a single 14-byte compare to see if your
         prediction is correct (3 longword compares to pick up the send
         & ack sequence numbers, header length, flags and window, plus a
         short compare on the length).  If the prediction is correct,
         there's a single test on the length to see if you're the sender
         or receiver followed by the appropriate processing.  E.g., if
         the length is non-zero (you're the receiver), checksum and
         append the data to the socket buffer then wake any process
         that's sleeping on the buffer.  Update rcv.nxt by the length of
         this packet (this updates your "prediction" of the next
         packet).  Check if you can handle another packet the same size
         as the current one.  If not, set one of the unused flag bits in
         your header prediction to guarantee that the prediction will
         fail on the next packet and force you to go through full
         protocol processing.  Otherwise, you're done with this packet.
         So, the *total* tcp protocol processing, exclusive of
         checksumming, is on the order of 6 compares and an add.

   Forward Acknowledgement (FACK)

      FACK [MM96] includes an alternate algorithm for triggering fast
      retransmit [RFC5681], based on the extent of the SACK scoreboard.
      Its goal is to trigger fast retransmit as soon as the receiver's
      reassembly queue is larger than the duplicate ACK threshold, as
      indicated by the difference between the forward most SACK block
      edge and SND.UNA.  This algorithm quickly and reliably triggers
      fast retransmit in the presence of burst losses -- often on the
      first SACK following such a loss.  Such a threshold-based
      algorithm also triggers fast retransmit immediately in the
      presence of any reordering with extent greater than the duplicate
      ACK threshold.  FACK is implemented in Linux and turned on per
      default.

   Congestion Control for High Rate Flows

      In the last decade significant research effort has been put into
      experimental TCP congestion control modifications for obtaining
      high throughput with reduced startup and recovery times.  Only a
      few RFCs have been published on some of these modifications,
      including HighSpeed TCP [RFC3649], Limited Slow-Start [RFC3742],
      and Quick-Start [RFC4782] (see Section 4.3 of this document for
      more information on each), but high-rate congestion control
      mechanisms are still considered an open issue in congestion
      control research.  Some other schemes have been published as
      Internet-Drafts, e.g.  CUBIC [CUBIC] (the standard TCP congestion
      control algorithm in Linux), Compound TCP [CTCP], and H-TCP [HTCP]
      or have been discussed a little by the IETF, but much of the work
      in this area has not been adopted within the IETF yet, so the
      majority of this work is outside the RFC series and may be
      discussed in other products of the IRTF Internet Congestion
      Control Research Group (ICCRG).


Metrics for the Evaluation of Congestion Control Mechanisms
https://tools.ietf.org/html/rfc5166
   Discusses the metrics to be considered in an evaluation
   of new or modified congestion control mechanisms for the Internet.
   These include metrics for the evaluation of new transport protocols,
   of proposed modifications to TCP, of application-level congestion
   control, and of Active Queue Management (AQM) mechanisms in the
   router.  This document is the first in a series of documents aimed at
   improving the models that we use in the evaluation of transport
   protocols.

Types Of Metrics:

  - Throughput, Delay, and Loss Rates

    - Throughput: can be measured as

      - router-based metric of aggregate link utilization

      - flow-based metric of per-connection transfer times

      - user-based metric of utility functions or user wait times


    - Goodput: sometimes distinguished from throughput where throughput
      is the link utilization or flow rate in bytes per second; goodput
      is the subset of throughput (also measured in Bytes/s) consisting
      of useful traffic [i.e. excluding duplicate packets]

    - Delay: Like throughput, delay can be measured as a router-based metri=
c of
      queueing delay over time, or as a flow-based metric in terms of
      per-packet transfer times.  Per-packet delay can also include delay
      at the sender waiting for the transport protocol to send the packet.
      For reliable transfer, the per-packet transfer time seen by the
      application includes the possible delay of retransmitting a lost
      packet.

    - Packet Loss Rates: can be measured as a network-based or as a
      flow-based metric. One network-related reason to avoid high steady-
      state packet loss rates is to avoid congestion collapse in environmen=
ts=20
      containing paths with multiple congested links

  - Response Times and Minimizing Oscillations
   =20
    - Response to Changes: One of the key concerns in the design of congest=
ion=20
      control mechanisms has been the response times to sudden congestion i=
n the
      network.  On the one hand, congestion control mechanisms should
      respond reasonably promptly to sudden congestion from routing or
      bandwidth changes or from a burst of competing traffic.  At the same
      time, congestion control mechanisms should not respond too severely
      to transient changes, e.g., to a sudden increase in delay that will
      dissipate in less than the connection's round-trip time.

    - Minimizing Oscillations:  One goal is that of stability, in terms of=
=20
      minimizing oscillations of queueing delay or of throughput.  In pract=
ice,=20
      stability is frequently associated with rate fluctuations or variance=
. =20
      Rate variations can result in fluctuations in router queue size and
      therefore of queue overflows.  These queue overflows can cause loss
      synchronizations across coexisting flows and periodic under-utilizati=
on=20
      of link capacity, both of which are considered to be general signs of=
=20
      network instability.  Thus, measuring the rate variations of flows is=
=20
      often used to measure the stability of transport protocols.  To measu=
re=20
      rate variations, [JWL04], [RX05], and [FHPW00] use the coefficient of=
=20
      variation (CoV) of per-flow transmission rates, and [WCL05] suggests =
the
      use of standard deviations of per-flow rates.  Since rate variations =
are=20
      a function of time scales, it makes sense to measure these rate varia=
tions
      over various time scales.

  - Fairness and Convergence

    - Fairness between Flows: let x_i be the throughput for the i-th connec=
tion.

      - Jain's fairness index: The fairness index in [JCH84] is:
     =20
      =09(( sum_i x_i )^2) / (n * sum_i ( (x_i)^2 )),

      =09where there are n users.  This fairness index ranges from 0 to 1, =
and
      =09it is maximum when all users receive the same allocation.  This in=
dex
      =09is k/n when k users equally share the resource, and the other n-k
      =09users receive zero allocation.

      - The product measure:

      =09product_i x_i

=09the product of the throughput of the individual connections, is also
   =09used as a measure of fairness.  (In some contexts x_i is taken as the
   =09power of the i-th connection, and the product measure is referred to
   =09as network power.)  The product measure is particularly sensitive to
   =09segregation; the product measure is zero if any connection receives
   =09zero throughput. [N.B. If one normalizes to actual bandwidth by takin=
g=20
=09the Nth root of the product, where N =3D number of connections, this is
=09the geometric mean. The geometric mean will be less than the arithmetic
=09mean unless all flows have equivalent throughput.]

     - Epsilon-fairness: A rate allocation is defined as epsilon-fair if

         (min_i x_i) / (max_i x_i) >=3D 1 - epsilon
      =20
       Epsilon-fairness measures the worst-case ratio between any two throu=
ghput
       rates [ZKL04]. Epsilon-fairness is related to max-min fairness.

    - Fairness between Flows with Different Resource Requirements

      - Max-min fairness: In order to satisfy the max-min fairness criteria=
,
      =09the smallest throughput rate must be as large as possible.  Given
   =09this condition, the next-smallest throughput rate must be as large as
   =09possible, and so on.  Thus, the max-min fairness gives absolute
   =09priority to the smallest flows.  (Max-min fairness can be explained
   =09by the progressive filling algorithm, where all flow rates start at
   =09zero, and the rates all grow at the same pace.  Each flow rate stops
   =09growing only when one or more links on the path reach link capacity.)

     - Proportional fairness: A   feasible allocation, x, is
       defined as proportionally fair if, for any other feasible allocation
       x*, the aggregate of proportional changes is zero or negative:

       =09  sum_i ( (x*_i - x_i)/x_i ) <=3D 0.

       "This criterion favours smaller flows, but less emphatically than
       max-min fairness" [K01].  (Using the language of utility functions,
       proportional fairness can be achieved by using logarithmic utility
       functions, and maximizing the sum of the per-flow utility functions;
       see [KMT98] for a fuller explanation.)

     - Minimum potential delay fairness: Minimum potential delay fairness
       has been shown to model TCP [KS03], and is a compromise between
       max-min fairness and proportional fairness.  An allocation, x, is
       defined as having minimum potential delay fairness if:

             sum_i (1/x_i)

       is smaller than for any other feasible allocation.  That is, it woul=
d
       minimize the average download time if each flow was an equal-sized
       file.
    - Comments on Fairness

      - Trade-offs between fairness and throughput: The fairness measures i=
n
      =09the section above generally measure both fairness and throughput,
   =09giving different weights to each.  Potential trade-offs between
   =09fairness and throughput are also discussed by Tang, et al. in
   =09[TWL06], for a framework where max-min fairness is defined as the
   =09most fair.  In particular, [TWL06] shows that in some topologies,
   =09throughput is proportional to fairness, while in other topologies,
   =09throughput is inversely proportional to fairness.

     - Fairness and the number of congested links: Some of these fairness
       metrics are discussed in more detail in [F91].  We note that there i=
s
       not a clear consensus for the fairness goals, in particular for
       fairness between flows that traverse different numbers of congested
       links [F91].  Utility maximization provides one framework for
       describing this trade-off in fairness.

     - Fairness and round-trip times: One goal cited in a number of new
       transport protocols has been that of fairness between flows with
       different round-trip times [KHR02] [XHR04].  We note that there is
       not a consensus in the networking community about the desirability o=
f
       this goal, or about the implications and interactions between this
       goal and other metrics [FJ92] (Section 3.3).  One common argument
       against the goal of fairness between flows with different round-trip
       times has been that flows with long round-trip times consume more
       resources; this aspect is covered by the previous paragraph.
       Researchers have also noted the difference between the RTT-unfairnes=
s
       of standard TCP, and the greater RTT-unfairness of some proposed
       modifications to TCP [LLS05].

     - Fairness and packet size: One fairness issue is that of the relative
       fairness for flows with different packet sizes.  Many file transfer
       applications will use the maximum packet size possible;  in contrast=
,
       low-bandwidth VoIP flows are likely to send small packets, sending a
       new packet every 10 to 40 ms., to limit delay.  Should a small-packe=
t
       VoIP connection receive the same sending rate in *bytes* per second
       as a large-packet TCP connection in the same environment, or should
       it receive the same sending rate in *packets* per second?  This
       fairness issue has been discussed in more detail in [RFC3714], with
       [RFC4828] also describing the ways that packet size can affect the
       packet drop rate experienced by a flow.

     - Convergence times: Convergence times concern the time for convergenc=
e
       to fairness between an existing flow and a newly starting one, and
       are a special concern for environments with high-bandwidth long-dela=
y
       flows.  Convergence times also concern the time for convergence to
       fairness after a sudden change such as a change in the network path,
       the competing cross-traffic, or the characteristics of a wireless
       link.  As with fairness, convergence times can matter both between
       flows of the same protocol, and between flows using different
       protocols [SLFK03].  One metric used for convergence times is the
       delta-fair convergence time, defined as the time taken for two flows
       with the same round-trip time to go from shares of 100/101-th and
       1/101-th of the link bandwidth, to having close to fair sharing with
       shares of (1+delta)/2 and (1-delta)/2 of the link bandwidth [BBFS01]=
.
       A similar metric for convergence times measures the convergence time
       as the number of round-trip times for two flows to reach epsilon-
       fairness, when starting from a maximally-unfair state [ZKL04].


TCP Congestion Control (RFC 5681):
http://www.rfc-editor.org/rfc/rfc5681.txt
Specifies four TCP congestion algorithms: slow start, congestion
avoidance, fast retransmit and fast recovery. They were devised
in [Jac88] and [Jac90]. Their use with TCP is standardized in=20
[RFC1122].

In addition the document specifies what TCP connections should do after
a relatively long idle period, as well as clarifying some of the issues
pertaining to TCP ACK generation.

Obsoletes [RFC2581], which in turn obsoleted [RFC2001].

The slow start and congestion avoidance algorithms MUST be used by the=20
TCP sender to control the amount of outstanding data being injected into
the network. These add three state variables.

    - Congestion Window (cwnd): a sender-side limit on the amount of data=
=20
      the sender can transmit before receiving an ACK.
    - Receiver's Advertised Window (rwnd):  a receiver-side limit o the amo=
unt=20
      of outstanding data.=20
    - Slow Start Threshold (ssthresh): used to determine whether the slow s=
tart=20
      or congestion avoidance algorithm is used to control data transmissio=
n.

Slow Start: Used to determine available link capacity at the beginning of a
transfer, after repairing loss detected by the retransmission timer, or=20
[potentially] after a long idle period. It is additionally used to start th=
e=20
"ACK clock".

    - SMSS: Sender Maximum Segment Size
    - IW: Initial Window, the initial value of cwnd, MUST be set using the=
=20
      following guidelines as an upper bound
     =20
      If SMSS > 2190 bytes:
       =09IW =3D 2 * SMSS bytes and MUST NOT be more than 2 segments
      If (SMSS > 1095 bytes) and (SMSS <=3D 2190 bytes):
      =09IW =3D 3 * SMSS bytes and MUST NOT be more than 3 segments
      If SMSS <=3D 1095 bytes:
       =09IW =3D 4 * SMSS bytes and MUST NOT be more than 4 segments

    - Ssthresh:=20

      - SHOULD be set arbitrarily high (e.g., to the size of the largest=20
      =09possible advertised window), but ssthresh MUST be reduced in respo=
nse
=09to congestion.

      - The slow start algorithm is used when cwnd < ssthresh, while the
      =09congestion avoidance algorithm is used when cwnd > ssthresh.  When
  =09cwnd and ssthresh are equal, the sender may use either slow start or
   =09congestion avoidance.

      - When a TCP sender detects segment loss using the retransmission tim=
er
      =09and the given segment has not yet been resent once by way of the
   =09retransmission timer, the value of ssthresh MUST be set to no more
   =09than the value given in equation (4):

      =09ssthresh =3D max (FlightSize / 2, 2*SMSS)            (4)

=09Where Flightsize is the amount of outstanding data in the network.
=20
    - Growing cwnd: During slow start, a TCP increments cwnd by at most SMS=
S=20
      bytes for each ACK received that cumulatively acknowledges new data.=
=20
      Slow start ends when cwnd reaches or exceeds ssthresh.
     =20
      - Traditionally TCP implementations have increased cwnd by precisely
      =09SMSS bytes upon receipt of an ACK covering new data, we RECOMMEND
   =09that TCP implementations increase cwnd, per:=20

=09cwnd +=3D min (N, SMSS)  (2)

   =09where N is the number of previously unacknowledged bytes acknowledged
=09in the incoming ACK.

Congestion Avoidance: during congestion avoidance, cwnd is incremented by
roughly 1 full-sized segment per RTT. Congestion avoidance continues until
congestion is detected. The basic guidelines for incrementing cwnd are:

     - MAY increment cwnd by SMSS bytes

     - SHOULD increment cwnd per equation (2) once per RTT

     - MUST NOT increment cwnd by more than SMSS bytesb

[RFC3465] allows for cwnd increases of more than SMSS bytes for incoming=20
acknowledgments during slow start on an experimental basis; however, such=
=20
behavior is not allowed as part of the standard.


Another common formula that a TCP MAY use to update cwnd during
congestion avoidance is given in equation (3):

   cwnd +=3D SMSS*SMSS/cwnd                     (3)

This adjustment is executed on every incoming ACK that acknowledges
new data.  Equation (3) provides an acceptable approximation to the
underlying principle of increasing cwnd by 1 full-sized segment per
RTT.

Upon a timeout (as specified in [RFC2988]) cwnd MUST be
set to no more than the loss window, LW, which equals 1 full-sized
segment (regardless of the value of IW).  Therefore, after
retransmitting the dropped segment the TCP sender uses the slow start
algorithm to increase the window from 1 full-sized segment to the new
value of ssthresh, at which point congestion avoidance again takes over.


Fast Retransmit/Fast Recovery: A TCP receiver SHOULD send an immediate=20
duplicate ACK when an out-of-order segment arrives.  The purpose of this AC=
K=20
is to inform the sender that a segment was received out-of-order and which=
=20
sequence number is expected. In addition, a TCP receiver SHOULD send an=20
immediate ACK when the incoming segment fills in all or part of a gap in th=
e=20
sequence space.  This will generate more timely information for a sender
recovering from a loss through a retransmission timeout, a fast retransmit,=
 or
an advanced loss recovery algorithm.


The TCP sender SHOULD use the "fast retransmit" algorithm to detect and rep=
air
loss, based on incoming duplicate ACKs.  The fast retransmit algorithm uses=
 the
arrival of 3 duplicate ACKs as an indication that a segment has been lost.
TCP then performs a retransmission of what appears to be the missing segmen=
t,=20
without waiting for the retransmission timer to expire.


The fast retransmit and fast recovery algorithms are implemented
   together as follows.

   1.  On the first and second duplicate ACKs received at a sender, a
       TCP SHOULD send a segment of previously unsent data per [RFC3042]
       provided that the receiver's advertised window allows, the total
       FlightSize would remain less than or equal to cwnd plus 2*SMSS,
       and that new data is available for transmission.  Further, the
       TCP sender MUST NOT change cwnd to reflect these two segments
       [RFC3042].  Note that a sender using SACK [RFC2018] MUST NOT send
       new data unless the incoming duplicate acknowledgment contains
       new SACK information.

   2.  When the third duplicate ACK is received, a TCP MUST set ssthresh
       to no more than the value given in equation (4).  When [RFC3042]
       is in use, additional data sent in limited transmit MUST NOT be
       included in this calculation.

   3.  The lost segment starting at SND.UNA MUST be retransmitted and
       cwnd set to ssthresh plus 3*SMSS.  This artificially "inflates"
       the congestion window by the number of segments (three) that have
       left the network and which the receiver has buffered.

   4.  For each additional duplicate ACK received (after the third),
       cwnd MUST be incremented by SMSS.  This artificially inflates the
       congestion window in order to reflect the additional segment that
       has left the network.

       Note: [SCWA99] discusses a receiver-based attack whereby many
       bogus duplicate ACKs are sent to the data sender in order to
       artificially inflate cwnd and cause a higher than appropriate
       sending rate to be used.  A TCP MAY therefore limit the number of
       times cwnd is artificially inflated during loss recovery to the
       number of outstanding segments (or, an approximation thereof).

       Note: When an advanced loss recovery mechanism (such as outlined
       in section 4.3) is not in use, this increase in FlightSize can
       cause equation (4) to slightly inflate cwnd and ssthresh, as some
       of the segments between SND.UNA and SND.NXT are assumed to have
       left the network but are still reflected in FlightSize.

   5.  When previously unsent data is available and the new value of
       cwnd and the receiver's advertised window allow, a TCP SHOULD
       send 1*SMSS bytes of previously unsent data.

   6.  When the next ACK arrives that acknowledges previously
       unacknowledged data, a TCP MUST set cwnd to ssthresh (the value
       set in step 2).  This is termed "deflating" the window.

       This ACK should be the acknowledgment elicited by the
       retransmission from step 3, one RTT after the retransmission
       (though it may arrive sooner in the presence of significant out-
       of-order delivery of data segments at the receiver).
       Additionally, this ACK should acknowledge all the intermediate
       segments sent between the lost segment and the receipt of the
       third duplicate ACK, if none of these were lost.

   Note: This algorithm is known to generally not recover efficiently
   from multiple losses in a single flight of packets=20


RTO:
https://tools.ietf.org/html/rfc6298
Does not modify the behaviour in RFC 5681.

The RTO is a function of two state variables, SRTT and RTTVAR. The
following constants are used for calculations:
=09G <- clock granularity in seconds
=09K <- 4
[(2.1)] Until a round-trip time (RTT) measurment has been made for a segmen=
t
sent between the sender and the receiver, the sender SHOULD set RTO <- 1 se=
cond,
[i.e. not the outdated 3s currently in FreeBSD] - the "backing off" on repe=
ated=20
retransmission still applies.

[(2.2)] When the first RTT measurement R is made, the host MUST set
=09SRTT <- R
=09RTTVAR <- R/2
=09RTO <- SRTT + max (G, K*RTTVAR)

[(2.3)] When a subsequent RTT measurement R' is made, a host must set
=09RTTVAR <- (1 - beta)*RTTVAR + beta * |SRTT - R'|
=09SRTT <- (1 - alpha)*SRTT + alpha*R'

The value of SRTT used in updating RTTVAR is the one prior to the update
in the second assignment - i.e. the updates are done RTTVAR then SRTT.
The above calculation SHOULD be done with alpha=3D1/8 and beta=3D1/4 (as
suggested in [JK88]). [N.B. Should these values be smaller in the data
center so that the SRTT maintains a longer memory and isn't compromised
by a transient microburst?].

[(2.4)] Whenever RTO is computed, if it is less than 1 second, then the
         RTO SHOULD be rounded up to 1 second. [See the incast section
=09 for why this is unequivocally wrong in the data center]

         Traditionally, TCP implementations use coarse grain clocks to
         measure the RTT and trigger the RTO, which imposes a large
         minimum value on the RTO.  Research suggests that a large
         minimum RTO is needed to keep TCP conservative and avoid
         spurious retransmissions [AP99].  Therefore, this specification
         requires a large minimum RTO as a conservative approach, while
=09 at the same time acknowledging that at some future point,
         research may show that a smaller minimum RTO is acceptable or
         superior. [Vasudevan09 (incast section) clearly shows this to
=09 be the case.]


   Note that a TCP implementation MAY clear SRTT and RTTVAR after
   backing off the timer multiple times as it is likely that the current
   SRTT and RTTVAR are bogus in this situation.  Once SRTT and RTTVAR
   are cleared, they should be initialized with the next RTT sample
   taken per (2.2) rather than using (2.3).

[(7)]  Changes from RFC 2988
   This document reduces the initial RTO from the previous 3 seconds
   [PA00] to 1 second, unless the SYN or the ACK of the SYN is lost, in
   which case the default RTO is reverted to 3 seconds before data
   transmission begins.

Increasing TCP's intial window:
http://www.rfc-editor.org/rfc/rfc3390.txt
http://www.rfc-editor.org/rfc/rfc6928.txt

Proposes an experiment to increase the permitted TCP
initial window (IW) from between 2 and 4 segments, as specified in
RFC 3390, to 10 segments with a fallback to the existing
recommendation when performance issues are detected.  It discusses
the motivation behind the increase, the advantages and disadvantages
of the higher initial window, and presents results from several
large-scale experiments showing that the higher initial window
improves the overall performance of many web services without
resulting in a congestion collapse.=20

TCP Modification:=20
    - The upper bound for the initial window will be:=20
    =20
=09min (10*MSS, max (2*MSS, 14600))

    - This change applies to the initial window of the connection in the
      first round-trip time (RTT) of data transmission during or following
      the TCP three-way handshake.

    -  all the test results described in this document were based
       on the regular Ethernet MTU of 1500 bytes.  Future study of the
       effect of a different MTU may be needed to fully validate (1) above.

    - [In contrast to RFC 3390 and RFC 5681] The proposed change to reduce =
the=20
      default retransmission timeout (RTO) to 1 second [RFC6298] increases =
the=20
      chance for spurious SYN or SYN/ACK retransmission, thus unnecessarily=
=20
      penalizing connections with RTT > 1 second if their initial window is=
=20
      reduced to 1 segment. For this reason, it is RECOMMENDED that=20
      implementations refrain from resetting the initial window to 1 segmen=
t,=20
      unless there have been more than one SYN or SYN/ACK retransmissions o=
r=20
      true loss detection has been made.

    - TCP implementations use slow start in as many as three different
      ways: (1) to start a new connection (the initial window); (2) to
      restart transmission after a long idle period (the restart window);
      and (3) to restart transmission after a retransmit timeout (the loss
      window).  The change specified in this document affects the value of
      the initial window.  Optionally, a TCP MAY set the restart window to
      the minimum of the value used for the initial window and the current
      value of cwnd (in other words, using a larger value for the restart
      window should never increase the size of cwnd).  These changes do NOT
      change the loss window, which must remain 1 segment of MSS bytes (to
      permit the lowest possible window size in the case of severe congesti=
on).

    - To limit any negative effect that a larger initial
      window may have on links with limited bandwidth or buffer space,
      implementations SHOULD fall back to RFC 3390 for the restart window
      (RW) if any packet loss is detected during either the initial window
      or a restart window, and more than 4 KB of data is sent.


4.  Background

    - According to the latest report from Akamai [AKAM10],
      the global broadband (> 2 Mbps) adoption has surpassed 50%,
      propelling the average connection speed to reach 1.7 Mbps, while the
      narrowband (< 256 Kbps) usage has dropped to 5%.  In contrast, TCP's
      initial window has remained 4 KB for a decade [RFC2414],
      corresponding to a bandwidth utilization of less than 200 Kbps per
      connection, assuming an RTT of 200 ms.

   - A large proportion of flows on the Internet are short web
     transactions over TCP and complete before exiting TCP slow start.

   - applications have responded to TCP's "slow" start.
     Web sites use multiple subdomains [Bel10] to circumvent HTTP 1.1
     regulation on two connections per physical host [RFC2616].  As of
     today, major web browsers open multiple connections to the same site
     (up to six connections per domain [Ste08] and the number is growing).
     This trend is to remedy HTTP serialized download to achieve
     parallelism and higher performance.  But it also implies that today
     most access links are severely under-utilized, hence having multiple
     TCP connections improves performance most of the time.

   - persistent connections and pipelining are designed to
     address some of the above issues with HTTP [RFC2616].  Their presence
     does not diminish the need for a larger initial window, e.g., data
     from the Chrome browser shows that 35% of HTTP requests are made on
     new TCP connections.  Our test data also shows significant latency
     reduction with the large initial window even in conjunction with
     these two HTTP features [Duk10].

5. Advantages of Larger Initial Windows

   - Reducing Latency

     An increase of the initial window from 3 segments to 10 segments
     reduces the total transfer time for data sets greater than 4 KB by up
     to 4 round trips.

     The table below compares the number of round trips between IW=3D3 and
     IW=3D10 for different transfer sizes, assuming infinite bandwidth, no
     packet loss, and the standard delayed ACKs with large delayed-ACK
     timer.
            ---------------------------------------
           | total segments |   IW=3D3   |   IW=3D10   |
            ---------------------------------------
           |         3      |     1    |      1    |
           |         6      |     2    |      1    |
           |        10      |     3    |      1    |
           |        12      |     3    |      2    |
           |        21      |     4    |      2    |
           |        25      |     5    |      2    |
           |        33      |     5    |      3    |
           |        46      |     6    |      3    |
           |        51      |     6    |      4    |
           |        78      |     7    |      4    |
           |        79      |     8    |      4    |
           |       120      |     8    |      5    |
           |       127      |     9    |      5    |
            ---------------------------------------

   For example, with the larger initial window, a transfer of 32
   segments of data will require only 2 rather than 5 round trips to
   complete.

   - Recovering Faster from Loss on Under-Utilized or Wireless Links

     A greater-than-3-segment initial window increases the chance to
     recover packet loss through Fast Retransmit rather than the lengthy
     initial RTO [RFC5681].  This is because the fast retransmit algorithm
     requires three duplicate ACKs as an indication that a segment has
     been lost rather than reordered.  While newer loss recovery
     techniques such as Limited Transmit [RFC3042] and Early Retransmit
     [RFC5827] have been proposed to help speeding up loss recovery from a
     smaller window, both algorithms can still benefit from the larger
     initial window because of a better chance to receive more ACKs.


8.  Mitigation of Negative Impact

   Much of the negative impact from an increase in the initial window is
   likely to be felt by users behind slow links with limited buffers.
   The negative impact can be mitigated by hosts directly connected to a
   low-speed link advertising an initial receive window smaller than 10
   segments.  This can be achieved either through manual configuration
   by the users or through the host stack auto-detecting the low-
   bandwidth links.

   Additional suggestions to improve the end-to-end performance of slow
   links can be found in RFC 3150 [RFC3150].


RTO & High Performance:
https://tools.ietf.org/html/rfc7323
Updates the venerable RFC 1361.

[Also in RFC1361]
        An additional mechanism could be added to the TCP, a per-host
        cache of the last timestamp received from any connection.  This
        value could then be used in the PAWS mechanism to reject old
        duplicate segments from earlier incarnations of the connection,
        if the timestamp clock can be guaranteed to have ticked at least
        once since the old connection was open.  This would require that
        the TIME-WAIT delay plus the RTT together must be at least one
        tick of the sender's timestamp clock.  Such an extension is not
        part of the proposal of this RFC.


Appendix G.  RTO Calculation Modification

   Taking multiple RTT samples per window would shorten the history
   calculated by the RTO mechanism in [RFC6298], and the below algorithm
   aims to maintain a similar history as originally intended by
   [RFC6298].=20

   It is roughly known how many samples a congestion window worth of
   data will yield, not accounting for ACK compression, and ACK losses.
   Such events will result in more history of the path being reflected
   in the final value for RTO, and are uncritical.  This modification
   will ensure that a similar amount of time is taken into account for
   the RTO estimation, regardless of how many samples are taken per
   window:

      ExpectedSamples =3D ceiling(FlightSize / (SMSS * 2))

      alpha' =3D alpha / ExpectedSamples

      beta' =3D beta / ExpectedSamples

   Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs".
   Instead of using alpha and beta in the algorithm of [RFC6298], use
   alpha' and beta' instead:

      RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'|

      SRTT <- (1 - alpha') * SRTT + alpha' * R'

      (for each sample R')

   =20
Appendix H.  Changes from RFC 1323

   Several important updates and clarifications to the specification in
   RFC 1323 are made in this document.  The [important] technical changes a=
re
   summarized below:

   (d)  The description of which TSecr values can be used to update the
        measured RTT has been clarified.  Specifically, with timestamps,
        the Karn algorithm [Karn87] is disabled.  The Karn algorithm
        disables all RTT measurements during retransmission, since it is
        ambiguous whether the <ACK> is for the original segment, or the
        retransmitted segment.  With timestamps, that ambiguity is
        removed since the TSecr in the <ACK> will contain the TSval from
        whichever data segment made it to the destination.

   (e)  RTTM update processing explicitly excludes segments not updating
        SND.UNA.  The original text could be interpreted to allow taking
        RTT samples when SACK acknowledges some new, non-continuous
        data.

   (f)  In RFC 1323, Section 3.4, step (2) of the algorithm to control
        which timestamp is echoed was incorrect in two regards:

        (1)  It failed to update TS.Recent for a retransmitted segment
             that resulted from a lost <ACK>.

        (2)  It failed if SEG.LEN =3D 0.

        In the new algorithm, the case of SEG.TSval >=3D TS.Recent is
        included for consistency with the PAWS test.

   (g)  It is now recommended that the Timestamps option is included in
        <RST> segments if the incoming segment contained a Timestamps
        option.

   (h)  <RST> segments are explicitly excluded from PAWS processing.

   (j)  Snd.TSoffset and Snd.TSclock variables have been added.
        Snd.TSclock is the sum of my.TSclock and Snd.TSoffset.  This
        allows the starting points for timestamp values to be randomized
        on a per-connection basis.  Setting Snd.TSoffset to zero yields
        the same results as [RFC1323].  Text was added to guide
        implementers to the proper selection of these offsets, as
        entirely random offsets for each new connection will conflict
        with PAWS.


Congestion Window Validation (CWV):
http://www.ietf.org/proceedings/69/slides/tcpm-7.pdf
https://tools.ietf.org/html/rfc7661

Provides a mechanism to address issues that arise when
TCP is used for traffic that exhibits periods where the sending rate
is limited by the application rather than the congestion window. This=20
RFC provides an experimental update to TCP that allows a TCP sender to
restart quickly following a rate-limited interval.  This method is
expected to benefit applications that send rate-limited traffic using
TCP while also providing an appropriate response if congestion is
experienced.

Motivation:
   Standard TCP states that a TCP sender SHOULD set cwnd to no more than
   the Restart Window (RW) before beginning transmission if the TCP
   sender has not sent data in an interval exceeding the retransmission
   timeout, i.e., when an application becomes idle [RFC5681].  [RFC2861]
   notes that this TCP behaviour was not always observed in current
   implementations.  Experiments confirm this to still be the case (see
   [Bis08]).

   Congestion Window Validation (CWV) [RFC2861] introduced the term
   "application-limited period" for the time when the sender sends less
   than is allowed by the congestion or receiver windows.


   Standard TCP does not impose additional restrictions on the growth of
   the congestion window when a TCP sender is unable to send at the
   maximum rate allowed by the cwnd.  In this case, the rate-limited
   sender may grow a cwnd far beyond that corresponding to the current
   transmit rate, resulting in a value that does not reflect current
   information about the state of the network path the flow is using.
   Use of such an invalid cwnd may result in reduced application
   performance and/or could significantly contribute to network
   congestion.


Active Queue Management (AQM):

Active Queue Management is an effort to avoid the latency increases (and in=
crease in time in the=20
feedback loop) and bursty losses caused by naive tail drop in intermediate =
buffering. The concept
was introduced along with a discussion of the queue management algorithm "R=
ED" (Random Early=20
Detect/Drop) by RFC 2309. The most current RFC is 7567.

The usual mix of long high throughput and short low latency flows place con=
flicting demands on=20
the queue occupancy of a switch:

   o  The queue must be short enough that it does not impose excessive
      latency on short flows.
   o  The queue must be long enough to buffer sufficient data for the
      long flows to saturate the path capacity.
   o  The queue must be short enough to absorb incast bursts without
      excessive packet loss.
=20
RED:
   The RED algorithm itself consists of two main parts: estimation of
   the average queue size and the decision of whether or not to drop an
   incoming packet.

   (a) Estimation of Average Queue Size

        RED estimates the average queue size, either in the forwarding
        path using a simple exponentially weighted moving average (such
        as presented in Appendix A of [Jacobson88]), or in the
        background (i.e., not in the forwarding path) using a similar
        mechanism.

   (b) Packet Drop Decision

        In the second portion of the algorithm, RED decides whether or
        not to drop an incoming packet.  It is RED's particular
        algorithm for dropping that results in performance improvement
        for responsive flows.  Two RED parameters, minth (minimum
        threshold) and maxth (maximum threshold), figure prominently in
        this decision process.  Minth specifies the average queue size
        *below which* no packets will be dropped, while maxth specifies
        the average queue size *above which* all packets will be
        dropped.  As the average queue size varies from minth to maxth,
        packets will be dropped with a probability that varies linearly
        from 0 to maxp.


Recommendations on Queue Management and Congestion Avoidance
in the Internet
https://tools.ietf.org/html/rfc2309

IETF Recommendations Regarding Active Queue Management
https://tools.ietf.org/html/rfc7567

https://en.wikipedia.org/wiki/Active_queue_management


Explicit Congestion Notification (ECN):
At its core ECN in TCP allows compliant routers to provide compliant sender=
s with notification
of "virtual drops" as a congestion indicator to halve its congestion window=
. This allows the=20
sender to not wait for the retransmit timeout or repeated ACKS to learn of =
a congestion=20
event and allows the receiver to avoid latency induced by drop/retransmit. =
ECN relies on some=20
form of AQM in the intermediate routers/switches to determine the marking t=
he CE (congestion
encountered) bit IP header, it is then the receiver's responsibility to mar=
k the ECE (ECN-Echo)=20
in the TCP header of the subsequent ACK. The receiver will continue to send=
 packets marked with=20
the ECE bit until it receives a packet with the CWR (Congestion Window Redu=
ced) bit set. Note=20
that although this last design decision makes it robust in the presence of =
ack loss (the=20
original version ECN specifies that ACKs / SYNs / SYN-ACKs not be marked as=
 ECN capable and=20
thus are not eligible for marking), it limits the use of ECN to once per RT=
T. As we'll see
later this leads to interoperability issues with DCTCP.

ECN is negotiated at connection time. In FreeBSD it is configured by a sysc=
tl defaulting to off
for all connections. Enabling the sysctl enables it for all connections. Th=
e last time a survey=20
was done, 2.7% of the internet would not respond to a SYN negotiating ECN. =
This isn't fatal as=20
subsequent SYNs will switch to not requesting ECN. This just adds the defau=
lt RTO to connection
establishment (3s in FreeBSD, 1s per RFC6298 - discussed later).

Linux has some very common sense configurability improvements. Its ECN knob=
 takes on _3_ values:
0) no request / no accept 1) no request / accept 2) request / accept. The d=
efault is (1),=20
supporting it for those adventurous enough to request it. The route command=
 can specify ECN by
subnet. In effect allowing servers / clients to only use it within a data c=
enter or between=20
compliant data centers.

ECN sees very little usage due to continued compatibility concerns. Althoug=
h the difficulty of
correctly tuning maxth and minth in RED and many other AQM mechanisms is no=
t specific to ECN,=20
RED et al are necessary to use ECN and thus further add to associated diffi=
culties of its use.


Talks:
More Accurate ECN Feedback in TCP (AccECN)
- https://www.ietf.org/proceedings/90/slides/slides-90-tcpm-10.pdf

ECN is slow, does not report condition extent, just it's existence. It lack=
s inter-
operability with DCTCP. Need to add mechanism for negotiating finer-grained=
,=20
adaptive congestion notification.=20


RFCS:

A Proposal to add Explicit Congestion Notification (ECN) to IP
- https://tools.ietf.org/html/rfc2481

Initial proposal.


The Addition of Explicit Congestion Notification (ECN) to IP
- https://tools.ietf.org/html/rfc3168

Elaboration and further specification of how to tie it in to TCP.

=20
Adding Explicit Congestion Notification (ECN) Capability to TCP's SYN/ACK P=
ackets
- https://tools.ietf.org/html/rfc5562

Sometimes referred to as ECN+. This extends ECN to SYN/ACK packets. Note th=
at SYN
packets are still not covered, being considered a potential security hole.


Accurate ECN (AccECN)
Problem Statement and Requirements for Increased Accuracy
in Explicit Congestion Notification (ECN) Feedback

- https://tools.ietf.org/html/rfc7560

Problem Statement and Requirements for Increased Accuracy
in Explicit Congestion Notification (ECN) Feedback
   "A primary motivation
   for this document is to intervene before each proprietary
   implementation invents its own non-interoperable handshake, which
   could lead to _de facto_ consumption of the few flags or codepoints
   that remain available for standardizing capability negotiation."


Incast:
The term was coined in [PANFS] for the case of increasing the number of
simultaneously initiated, effectively barrier synchronized, fan-in flows=20
in to a single port to the point where the instantaneous switch / NIC buffe=
ring
capacity was exceeded. Thus causing a decline in aggregate bandwidth as the=
 need
for re-transmits increases. This is further exacerbated by tail-drop behavi=
or in
the switch whereby multiple losses within individual streams exceeds the re=
-
covery abilities of duplicate ACKs or SACK, leading to RTOs before the flow=
 is=20
resumed.


The Panasas ActiveScale Storage Cluster - Delivering Scalable
High Bandwidth Storage [PANFS]
- http://acm.supercomputing.org/sc2004/schedule/pdfs/pap207.pdf

Focuses on the Object-based Storage Device (OSD) component backing the PanF=
S=20
distributed file system. PanFS runs on the client, backend storage consists=
 of=20
networked block devices (OSD). The intelligence consists in how stripes are=
 laid
out across OSD. PanFS relies on a Metadata Server (MDS) to control the inte=
raction
of clients with the objects on OSDs and maintain cache coherency.

Scalable bandwidth is achieved through aggregation by striping data across =
many
OSDs. Although in principle it would be desirable to stripe files as widely=
 as
possible. In practice, in their 1Gbps testbed (this is 2004) bandwidth scal=
ed
linearly from 3 to 7  OSDs but then after 14 OSDs aggregate bandwidth actua=
lly
decreases. With a 10ms disk access latency, if just one OSD experienced eno=
ugh=20
packet loss to result in one 200ms RTO the system would suffer a 10x decrea=
se in
performance.

Changes to address the incast problem:
  - Reduce the minRTO from 200ms to 50ms.
  - Tuning the _individual, socket buffer size. While a client must have a =
large
    aggregate receive buffer size, each individual stream's receive buffer =
should
    be relatively small. Thus they reduced the clients' (per OSD) receive s=
ocket
    buffer to under 64K.
  - To reduce the size of a single synchronized incast response PanFS imple=
ments
    a two level striping pattern. The first level is optimized for RAID's p=
arity
    update performance and read overhead. The second level of striping is d=
esigned
    to resist incast induced bandwidth penalties by stacking successive par=
ity
    stripes that are stacked in the same subset of objects. They call N seq=
uential
    parity stripes that are stacked in the same set of objects a 'visit', b=
ecause
    a client repeatedly feteches data from just a few OSDs (whose number is=
=20
    controlled by parity stripe width) for a while, then moves on to the ne=
xt set
    of OSDs. This striping pattern minimizes simultaneous fan-in and thus t=
he=20
    potential for incast. Typically PanFS stripes about 1GB of data per vis=
it,
    using a round-robin layout algorithm of visits across all OSDs.


Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storag=
e Systems
- https://www.usenix.org/legacy/event/fast08/tech/full_papers/phanishayee/p=
hanishayee_html/

Attempts to do a more general analysis of incast than [PANFS]. Analysis is =
based on
the model of a cluster-based storage system with data blocks striped over a=
 number of
servers. They refer to a single block fragmented over multiple servers as a=
 Server
Request Unit (SRU). A subsequent block request will only be made after the =
client=20
has received all the data for the current block. They refer to such reads a=
s=20
'synchronized reads'. The paper makes three contributions to the literature=
:

  - Explores the root causes of incast, characterizing it under a variety o=
f=20
    conditions (buffer space, varying number of servers, etc.). Buffer spac=
e can
    delay the onset of Incast, but any particular switch configuration will=
 have
    some maximum number of servers that can send simultaneously before=20
    throughput collapse occurs.
   =20
    - Reproduce incast collapse on 3 different models of switches. In some =
cases
      disabling QoS can help delay incast by freeing up packet buffers for=
=20
      general switching.

    - Demonstrate applicability of simulation by showing that the throughpu=
t=20
      collapse curve produced by ns-2 with a simulated 32KB buffer closely
      matches that shown by the HP Procurve 2848 with QoS disabled.
   =20
    - Analysis of TCP traces obtained from simulation reveals that TCP re-
      transmission timeouts are the primary cause of incast.

    - Displays the effect of varying the switch buffer size. Doubling the s=
ize
      of the switch's output port buffer doubles the number of servers that=
 can=20
      be supported before the system experiences incast.

    - TCP performs well in settings without synchronized reads, which can
      be modelled by an infinite SRU size. Running netperf across many serv=
ers
      does not induce incast. With larger SRU sizes servers can use the spa=
re
      link capacity made available by any stalled flow waiting for a timeou=
t
      event.=20


  - Examines the effectiveness of existing TCP variants (e.g. Reno, NewReno=
,
    SACK, and limited transmit). Although the move from Reno to NewReno=20
    improves performance, none of the additional improvements help. When TC=
P
    loses all packets in its window or loses retransmissions, no clever los=
s
    recovery algorithms can help.

  - Examine a set of techniques that are moderately effective in masking In=
cast,
    such as drastically reducing TCP's retransmission timeout timer. None o=
f
    these techniques are without drawbacks.
   =20
    - reducing RTOmin from 200ms to 200us improves throughput by an order o=
f
      magnitde for 8-32 servers. However, at the time of the paper Linux an=
d
      BSD TCP implementations were unable to provide a timer of sufficient=
=20
      granularity to calculate RTT at less than the system clock frequency.


Understanding TCP Incast Throughput Collapse in Datacenter Networks
- http://conferences.sigcomm.org/sigcomm/2009/workshops/wren/papers/p73.pdf

Proposes an analytical model of limited generality based on the results
observed in two test beds.
  - Observed little benefit from disabling delayed acks

  - Observed a much shallower decline in throughput after 4 servers with 1m=
s
    minRTO vs 200ms minRTO. No benefit was shown for 200us over 1ms. [The=
=20
    next paper concludes that this was because the calculated RTO never wen=
t
    below 5ms, so a 200us minRTO was equivalent to disabling minRTO in this
    setting].

  - For large RTO timer values, reducing the RTO timer value is a first-ord=
er=20
    mitigation. For smaller RTO timer values, intelligently controlling the
    inter-packet wait time [pacing] becomes crucial.

  - Observes two regions of throughput increase. Following the initial=20
    throughput decline there is an increasing region. They reason that: As
    the number of senders increase, 'T' increases, and there is less
    overlap in the RTO periods for different senders. This means
    the impact of RTO events is less severe - a mitigating effect.=20
    (Prob(enter RTO at t) =3D { 1/T : d < t < d + T, 0: otherwise} - d is t=
he=20
    delay for congestion info to propagate back to the sender and T is the=
=20
    width of the uniform distribution in time.)

  - The smaller the RTO timer values, the faster the rate of recovery betwe=
en=20
    the throughput minimum and the second order throughput maximum. For sma=
ller=20
    RTO timer values, the same increase in 'T' will have a larger mitigatin=
g=20
    effect. Hence, as the number of senders increases, the same increase in=
 'T'
    will result in a faster increase in the goodput for smaller RTO timer=
=20
    values.

  - After the second order goodput maximum, the slope of throughput decreas=
e is the=20
    same for different RTO timer values. When 'T' becomes comparable or lar=
ger than
    the RTO timer value, the amount of interference between retransmits aft=
er RTO=20
    and transmissions before RTO no longer depends on the value of the RTO =
timer.


Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communic=
ation
- https://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/ekrevat/docs/SIGCOMMInca=
st.pdf

Effectively makes the case for using high resolution timers to neable micro=
second
granularity TCP timeouts. They claim that they demonstrate that this techni=
que is
effective in avoiding TCP incast collapse in both simulation and real-world=
=20
experiments.
  - Prototype uses Linux's high resolution kernel timers.
  - Demonstrate that this change prevents incast collapse in practice for u=
p
    to 47 senders.
  - Demonstrate that simply reducing RTOmin in today's [2009] TCP=20
    implementations without also improving the timing granularity does not
    prevent TCP incast.
   =20
    - Even without incast patterns, the RTO can determine observed performa=
nce.
      Simple example: They started ten bulk-data transfer TCP flows from te=
n
      clients to one server.  They then had another client issue small
      request packets for 1KB of data from the server, waiting for
      the response before sending the next request.  Approximately
      1% of these requests experienced a TCP timeout, delaying
      the response by at least 200ms. Finer-grained re-transmission handlin=
g
      can improve the performance of latency sensitive applications.

   Evaluating Throughput with Fine-Grained RTO:
     - to be maximally effective timers must operate on a granularity close=
 to
       the RTT of the network.

     - Jacobson RTO Estimation:
       - The standard RTO estimator [V. Jacobson, 98] tracks a smoothed
       =09 estimate of the round-trip time, and sets the timeout to this RT=
T
=09 estimate plus 4 times the mean deviation (a simpler calculation
=09 than the standard deviation, and given a normal distribution of
=09 prediction errors mdev =3D sqrt(pi/2)*sdev).

=09 - RTO =3D SRTT + (4xRTTMDEV)

     =09 - Two factors set lower bounds on the value that the RTO can achie=
ve:
      =20
           - the explicit configuration parameter RTOmin

       =09   - the implicit effects of the granularity with which the RTT i=
s=20
       =09     measured and with which the kernel sets and checks timers.
=09     Most implementations track RTTs and timers at a granularity
=09     of 1ms or larger. Thus the minimum achievable RTO is 5ms.

     - In Simulation (simulate one client with multiple servers connected
       through a single switch with an unloaded RTT of 100us, each node has
       a 1Gbps link, the switch buffers have 32KB of space per output port,
       and a random timer scheduling delay of up to 20us to account for
       real-world variance):
      =20
       - With an RTOmin of 200ms throughput drops by an order of magnitude
       =09 with 8 concurrent senders.

       - Reducing RTOmin to 1ms is effective for 8-16 concurrent senders,
       =09 fully utilizing the client's link. However, throughput declines
=09 as the number of servers is increased. 128 concurrent senders
=09 use only 50% of the available link bandwidth even with a 1ms
=09 RTOmin.

     - In Real Clusters (sixteen node cluster w/ HP Procurve 2848 &=20
       48 node cluster w/ Force10 S50 switch - all nodes 1Gbps and a
       client to server RTT of ~100us):

       - Modified the Linux 2.6.28 kernel to use 'microsecond-accurate'
       =09 timers with microsecond granularity RTT estimation.

       - For all configurations, throughput drops with increasing RTOmin
       =09 above 1ms. For 8 and 16 concurrent senders, the default RTOmin
=09 of 200ms results in nearly 2 orders of magnitude drop in through-
=09 put.

      - Results show identical performance for RTOmin values of 200us and
      =091 ms. Although teh baseline RTTs can be between 50-100us, increase=
d
=09congestion causes RTTs to rise to 400us on average with spikes as=20
=09high as 850us. Thus the higher RTTs combined with increased RTT
=09variance causes the RTO estimator to set timeouts of 1-3ms and an
=09RTOmin below 1ms will not lead to shorter retransmission times.
=09In effect, specifying an RTOmin <=3D 1ms is equivalent to eliminating
=09RTOmin.

   Next-Generation Datacenters:

     - 10Gbps networks have smaller RTTs than 1Gbps - port-to-port latency
       can be as low as 10us. In a sampling of an active storage node at=20
       LANL 20% of RTTs are belowe 100us even when accounting for kernel
       scheduling.
     - smaller RTO values are required to avoid idle link time.=20

     - Scaling to Thousands [simulating large numbers of servers on a 10Gbp=
s network]
       (reduce baseline RTTs from 100us to 20us, eliminate 20us timer sched=
uling
       variance, increase link capacity to 10Gbps, set per-port buffer size=
 to 32KB,
       increase blocksize to 80MB to ensure each flow can saturate a 10Gbps=
 link,=20
       vary the number of servers from 32 to 2048):
      =20
       - Having an artificial bound of either 1ms or 200us results in low t=
hroughput
       =09 in a network whose RTTs are 20us - underscoring the requirement =
that=20
=09 retransmission timeouts should be on the same timescale as network late=
ncy
=09 to avoid incast collapse.

       - Eliminating a lower bound on RTO performs well for up to 512 concu=
rrent
       =09 senders. For 1024 servers and beyond, even the aggressively low =
RTO
=09 configuration sees up to a 50% reduction in throughput resulting from
=09 significant periods of link idle time caused by repeated, simultaneous,
=09 successive timeouts.
=09=20
=09 - For incast communication the standard exponential backoff increase of
=09   RTO can overshoot some portion of the time the link is actually idle.
=09   Because only one flow must overshoot to delay the entire transfer,=20
=09   the probability of overshooting increases with increased number of
=09   flows.
=09 - Decreased throughput for a large number of flows can be attributed to
=09   many flows timing out simultaneously, backing off deterministically,
=09   and retransmitting at the same time. While some flows are successful
=09   on this retransmission, a majority of flows lose their retransmitted
=09   packet and backoff by another factor of two, sometimes far beyond
=09   when the link becomes idle.

     - Desynchronizing Retransmissions
      =20
       - Adding some randomness to the RTO will desynchronize retransmissio=
ns.
      =20
       - Adding an adaptive randomize RTO to the scheduled timeout:

       =09 timeout =3D (RTO + (rand(0.5) x RTO)) x 2^backoff

=09 performs well regardless of the number of concurrent senders.=20
=09 Nonetheless, real-world variances my be large enough to avoid the
=09 need for explicit randomization in practice.

      - Do not evaluate the impact on wide area flows.

     - Implementing fine-grained retransmissions
      =20
       - Three changes to the Linux TCP stack were required:
       =09=20
=09 - microsecond resolution time accounting to track RTTs with greater
=09   precision - store microseconds in the TCP timestamp option=20
=09   [timestamp resolution can go as high as 57ns without violating the
=09   requirements of PAWS]

=09 - redefinition of TCP constants - timer constants formerly defined in=
=20
=09   terms of jiffies [ticks] are converted to absolute values (e.g. 1ms=
=20
=09   instead of 1 jiffy)
=09 =09=20
=09 - replacement of low-resolution timers with hrtimers - replace standard
=09   timer objects in the socket structure with the hrtimer structure,
=09   ensuring that all calls to set, reset, or clear timers use the
=09   hrtimer functions.

       - Results:
       =09=20
=09 - Using the default 200ms RTOmin throughput plummets beyond 8
=09   concurrent senders on both testbeds.

=09 - On the 16 server testbed a 5ms jiffy-based RTOmin throughput begins=
=20
=09   to drop at 8 servers to ~70% of link capacity and slowly decreases=20
=09   thereafter. On the 47 server testbed [Force10 switch] the 5ms=20
=09   RTOmin kernel obtained 70-80% throughput with a substantial
=09   decline after 40 servers.
=09  =20
=09 - TCP hrtimer implementation / microsecond RTO kernel is able to
=09   saturate the link for up to 16/47 servers [total number in=20
=09   both testbeds].

       - Implications of Fine-Grained TCP Retransmissions:

       =09 - A receiver's delayed ACK timer should always fire before the s=
enders
=09   retransmission timer fires to prevent the sender form timing out
=09   waiting for an ACK that is merely delayed. Current system protect
=09   against this by setting the delayed ACK timer to a value (40ms)
=09   that is safely under the RTOmin (200ms).
=09=20
=09- A host with microsecend granularity retransmissions would periodically
=09  experience an unnecessary timeout when communicating with unmodified
=09  hosts in environments where the RTO is below 40ms (e.g., in the data
=09  center and for short flows in the WAN), because the sender incorrectly
=09  assumes that a loss has occurred. In practice the two consequences
=09  are mitigated by newer TCP features and the limited circumstances in
=09  which they occur (and bulk data transfer is essentially unimpacted by=
=20
=09  the issue).

=09  - The major potential effect of a spurious timeout is a loss of
=09    performance:  a flow that experiences a timeout will reduce
=09    its slow-start threshold (ssthresh) by half, its window to one
=09    and attempt to rediscover link capacity.  It is important to
=09    understand that spurious timeouts do not endanger network
=09    stability through increased congestion [On estimating end-to-end
=09    network path properties. SIGCOMM 99]. Spurious timeouts
=09    occur not when the network path drops packets, but rather when=20
=09    the path observers a sudden, higher delay.
=09=20
=09 - Several algorithms have been proposed to undo the effects of spurious
=09   timeouts have been proposed and, in the case of F-RTO [Forward=20
=09   RTO-Recovery RFC 4138], adopted in the Linux TCP implementation.

       - When seeding torrents over a WAN there was no observable differenc=
e
       =09 in performance between the 200us and 200ms RTOmin [no penalty].

       - Interaction with Delayed ACK in the Datacenter: For servers using =
a
       =09 reduced RTO in  a  datacenter  environment, the server's retrans=
mission=20
=09 timer may expire long before an unmodied client's 40ms delayed ACK time=
r
=09 expires. As a result, the server will timeout and resend the unacked
=09 packet, cutting ssthresh in half and rediscovering link capacity using
=09 slow-start. Because the client acknowledges the retransmitted segment=
=20
=09 immediately, the server does not observe a coarse-grained 40ms delay,=
=20
=09 only an unnecessary timeout.

       - Although for full performance delayed acks should be disabled, unm=
odified
       =09 clients still achieve good performance and avoid incast when onl=
y the
=09 servers implement fine-grained retransmissions.

Data Center Transmission Control Protocol (DCTCP):
The Microsoft & Stanford developed CC protocol uses simplified switch RED/E=
CN CE marking to=20
provide fine grained congestion notification to senders. RED is enabled in =
the switch but
minth=3Dmaxth=3DK, where K is an empirically determined constant that is a =
function of bandwidth
and desired switch utilization vs rate of convergence. Common values for K =
are 5 for 1Gbps
and 60 for 10Gbps. The value for 40Gbps is presumably on the order of 240. =
The sender's=20
congestion window is scaled back once per RTT as function of (#ECE/(#segmen=
ts in window))/2.
In the degenerate case of all segments being marked window is scaled back a=
 la a loss in
Reno. In the steady state latencies are much lower than in Reno due to cons=
iderably reduced
switch occupancy.=20

There is currently no mechanism for negotiating CC protocols and DCTCP's re=
liance on continuous
ECE notifications is incompatible with ECN's continuous repeating of the sa=
me ECE until a CWR
is received. In effect ECN support has to be sucessfully negotiated when es=
tablishing the=20
connection, but the receiver has to instead provide one ECE per new CE seen=
.=20

RFC:
Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters
https://tools.ietf.org/pdf/draft-ietf-tcpm-dctcp-00.pdf


The window scaling constant is referred to as 'alpha'. Alpha=3D0 correspond=
s
to no congestion, alpha=3D1 corresponds to a loss event in Reno or an ECE m=
ark in standard
ECN  - resulting in a halving of the congestion window. 'g' is the feedback=
 gain, 'M' is the=20
fraction of bytes marked to bytes sent. Alpha and the congestion window 'cw=
nd' are calculated
as follows:

alpha =3D alpha * (1 - g) + g * M

cwnd =3D cwnd * (1 - alpha/2)

To cope with delayed acks DCTCP specifies the following state machine - CE =
refers to DCTCP.CE,=20
a new Boolean TCP state variable, "DCTCP Congestion Encountered" - which is=
 initialized to=20
false and stored in the Transmission Control Block (TCB).

=20
                                  Send immediate
                                  ACK with ECE=3D0
                        .----.    .-------------.     .---.
           Send 1 ACK  /     v    v             |    |     \
            for every |     .------.           .------.     | Send 1 ACK
            m packets |     | CE=3D0 |           | CE=3D1 |     | for every
           with ECE=3D0 |     =E2=80=99------=E2=80=99           =E2=80=99-=
-----=E2=80=99     | m packets
                       \     |    |             ^    ^     /  with ECE=3D1
                        =E2=80=99---=E2=80=99      =E2=80=99------------=E2=
=80=99    =E2=80=99----=E2=80=99
                                   Send immediate
                                   ACK with ECE=3D1

The clear implication of this is that if the ack is delayed by more than m,=
 as in different
assumptions between peers or dropped ACKs, the signal can underestimate the=
 level of encountered=20
congestion. None of the literature suggests that this has been a problem in=
 practice.

[Section 3.4 of RFC]
Handling of SYN, SYN-ACK, RST Packets
   [RFC3168] requires that a compliant TCP MUST NOT set ECT on SYN or
   SYN-ACK packets.  [RFC5562] proposes setting ECT on SYN-ACK packets,
   but maintains the restriction of no ECT on SYN packets.  Both these
   RFCs prohibit ECT in SYN packets due to security concerns regarding
   malicious SYN packets with ECT set.  These RFCs, however, are
   intended for general Internet use, and do not directly apply to a
   controlled datacenter environment.  The switching fabric can drop TCP
   packets that do not have the ECT set in the IP header.  If SYN and
   SYN-ACK packets for DCTCP connections do not have ECT set, they will
   be dropped with high probability.  For DCTCP connections, the sender
   SHOULD set ECT for SYN, SYN-ACK and RST packets.

[Section 4]
Implementation Issues
- the implementation must choose a suitable estimation gain (feedback gain)
  - [DCTCP10] provides a theoretical basis for its selection, in practice
    more practical to select empirically by network/workload
  - The Microsoft implementation uses a fixed estimation gain of 1/16

- the implementation must decide when to use DCTCP. DCTCP may not be=20
  suitable or supported for all peers.

- It is  RECOMMENDED that the implementation deal with loss episodes in
   the same way as conventional TCP.

- To prevent incast throughput collapse, the minimum RTO (MinRTO) should be=
=20
  lowered significantly. The default value of MinRTO in Windows is 300ms,=
=20
  Linux 200ms, and  FreeBSD 233ms. A lower MinRTO requires a correspondingl=
y=20
  lower delayed ACK timeout on the receiver. Thus, it is RECOMMENDED that a=
n=20
  implementation allow configuration of lower timeouts for DCTCP connection=
s.

- It is also RECOMMENDED that an implementation allow configuration of=20
  restarting the congestion window (cwnd) of idle DCTCP connections as desc=
ribed=20
  in [RFC5681].

-  [RFC3168] forbids the ECN-marking of pure ACK packets, because of the
   inability of TCP to mitigate ACK-path congestion and protocol-wise
   preferential treatment by routers.  However, dropping pure ACKs -
   rather than ECN marking them - has disadvantages for typical
   datacenter traffic patterns. Dropping of ACKs causes subsequent re-
   transmissions.  It is RECOMMENDED that an implementation provide a=20
   configuration knob that forces ECT to be set on pure ACKs.

[Section 5]
Deployment Issues
-  DCTCP and conventional TCP congestion control do not coexist well in
   the same network.  In DCTCP, the marking threshold is set to a very
   low value to reduce queueing delay, and a relatively small amount of
   congestion will exceed the marking threshold.  During such periods of
   congestion, conventional TCP will suffer packet loss and quickly and
   drastically reduce cwnd.  DCTCP, on the other hand, will use the
   fraction of marked packets to reduce cwnd more gradually.  Thus, the
   rate reduction in DCTCP will be much slower than that of conventional
   TCP, and DCTCP traffic will gain a larger share of the capacity
   compared to conventional TCP traffic traversing the same path. It is
   RECOMMENDED that DCTCP traffic be segregated from conventional TCP traff=
ic.
   [MORGANSTANLEY] describes a deployment that uses the IP DSCP bits to=20
   segregate the network such that AQM is applied to DCTCP traffic, whereas=
=20
   TCP traffic is managed via drop-tail queueing.

-  Since DCTCP relies on congestion marking by the switches, DCTCP can
   only be deployed in datacenters where the entire network
   infrastructure supports ECN.  The switches may also support
   configuration of the congestion threshold used for marking.  The
   proposed parameterization can be configured with switches that
   implement RED.  [DCTCP10] provides a theoretical basis for selecting
   the congestion threshold, but as with the estimation gain, it may be
   more practical to rely on experimentation or simply to use the
   default configuration of the device.  DCTCP will degrade to loss-
   based congestion control when transiting a congested drop-tail link.

-  DCTCP requires changes on both the sender and the receiver, so both
   endpoints must support DCTCP.  Furthermore, DCTCP provides no
   mechanism for negotiating its use, so both endpoints must be
   configured through some out-of-band mechanism to use DCTCP.  A
   variant of DCTCP that can be deployed unilaterally and only requires
   standard ECN behavior has been described in [ODCTCP][BSDCAN], but
   requires additional experimental evaluation.

[Section 6]
Known Issues

-  DCTCP relies on the sender=E2=80=99s ability to reconstruct the stream o=
f CE
   codepoints received by the remote endpoint.  To accomplish this,
   DCTCP avoids using a single ACK packet to acknowledge segments
   received both with and without the CE codepoint set.  However, if one
   or more ACK packets are dropped, it is possible that a subsequent ACK
   will cumulatively acknowledge a mix of CE and non-CE segments.  This
   will, of course, result in a less accurate congestion estimate.

   o  Even with an inaccurate congestion estimate, DCTCP may still
      perform better than [RFC3168].
   o  If the estimation gain is small relative to the packet loss rate,
      the estimate may not be too inaccurate.
   o  If packet loss mostly occurs under heavy congestion, most drops
      will occur during an unbroken string of CE packets, and the
      estimate will be unaffected

- The effect of packet drops on DCTCP under real world conditions has not b=
een
  analyzed.

-  Much like standard TCP, DCTCP is biased against flows with longer
   RTTs.  A method for improving the fairness of DCTCP has been proposed
   in [ADCTCP], but requires additional experimental evaluation.


Papers:
Data Center TCP [DCTCP10]
- http://research.microsoft.com/en-us/um/people/padhye/publications/dctcp-s=
igcomm2010.pdf

The original DCTCP SIGCOMM paper by Stanford and Microsoft Research. It is =
very accessible
even for those of us not well versed in CC protocols.

 - reduce minRTO to 10ms.
 - suggest that K > (RTT * C)/7, where C is the sending rate in packets per=
 second.


Attaining the Promise and Avoiding the Pitfalls of TCP=20
in the Datacenter [MORGANSTANLEY]
- https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-judd.p=
df

Real world experience deploying DCTCP on Linux at Morgan Stanley.

  - reduce minRTO to 5ms.
  - reduce delayed ACK to 1ms.
  - Only ToR switches support ECN marking, higher level switches purely tai=
l-drop.
    Tests show that DCTCP successfully resorts to loss-based congestion con=
trol when
    transiting a congested drop-tail link.
  - Find that setting ECT on SYN and SYN-ACK is critical for the practical=
=20
    deployment of DCTCP. Under load, DCTCP would fail to establish network=
=20
    connections in the absence of ECT in SYN and SYN-ACK packets. (DCTCP+)
  - Without correct receive buffer tuning DCTCP will converge _faster_ than=
 TCP,
    rather than the theoretical 1.4 x TCP.

Per-packet latency in ms
=09   TCP=09   DCTCP+
Mean=09   4.01=09   0.0422
Median=09   4.06=09   0.0395
Maximum=09   4.20=09   0.0850
Minimum=09   3.32=09   0.0280
sigma=09   0.167   0.0106


Extensions to FreeBSD Datacenter TCP for Incremental
Deployment Support [BSDCAN]
- https://www.bsdcan.org/2015/schedule/attachments/315_dctcp-bsdcan2015-pap=
er.pdf

Proposes a variant of DCTCP that can be deployed only on one endpoint of a =
connection,
provided the peer is ECN-capable.
ODTCP changes:
  - In order to facilitate one-sided deployment, a DCTCP
    sender should set the CWR mark after receiving an ECE-
    marked ACK once per RTT. It is safe in two-sided deploy-
    ments, because a regular DCTCP receiver will simply ig-
    nore the CWR mark.=20
  - A a one-sided DCTCP receiver should always delay an ACK for=20
    incoming packets marked with CWR, which is the only indication
    of recovery exit.
DCTCP improvements:
  - ECE processing: Under standard ECN an ACK with an ECE mark will
    trigger congestion recovery. When this happens a sender stops
    increasing cwnd for one RTT. For DCTCP there is no reason for
    this response. ECEs are used, not for detecting congestion=20
    events, but to quantify the extent of congestion and react=20
    proportionally. Thus, there is no need to stop cwnd from in-
    creasing.=20
  - Set initial value of alpha to 0 (i.e. don't halve cwnd on first
    ECE seen).
  - Idle Periods: The same tradeoffs regarding "slow-start restart"
    apply to alpha. The FreeBSD implementation re-initializes alpha
    after an idle period longer than the RTO.
  - Timeouts and Packet Loss: The DCTCP specification defines the
    update interval for alpha as one RTT. To track this DCTCP compares
    received ACKs against the sequence numbers of outgoing packets.
    This is not robust in the face of packet loss. The FreeBSD=20
    implementation addresses this by updating alpha when it detects
    duplicate ACKs or timeouts.  =20


Data Center TCP (DCTCP)
- http://www.ietf.org/proceedings/80/slides/iccrg-3.pdf

Case studies, workloads, latency and flow completion time of TCP vs DCTCP.
Interesting set of slides worth skimming.
   - Small (10-100KB & 100KB - 1MB) background flows complete in ~45% less=
=20
     time than TCP.
   - 99th %ile & 99.9th %ile query flows are 2/3rds and 4/7ths respectively
   - large (1-10MB & > 10MB) flows unchanged
   - query completion time with 10 to 1 background incast unchanged with=20
     DCTCP, ~5x slower with TCP


Analysis of DCTCP: Stability, Convergence, and Fairness [ADCTCP]
- http://sedcl.stanford.edu/files/dctcp-analysis.pdf
Follow up mathematical analysis of DCTCP using a fluid model. Contains=20
interesting graphs showing how the gain factor affects the convergence rate
between two flows.
   - Analyzes the convergence of DCTCP sources to their fair share, obtaini=
ng
     an explicit characterization of the convergence rate.
   - Proposes a simple change to DCTCP suggested by the fluid model which=
=20
     significantly improves DCTCP's RTT-fairness. It suggests updating the=
=20
     congestion window continuously rather than once per RTT.
   - Finds that with a marking threshold, K, of about 17% of the bandwidth-
     delay product, DCTCP achieves 100% throughput, and that even for value=
s=20
     of K as small as 1% of the bandwidth-delay product, its throughput is=
=20
     at least 94%.
   - Show that DCTCP's convergence rate is no more than a factor 1.4 slower=
 than=20
     TCP


Using Data Center TCP (DCTCP) in the Internet [ADCTCP]
- http://www.ikr.uni-stuttgart.de/Content/Publications/Archive/Wa_GLOBECOM_=
14_40260.pdf
Investigates what would be needed to deploy DCTCP incrementally outside the=
 data
center.
   - Proposes finer resolution for alpha value
   - Allow the congestion window to grow in the CWR state (similar to [BSDC=
AN])
   - Continuous update of alpha: Define a smaller gain factor (1/2^8 instea=
d of 1/2^4)
     to permit an EWMA updated every packet. However, g should actually be =
a function
     of number of packets in flight.
   - Progressive congestion window reduction: Similar to [ADCTCP], reduce t=
he congestion
     window on the reception of each ECE.

   - develops a formula for AQM RED parameters that always results in equal=
 sharing
     between DCTCP and non-DCTCP.


Incast Transmission Control Protocol (ICTCP):
In ICTCP the receiver plays a direct role in estimating the per-flow availa=
ble bandwidth
and actively re-sizes each connection's receive window accordingly.

- http://research.microsoft.com/pubs/141115/ictcp.pdf


Quantum Congestion Notification (QCN):
Congestion control in ethernet. Introduced as part of the IEEE 802.2 Standa=
rds=20
Body discussions for Data Center Bridging [DCB] motivated by the needs of F=
CoE.=20
The initial congestion control protocol was standardized as 802.1Qau. Unlik=
e=20
the single bit of congestion information per-packet in TCP QCN uses 6-bits.

The algorithm is composed of two main parts: Switch or Control Point (CP)=
=20
Dynamics and Rate Limiter or Reaction Point (RP) Dynamics.

  - The CP Algorithm runs at the network nodes. Its objective is to maintai=
n the
    node's buffer occupancy at the operating point 'Beq'. It computes a con=
-
    gestion measure Fb and randomly samples an incoming packet with a proba=
bility=20
    proportional to the severity of the congestion. The node sends a 6-bit=
=20
    quantized value of Fb back to the source of the sampled packet.
   =20
    - B: Value of the current queue length
    - Bold: Value of the buffer occupancy when the last feedback message wa=
s=20
      generated.
    - w: a non-negative constant, equal to 2 for the baseline implementatio=
n
    - Boff =3D B - Beq
    - Bd =3D B - Bold
    - Fb =3D Boff + w*Bd
      - essentially equivalent to the PI AQM. The first term is the offset
      =09from the target operating point and the second term is proportiona=
l
=09to the rate at which the queue size is changing.

     When Fb < 0, there is no congestion, and no feedback messages are sent=
.
     When Fb >=3D 0, then either the buffers or the link is oversubscribed,=
 and
     control action needs to be taken.

   - The RP algorithm runs on end systems (NICs) and controls the rate at w=
hich
     ethernet packets are transmitted. Unlike TCP, the RP algorithm does no=
t
     get positive ACKs from the network and thus needs alternative mechanis=
ms
     for increasing its sending rate.
    =20
     - Current Rate (Rc): The transmission rate of the source
     - Target Rate (Rt): The transmission rate of the source just before th=
e=20
       arrival of the last feedback message
     - Gain (Gd): a constant chosen so that Gd*|Fbmax| =3D 1/2 - that is to=
 say
       the rate can decrease by at most 50%. Only 6 bits are available for
       feedback so Fbmax =3D 64, and thus Gd =3D 1/128.
     - Byte counter: A counter at the RP for counting transmitted bytes; us=
ed
       to time rate increases
     - Timer: A clock at the RP used for timing rate increases.

     Rate Decreases:
     A rate decrease is only done when a feedback message is received:
       - Rt <- Rc
       - Rt <- Rc*(1 - Gd*|Fb|)=20

     Rate Increases:
     Rate Increase is done in two phases: Fast Recovery and Active Increase=
.

       Fast Recovery (FR): The source enters the FR state immediately after=
 a
       rate decrease event - at which point the Byte Counter is reset. FR
       consists of 5 cycles, in each of which 150KB of data (assuming full-
       sized regular frames) are transmitted (100 packets of 1500 bytes eac=
h),
       as counted by the Byte Counter. At the end of each cycle, Rt remains
       unchanged, and Rc is updated as follows:
       =09=09 =20
=09=09  Rc <- (Rc + Rt)/2
=09The rationale being that, when congested, Rate Decrease messages are
=09sent by the CP once every 100 packets. Thus the absence of a Rate
=09Decrease message during this interval indicates that the CP is no
=09longer congested.

      Active Increase (AI): After 5 cycles of FR, the source enters the AI
      state when it probes for extra bandwidth. AI consists of multiple
      cycles of 50 packets each. Rt and Rc are updated as follows:      =09=
 =20
      =09 - Rt <- Rt + Rai
=09 - Rc <- (Rc + Rt)/2
=09 - Rai: a constant set to 5Mbps by default.
      When Rc is extremely small after a rate decrease the time required to
      send out 150 KB can be excessive. To increase the rate of increase
      the source also uses a timer that is used as follows:=20
      =09 1) reset timer when rate decrease message arrives
=09 2) source enters FR and counts out 5 cycles of T ms duration
=09    (T =3D 10ms in baseline implementation), and in the AI state,
=09    each cycle is T/2 ms long
=09 3) in the AI state, Rc is updated when _either_ the Byte Counter
=09    or the Timer completes a cycle.
=09 4) The source is is in teh AI state iff either the Byte Counter
=09    or the timer is in teh AI state.
=09 5) if _both_ the Byte Counter and the Timer ar in AI the source is
=09    said to be in Hyper-Active Increase (HAI). In this case, at the
=09    completion of the ith Byte Counter and Timer cycle, Rt and Rc
=09    are updated:
=09     - Rt <- Rt + i*Rhai
=09     - Rc <- (Rc + Rt) / 2
=09     - Rhai: 50Mbps in the baseline

[Taken from "Internet Congestion Control" by Subir Varna, ch. 8]


Performance of Quantized Congestion Notification in TCP Incast Scenarios of=
=20
Data Centers
- http://eprints.networks.imdea.org/131/1/Performance_of_Quantized_Congesti=
on_Notification_-_2010_EN.pdf

Using the QCN pseudocode version released by Rong Pan [IEEE EDCS-608482]=20
simulated the performance of QCN at 1Gbps under a number of incast scenario=
s,
reaching the conclusion that the the default QCN behaviors will not scale
to large number of flows with full link utilization. It goes on to propose
a small number of changes to the QCN algorithm that _will_ support a large
number of flows at full link utilization. However, there is no indication i=
n the
literature that these ideas have been taken any further in practice. A surv=
ey
paper written in 2014 [A Survey on TCP Incast in Data Center Networks] indi=
cates
that these problems still exist. It is unclear what the current state of th=
e art
is in shipping hardware.


http://www.ieee802.org/3/ar/public/0505/bergamasco_1_0505.pdf
http://www.ieee802.org/1/files/public/docs2007/au-bergamasco-ecm-v0.1.pdf
http://www.cs.wustl.edu/~jain/papers/ftp/bcn.pdf
http://www.cse.wustl.edu/~jain/papers/ftp/icc08.pdf


Recommendations:=20

RFC 6298:=20

    - change starting RTO from 3s to 1s=09 (in /dctcp)=09=09=09=09D4294

    - DO NOT round RTO up to 1s counter to the suggestions here (long done)

    - simplify setting of minRTO sysctl to eliminate "slop" component"=09=
=09D4294
       (in /dctcp)

RFC 6928:

    - increase initial / idle window to 10 segments when connecting to=09(d=
one by hiren)
      data center peers


RFC 7323:

    - stop truncating SRTT prematurely on low-latency conections,=09=09D429=
3
      see appendix G to calculate reduce potentially detrimental
      fluctuations in calculated RTO


Incast:

 - do SW TSO only

 - add rudimentary pacing by interleaving streams

 - fine grained timers=09=09=09=09=09=09=09=09D4292

 - scale RTO down to same granularity as RTT=09(patch in progres)

ECN:
 - change default to allow ECN on incoming connections

 - set ECT on _ALL_ packets sent by a host using a DCTCP connection
=20
 - add facility to enable ECN by subnet


DCTCP:

 - add facility to enable DCTCP by subnet

 - set ECT on _ALL_ packets used by a host using a DCTCP connection

 - update TCP to use microsecond granularity timers to timestamps (patch in=
 progress)

 - when using current coarse-grained timers reduce minRTO to 3ms=09=09D4294
    when using DCTCP, if fine-grained timers are available disable
   minRTO when using DCTCP

 - reduce delack to 1/5th of min(minRTO, RTO) (reduced to 1/2 in /dctcp)=09=
D4294


ICTCP:

  - if there is time investigate it's use and the ability to use
    the socket buffer sizing to communicate the amount of anticipated
    data for purposes of TCB's sharing the port's connection optimally=09


From owner-freebsd-net@freebsd.org  Fri Nov 27 07:52:59 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7C692A3A371
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri, 27 Nov 2015 07:52:59 +0000 (UTC)
 (envelope-from hiren@strugglingcoder.info)
Received: from mail.strugglingcoder.info (strugglingcoder.info [65.19.130.35])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (Client CN "mail.strugglingcoder.info",
 Issuer "mail.strugglingcoder.info" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6C7E21862
 for <freebsd-net@freebsd.org>; Fri, 27 Nov 2015 07:52:59 +0000 (UTC)
 (envelope-from hiren@strugglingcoder.info)
Received: from localhost (unknown [10.1.1.3])
 (Authenticated sender: hiren@strugglingcoder.info)
 by mail.strugglingcoder.info (Postfix) with ESMTPA id 8EA6BC4BE6;
 Thu, 26 Nov 2015 23:52:58 -0800 (PST)
Date: Thu, 26 Nov 2015 23:52:58 -0800
From: hiren panchasara <hiren@strugglingcoder.info>
To: Matthew Macy <mmacy@nextbsd.org>
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject: Re: TCP notes and incast recommendations
Message-ID: <20151127075258.GD68002@strugglingcoder.info>
References: <15146a8f285.b094791a15089.3823664487014698900@nextbsd.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
 protocol="application/pgp-signature"; boundary="C1iGAkRnbeBonpVg"
Content-Disposition: inline
In-Reply-To: <15146a8f285.b094791a15089.3823664487014698900@nextbsd.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 07:52:59 -0000


--C1iGAkRnbeBonpVg
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On 11/26/15 at 05:57P, Matthew Macy wrote:
> In an effort to be somewhat current on the state TCP I've collected a small bibliography.

This is beyond awesome!
Thank you for this work.

Cheers,
Hiren

--C1iGAkRnbeBonpVg
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQF8BAABCgBmBQJWWAvXXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRBNEUyMEZBMUQ4Nzg4RjNGMTdFNjZGMDI4
QjkyNTBFMTU2M0VERkU1AAoJEIuSUOFWPt/lcHgH/RFscV6eCMNap2wqsFAl0Bcw
7mmQqA8L2WRi1qMoz8Lrxw/RnOGKfn5cXXO5i/ntbV7HEIqvkQXkzsixfHN4nRFV
/lnrLEJC/DHwpgno7diU4zPNcxOoENpX/pMwakcXzhQpaWkf8f7NgcECPQRDgDhF
8kCTAzQfH8WNKGBiEXDCM7xdrtByyBQItB9JAw+2oJ1zMxkg+Y5F6tIFnOfDR4F1
WLL6mp/mtvUhp8S8UhWM3ytFUsjSH1X2iRbOBD7Bda5F+jzl5WhqrrtlLIsUqgQg
SSKXjOrS+s0q7a5pc0Y/Dzsx7BM6lKLOFmxzu9+xOgaets3AVKHMBd0+Jczv+M4=
=/1Uh
-----END PGP SIGNATURE-----

--C1iGAkRnbeBonpVg--

From owner-freebsd-net@freebsd.org  Fri Nov 27 09:18:09 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A548EA36DFA
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri, 27 Nov 2015 09:18:09 +0000 (UTC)
 (envelope-from ddb@neosystem.org)
Received: from mail.neosystem.cz (mail.neosystem.cz
 [IPv6:2001:41d0:2:5ab8::10:15])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6D7151983;
 Fri, 27 Nov 2015 09:18:09 +0000 (UTC)
 (envelope-from ddb@neosystem.org)
Received: from mail.neosystem.cz (unknown [127.0.10.15])
 by mail.neosystem.cz (Postfix) with ESMTP id 6AB32BD7D;
 Fri, 27 Nov 2015 10:18:06 +0100 (CET)
X-Virus-Scanned: amavisd-new at mail.neosystem.cz
Received: from iron.sn.neosystem.cz (unknown [IPv6:2001:41d0:2:5ab8::100:107])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (No client certificate requested)
 by mail.neosystem.cz (Postfix) with ESMTPSA id 9CE92BD77;
 Fri, 27 Nov 2015 10:18:05 +0100 (CET)
Date: Fri, 27 Nov 2015 10:13:49 +0100
From: Daniel Bilik <ddb@neosystem.org>
To: Gary Palmer <gpalmer@freebsd.org>
Cc: freebsd-net@freebsd.org
Subject: Re: Outgoing packets being sent via wrong interface
Message-Id: <20151127101349.752c94090e78ca68cf0f81fc@neosystem.org>
In-Reply-To: <20151125122033.GB41119@in-addr.com>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
 <20151125122033.GB41119@in-addr.com>
X-Mailer: Sylpheed 3.4.3 (GTK+ 2.24.28; amd64-portbld-freebsd10.2)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 09:18:09 -0000

On Wed, 25 Nov 2015 12:20:33 +0000
Gary Palmer <gpalmer@freebsd.org> wrote:

> route -n get <unreachable IP>

As suggested by Kevin and Ryan, I set the router to drop redirects...

net.inet.icmp.drop_redirect: 1

... but it happened again today, and again affected host was 192.168.2.33.
Routing and arp entries were correct. Output of "route -n get"...

   route to: 192.168.2.33
destination: 192.168.2.0
       mask: 255.255.255.0
        fib: 0
  interface: re1
      flags: <UP,DONE,PINNED>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0 

... has not changed during the problem.

Interesting was ping result...

PING 192.168.2.33 (192.168.2.33): 56 data bytes
ping: sendto: Operation not permitted
ping: sendto: Operation not permitted
...
64 bytes from 192.168.2.33: icmp_seq=11 ttl=128 time=0.593 ms
ping: sendto: Operation not permitted
...
64 bytes from 192.168.2.33: icmp_seq=20 ttl=128 time=0.275 ms
64 bytes from 192.168.2.33: icmp_seq=21 ttl=128 time=0.251 ms
ping: sendto: Operation not permitted
...
64 bytes from 192.168.2.33: icmp_seq=40 ttl=128 time=0.245 ms
ping: sendto: Operation not permitted
64 bytes from 192.168.2.33: icmp_seq=42 ttl=128 time=7.111 ms
ping: sendto: Operation not permitted
...
--- 192.168.2.33 ping statistics ---
46 packets transmitted, 5 packets received, 89.1% packet loss

It seems _some_ packets go the right interface (re1), but most
try to go wrong (re0) and are dropped by pf...

00:00:01.066886 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 39, length 64
00:00:02.017874 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 41, length 64
00:00:02.069634 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 43, length 64

And again, refreshing default route (delete default / add default)
resolved it...

PING 192.168.2.33 (192.168.2.33): 56 data bytes
64 bytes from 192.168.2.33: icmp_seq=0 ttl=128 time=0.496 ms
64 bytes from 192.168.2.33: icmp_seq=1 ttl=128 time=0.226 ms
64 bytes from 192.168.2.33: icmp_seq=2 ttl=128 time=0.242 ms
64 bytes from 192.168.2.33: icmp_seq=3 ttl=128 time=0.226 ms

--
						Dan

From owner-freebsd-net@freebsd.org  Fri Nov 27 20:28:25 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 200D3A3A1FB
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri, 27 Nov 2015 20:28:25 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 0C729156D
 for <freebsd-net@FreeBSD.org>; Fri, 27 Nov 2015 20:28:25 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tARKSOoW081302
 for <freebsd-net@FreeBSD.org>; Fri, 27 Nov 2015 20:28:24 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-net@FreeBSD.org
Subject: [Bug 204853] Panic after close openconnect VPN Cisco
Date: Fri, 27 Nov 2015 20:28:24 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 10.2-RELEASE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: linimon@FreeBSD.org
X-Bugzilla-Status: New
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: assigned_to
Message-ID: <bug-204853-2472-KSR4Kv2DMP@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-204853-2472@https.bugs.freebsd.org/bugzilla/>
References: <bug-204853-2472@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 20:28:25 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204853

Mark Linimon <linimon@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|freebsd-bugs@FreeBSD.org    |freebsd-net@FreeBSD.org

-- 
You are receiving this mail because:
You are the assignee for the bug.

From owner-freebsd-net@freebsd.org  Sat Nov 28 10:06:59 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57051A3AFEE
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Sat, 28 Nov 2015 10:06:59 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 26D591DFA;
 Sat, 28 Nov 2015 10:06:58 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from Julian-MBP3.local (ppp121-45-225-88.lns20.per1.internode.on.net
 [121.45.225.88]) (authenticated bits=0)
 by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id tASA6pCe084546
 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
 Sat, 28 Nov 2015 02:06:55 -0800 (PST)
 (envelope-from julian@freebsd.org)
Subject: Re: Outgoing packets being sent via wrong interface
To: Daniel Bilik <ddb@neosystem.org>, Gary Palmer <gpalmer@freebsd.org>
References: <20151120155511.5fb0f3b07228a0c829fa223f@neosystem.org>
 <C1D7F956-81C9-4ED4-99B8-E0C73A3ECB37@FreeBSD.org>
 <20151120163431.3449a473db9de23576d3a4b4@neosystem.org>
 <20151121212043.GC2307@vega.codepro.be>
 <20151122130240.165a50286cbaa9288ffc063b@neosystem.cz>
 <20151125092145.e93151af70085c2b3393f149@neosystem.cz>
 <20151125122033.GB41119@in-addr.com>
 <20151127101349.752c94090e78ca68cf0f81fc@neosystem.org>
Cc: freebsd-net@freebsd.org
From: Julian Elischer <julian@freebsd.org>
Message-ID: <56597CB5.7030307@freebsd.org>
Date: Sat, 28 Nov 2015 18:06:45 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0)
 Gecko/20100101 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: <20151127101349.752c94090e78ca68cf0f81fc@neosystem.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 28 Nov 2015 10:06:59 -0000

On 27/11/2015 5:13 PM, Daniel Bilik wrote:
> On Wed, 25 Nov 2015 12:20:33 +0000
> Gary Palmer <gpalmer@freebsd.org> wrote:
>
>> route -n get <unreachable IP>
> As suggested by Kevin and Ryan, I set the router to drop redirects...
>
> net.inet.icmp.drop_redirect: 1
>
> ... but it happened again today, and again affected host was 192.168.2.33.
> Routing and arp entries were correct. Output of "route -n get"...
>
>     route to: 192.168.2.33
> destination: 192.168.2.0
>         mask: 255.255.255.0
>          fib: 0
>    interface: re1
>        flags: <UP,DONE,PINNED>
>   recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
>         0         0         0         0      1500         1         0
>
> ... has not changed during the problem.
>
> Interesting was ping result...
>
> PING 192.168.2.33 (192.168.2.33): 56 data bytes
> ping: sendto: Operation not permitted
> ping: sendto: Operation not permitted
> ...
> 64 bytes from 192.168.2.33: icmp_seq=11 ttl=128 time=0.593 ms
> ping: sendto: Operation not permitted
> ...
> 64 bytes from 192.168.2.33: icmp_seq=20 ttl=128 time=0.275 ms
> 64 bytes from 192.168.2.33: icmp_seq=21 ttl=128 time=0.251 ms
> ping: sendto: Operation not permitted
> ...
> 64 bytes from 192.168.2.33: icmp_seq=40 ttl=128 time=0.245 ms
> ping: sendto: Operation not permitted
> 64 bytes from 192.168.2.33: icmp_seq=42 ttl=128 time=7.111 ms
> ping: sendto: Operation not permitted
> ...
> --- 192.168.2.33 ping statistics ---
> 46 packets transmitted, 5 packets received, 89.1% packet loss
>
> It seems _some_ packets go the right interface (re1), but most
> try to go wrong (re0) and are dropped by pf...
>
> 00:00:01.066886 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 39, length 64
> 00:00:02.017874 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 41, length 64
> 00:00:02.069634 rule 53..16777216/0(match): block out on re0: 82.x.y.50 > 192.168.2.33: ICMP echo request, id 58628, seq 43, length 64
>
> And again, refreshing default route (delete default / add default)
> resolved it...
>
> PING 192.168.2.33 (192.168.2.33): 56 data bytes
> 64 bytes from 192.168.2.33: icmp_seq=0 ttl=128 time=0.496 ms
> 64 bytes from 192.168.2.33: icmp_seq=1 ttl=128 time=0.226 ms
> 64 bytes from 192.168.2.33: icmp_seq=2 ttl=128 time=0.242 ms
> 64 bytes from 192.168.2.33: icmp_seq=3 ttl=128 time=0.226 ms

next time it happens try flushing the arp table.

>
> --
> 						Dan
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@freebsd.org  Sat Nov 28 11:16:33 2015
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B1252A3ACA4;
 Sat, 28 Nov 2015 11:16:33 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7D0E01393;
 Sat, 28 Nov 2015 11:16:30 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from Julian-MBP3.local (ppp121-45-225-88.lns20.per1.internode.on.net
 [121.45.225.88]) (authenticated bits=0)
 by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id tASBGK0f085176
 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
 Sat, 28 Nov 2015 03:16:23 -0800 (PST)
 (envelope-from julian@freebsd.org)
Subject: Re: Kernel NAT issues
To: Nathan Aherne <nathan@reddog.com.au>
References: <94B91F98-DE01-4A10-8AB5-4193FE11AF3F@reddog.com.au>
 <20151013142301.B67283@sola.nimnet.asn.au>
 <C1C25100-FBD4-42F4-94F7-965B270D927F@reddog.com.au>
 <20151014232026.S15983@sola.nimnet.asn.au>
 <9908EC22-344F-4D0B-8930-7D2C70B084A1@reddog.com.au>
 <32DEEFB3-E41F-40CD-8E1A-520FB261C572@reddog.com.au>
 <564C8879.8070307@freebsd.org> <20151119032200.T27669@sola.nimnet.asn.au>
 <9D81BDD4-200C-40AB-AB24-B1112881E43A@reddog.com.au>
 <3BF360A8-35E6-4043-8AFF-87D983F29C66@reddog.com.au>
 <5652B9EB.10805@freebsd.org>
 <CA479F59-7408-4146-8F5A-85213DB64720@reddog.com.au>
Cc: freebsd-ipfw@freebsd.org, Ian Smith <smithi@nimnet.asn.au>,
 "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.ORG>
From: Julian Elischer <julian@freebsd.org>
Message-ID: <56598CFF.3060102@freebsd.org>
Date: Sat, 28 Nov 2015 19:16:15 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0)
 Gecko/20100101 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: <CA479F59-7408-4146-8F5A-85213DB64720@reddog.com.au>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 28 Nov 2015 11:16:33 -0000

On 27/11/2015 12:55 PM, Nathan Aherne wrote:
> Hi Julian,
>
> Thank you for replying. I was completely off grid for a while and only got back on it today.
>
> I thought that Vimage was probably the way to achieve what I want. The main reason I was staying away from Vimage was the reported bugs with it, another reason was the extra overhead. I would like to be able to shutdown jails quite regularly so was worried the kernel panic bug or memory leak bug might be a problem here. Is there any version of Vimage/FreeBSD which is stable?
Generally vimage is stable. It has had problems with pf over the years 
becasue pf is imported from OpenBSD and has some pretty 
vimage-unfriendly assumptions in its design, but I hear that even some 
of thise have been ironed out.
I know of vimage being used to run production virtual systems in some 
of the largest banks in the world processing amounts of trnasactions 
that would make your head spin so have a small play with it.
Vimage overhead is negative in some situations. i.e. things work faster..
This is especially true when non vimage workloads contest a single 
lock heavily,
but vimage splits it over many locks.. one for each VM.

run up a virtualbox or amazon or whatever freebsd instance and play 
around with it.
once realize how insanely powerful it is,  you will wonder how you 
ever did jails without it.

you can use bridges, epairs or netgraph to do your networking... your 
choice.


>
> Regards,
>
> Nathan
>
>> On 23 Nov 2015, at 5:02 pm, Julian Elischer <julian@freebsd.org> wrote:
>>
>> On 21/11/2015 10:06 AM, Nathan Aherne wrote:
>>> I had a bit of a think about how to describe what I am trying to achieve.
>>>
>>> I am treating each jail likes its own little "virtual machine”. The jail provides certain services, using things like nginx or nodejs, php-fpm, mysql or postgresql. The jails can control connections to themselves by configuring the firewall ports that are opened on the IP their IP  (10.0.0.0/16 or a public IP). I know the jails have no firewall of their own, the firewall is configured from the host.
>>>
>>> I want each jail or “virtual machine” to be able to communicate with one another and the wider internet. When a jail does a DNS query for another App jail, it may get a public IP on its own Host (or it may get another host) and it has no issues being able to communicate with another jail on the same host.
>>>
>>> At the moment all of the above is working perfectly except for jail to jail communication on the same host (when the communication is not directly between 10.0.0.0/16 IP addresses).
>> this is pretty much exactly when vimage/vnet jails could be used to great affect.
>> Is there a reason you are not doing that?  Each jail has it's own routing tables, addresses and (virtual) interfaces.
>>
>> here's how I'd do it with vimage
>>
>>                                        +--------------+
>>                        +---------------+              | servers
>>                        |               +--------------+
>>                        |
>>                        |               +--------------+
>>                        |      +--------+              |
>>                        |      |        +--------------+
>>                        |      |
>>      +--------+     +--+------+----+
>>      | iface  |     | bridge       |
>>      |        +-----+              |
>>      +--------+     +----+---------+
>>                          |
>>                          |
>>                          |
>>                          |
>>                          |
>>                          |
>> +------------------------+---------------------+
>> |                                              |
>> |                                              |
>> |       NAT jail router                        |
>> |                                              |
>> |                                              |
>> +-------+--------+--------+-------+------------+
>>         |        |        |       |
>>      +--+--+  +--+--+  +--+--+ +--+--+
>>      |     |  |     |  |     | |     |
>>      |     |  |     |  |     | |     |
>>      |     |  |     |  |     | |     |    jails
>>      |     |  |     |  |     | |     |
>>      +-----+  +-----+  +-----+ +-----+
>>
>>
>>
>> however the hairpin idea might still be useful even in that scenario if they don't know about each other's 'local' addresses, but do NAT'd machines need to talk to each other by externeal addresses?
>>
>> i Nathan
>>>> On 21 Nov 2015, at 9:12 am, Nathan Aherne <nathan@reddog.com.au> wrote:
>>>>
>>>> I am not exactly sure how to draw the setup so it doesn’t confuse the situation. The setup is extremely simple (I am not running vimage), jails running on the 10.0.0.0/16 (cloned lo1 interface) network or with public IPs. The jails with private IPs are the HTTP app jails. The Host runs a HTTP Proxy (nginx) and forwards traffic to each HTTP App jail based on the URL it receives. The jails with public IPs are things like database jails which cannot be proxied by the Host.
>>>>
>>>> I can happily communicate with any jail from my laptop (externally) but when I want one jail to communicate with another jail (for example an App Jail communicating with the database jail) the traffic shows as backwards (destination:port -> source:port) in the IPFW logs (tshark shows the traffic correctly source:port -> destination:port). The jail to jail traffic tries to go over the lo1 interface (backwards) and is blocked. Below is some IPFW logs of an App jail (10.0.0.25) communicating with the database jail (aaa.bbb.ccc.ddd)
>>>>
>>>> IPFW logs. The lines labelled UNKNOWN is the check-state rule (everything is labelled UNKNOWN even if it is KNOWN traffic)
>>>>
>>>> Nov 21 08:49:07 host5 kernel: ipfw: 101 UNKNOWN TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:07 host5 kernel: ipfw: 65501 Deny TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:10 host5 kernel: ipfw: 101 UNKNOWN TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:10 host5 kernel: ipfw: 65501 Deny TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:13 host5 kernel: ipfw: 101 UNKNOWN TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:13 host5 kernel: ipfw: 65501 Deny TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:16 host5 kernel: ipfw: 101 UNKNOWN TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>> Nov 21 08:49:16 host5 kernel: ipfw: 65501 Deny TCP eee.fff.gg.hhh:5432 10.0.0.25:42957 out via lo1
>>>>
>>>> tshark output (loopback and wan interface capture for port 5432)
>>>>
>>>> Capturing on 'Loopback' and 'bce0'
>>>>    1   0.000000    10.0.0.25 -> eee.fff.gg.hhh TCP 64 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142885525 TSecr=0
>>>>    2   3.013905    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142888539 TSecr=0
>>>>    3   6.241658    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142891767 TSecr=0
>>>>    4   9.451516    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142894976 TSecr=0
>>>>    5  12.654656    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142898180 TSecr=0
>>>>    6  15.863900    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142901389 TSecr=0
>>>>    7  22.076655    10.0.0.25 -> eee.fff.gg.hhh TCP 64 [TCP Retransmission] 42957→5432 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 SACK_PERM=1 TSval=142907602 TSecr=0
>>>>
>>>>
>>>>> If so, what sort of routing is setup on both host and jails?
>>>> Routing is what would be added by default (whatever the host system adds when adding an IP), there is no custom routing. I have wondered if I need to modify the routing table to get this to work.
>>>>
>>>> Below is the output of netstat -rn
>>>>
>>>> www.xxx.yy <http://www.xxx.yy/>.zzz is the gateway address
>>>> eee.fff.gg.hhh is the database jail public IP
>>>> aaa.bbb.cc.ddd is the public IP for NAT
>>>> lll.mmm.nn.ooo is the Hosts public IP
>>>>
>>>>
>>>> Routing tables
>>>>
>>>> Internet:
>>>> Destination        Gateway            Flags      Netif Expire
>>>> default            www.xxx.yy <http://www.xxx.yy/>.zzz     UGS        bce0
>>>> 10.0.0.1           link#6             UH          lo1
>>>> 10.0.0.2           link#6             UH          lo1
>>>> 10.0.0.3           link#6             UH          lo1
>>>> 10.0.0.4           link#6             UH          lo1
>>>> 10.0.0.5           link#6             UH          lo1
>>>> 10.0.0.6           link#6             UH          lo1
>>>> 10.0.0.7           link#6             UH          lo1
>>>> 10.0.0.8           link#6             UH          lo1
>>>> 10.0.0.9           link#6             UH          lo1
>>>> 10.0.0.10          link#6             UH          lo1
>>>> 10.0.0.11          link#6             UH          lo1
>>>> 10.0.0.12          link#6             UH          lo1
>>>> 10.0.0.13          link#6             UH          lo1
>>>> 10.0.0.14          link#6             UH          lo1
>>>> 10.0.0.15          link#6             UH          lo1
>>>> 10.0.0.16          link#6             UH          lo1
>>>> 10.0.0.17          link#6             UH          lo1
>>>> 10.0.0.18          link#6             UH          lo1
>>>> 10.0.0.19          link#6             UH          lo1
>>>> 10.0.0.20          link#6             UH          lo1
>>>> 10.0.0.21          link#6             UH          lo1
>>>> 10.0.0.22          link#6             UH          lo1
>>>> 10.0.0.23          link#6             UH          lo1
>>>> 10.0.0.24          link#6             UH          lo1
>>>> 10.0.0.25          link#6             UH          lo1
>>>> 10.0.0.26          link#6             UH          lo1
>>>> www.xxx.yy.zzz/25 <http://www.xxx.yy.zzz/25>  link#1             U          bce0
>>>> eee.fff.gg.hhh     link#1             UHS         lo0
>>>> eee.fff.gg.hhh/32  link#1             U          bce0
>>>> aaa.bbb.cc <http://aaa.bbb.cc/>.ddd     link#1             UHS         lo0
>>>> aaa.bbb.cc.ddd/32  link#1             U          bce0
>>>> lll.mmm.nn.ooo     link#1             UHS         lo0
>>>> 127.0.0.1          link#5             UH          lo0
>>>>
>>>> Internet6:
>>>> Destination                       Gateway                       Flags      Netif Expire
>>>> ::/96                             ::1                           UGRS        lo0
>>>> ::1                               link#5                        UH          lo0
>>>> ::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
>>>> fe80::/10                         ::1                           UGRS        lo0
>>>> fe80::%lo0/64                     link#5                        U           lo0
>>>> fe80::1%lo0                       link#5                        UHS         lo0
>>>> ff01::%lo0/32                     ::1                           U           lo0
>>>> ff02::/16                         ::1                           UGRS        lo0
>>>> ff02::%lo0/32                     ::1                           U           lo0
>>>>
>>>>> Anything like ?
>>>>> http://kb.juniper.net/InfoCenter/index?page=content&id=KB24639&actp=search <http://kb.juniper.net/InfoCenter/index?page=content&id=KB24639&actp=search>
>>>> Yes just like that.
>>>>
>>>> Regards,
>>>>
>>>> Nathan
>>>>
>>>>> On 19 Nov 2015, at 2:46 am, Ian Smith <smithi@nimnet.asn.au <mailto:smithi@nimnet.asn.au>> wrote:
>>>>>
>>>>> On Wed, 18 Nov 2015 22:17:29 +0800, Julian Elischer wrote:
>>>>>> On 11/18/15 8:40 AM, Nathan Aherne wrote:
>>>>>>> For some reason hairpin (loopback nat or nat reflection) does not seem to
>>>>>>> be working, which is why I chose IPFW in the first place.
>>>>>> it would be good to see a diagram of what this actually means.
>>>>> Anything like ?
>>>>> http://kb.juniper.net/InfoCenter/index?page=content&id=KB24639&actp=search <http://kb.juniper.net/InfoCenter/index?page=content&id=KB24639&actp=search>
>>>>>
>>>>> Was this so one jail can only access service/s provided by other jail/s,
>>>>> both/all with internal NAT'd addresses, by using only the public address
>>>>> and port of the 'router', which IIRC this is a single system with jails?
>>>>>
>>>>> If so, what sort of routing is setup on both host and jails?
>>>>>
>>>>> (blindfolded, no idea where I've pinned the donkey's tail :)
>>>>>
>>>>> cheers, Ian
>>> _______________________________________________
>>> freebsd-ipfw@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
>>> To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe@freebsd.org"
>>>
>>>
>> _______________________________________________
>> freebsd-ipfw@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
>> To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-ipfw@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ipfw
> To unsubscribe, send any mail to "freebsd-ipfw-unsubscribe@freebsd.org"
>
>