From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 01:34:52 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 89AF0254 for ; Sun, 10 Aug 2014 01:34:52 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7208C21F2 for ; Sun, 10 Aug 2014 01:34:52 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.8/8.14.8) with ESMTP id s7A1YqHc029970 for ; Sun, 10 Aug 2014 01:34:52 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bug 192426] [bpf] [panic]: Kernel panic when using BPF Date: Sun, 10 Aug 2014 01:34:52 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-RELEASE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 01:34:52 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192426 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-net@FreeBSD.org Summary|[panic]: Kernel panic when |[bpf] [panic]: Kernel panic |using BPF |when using BPF --- Comment #7 from Mark Linimon --- Over to maintainers. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 02:12:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F20AF537 for ; Sun, 10 Aug 2014 02:12:11 +0000 (UTC) Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AF6DF25A2 for ; Sun, 10 Aug 2014 02:12:11 +0000 (UTC) Received: by mail-qc0-f170.google.com with SMTP id x3so584053qcv.29 for ; Sat, 09 Aug 2014 19:12:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=O9O+LVVPmI1+5Dc3omOIKVqWXuyM++K24VAJqHxfRk4=; b=Q0/uPA288lECBHFAkHZ9DZHCPB7lEanmjuXisPkVz7+9CCr4V1nS8Dwfk40jIz+f3A ELdmcGhAp5FqIbjYxaZLzSeMaEUqVEY/FXNEF1b3sn/4w/Hy9O6PJAT63/RFQ0cPJNM2 /eh9czbdOWO9kdctqsdTXWcx0zFdRIZN31NzwL1jvGyh9thiCBrKpaqJe4MjSPDsVoPt jp4k9xsJbSS/1yTIaeRaU8VNSS3dQb5+/AnigIQb6qTK63tivdUl1iu9FlCNAgk5ZUHD 1GIg+fLzazvUnKkX8Q/7KMJAnu+Yn2rZwxOI7OlR6tlIz363dco5baEB50rU2pSiCO++ rCYQ== MIME-Version: 1.0 X-Received: by 10.229.68.131 with SMTP id v3mr50597999qci.10.1407636730519; Sat, 09 Aug 2014 19:12:10 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sat, 9 Aug 2014 19:12:10 -0700 (PDT) In-Reply-To: <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> Date: Sun, 10 Aug 2014 10:12:10 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Michael Tuexen Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: freebsd-net@freebsd.org, John-Mark Gurney , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 02:12:12 -0000 Hi During the TCP4 transmission. Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 ESTABLISHED Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < Michael.Tuexen@lurchi.franken.de> wrote: > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 +0200: > >> > >> On 09 Aug 2014, at 20:42, John-Mark Gurney wrote: > >> > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 +0800: > >>>> Dear all, > >>>> > >>>> Last month, I send problems related to FTP/TCP in a high RTT > environment. > >>>> After that, I setup a simulation environment(Dummynet) to test TCP > and SCTP > >>>> in high delay environment. After finishing the test, I can see TCP i= s > >>>> always slower than SCTP. But, I think it is not possible. (Plz see t= he > >>>> figure in the attachment). When the delay is 200ms(means RTT=3D400ms= ). > >>>> Besides, the TCP is extremely slow. > >>>> > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0 (by dummynet) > >>>> > >>>> This is my parameters: > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Aug 7 > >>>> 11:04:15 HKT 2014 > >>>> > >>>> sysctl net.inet.tcp > >>> > >>> [...] > >>> > >>>> net.inet.tcp.recvbuf_auto: 0 > >>> > >>> [...] > >>> > >>>> net.inet.tcp.sendbuf_auto: 0 > >>> > >>> Try enabling this... This should allow the buffer to grow large enou= gh > >>> to deal w/ the higher latency... > >>> > >>> Also, make sure your program isn't setting the recv buffer size as th= at > >>> will disable the auto growing... > >> I think the program sets the buffer to 2MB, which it also does for SCT= P. > >> So having both statically at the same size makes sense for the > comparison. > >> I remember that there was a bug in the combination of LRO and delayed > ACK, > >> which was fixed, but I don't remember it was fixed before 10.0... > > > > Sounds like disabling LRO and TSO would be a useful test to see if that > > improves things... But hiren said that the fix made it, so... > > > >>> If you use netstat -a, you should be able to see the send-q on the > >>> sender grow as necessary... > > > > Also, getting the send-q output while it's running would let us know > > if the buffer is getting to 2MB or not... > That is correct. Niu: Can you provide this? > > Best regards > Michael > > > > -- > > John-Mark Gurney Voice: +1 415 225 5579 > > > > "All that I will do, has been done, All that I have, has not." > > > > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 02:23:52 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 09DA36ED for ; Sun, 10 Aug 2014 02:23:52 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D85E8264F for ; Sun, 10 Aug 2014 02:23:51 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7A2NoXj004832 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Aug 2014 19:23:50 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7A2NovU004831; Sat, 9 Aug 2014 19:23:50 -0700 (PDT) (envelope-from jmg) Date: Sat, 9 Aug 2014 19:23:50 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140810022350.GI83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 09 Aug 2014 19:23:50 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 02:23:52 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > During the TCP4 transmission. > Proto Recv-Q Send-Q Local Address Foreign Address (state) > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > ESTABLISHED Ok, so you are getting a full 2MB in there, and w/ that, you should easily be saturating your pipe... The next thing would be to get a tcpdump, and take a look at the window size.. Wireshark has lots of neat tools to make this analysis easy... Another tool that is good is tcptrace.. It can output a variety of different graphs that will help you track down, and see what part of the system is the problem... You probably only need a few tens of seconds of the tcpdump... > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > Michael.Tuexen@lurchi.franken.de> wrote: > > > > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: > > > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 +0200: > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney wrote: > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 +0800: > > >>>> Dear all, > > >>>> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > > environment. > > >>>> After that, I setup a simulation environment(Dummynet) to test TCP > > and SCTP > > >>>> in high delay environment. After finishing the test, I can see TCP is > > >>>> always slower than SCTP. But, I think it is not possible. (Plz see the > > >>>> figure in the attachment). When the delay is 200ms(means RTT=400ms). > > >>>> Besides, the TCP is extremely slow. > > >>>> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by dummynet) > > >>>> > > >>>> This is my parameters: > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Aug 7 > > >>>> 11:04:15 HKT 2014 > > >>>> > > >>>> sysctl net.inet.tcp > > >>> > > >>> [...] > > >>> > > >>>> net.inet.tcp.recvbuf_auto: 0 > > >>> > > >>> [...] > > >>> > > >>>> net.inet.tcp.sendbuf_auto: 0 > > >>> > > >>> Try enabling this... This should allow the buffer to grow large enough > > >>> to deal w/ the higher latency... > > >>> > > >>> Also, make sure your program isn't setting the recv buffer size as that > > >>> will disable the auto growing... > > >> I think the program sets the buffer to 2MB, which it also does for SCTP. > > >> So having both statically at the same size makes sense for the > > comparison. > > >> I remember that there was a bug in the combination of LRO and delayed > > ACK, > > >> which was fixed, but I don't remember it was fixed before 10.0... > > > > > > Sounds like disabling LRO and TSO would be a useful test to see if that > > > improves things... But hiren said that the fix made it, so... > > > > > >>> If you use netstat -a, you should be able to see the send-q on the > > >>> sender grow as necessary... > > > > > > Also, getting the send-q output while it's running would let us know > > > if the buffer is getting to 2MB or not... > > That is correct. Niu: Can you provide this? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 02:42:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7C7329F5 for ; Sun, 10 Aug 2014 02:42:47 +0000 (UTC) Received: from mail-qa0-x232.google.com (mail-qa0-x232.google.com [IPv6:2607:f8b0:400d:c00::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 37BD727C1 for ; Sun, 10 Aug 2014 02:42:47 +0000 (UTC) Received: by mail-qa0-f50.google.com with SMTP id s7so6722366qap.23 for ; Sat, 09 Aug 2014 19:42:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=nSfRUkwt2/ofDoJUfmVKIyIsDadW8gYMfZt1/nxiG1w=; b=VFdOY+SeO9DsNNt0ggh5J6WLBCkGCzpRr8LTtkJpUUg9sidwp6ItUGTI5DPqN+Jf+l 7Petoj2Mv2C57kjRQCbDqgqI3E8NkPr/0IeHS6ch35tdZ7eQzsyUSD53HCYZbuC/+euv kgJYTtB1hME3H71po/92w+kQb+PcKzPB0cjwj09DlssuWEFlrtPitB1UeHbgB6NuKwDL gKeIVzQyU2hLeMWYGfkYqt6W+n2sER6FZEzqby79c6qeItRNcuX5Y8qTLTs5ngtxQoOx KagFBHI6DjovpjPEPSpNrzg/syj1agNalITJrWlvrx8SsW5W4bR+MG8cDrmPzRC0pgYd HKkA== MIME-Version: 1.0 X-Received: by 10.140.30.180 with SMTP id d49mr34369652qgd.63.1407638565872; Sat, 09 Aug 2014 19:42:45 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sat, 9 Aug 2014 19:42:45 -0700 (PDT) In-Reply-To: <20140810022350.GI83475@funkthat.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> Date: Sun, 10 Aug 2014 10:42:45 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan , John-Mark Gurney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 02:42:47 -0000 I am sure that wnd is about 2MB all the time. This is my latest capture, plz see Google Drive. In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is about 18Mbps. (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) The SCTP and TCP are tested in same environment. =E2=80=8B sctp.pcapng.gz =E2=80=8B=E2=80=8B tcp.pcapng.gz =E2=80=8B Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney wrote= : > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > > During the TCP4 transmission. > > Proto Recv-Q Send-Q Local Address Foreign Address (stat= e) > > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > > ESTABLISHED > > Ok, so you are getting a full 2MB in there, and w/ that, you should > easily be saturating your pipe... > > The next thing would be to get a tcpdump, and take a look at the > window size.. Wireshark has lots of neat tools to make this analysis > easy... Another tool that is good is tcptrace.. It can output a > variety of different graphs that will help you track down, and see > what part of the system is the problem... > > You probably only need a few tens of seconds of the tcpdump... > > > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > > Michael.Tuexen@lurchi.franken.de> wrote: > > > > > > > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: > > > > > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 > +0200: > > > >> > > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > wrote: > > > >> > > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 > +0800: > > > >>>> Dear all, > > > >>>> > > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > > > environment. > > > >>>> After that, I setup a simulation environment(Dummynet) to test T= CP > > > and SCTP > > > >>>> in high delay environment. After finishing the test, I can see > TCP is > > > >>>> always slower than SCTP. But, I think it is not possible. (Plz > see the > > > >>>> figure in the attachment). When the delay is 200ms(means > RTT=3D400ms). > > > >>>> Besides, the TCP is extremely slow. > > > >>>> > > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0 (by dummy= net) > > > >>>> > > > >>>> This is my parameters: > > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Aug > 7 > > > >>>> 11:04:15 HKT 2014 > > > >>>> > > > >>>> sysctl net.inet.tcp > > > >>> > > > >>> [...] > > > >>> > > > >>>> net.inet.tcp.recvbuf_auto: 0 > > > >>> > > > >>> [...] > > > >>> > > > >>>> net.inet.tcp.sendbuf_auto: 0 > > > >>> > > > >>> Try enabling this... This should allow the buffer to grow large > enough > > > >>> to deal w/ the higher latency... > > > >>> > > > >>> Also, make sure your program isn't setting the recv buffer size a= s > that > > > >>> will disable the auto growing... > > > >> I think the program sets the buffer to 2MB, which it also does for > SCTP. > > > >> So having both statically at the same size makes sense for the > > > comparison. > > > >> I remember that there was a bug in the combination of LRO and > delayed > > > ACK, > > > >> which was fixed, but I don't remember it was fixed before 10.0... > > > > > > > > Sounds like disabling LRO and TSO would be a useful test to see if > that > > > > improves things... But hiren said that the fix made it, so... > > > > > > > >>> If you use netstat -a, you should be able to see the send-q on th= e > > > >>> sender grow as necessary... > > > > > > > > Also, getting the send-q output while it's running would let us kno= w > > > > if the buffer is getting to 2MB or not... > > > That is correct. Niu: Can you provide this? > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 02:50:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 39AB1AB5 for ; Sun, 10 Aug 2014 02:50:51 +0000 (UTC) Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E74FA285D for ; Sun, 10 Aug 2014 02:50:50 +0000 (UTC) Received: by mail-qc0-f170.google.com with SMTP id x3so592622qcv.1 for ; Sat, 09 Aug 2014 19:50:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=1q1LMVzE/a/6A+a5deEvDTP0N5nBOMNMrEMhFN6WNzE=; b=URL1NTwe6H33pifzk4uwoLIStBo6N4rCEAbKHc99gRc/Qg76FvF3++wtv6Wr1hn/Z+ JseMaokJteOAwIJ3q23WLdkbUV2+dcLghcnO7AWQGfPn5vj5XTQz3w2YFZeLNt7qoBJK wOe0IGMec8FdYzPW4P4EHjx/Azd3ARJhBaLF3snAoLQUiA8lEUuA70rbYXzzjp+QSvR7 KQ+omaLDUm1iTd447KBEg6YqL4tGwaYOI/04ER6i0FB66KMNFj+PkeRuJOAp3XOcZ5GW SDB+00ZUx+adzmDIN2bxBP3g7G7OceTTInzeTjWr/oqrCQ0WEAJmty2jP1uQLl59tQPT oqEw== MIME-Version: 1.0 X-Received: by 10.224.95.6 with SMTP id b6mr52190063qan.17.1407639050068; Sat, 09 Aug 2014 19:50:50 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sat, 9 Aug 2014 19:50:49 -0700 (PDT) In-Reply-To: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> Date: Sun, 10 Aug 2014 10:50:49 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan , John-Mark Gurney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 02:50:51 -0000 I am sorry that I upload a WRONG SCTP capture. But, the throughput is same. SCTP is double than TCP, about 18Mbps. =E2=80=8B sctp_2.pcapng.gz =E2=80=8B Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong wrote: > I am sure that wnd is about 2MB all the time. > This is my latest capture, plz see Google Drive. > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is abou= t > 18Mbps. > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > The SCTP and TCP are tested in same environment. > > =E2=80=8B > sctp.pcapng.gz > > =E2=80=8B=E2=80=8B > tcp.pcapng.gz > > =E2=80=8B > > > > Regards, > Niu Zhixiong > =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D= =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D > kaiaixi@gmail.com > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > wrote: > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: >> > During the TCP4 transmission. >> > Proto Recv-Q Send-Q Local Address Foreign Address >> (state) >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 >> > ESTABLISHED >> >> Ok, so you are getting a full 2MB in there, and w/ that, you should >> easily be saturating your pipe... >> >> The next thing would be to get a tcpdump, and take a look at the >> window size.. Wireshark has lots of neat tools to make this analysis >> easy... Another tool that is good is tcptrace.. It can output a >> variety of different graphs that will help you track down, and see >> what part of the system is the problem... >> >> You probably only need a few tens of seconds of the tcpdump... >> >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < >> > Michael.Tuexen@lurchi.franken.de> wrote: >> > >> > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: >> > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 >> +0200: >> > > >> >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney >> wrote: >> > > >> >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 >> +0800: >> > > >>>> Dear all, >> > > >>>> >> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT >> > > environment. >> > > >>>> After that, I setup a simulation environment(Dummynet) to test >> TCP >> > > and SCTP >> > > >>>> in high delay environment. After finishing the test, I can see >> TCP is >> > > >>>> always slower than SCTP. But, I think it is not possible. (Plz >> see the >> > > >>>> figure in the attachment). When the delay is 200ms(means >> RTT=3D400ms). >> > > >>>> Besides, the TCP is extremely slow. >> > > >>>> >> > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0 (by dumm= ynet) >> > > >>>> >> > > >>>> This is my parameters: >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Au= g >> 7 >> > > >>>> 11:04:15 HKT 2014 >> > > >>>> >> > > >>>> sysctl net.inet.tcp >> > > >>> >> > > >>> [...] >> > > >>> >> > > >>>> net.inet.tcp.recvbuf_auto: 0 >> > > >>> >> > > >>> [...] >> > > >>> >> > > >>>> net.inet.tcp.sendbuf_auto: 0 >> > > >>> >> > > >>> Try enabling this... This should allow the buffer to grow large >> enough >> > > >>> to deal w/ the higher latency... >> > > >>> >> > > >>> Also, make sure your program isn't setting the recv buffer size >> as that >> > > >>> will disable the auto growing... >> > > >> I think the program sets the buffer to 2MB, which it also does fo= r >> SCTP. >> > > >> So having both statically at the same size makes sense for the >> > > comparison. >> > > >> I remember that there was a bug in the combination of LRO and >> delayed >> > > ACK, >> > > >> which was fixed, but I don't remember it was fixed before 10.0... >> > > > >> > > > Sounds like disabling LRO and TSO would be a useful test to see if >> that >> > > > improves things... But hiren said that the fix made it, so... >> > > > >> > > >>> If you use netstat -a, you should be able to see the send-q on t= he >> > > >>> sender grow as necessary... >> > > > >> > > > Also, getting the send-q output while it's running would let us kn= ow >> > > > if the buffer is getting to 2MB or not... >> > > That is correct. Niu: Can you provide this? >> >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." >> > > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 03:32:15 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7780379F for ; Sun, 10 Aug 2014 03:32:15 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 317492D5A for ; Sun, 10 Aug 2014 03:32:14 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7A3WDP6005867 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Aug 2014 20:32:13 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7A3WCZp005866; Sat, 9 Aug 2014 20:32:12 -0700 (PDT) (envelope-from jmg) Date: Sat, 9 Aug 2014 20:32:12 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140810033212.GL83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 09 Aug 2014 20:32:13 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 03:32:15 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > I am sorry that I upload a WRONG SCTP capture. But, the throughput is same. > SCTP is double than TCP, about 18Mbps. > ??? > sctp_2.pcapng.gz > > ??? Ok, the owin graph is very interesting... We do have a full 2MB window on the receiver side, but for some reason, we only ever have just under 6k outstanding on the connection... So, it looks like we send for a short period of time, and then stop sending... Do you have LRO enabled? I think it might be related to: https://svnweb.freebsd.org/changeset/base/r256920 As I'm seeing >100ms gaps where the sender doesn't send any data, and as soon as more than one ack comes in, the next segment goes out... If we only receive a single ack, then we wait for a timeout before sending the next segment.. Can you try to disable LRO on the receiving host? ifconfig -lro And see if that helps... If it does... Applying the patch, or compiling a more recent kernel from stable/10 that is after r257367 as that is was the date that the change was merged... > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong wrote: > > > I am sure that wnd is about 2MB all the time. > > This is my latest capture, plz see Google Drive. > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is about > > 18Mbps. > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > > The SCTP and TCP are tested in same environment. > > > > ??? > > sctp.pcapng.gz > > > > ?????? > > tcp.pcapng.gz > > > > ??? > > > > > > > > Regards, > > Niu Zhixiong > > ????????????????????????????????????????????? > > kaiaixi@gmail.com > > > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > > wrote: > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > >> > During the TCP4 transmission. > >> > Proto Recv-Q Send-Q Local Address Foreign Address > >> (state) > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > >> > ESTABLISHED > >> > >> Ok, so you are getting a full 2MB in there, and w/ that, you should > >> easily be saturating your pipe... > >> > >> The next thing would be to get a tcpdump, and take a look at the > >> window size.. Wireshark has lots of neat tools to make this analysis > >> easy... Another tool that is good is tcptrace.. It can output a > >> variety of different graphs that will help you track down, and see > >> what part of the system is the problem... > >> > >> You probably only need a few tens of seconds of the tcpdump... > >> > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > >> > Michael.Tuexen@lurchi.franken.de> wrote: > >> > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney wrote: > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 > >> +0200: > >> > > >> > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > >> wrote: > >> > > >> > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 > >> +0800: > >> > > >>>> Dear all, > >> > > >>>> > >> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > >> > > environment. > >> > > >>>> After that, I setup a simulation environment(Dummynet) to test > >> TCP > >> > > and SCTP > >> > > >>>> in high delay environment. After finishing the test, I can see > >> TCP is > >> > > >>>> always slower than SCTP. But, I think it is not possible. (Plz > >> see the > >> > > >>>> figure in the attachment). When the delay is 200ms(means > >> RTT=400ms). > >> > > >>>> Besides, the TCP is extremely slow. > >> > > >>>> > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by dummynet) > >> > > >>>> > >> > > >>>> This is my parameters: > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu Aug > >> 7 > >> > > >>>> 11:04:15 HKT 2014 > >> > > >>>> > >> > > >>>> sysctl net.inet.tcp > >> > > >>> > >> > > >>> [...] > >> > > >>> > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > >> > > >>> > >> > > >>> [...] > >> > > >>> > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > >> > > >>> > >> > > >>> Try enabling this... This should allow the buffer to grow large > >> enough > >> > > >>> to deal w/ the higher latency... > >> > > >>> > >> > > >>> Also, make sure your program isn't setting the recv buffer size > >> as that > >> > > >>> will disable the auto growing... > >> > > >> I think the program sets the buffer to 2MB, which it also does for > >> SCTP. > >> > > >> So having both statically at the same size makes sense for the > >> > > comparison. > >> > > >> I remember that there was a bug in the combination of LRO and > >> delayed > >> > > ACK, > >> > > >> which was fixed, but I don't remember it was fixed before 10.0... > >> > > > > >> > > > Sounds like disabling LRO and TSO would be a useful test to see if > >> that > >> > > > improves things... But hiren said that the fix made it, so... > >> > > > > >> > > >>> If you use netstat -a, you should be able to see the send-q on the > >> > > >>> sender grow as necessary... > >> > > > > >> > > > Also, getting the send-q output while it's running would let us know > >> > > > if the buffer is getting to 2MB or not... > >> > > That is correct. Niu: Can you provide this? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 03:46:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 068C1A80 for ; Sun, 10 Aug 2014 03:46:47 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id B86242E57 for ; Sun, 10 Aug 2014 03:46:46 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id BB35720E7088A; Sun, 10 Aug 2014 03:46:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: ** X-Spam-Status: No, score=2.2 required=8.0 tests=AWL,BAYES_00,DOS_OE_TO_MX, FSL_HELO_NON_FQDN_1,HELO_NO_DOMAIN,RDNS_DYNAMIC,STOX_REPLY_TYPE autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPS id 99F6020E70885; Sun, 10 Aug 2014 03:46:34 +0000 (UTC) Message-ID: <59B841E11C214B28A31D842B7CB1136F@multiplay.co.uk> From: "Steven Hartland" To: "John-Mark Gurney" , "Niu Zhixiong" References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> Subject: Re: A problem on TCP in High RTT Environment. Date: Sun, 10 Aug 2014 04:46:29 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 03:46:47 -0000 ----- Original Message ----- From: "John-Mark Gurney" > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: >> I am sorry that I upload a WRONG SCTP capture. But, the throughput is >> same. >> SCTP is double than TCP, about 18Mbps. >> ??? >> sctp_2.pcapng.gz >> >> ??? > > Ok, the owin graph is very interesting... We do have a full 2MB > window > on the receiver side, but for some reason, we only ever have just > under > 6k outstanding on the connection... > > So, it looks like we send for a short period of time, and then stop > sending... Do you have LRO enabled? I think it might be related to: > https://svnweb.freebsd.org/changeset/base/r256920 > > As I'm seeing >100ms gaps where the sender doesn't send any data, and > as soon as more than one ack comes in, the next segment goes out... > If > we only receive a single ack, then we wait for a timeout before > sending > the next segment.. > > Can you try to disable LRO on the receiving host? > > ifconfig -lro > > And see if that helps... If it does... Applying the patch, or > compiling > a more recent kernel from stable/10 that is after r257367 as that is > was > the date that the change was merged... r257367 was in 10.0-RELEASE From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 03:48:59 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EDE42B2C for ; Sun, 10 Aug 2014 03:48:59 +0000 (UTC) Received: from mail-qa0-x22c.google.com (mail-qa0-x22c.google.com [IPv6:2607:f8b0:400d:c00::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A641A2E6C for ; Sun, 10 Aug 2014 03:48:59 +0000 (UTC) Received: by mail-qa0-f44.google.com with SMTP id f12so6959278qad.31 for ; Sat, 09 Aug 2014 20:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=v5GCfOnDgAXy+7nVOWr/bzah3mB631DfXtS5xt+kVbA=; b=YSDgFic5brRItXnpJLMSLFFn2r+O03FC7yTLy0rPADUOSFUveu47B6vh0a4fQU6VGg dyTUHVKm4fAvjprhBIWACS/kCNZiaCXKQeGWNodpu2HFP6cTrbelEu7ys+tJUKxUSyYj ARcb88ka26tC97Fb3VbKbfizV6eBmnqe5KSQEioI9/TZwoLhyyxCO+FOKEdBMusjd/UU 1YfHV4DNh5k1PG2gPruRZExxCYwImTMWAoG8BDzw2YLC+Tpa0xXUi6rI4vr5eauTCH9W zdCrceTfTBMPYwVdXsdCvK07e38jP2RK12Y0jErIJjSMXpkly5YBHHcQ5rsjQXi1CEO/ 9/4A== MIME-Version: 1.0 X-Received: by 10.224.95.74 with SMTP id c10mr50038126qan.35.1407642538522; Sat, 09 Aug 2014 20:48:58 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sat, 9 Aug 2014 20:48:58 -0700 (PDT) In-Reply-To: <20140810033212.GL83475@funkthat.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> Date: Sun, 10 Aug 2014 11:48:58 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan , John-Mark Gurney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 03:49:00 -0000 I am using Intel I350-T4 NIC. The LRO is closed by default. And by the way, when I am using KVM-based virtual machine(virtio NIC) do the exactly same test. The results are same. ifconfig igb0 igb0: flags=3D8843 metric 0 mtu 150= 0 options=3D403bb ether a0:36:9f:38:27:d0 inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 nd6 options=3D29 media: Ethernet autoselect (1000baseT ) status: active Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney wrote= : > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > > I am sorry that I upload a WRONG SCTP capture. But, the throughput is > same. > > SCTP is double than TCP, about 18Mbps. > > ??? > > sctp_2.pcapng.gz > > < > https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=3Ddr= ive_web > > > > ??? > > Ok, the owin graph is very interesting... We do have a full 2MB window > on the receiver side, but for some reason, we only ever have just under > 6k outstanding on the connection... > > So, it looks like we send for a short period of time, and then stop > sending... Do you have LRO enabled? I think it might be related to: > https://svnweb.freebsd.org/changeset/base/r256920 > > As I'm seeing >100ms gaps where the sender doesn't send any data, and > as soon as more than one ack comes in, the next segment goes out... If > we only receive a single ack, then we wait for a timeout before sending > the next segment.. > > Can you try to disable LRO on the receiving host? > > ifconfig -lro > > And see if that helps... If it does... Applying the patch, or compiling > a more recent kernel from stable/10 that is after r257367 as that is was > the date that the change was merged... > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong > wrote: > > > > > I am sure that wnd is about 2MB all the time. > > > This is my latest capture, plz see Google Drive. > > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is > about > > > 18Mbps. > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > > > The SCTP and TCP are tested in same environment. > > > > > > ??? > > > sctp.pcapng.gz > > > < > https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=3Ddr= ive_web > > > > > ?????? > > > tcp.pcapng.gz > > > < > https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=3Ddr= ive_web > > > > > ??? > > > > > > > > > > > > Regards, > > > Niu Zhixiong > > > ????????????????????????????????????????????? > > > kaiaixi@gmail.com > > > > > > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > > > wrote: > > > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > > >> > During the TCP4 transmission. > > >> > Proto Recv-Q Send-Q Local Address Foreign Address > > >> (state) > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > > >> > ESTABLISHED > > >> > > >> Ok, so you are getting a full 2MB in there, and w/ that, you should > > >> easily be saturating your pipe... > > >> > > >> The next thing would be to get a tcpdump, and take a look at the > > >> window size.. Wireshark has lots of neat tools to make this analysis > > >> easy... Another tool that is good is tcptrace.. It can output a > > >> variety of different graphs that will help you track down, and see > > >> what part of the system is the problem... > > >> > > >> You probably only need a few tens of seconds of the tcpdump... > > >> > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > > >> > Michael.Tuexen@lurchi.franken.de> wrote: > > >> > > > >> > > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney > wrote: > > >> > > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:5= 1 > > >> +0200: > > >> > > >> > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > > >> wrote: > > >> > > >> > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:3= 4 > > >> +0800: > > >> > > >>>> Dear all, > > >> > > >>>> > > >> > > >>>> Last month, I send problems related to FTP/TCP in a high RT= T > > >> > > environment. > > >> > > >>>> After that, I setup a simulation environment(Dummynet) to > test > > >> TCP > > >> > > and SCTP > > >> > > >>>> in high delay environment. After finishing the test, I can > see > > >> TCP is > > >> > > >>>> always slower than SCTP. But, I think it is not possible. > (Plz > > >> see the > > >> > > >>>> figure in the attachment). When the delay is 200ms(means > > >> RTT=3D400ms). > > >> > > >>>> Besides, the TCP is extremely slow. > > >> > > >>>> > > >> > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0 (by > dummynet) > > >> > > >>>> > > >> > > >>>> This is my parameters: > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Th= u > Aug > > >> 7 > > >> > > >>>> 11:04:15 HKT 2014 > > >> > > >>>> > > >> > > >>>> sysctl net.inet.tcp > > >> > > >>> > > >> > > >>> [...] > > >> > > >>> > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > > >> > > >>> > > >> > > >>> [...] > > >> > > >>> > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > > >> > > >>> > > >> > > >>> Try enabling this... This should allow the buffer to grow > large > > >> enough > > >> > > >>> to deal w/ the higher latency... > > >> > > >>> > > >> > > >>> Also, make sure your program isn't setting the recv buffer > size > > >> as that > > >> > > >>> will disable the auto growing... > > >> > > >> I think the program sets the buffer to 2MB, which it also doe= s > for > > >> SCTP. > > >> > > >> So having both statically at the same size makes sense for th= e > > >> > > comparison. > > >> > > >> I remember that there was a bug in the combination of LRO and > > >> delayed > > >> > > ACK, > > >> > > >> which was fixed, but I don't remember it was fixed before > 10.0... > > >> > > > > > >> > > > Sounds like disabling LRO and TSO would be a useful test to se= e > if > > >> that > > >> > > > improves things... But hiren said that the fix made it, so... > > >> > > > > > >> > > >>> If you use netstat -a, you should be able to see the send-q > on the > > >> > > >>> sender grow as necessary... > > >> > > > > > >> > > > Also, getting the send-q output while it's running would let u= s > know > > >> > > > if the buffer is getting to 2MB or not... > > >> > > That is correct. Niu: Can you provide this? > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 03:56:08 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 812C9BE7 for ; Sun, 10 Aug 2014 03:56:08 +0000 (UTC) Received: from mail-qc0-x235.google.com (mail-qc0-x235.google.com [IPv6:2607:f8b0:400d:c01::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 383582FE2 for ; Sun, 10 Aug 2014 03:56:08 +0000 (UTC) Received: by mail-qc0-f181.google.com with SMTP id x13so597644qcv.26 for ; Sat, 09 Aug 2014 20:56:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=oyo/oJu7Tz9rdnyarvTcaDGUkowG0LTSoTAsT8v5rWA=; b=wXPXHnQ6t6QB89F9d4z+JV/Y+AJjNZ0PMFmlj7c304G9ZQNKdl5OWMZlbm9by4kuuY Gb1LKT+X7/h/IvsWNAIHBkDxow0LcvYyACPHNARlGNb4SzzHTZYC0RcQ1sCsgaXNB0L/ UQaHVkRxhEfgpbZj0rwuMJ/AN3f2kUz4VNv7sJEI/2rXW3+o0l/piDyn5xUnQi4U3OA4 pO2DMN4dSVtS9IU40nHw5VTyl5Aam9SM1/u9gqG+VF8IkWNKqy/SQF5CkNJb+Xn46zwi IVqHDSTlcVkQKGGbPn5e4m1b7bBjiZpO3jMxO5lKi46pzyNl9P+rwrmpWHT4VT8zSQ4e +T5Q== MIME-Version: 1.0 X-Received: by 10.140.41.38 with SMTP id y35mr35650669qgy.69.1407642967256; Sat, 09 Aug 2014 20:56:07 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sat, 9 Aug 2014 20:56:07 -0700 (PDT) In-Reply-To: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> Date: Sun, 10 Aug 2014 11:56:07 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan , John-Mark Gurney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 03:56:08 -0000 Actually. In the http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/configtuning-kern= el-limits.html 12.11.2.2. TCP Bandwidth Delay Product I saw an option called net.inet.tcp.inflight.enable net.inet.tcp.inflight.debug net.inet.tcp.inflight.min But, in FreeBSD 9.3R and 10R. I cannot find anything related to inflight in sysctl net.inet.tcp. Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 11:48 AM, Niu Zhixiong wrote: > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the > way, when I am using KVM-based virtual machine(virtio NIC) do the exactly > same test. The results are same. > > ifconfig igb0 > igb0: flags=3D8843 metric 0 mtu 1= 500 > > options=3D403bb > ether a0:36:9f:38:27:d0 > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 > nd6 options=3D29 > media: Ethernet autoselect (1000baseT ) > status: active > > Regards, > Niu Zhixiong > =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D= =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D > kaiaixi@gmail.com > > > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney > wrote: > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: >> > I am sorry that I upload a WRONG SCTP capture. But, the throughput is >> same. >> > SCTP is double than TCP, about 18Mbps. >> > ??? >> > sctp_2.pcapng.gz >> > < >> https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=3Dd= rive_web >> > >> > ??? >> >> Ok, the owin graph is very interesting... We do have a full 2MB window >> on the receiver side, but for some reason, we only ever have just under >> 6k outstanding on the connection... >> >> So, it looks like we send for a short period of time, and then stop >> sending... Do you have LRO enabled? I think it might be related to: >> https://svnweb.freebsd.org/changeset/base/r256920 >> >> As I'm seeing >100ms gaps where the sender doesn't send any data, and >> as soon as more than one ack comes in, the next segment goes out... If >> we only receive a single ack, then we wait for a timeout before sending >> the next segment.. >> >> Can you try to disable LRO on the receiving host? >> >> ifconfig -lro >> >> And see if that helps... If it does... Applying the patch, or compiling >> a more recent kernel from stable/10 that is after r257367 as that is was >> the date that the change was merged... >> >> > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong >> wrote: >> > >> > > I am sure that wnd is about 2MB all the time. >> > > This is my latest capture, plz see Google Drive. >> > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is >> about >> > > 18Mbps. >> > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) >> > > The SCTP and TCP are tested in same environment. >> > > >> > > ??? >> > > sctp.pcapng.gz >> > > < >> https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=3Dd= rive_web >> > >> > > ?????? >> > > tcp.pcapng.gz >> > > < >> https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=3Dd= rive_web >> > >> > > ??? >> > > >> > > >> > > >> > > Regards, >> > > Niu Zhixiong >> > > ????????????????????????????????????????????? >> > > kaiaixi@gmail.com >> > > >> > > >> > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney >> > > wrote: >> > > >> > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800= : >> > >> > During the TCP4 transmission. >> > >> > Proto Recv-Q Send-Q Local Address Foreign Address >> > >> (state) >> > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 >> > >> > ESTABLISHED >> > >> >> > >> Ok, so you are getting a full 2MB in there, and w/ that, you should >> > >> easily be saturating your pipe... >> > >> >> > >> The next thing would be to get a tcpdump, and take a look at the >> > >> window size.. Wireshark has lots of neat tools to make this analysi= s >> > >> easy... Another tool that is good is tcptrace.. It can output a >> > >> variety of different graphs that will help you track down, and see >> > >> what part of the system is the problem... >> > >> >> > >> You probably only need a few tens of seconds of the tcpdump... >> > >> >> > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < >> > >> > Michael.Tuexen@lurchi.franken.de> wrote: >> > >> > >> > >> > > >> > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney >> wrote: >> > >> > > >> > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:= 51 >> > >> +0200: >> > >> > > >> >> > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney >> > >> wrote: >> > >> > > >> >> > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:= 34 >> > >> +0800: >> > >> > > >>>> Dear all, >> > >> > > >>>> >> > >> > > >>>> Last month, I send problems related to FTP/TCP in a high R= TT >> > >> > > environment. >> > >> > > >>>> After that, I setup a simulation environment(Dummynet) to >> test >> > >> TCP >> > >> > > and SCTP >> > >> > > >>>> in high delay environment. After finishing the test, I can >> see >> > >> TCP is >> > >> > > >>>> always slower than SCTP. But, I think it is not possible. >> (Plz >> > >> see the >> > >> > > >>>> figure in the attachment). When the delay is 200ms(means >> > >> RTT=3D400ms). >> > >> > > >>>> Besides, the TCP is extremely slow. >> > >> > > >>>> >> > >> > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0 (by >> dummynet) >> > >> > > >>>> >> > >> > > >>>> This is my parameters: >> > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: >> Thu Aug >> > >> 7 >> > >> > > >>>> 11:04:15 HKT 2014 >> > >> > > >>>> >> > >> > > >>>> sysctl net.inet.tcp >> > >> > > >>> >> > >> > > >>> [...] >> > >> > > >>> >> > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 >> > >> > > >>> >> > >> > > >>> [...] >> > >> > > >>> >> > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 >> > >> > > >>> >> > >> > > >>> Try enabling this... This should allow the buffer to grow >> large >> > >> enough >> > >> > > >>> to deal w/ the higher latency... >> > >> > > >>> >> > >> > > >>> Also, make sure your program isn't setting the recv buffer >> size >> > >> as that >> > >> > > >>> will disable the auto growing... >> > >> > > >> I think the program sets the buffer to 2MB, which it also >> does for >> > >> SCTP. >> > >> > > >> So having both statically at the same size makes sense for t= he >> > >> > > comparison. >> > >> > > >> I remember that there was a bug in the combination of LRO an= d >> > >> delayed >> > >> > > ACK, >> > >> > > >> which was fixed, but I don't remember it was fixed before >> 10.0... >> > >> > > > >> > >> > > > Sounds like disabling LRO and TSO would be a useful test to >> see if >> > >> that >> > >> > > > improves things... But hiren said that the fix made it, so..= . >> > >> > > > >> > >> > > >>> If you use netstat -a, you should be able to see the send-q >> on the >> > >> > > >>> sender grow as necessary... >> > >> > > > >> > >> > > > Also, getting the send-q output while it's running would let >> us know >> > >> > > > if the buffer is getting to 2MB or not... >> > >> > > That is correct. Niu: Can you provide this? >> >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." >> > > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 04:53:59 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F33E117C for ; Sun, 10 Aug 2014 04:53:58 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A673824C5 for ; Sun, 10 Aug 2014 04:53:58 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7A4ruNs006856 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Aug 2014 21:53:56 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7A4rtQm006855; Sat, 9 Aug 2014 21:53:55 -0700 (PDT) (envelope-from jmg) Date: Sat, 9 Aug 2014 21:53:55 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140810045355.GM83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 09 Aug 2014 21:53:56 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 04:53:59 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:48 +0800: > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the way, > when I am using KVM-based virtual machine(virtio NIC) do the exactly same > test. The results are same. Have you tried disabling tso? I asked that in an earlier email, but never heard from you if that changed anything... a lot of the trace looks like: 19:29:57.223574 IP 10.0.10.2.61010 > 10.0.10.3.9000: . 251521:257313(5792) ack 1 win 32783 19:29:57.223798 IP 10.0.10.3.9000 > 10.0.10.2.61010: . ack 257313 win 32745 19:29:57.225570 IP 10.0.10.2.61010 > 10.0.10.3.9000: . 257313:263105(5792) ack 1 win 32783 Notice how the ack comes back immediately, but for some reason, we decide to wait almost 2ms before sending out the next frame... For some reason, we just aren't filling our window out... tcptcace's graphs shows the winow at 2MB, but we only ever have 4 segments outstanding at once... > ifconfig igb0 > igb0: flags=8843 metric 0 mtu 1500 > options=403bb > ether a0:36:9f:38:27:d0 > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 > nd6 options=29 > media: Ethernet autoselect (1000baseT ) > status: active > > Regards, > Niu Zhixiong > ????????????????????????????????????????????? > kaiaixi@gmail.com > > > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney wrote: > > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > > > I am sorry that I upload a WRONG SCTP capture. But, the throughput is > > same. > > > SCTP is double than TCP, about 18Mbps. > > > ??? > > > sctp_2.pcapng.gz > > > < > > https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=drive_web > > > > > > ??? > > > > Ok, the owin graph is very interesting... We do have a full 2MB window > > on the receiver side, but for some reason, we only ever have just under > > 6k outstanding on the connection... > > > > So, it looks like we send for a short period of time, and then stop > > sending... Do you have LRO enabled? I think it might be related to: > > https://svnweb.freebsd.org/changeset/base/r256920 > > > > As I'm seeing >100ms gaps where the sender doesn't send any data, and > > as soon as more than one ack comes in, the next segment goes out... If > > we only receive a single ack, then we wait for a timeout before sending > > the next segment.. > > > > Can you try to disable LRO on the receiving host? > > > > ifconfig -lro > > > > And see if that helps... If it does... Applying the patch, or compiling > > a more recent kernel from stable/10 that is after r257367 as that is was > > the date that the change was merged... > > > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong > > wrote: > > > > > > > I am sure that wnd is about 2MB all the time. > > > > This is my latest capture, plz see Google Drive. > > > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is > > about > > > > 18Mbps. > > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > > > > The SCTP and TCP are tested in same environment. > > > > > > > > ??? > > > > sctp.pcapng.gz > > > > < > > https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=drive_web > > > > > > > ?????? > > > > tcp.pcapng.gz > > > > < > > https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=drive_web > > > > > > > ??? > > > > > > > > > > > > > > > > Regards, > > > > Niu Zhixiong > > > > ????????????????????????????????????????????? > > > > kaiaixi@gmail.com > > > > > > > > > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > > > > wrote: > > > > > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > > > >> > During the TCP4 transmission. > > > >> > Proto Recv-Q Send-Q Local Address Foreign Address > > > >> (state) > > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > > > >> > ESTABLISHED > > > >> > > > >> Ok, so you are getting a full 2MB in there, and w/ that, you should > > > >> easily be saturating your pipe... > > > >> > > > >> The next thing would be to get a tcpdump, and take a look at the > > > >> window size.. Wireshark has lots of neat tools to make this analysis > > > >> easy... Another tool that is good is tcptrace.. It can output a > > > >> variety of different graphs that will help you track down, and see > > > >> what part of the system is the problem... > > > >> > > > >> You probably only need a few tens of seconds of the tcpdump... > > > >> > > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > > > >> > Michael.Tuexen@lurchi.franken.de> wrote: > > > >> > > > > >> > > > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney > > wrote: > > > >> > > > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 > > > >> +0200: > > > >> > > >> > > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > > > >> wrote: > > > >> > > >> > > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 > > > >> +0800: > > > >> > > >>>> Dear all, > > > >> > > >>>> > > > >> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > > > >> > > environment. > > > >> > > >>>> After that, I setup a simulation environment(Dummynet) to > > test > > > >> TCP > > > >> > > and SCTP > > > >> > > >>>> in high delay environment. After finishing the test, I can > > see > > > >> TCP is > > > >> > > >>>> always slower than SCTP. But, I think it is not possible. > > (Plz > > > >> see the > > > >> > > >>>> figure in the attachment). When the delay is 200ms(means > > > >> RTT=400ms). > > > >> > > >>>> Besides, the TCP is extremely slow. > > > >> > > >>>> > > > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by > > dummynet) > > > >> > > >>>> > > > >> > > >>>> This is my parameters: > > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: Thu > > Aug > > > >> 7 > > > >> > > >>>> 11:04:15 HKT 2014 > > > >> > > >>>> > > > >> > > >>>> sysctl net.inet.tcp > > > >> > > >>> > > > >> > > >>> [...] > > > >> > > >>> > > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > > > >> > > >>> > > > >> > > >>> [...] > > > >> > > >>> > > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > > > >> > > >>> > > > >> > > >>> Try enabling this... This should allow the buffer to grow > > large > > > >> enough > > > >> > > >>> to deal w/ the higher latency... > > > >> > > >>> > > > >> > > >>> Also, make sure your program isn't setting the recv buffer > > size > > > >> as that > > > >> > > >>> will disable the auto growing... > > > >> > > >> I think the program sets the buffer to 2MB, which it also does > > for > > > >> SCTP. > > > >> > > >> So having both statically at the same size makes sense for the > > > >> > > comparison. > > > >> > > >> I remember that there was a bug in the combination of LRO and > > > >> delayed > > > >> > > ACK, > > > >> > > >> which was fixed, but I don't remember it was fixed before > > 10.0... > > > >> > > > > > > >> > > > Sounds like disabling LRO and TSO would be a useful test to see > > if > > > >> that > > > >> > > > improves things... But hiren said that the fix made it, so... > > > >> > > > > > > >> > > >>> If you use netstat -a, you should be able to see the send-q > > on the > > > >> > > >>> sender grow as necessary... > > > >> > > > > > > >> > > > Also, getting the send-q output while it's running would let us > > know > > > >> > > > if the buffer is getting to 2MB or not... > > > >> > > That is correct. Niu: Can you provide this? > > > > -- > > John-Mark Gurney Voice: +1 415 225 5579 > > > > "All that I will do, has been done, All that I have, has not." > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 05:06:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 994772C4 for ; Sun, 10 Aug 2014 05:06:47 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6BFFD257D for ; Sun, 10 Aug 2014 05:06:47 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7A56kZc007077 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 9 Aug 2014 22:06:46 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7A56kBj007076; Sat, 9 Aug 2014 22:06:46 -0700 (PDT) (envelope-from jmg) Date: Sat, 9 Aug 2014 22:06:46 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140810050646.GN83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , freebsd-net@freebsd.org, Bill Yuan References: <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 09 Aug 2014 22:06:46 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 05:06:47 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:56 +0800: > Actually. In the > http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/configtuning-kernel-limits.html > 12.11.2.2. TCP Bandwidth Delay Product > I saw an option called > net.inet.tcp.inflight.enable > net.inet.tcp.inflight.debug > net.inet.tcp.inflight.min > > But, in FreeBSD 9.3R and 10R. I cannot find anything related to inflight in > sysctl net.inet.tcp. Looks like it was removed for the pluggable congestion control: https://svnweb.freebsd.org/changeset/base/r211315 man mod_cc for more info... > On Sun, Aug 10, 2014 at 11:48 AM, Niu Zhixiong wrote: > > > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the > > way, when I am using KVM-based virtual machine(virtio NIC) do the exactly > > same test. The results are same. > > > > ifconfig igb0 > > igb0: flags=8843 metric 0 mtu 1500 > > > > options=403bb > > ether a0:36:9f:38:27:d0 > > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 > > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 > > nd6 options=29 > > media: Ethernet autoselect (1000baseT ) > > status: active > > > > Regards, > > Niu Zhixiong > > ????????????????????????????????????????????? > > kaiaixi@gmail.com > > > > > > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney > > wrote: > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > >> > I am sorry that I upload a WRONG SCTP capture. But, the throughput is > >> same. > >> > SCTP is double than TCP, about 18Mbps. > >> > ??? > >> > sctp_2.pcapng.gz > >> > < > >> https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=drive_web > >> > > >> > ??? > >> > >> Ok, the owin graph is very interesting... We do have a full 2MB window > >> on the receiver side, but for some reason, we only ever have just under > >> 6k outstanding on the connection... > >> > >> So, it looks like we send for a short period of time, and then stop > >> sending... Do you have LRO enabled? I think it might be related to: > >> https://svnweb.freebsd.org/changeset/base/r256920 > >> > >> As I'm seeing >100ms gaps where the sender doesn't send any data, and > >> as soon as more than one ack comes in, the next segment goes out... If > >> we only receive a single ack, then we wait for a timeout before sending > >> the next segment.. > >> > >> Can you try to disable LRO on the receiving host? > >> > >> ifconfig -lro > >> > >> And see if that helps... If it does... Applying the patch, or compiling > >> a more recent kernel from stable/10 that is after r257367 as that is was > >> the date that the change was merged... > >> > >> > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong > >> wrote: > >> > > >> > > I am sure that wnd is about 2MB all the time. > >> > > This is my latest capture, plz see Google Drive. > >> > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) is > >> about > >> > > 18Mbps. > >> > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > >> > > The SCTP and TCP are tested in same environment. > >> > > > >> > > ??? > >> > > sctp.pcapng.gz > >> > > < > >> https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=drive_web > >> > > >> > > ?????? > >> > > tcp.pcapng.gz > >> > > < > >> https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=drive_web > >> > > >> > > ??? > >> > > > >> > > > >> > > > >> > > Regards, > >> > > Niu Zhixiong > >> > > ????????????????????????????????????????????? > >> > > kaiaixi@gmail.com > >> > > > >> > > > >> > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney > >> > > wrote: > >> > > > >> > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 +0800: > >> > >> > During the TCP4 transmission. > >> > >> > Proto Recv-Q Send-Q Local Address Foreign Address > >> > >> (state) > >> > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > >> > >> > ESTABLISHED > >> > >> > >> > >> Ok, so you are getting a full 2MB in there, and w/ that, you should > >> > >> easily be saturating your pipe... > >> > >> > >> > >> The next thing would be to get a tcpdump, and take a look at the > >> > >> window size.. Wireshark has lots of neat tools to make this analysis > >> > >> easy... Another tool that is good is tcptrace.. It can output a > >> > >> variety of different graphs that will help you track down, and see > >> > >> what part of the system is the problem... > >> > >> > >> > >> You probably only need a few tens of seconds of the tcpdump... > >> > >> > >> > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > >> > >> > Michael.Tuexen@lurchi.franken.de> wrote: > >> > >> > > >> > >> > > > >> > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney > >> wrote: > >> > >> > > > >> > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at 21:51 > >> > >> +0200: > >> > >> > > >> > >> > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney > >> > >> wrote: > >> > >> > > >> > >> > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at 20:34 > >> > >> +0800: > >> > >> > > >>>> Dear all, > >> > >> > > >>>> > >> > >> > > >>>> Last month, I send problems related to FTP/TCP in a high RTT > >> > >> > > environment. > >> > >> > > >>>> After that, I setup a simulation environment(Dummynet) to > >> test > >> > >> TCP > >> > >> > > and SCTP > >> > >> > > >>>> in high delay environment. After finishing the test, I can > >> see > >> > >> TCP is > >> > >> > > >>>> always slower than SCTP. But, I think it is not possible. > >> (Plz > >> > >> see the > >> > >> > > >>>> figure in the attachment). When the delay is 200ms(means > >> > >> RTT=400ms). > >> > >> > > >>>> Besides, the TCP is extremely slow. > >> > >> > > >>>> > >> > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by > >> dummynet) > >> > >> > > >>>> > >> > >> > > >>>> This is my parameters: > >> > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE #0: > >> Thu Aug > >> > >> 7 > >> > >> > > >>>> 11:04:15 HKT 2014 > >> > >> > > >>>> > >> > >> > > >>>> sysctl net.inet.tcp > >> > >> > > >>> > >> > >> > > >>> [...] > >> > >> > > >>> > >> > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > >> > >> > > >>> > >> > >> > > >>> [...] > >> > >> > > >>> > >> > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > >> > >> > > >>> > >> > >> > > >>> Try enabling this... This should allow the buffer to grow > >> large > >> > >> enough > >> > >> > > >>> to deal w/ the higher latency... > >> > >> > > >>> > >> > >> > > >>> Also, make sure your program isn't setting the recv buffer > >> size > >> > >> as that > >> > >> > > >>> will disable the auto growing... > >> > >> > > >> I think the program sets the buffer to 2MB, which it also > >> does for > >> > >> SCTP. > >> > >> > > >> So having both statically at the same size makes sense for the > >> > >> > > comparison. > >> > >> > > >> I remember that there was a bug in the combination of LRO and > >> > >> delayed > >> > >> > > ACK, > >> > >> > > >> which was fixed, but I don't remember it was fixed before > >> 10.0... > >> > >> > > > > >> > >> > > > Sounds like disabling LRO and TSO would be a useful test to > >> see if > >> > >> that > >> > >> > > > improves things... But hiren said that the fix made it, so... > >> > >> > > > > >> > >> > > >>> If you use netstat -a, you should be able to see the send-q > >> on the > >> > >> > > >>> sender grow as necessary... > >> > >> > > > > >> > >> > > > Also, getting the send-q output while it's running would let > >> us know > >> > >> > > > if the buffer is getting to 2MB or not... > >> > >> > > That is correct. Niu: Can you provide this? > >> > >> -- > >> John-Mark Gurney Voice: +1 415 225 5579 > >> > >> "All that I will do, has been done, All that I have, has not." > >> > > > > -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 09:55:41 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 225D14CB for ; Sun, 10 Aug 2014 09:55:41 +0000 (UTC) Received: from gpo1.cc.swin.edu.au (gpo1.cc.swin.edu.au [136.186.1.30]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 91473201A for ; Sun, 10 Aug 2014 09:55:40 +0000 (UTC) Received: from [136.186.229.37] (garmitage.caia.swin.edu.au [136.186.229.37]) by gpo1.cc.swin.edu.au (8.14.3/8.14.3) with ESMTP id s7A9tbnW008910 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 10 Aug 2014 19:55:37 +1000 Message-ID: <53E74199.5040507@swin.edu.au> Date: Sun, 10 Aug 2014 19:55:37 +1000 From: grenville armitage User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:16.0) Gecko/20121107 Thunderbird/16.0.2 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: A problem on TCP in High RTT Environment. References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> In-Reply-To: <20140810022350.GI83475@funkthat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 09:55:41 -0000 On 08/10/2014 12:23, John-Mark Gurney wrote: [..] > The next thing would be to get a tcpdump, and take a look at the > window size.. Wireshark has lots of neat tools to make this analysis > easy... Another tool that is good is tcptrace.. It can output a > variety of different graphs that will help you track down, and see > what part of the system is the problem... Also, SIFTR (man siftr) can provide detailed insight into the TCP flow's state over time. cheers, gja From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 10:05:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3ACC5698 for ; Sun, 10 Aug 2014 10:05:29 +0000 (UTC) Received: from outbound.afilias.info (outbound.afilias.info [66.199.183.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 037CA20DE for ; Sun, 10 Aug 2014 10:05:28 +0000 (UTC) Received: from ms5.on1.afilias-ops.info ([10.109.8.9] helo=smtp.afilias.info) by outbound.afilias.info with esmtp (Exim 4.72) (envelope-from ) id 1XGPqW-0001DG-60 for freebsd-net@freebsd.org; Sun, 10 Aug 2014 09:55:24 +0000 Received: from mail-we0-f177.google.com ([74.125.82.177]) by smtp.afilias.info with esmtps (TLSv1:RC4-SHA:128) (Exim 4.72) (envelope-from ) id 1XGPqW-0002Yg-5Y for freebsd-net@freebsd.org; Sun, 10 Aug 2014 09:55:24 +0000 Received: by mail-we0-f177.google.com with SMTP id w62so7367197wes.22 for ; Sun, 10 Aug 2014 02:55:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=UU+NIbWhZmF1VzlnzPvD7Ko1befayOVgFFCntxTJjRU=; b=jYfBlLZFGfbk/TXFXkqpRQNx4ibKWHdo7YhQzpgv/2w0Kw6wO7G6UM1Jp3ySJBwNpW j0A2ylv/a+bCY1rGucdQ/dY12fsh3R3u0j6W44lmuxIjdSJoDVbIZTGxYmLPpbVPkNZT Tgl/Mq5zXXgxIZb6AcDmVta65vVtrIKSw+7tZyOcAcKO5Rpg8tS11nOqriokdOXbTF0T SfM4F0jRc1GKDmt7rRq+nzwF1XZxAZ8PYUX9w7vXvFGNfXHoiL5lHm9PjU1fWkKP3UzA mTeR+Cn2Ai16fQSnx1D15m2zF4wjcVbKRFVwUusWj2gIB0t1dFxm/qoZA0TrvYN59pLs NU6Q== X-Gm-Message-State: ALoCoQkW8bvnR+atoDzDOREY1QY2DXe4+NRH/WwQv5k4jasB4QHS6x4EtComOe6MiRdT/JlOvpqF3r/o93ht+H3BzzuFCvWdIIHNSc2XiKoVCnE0pUGmjYpZ8OBTwXNBM9Oo08fS6dO8 X-Received: by 10.180.12.76 with SMTP id w12mr16056979wib.4.1407664518628; Sun, 10 Aug 2014 02:55:18 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.12.76 with SMTP id w12mr16056970wib.4.1407664518549; Sun, 10 Aug 2014 02:55:18 -0700 (PDT) Received: by 10.217.117.201 with HTTP; Sun, 10 Aug 2014 02:55:18 -0700 (PDT) Date: Sun, 10 Aug 2014 10:55:18 +0100 Message-ID: Subject: Geolocalization daemon From: David Carlier To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 10:05:29 -0000 Hello. I made recently a new daemon for OpenBSD which provides Geolocalization infos from an IP address. I'd like to know if in any way it might interest so I could port it for another BSD ? Kind regards. David CARLIER Afilias Technologies From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 10:34:43 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BC99EA47 for ; Sun, 10 Aug 2014 10:34:43 +0000 (UTC) Received: from outbound.afilias.info (outbound.afilias.info [66.199.183.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 80092242C for ; Sun, 10 Aug 2014 10:34:43 +0000 (UTC) Received: from ms5.on1.afilias-ops.info ([10.109.8.9] helo=smtp.afilias.info) by outbound.afilias.info with esmtp (Exim 4.72) (envelope-from ) id 1XGQSY-0002LV-3v for freebsd-net@freebsd.org; Sun, 10 Aug 2014 10:34:42 +0000 Received: from mail-we0-f178.google.com ([74.125.82.178]) by smtp.afilias.info with esmtps (TLSv1:RC4-SHA:128) (Exim 4.72) (envelope-from ) id 1XGQSY-000376-3T for freebsd-net@freebsd.org; Sun, 10 Aug 2014 10:34:42 +0000 Received: by mail-we0-f178.google.com with SMTP id w61so7492447wes.37 for ; Sun, 10 Aug 2014 03:34:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=8fHe9/qbd7wPN6zWgwPPWV4upeKMnTBFxbKdEG7pAC4=; b=M7sOox1whiZTvKp/zyrFAq356IIxKwyu78jLeKnNXZFipqc990PnF2wzUKv5Ah6Uwl HBUkTmRbnXleA6SnLdti6LzvLjLuUyQm+63gcJQKwaIsPxplu02aAIyBuXjDdVuklOtf 4tDTI7XY94wB61PKOJ1rXBcwZjnRLaiFvJqZUO3oyfVQfmPh3ux0ztjUlHvKvIYOdQ/T TvDt4nFCvhXDgbPzS2yZcaK/votygLdvMi+O5Eop7RVdYhtAxrLnoywjUjOPtLqd98Sp U348yg+reLWYJHI/tWyeN05mihXuIIGbYi9se2q44+tluIjSHWZnAYxQ6xaSEvUpAvwN BsBA== X-Gm-Message-State: ALoCoQmlS3jLYOjPk4YaiI+eIvPGuvE0RiOTiKelbh4YxCZMgQ35MbQ7F47/PBaHN//ym2i18snj1Bnu0MuVE6KT/OyNLVzFWQJMLuAUK3wafjfX4Qa9O5aSr1XbG++v0fKa2kgCoh8T X-Received: by 10.180.20.105 with SMTP id m9mr11747718wie.6.1407666876011; Sun, 10 Aug 2014 03:34:36 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.20.105 with SMTP id m9mr11747700wie.6.1407666875897; Sun, 10 Aug 2014 03:34:35 -0700 (PDT) Received: by 10.217.117.201 with HTTP; Sun, 10 Aug 2014 03:34:35 -0700 (PDT) In-Reply-To: References: Date: Sun, 10 Aug 2014 11:34:35 +0100 Message-ID: Subject: Re: Geolocalization daemon From: David Carlier To: Sami Halabi , freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 10:34:43 -0000 Hi and thanks for your interest. Will need some port as it uses imsg from OpenBSD's libutil and more generally the OpenBSD's "trend". Who is interested can have a look here https://github.com/devnexen/geoloc On 10 August 2014 11:30, Sami Halabi wrote: > Souns very interesting.. would love to see it working in fbsd.. > > Sami > =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 10 =D7=91=D7=90=D7=95=D7=92 2014 13:= 05, "David Carlier" =D7=9B=D7=AA=D7=91: > >> Hello. >> >> I made recently a new daemon for OpenBSD which provides Geolocalization >> infos from an IP address. I'd like to know if in any way it might intere= st >> so I could port it for another BSD ? >> >> Kind regards. >> David CARLIER >> Afilias Technologies >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 12:27:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B5323B38 for ; Sun, 10 Aug 2014 12:27:02 +0000 (UTC) Received: from mail-qa0-x22d.google.com (mail-qa0-x22d.google.com [IPv6:2607:f8b0:400d:c00::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6B52C2D7A for ; Sun, 10 Aug 2014 12:27:02 +0000 (UTC) Received: by mail-qa0-f45.google.com with SMTP id cm18so7034342qab.18 for ; Sun, 10 Aug 2014 05:27:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=9ovCxsdSWgOd0GEAo64jRNS17PuPqEiw2cHHW83bjfs=; b=sbJHc9nx0+rTglTIWXGJeY/CtZG2W5eCVyCBO6ctYlkXQFSGcH4ODAIB6A4ErnXimE dmwfl2+WDyV94eZxr0VY8GGFXXpy0UGLPU3RRflp30wgIjgzaUCGbQwlvHvTDyo5ukLw MtPYyzhzpTFh1DokFXEj3UteyQva2U400mFqLyHwwpGHRVBJ/1lcFSVn8ptKQFVuUCSd sZJmdBllvDDvzpqEhdFyyPeSCLyTXY9lLLth124PUTOjSe+zNF27fy/PRQuPM57N6eQp IQK/3d4mib6Aj5dtqssh4XEEInuSWN62pHdZlVv1ZrY7nvWt3S9RwpBCFtaVGX3gim7S 6oXw== MIME-Version: 1.0 X-Received: by 10.140.41.38 with SMTP id y35mr38456683qgy.69.1407673621360; Sun, 10 Aug 2014 05:27:01 -0700 (PDT) Received: by 10.224.137.71 with HTTP; Sun, 10 Aug 2014 05:27:01 -0700 (PDT) In-Reply-To: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <20140810045355.GM83475@funkthat.com> Date: Sun, 10 Aug 2014 20:27:01 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , "freebsd-net@freebsd.org" , Bill Yuan , John-Mark Gurney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 12:27:02 -0000 Hi, I am not sure whether my last email is filtered by mailing list. After disabled tso=EF=BC=8C the speed become even poorer=EF=BC=8E This is the packets captures. Plz see google drive. tcp_with_tso_off.pcapng.gz Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Sun, Aug 10, 2014 at 1:24 PM, Niu Zhixiong wrote: > Hi=EF=BC=8C > After disabled tso=EF=BC=8C the speed become even poorer=EF=BC=8E > This is the packets captures. Plz see google drive. > =E2=80=8B > tcp_with_tso_off.pcapng.gz > > =E2=80=8B > > > John-Mark Gurney =E4=BA=8E2014=E5=B9=B48=E6=9C=8810=E6= =97=A5=E6=98=9F=E6=9C=9F=E6=97=A5=E5=86=99=E9=81=93=EF=BC=9A > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:48 +0800: >> > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the >> way, >> > when I am using KVM-based virtual machine(virtio NIC) do the exactly >> same >> > test. The results are same. >> >> Have you tried disabling tso? I asked that in an earlier email, but >> never heard from you if that changed anything... >> >> a lot of the trace looks like: >> 19:29:57.223574 IP 10.0.10.2.61010 > 10.0.10.3.9000: . >> 251521:257313(5792) ack 1 win 32783 >> 19:29:57.223798 IP 10.0.10.3.9000 > 10.0.10.2.61010: . ack 257313 win >> 32745 >> 19:29:57.225570 IP 10.0.10.2.61010 > 10.0.10.3.9000: . >> 257313:263105(5792) ack 1 win 32783 >> >> Notice how the ack comes back immediately, but for some reason, we decid= e >> to >> wait almost 2ms before sending out the next frame... >> >> For some reason, we just aren't filling our window out... tcptcace's >> graphs shows the winow at 2MB, but we only ever have 4 segments >> outstanding at once... >> >> > ifconfig igb0 >> > igb0: flags=3D8843 metric 0 mt= u >> 1500 >> > >> options=3D403bb >> > ether a0:36:9f:38:27:d0 >> > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 >> > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 >> > nd6 options=3D29 >> > media: Ethernet autoselect (1000baseT ) >> > status: active >> > >> > Regards, >> > Niu Zhixiong >> > ????????????????????????????????????????????? >> > kaiaixi@gmail.com >> > >> > >> > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney >> wrote: >> > >> > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: >> > > > I am sorry that I upload a WRONG SCTP capture. But, the throughput >> is >> > > same. >> > > > SCTP is double than TCP, about 18Mbps. >> > > > ??? >> > > > sctp_2.pcapng.gz >> > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=3Dd= rive_web >> > > > >> > > > ??? >> > > >> > > Ok, the owin graph is very interesting... We do have a full 2MB >> window >> > > on the receiver side, but for some reason, we only ever have just >> under >> > > 6k outstanding on the connection... >> > > >> > > So, it looks like we send for a short period of time, and then stop >> > > sending... Do you have LRO enabled? I think it might be related to= : >> > > https://svnweb.freebsd.org/changeset/base/r256920 >> > > >> > > As I'm seeing >100ms gaps where the sender doesn't send any data, an= d >> > > as soon as more than one ack comes in, the next segment goes out... >> If >> > > we only receive a single ack, then we wait for a timeout before >> sending >> > > the next segment.. >> > > >> > > Can you try to disable LRO on the receiving host? >> > > >> > > ifconfig -lro >> > > >> > > And see if that helps... If it does... Applying the patch, or >> compiling >> > > a more recent kernel from stable/10 that is after r257367 as that is >> was >> > > the date that the change was merged... >> > > >> > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong >> > > wrote: >> > > > >> > > > > I am sure that wnd is about 2MB all the time. >> > > > > This is my latest capture, plz see Google Drive. >> > > > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s= ) >> is >> > > about >> > > > > 18Mbps. >> > > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) >> > > > > The SCTP and TCP are tested in same environment. >> > > > > >> > > > > ??? >> > > > > sctp.pcapng.gz >> > > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=3Dd= rive_web >> > > > >> > > > > ?????? >> > > > > tcp.pcapng.gz >> > > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=3Dd= rive_web >> > > > >> > > > > ??? >> > > > > >> > > > > >> > > > > >> > > > > Regards, >> > > > > Niu Zhixiong >> > > > > ????????????????????????????????????????????? >> > > > > kaiaixi@gmail.com >> > > > > >> > > > > >> > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney < >> jmg@funkthat.com> >> > > > > wrote: >> > > > > >> > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 >> +0800: >> > > > >> > During the TCP4 transmission. >> > > > >> > Proto Recv-Q Send-Q Local Address Foreign Address >> > > > >> (state) >> > > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 >> > > > >> > ESTABLISHED >> > > > >> >> > > > >> Ok, so you are getting a full 2MB in there, and w/ that, you >> should >> > > > >> easily be saturating your pipe... >> > > > >> >> > > > >> The next thing would be to get a tcpdump, and take a look at th= e >> > > > >> window size.. Wireshark has lots of neat tools to make this >> analysis >> > > > >> easy... Another tool that is good is tcptrace.. It can output= a >> > > > >> variety of different graphs that will help you track down, and >> see >> > > > >> what part of the system is the problem... >> > > > >> >> > > > >> You probably only need a few tens of seconds of the tcpdump... >> > > > >> >> > > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < >> > > > >> > Michael.Tuexen@lurchi.franken.de> wrote: >> > > > >> > >> > > > >> > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney > > >> > > wrote: >> > > > >> > > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at >> 21:51 >> > > > >> +0200: >> > > > >> > > >> >> > > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney < >> jmg@funkthat.com> >> > > > >> wrote: >> > > > >> > > >> >> > > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at >> 20:34 >> > > > >> +0800: >> > > > >> > > >>>> Dear all, >> > > > >> > > >>>> >> > > > >> > > >>>> Last month, I send problems related to FTP/TCP in a >> high RTT >> > > > >> > > environment. >> > > > >> > > >>>> After that, I setup a simulation environment(Dummynet) >> to >> > > test >> > > > >> TCP >> > > > >> > > and SCTP >> > > > >> > > >>>> in high delay environment. After finishing the test, I >> can >> > > see >> > > > >> TCP is >> > > > >> > > >>>> always slower than SCTP. But, I think it is not >> possible. >> > > (Plz >> > > > >> see the >> > > > >> > > >>>> figure in the attachment). When the delay is 200ms(mea= ns >> > > > >> RTT=3D400ms). >> > > > >> > > >>>> Besides, the TCP is extremely slow. >> > > > >> > > >>>> >> > > > >> > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS =3D 0= (by >> > > dummynet) >> > > > >> > > >>>> >> > > > >> > > >>>> This is my parameters: >> > > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE >> #0: Thu >> > > Aug >> > > > >> 7 >> > > > >> > > >>>> 11:04:15 HKT 2014 >> > > > >> > > >>>> >> > > > >> > > >>>> sysctl net.inet.tcp >> > > > >> > > >>> >> > > > >> > > >>> [...] >> > > > >> > > >>> >> > > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 >> > > > >> > > >>> >> > > > >> > > >>> [...] >> > > > >> > > >>> >> > > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 >> > > > >> > > >>> >> > > > >> > > >>> Try enabling this... This should allow the buffer to >> grow >> > > large >> > > > >> enough >> > > > >> > > >>> to deal w/ the higher latency... >> > > > >> > > >>> >> > > > >> > > >>> Also, make sure your program isn't setting the recv >> buffer >> > > size >> > > > >> as that >> > > > >> > > >>> will disable the auto growing... >> > > > >> > > >> I think the program sets the buffer to 2MB, which it als= o >> does >> > > for >> > > > >> SCTP. >> > > > >> > > >> So having both statically at the same size makes sense >> for the >> > > > >> > > comparison. >> > > > >> > > >> I remember that there was a bug in the combination of LR= O >> and >> > > > >> delayed >> > > > >> > > ACK, >> > > > >> > > >> which was fixed, but I don't remember it was fixed befor= e >> > > 10.0... >> > > > >> > > > >> > > > >> > > > Sounds like disabling LRO and TSO would be a useful test >> to see >> > > if >> > > > >> that >> > > > >> > > > improves things... But hiren said that the fix made it, >> so... >> > > > >> > > > >> > > > >> > > >>> If you use netstat -a, you should be able to see the >> send-q >> > > on the >> > > > >> > > >>> sender grow as necessary... >> > > > >> > > > >> > > > >> > > > Also, getting the send-q output while it's running would >> let us >> > > know >> > > > >> > > > if the buffer is getting to 2MB or not... >> > > > >> > > That is correct. Niu: Can you provide this? >> > > >> > > -- >> > > John-Mark Gurney Voice: +1 415 225 >> 5579 >> > > >> > > "All that I will do, has been done, All that I have, has not." >> > > >> > _______________________________________________ >> > freebsd-net@freebsd.org mailing list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-net >> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." >> > From owner-freebsd-net@FreeBSD.ORG Sun Aug 10 20:09:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7E0F3733 for ; Sun, 10 Aug 2014 20:09:57 +0000 (UTC) Received: from mail-wg0-x230.google.com (mail-wg0-x230.google.com [IPv6:2a00:1450:400c:c00::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C0C1294A for ; Sun, 10 Aug 2014 20:09:56 +0000 (UTC) Received: by mail-wg0-f48.google.com with SMTP id x13so7738817wgg.7 for ; Sun, 10 Aug 2014 13:09:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=7EU5W3QJtEqDeDChDpD8KI1o0l/+jXKPn+n4Hbkwjvw=; b=AiLD3w9QRoS1GLHYm+0aaMtMdUOoQhb+vRH1kmH81aEkhvzQjrEmfBKr3Jft1Y7JxH eA+x/YJnTyMwHP0jiBnBOsGhA8yOcRLwI/tvDcW6OmTVJv5TJFVR9giRvUkwQRsn/CrP yBFzcbsHKaXh3lqFw+NvVf8QdNSQaVtlS0O8Dc83Gb/4t5BffDVV0d7bNmYkqf94O8Zd LfUiNNYpbwzrgmveCIepxbp+WgCYpMxFYJq79rfoq0RfZp6HI3lxmhGTYoxzjRyHrdco UIU5usgHictsVQA/LySd02hKqXtEupGnLsNn95fvelzPIPEbS7gtSJQaXEfNTgVX6CzZ Frng== MIME-Version: 1.0 X-Received: by 10.180.73.6 with SMTP id h6mr19773445wiv.65.1407701395087; Sun, 10 Aug 2014 13:09:55 -0700 (PDT) Received: by 10.216.22.129 with HTTP; Sun, 10 Aug 2014 13:09:55 -0700 (PDT) Date: Sun, 10 Aug 2014 23:09:55 +0300 Message-ID: Subject: Question about tcp keep-alive timer From: David Bar To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Aug 2014 20:09:57 -0000 Hi (Forgive me if this topic has been discussed before. I didn't find it in the archives) In tcp_input(), when a packet is received on an established socket the code re-arms the keep-alive timer, for each packet. Here: https://svnweb.freebsd.org/base/release/10.0.0/sys/netinet/tcp_input.c?revision=260789&view=markup#l1518 Isn't this a waste to do this for each packet? The setting of the timer when the connection becomes established should suffice if there was a small change in tcp_timer_keep(). If tcp_timer_keep() would first checks if tp->t_rcvtime is recent (newer than the TT_KEEPIDLE time), and would just re-arm the timer to go off later, then we would keep the same functionality. I can't think of any downsides to this idea. Any good reason why this hasn't been done before? Thanks, David Bar dbar at gmail dot com From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 07:08:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5D440E6B for ; Mon, 11 Aug 2014 07:08:50 +0000 (UTC) Received: from mx1.netapp.com (mx1.netapp.com [216.240.18.38]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx1.netapp.com", Issuer "VeriSign Class 3 Secure Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 13CCA27DA for ; Mon, 11 Aug 2014 07:08:49 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,839,1400050800"; d="asc'?scan'208";a="338346448" Received: from vmwexceht05-prd.hq.netapp.com ([10.106.77.35]) by mx1-out.netapp.com with ESMTP; 11 Aug 2014 00:08:43 -0700 Received: from HIOEXCMBX03-PRD.hq.netapp.com (10.122.105.36) by vmwexceht05-prd.hq.netapp.com (10.106.77.35) with Microsoft SMTP Server (TLS) id 14.3.123.3; Mon, 11 Aug 2014 00:08:17 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx03-prd.hq.netapp.com (10.122.105.36) with Microsoft SMTP Server (TLS) id 15.0.913.22; Mon, 11 Aug 2014 00:08:15 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::f0de:b572:dd26:36b5%21]) with mapi id 15.00.0913.011; Mon, 11 Aug 2014 00:08:15 -0700 From: "Eggert, Lars" To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Thread-Topic: A problem on TCP in High RTT Environment. Thread-Index: AQHPswaXaqmGo9mF3Ua6+epBNevn4ZvJEqcAgAATYoCAAA7VAIAAA8CAgABXqQCAAANCAP//lMnkgAB+UQCAAASwAIAByiAA Date: Mon, 11 Aug 2014 07:08:15 +0000 Message-ID: <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.122.56.79] Content-Type: multipart/signed; boundary="Apple-Mail=_BBD29890-D3AB-4597-8701-FC64BF565A91"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: Michael Tuexen , John-Mark Gurney , Bill Yuan , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 07:08:50 -0000 --Apple-Mail=_BBD29890-D3AB-4597-8701-FC64BF565A91 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, On 2014-8-10, at 5:48, Niu Zhixiong wrote: > I am using Intel I350-T4 NIC. igb driver? I've been having weird issues with this driver under 10-RELEASE, too. On = one machine, I had to limit hw.igb.num_queues=3D2 in order to get any = sort of useful connectivity. On another machine, I had to severely bump = kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, = since SCTP seems to be working OK, but I'm not trusting igb NICs at the = moment. Lars --Apple-Mail=_BBD29890-D3AB-4597-8701-FC64BF565A91 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU+hr+NZcnpRveo1xAQJ26gP+MJIgZKg2Y9ztDnCkU4yU1CROwWoo16jt 9eMePs9YZyZ3ryh6s8Et7h8wxHdnvnEeH9mptrqKTC+gJwgjLyfU9CGEEqsQ24af 17J4dIf8Kbr4Xui6x2rukiQ/w96N9BL+p5lfWnP4zmDWOAjA9gUcHUElP1lII/bp klNdHOy0Tqo= =ocLP -----END PGP SIGNATURE----- --Apple-Mail=_BBD29890-D3AB-4597-8701-FC64BF565A91-- From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 07:17:58 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 22256F98 for ; Mon, 11 Aug 2014 07:17:58 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D11A328B6 for ; Mon, 11 Aug 2014 07:17:57 +0000 (UTC) Received: from [192.168.1.200] (p548195BE.dip0.t-ipconnect.de [84.129.149.190]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 6B7321C0E96E9; Mon, 11 Aug 2014 09:17:52 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> Date: Mon, 11 Aug 2014 09:17:51 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> To: "Eggert, Lars" X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 07:17:58 -0000 On 11 Aug 2014, at 09:08, Eggert, Lars wrote: > Hi, >=20 > On 2014-8-10, at 5:48, Niu Zhixiong wrote: >> I am using Intel I350-T4 NIC. >=20 > igb driver? >=20 > I've been having weird issues with this driver under 10-RELEASE, too. = On one machine, I had to limit hw.igb.num_queues=3D2 in order to get any = sort of useful connectivity. On another machine, I had to severely bump = kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, = since SCTP seems to be working OK, but I'm not trusting igb NICs at the = moment. Was there any suspicious output provided by netstat -m when the problems = occur? Best regards Michael >=20 > Lars From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 07:31:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8699A2AA for ; Mon, 11 Aug 2014 07:31:20 +0000 (UTC) Received: from mail-qg0-x22a.google.com (mail-qg0-x22a.google.com [IPv6:2607:f8b0:400d:c04::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3DCE429D8 for ; Mon, 11 Aug 2014 07:31:20 +0000 (UTC) Received: by mail-qg0-f42.google.com with SMTP id j5so8109562qga.15 for ; Mon, 11 Aug 2014 00:31:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Emqxi1U+S3jlxmgswSR5DHUHVDj0MuIiYpEBTsdlDh8=; b=LVcMjUwm97ddRLalaBlVWek/4Obs8dTzsWozjDZ3sDR5Y5bQlMMiWs/+ZIqLlSDb2l bgqg04a4jSq+N6bz4tbsT/LjHBcH70K1Xhet+BWZLezHHpDLWrjVlbfA4FdbgTrfXP2I IGvYarQl5lqAFfLNHtLMzU4Nl9TNG6uh08e4yFxd2mg7csiKEy5/9WroAM7dmIKRWT7Z Ni7nrq4SNla1PArtIuaWvmutBWk8aVugbwfdhKIrtrsb/6OuWiMZ6qM2xfPXoM4MQp9M LOQ6bOE++wQaWrI3jLhTK8gcmrrKlgKlpf3XNWZ+nFUoEa4wqKlVO52PBFsbftZ+LUUu rL/w== MIME-Version: 1.0 X-Received: by 10.140.41.38 with SMTP id y35mr44554655qgy.69.1407742279326; Mon, 11 Aug 2014 00:31:19 -0700 (PDT) Received: by 10.224.65.65 with HTTP; Mon, 11 Aug 2014 00:31:19 -0700 (PDT) In-Reply-To: <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> Date: Mon, 11 Aug 2014 15:31:19 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Michael Tuexen Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , "Eggert, Lars" , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 07:31:20 -0000 During TCP benchmark. root@Freetest0: # netstat -m 35806/2129/37935 mbufs in use (current/cache/total) 35251/1037/36288/1011638 mbuf clusters in use (current/cache/total/max) 35251/1030 mbuf+clusters out of packet secondary zone in use (current/cache= ) 0/6/6/505818 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/149872 9k jumbo clusters in use (current/cache/total/max) 0/0/0/84303 16k jumbo clusters in use (current/cache/total/max) 79481K/2630K/82111K bytes allocated to network (current/cache/total) 15/18307/31 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile During SCTP benchmark 36131/1804/37935 mbufs in use (current/cache/total) 35394/894/36288/1011638 mbuf clusters in use (current/cache/total/max) 35394/887 mbuf+clusters out of packet secondary zone in use (current/cache) 0/6/6/505818 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/149872 9k jumbo clusters in use (current/cache/total/max) 0/0/0/84303 16k jumbo clusters in use (current/cache/total/max) 79950K/2263K/82213K bytes allocated to network (current/cache/total) 15/18307/31 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Mon, Aug 11, 2014 at 3:17 PM, Michael Tuexen < Michael.Tuexen@lurchi.franken.de> wrote: > On 11 Aug 2014, at 09:08, Eggert, Lars wrote: > > > Hi, > > > > On 2014-8-10, at 5:48, Niu Zhixiong wrote: > >> I am using Intel I350-T4 NIC. > > > > igb driver? > > > > I've been having weird issues with this driver under 10-RELEASE, too. O= n > one machine, I had to limit hw.igb.num_queues=3D2 in order to get any sor= t of > useful connectivity. On another machine, I had to severely bump > kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, sinc= e > SCTP seems to be working OK, but I'm not trusting igb NICs at the moment. > Was there any suspicious output provided by netstat -m when the problems > occur? > > Best regards > Michael > > > > Lars > > From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 07:32:03 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 542AB340 for ; Mon, 11 Aug 2014 07:32:03 +0000 (UTC) Received: from mail-qc0-x22e.google.com (mail-qc0-x22e.google.com [IPv6:2607:f8b0:400d:c01::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0C43B2A6C for ; Mon, 11 Aug 2014 07:32:02 +0000 (UTC) Received: by mail-qc0-f174.google.com with SMTP id l6so1276959qcy.19 for ; Mon, 11 Aug 2014 00:32:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=dVyiNqOPXmvK5L5n1Bfbz+0QWpl7Zn2CeASi9k5vefM=; b=jCr2yiCccO6mpdI07cGy6lw1H0NcsVmSrBxSC1IZlyv2nGhWzZc+8n/gKNybjOGRiD dJH4nCKDxiKAH4r0a1PjmtM7XfT/X2tCrIof9GoDwuK+liwE/svJmPvGPUmE3EUm5ATv VF5ERABQQBNdUA9IH35K+CAKvv7oZWpiatDG69ZQfUbCmpgzCgYvO/csulhqzE5WiNyt CnrwW9AwxDD1BeAiXEU4vBs4XXA/zuxwafSr2Fb/f1HxnF5PDcJ04eflltbVYMXeOw3+ ScyoxqQzLyF5xxsqgQrhLnB/qwusXKqDv7om7tk7ACxtxnXIuZrdafb1pkgxwBcNEU2c JjiQ== MIME-Version: 1.0 X-Received: by 10.229.137.131 with SMTP id w3mr60120853qct.23.1407742322120; Mon, 11 Aug 2014 00:32:02 -0700 (PDT) Received: by 10.224.65.65 with HTTP; Mon, 11 Aug 2014 00:32:01 -0700 (PDT) In-Reply-To: <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> Date: Mon, 11 Aug 2014 15:32:01 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: "Eggert, Lars" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: Michael Tuexen , John-Mark Gurney , Bill Yuan , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 07:32:03 -0000 Thanks for your reminding. I tried hw.igb.num_queues=3D2 just now. But, the throughput is still slow. And When I tested same things in my Virtual machine-based environment(use Virtio), the throughput is similar(SCTP is 2x than TCP). Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Mon, Aug 11, 2014 at 3:08 PM, Eggert, Lars wrote: > Hi, > > On 2014-8-10, at 5:48, Niu Zhixiong wrote: > > I am using Intel I350-T4 NIC. > > igb driver? > > I've been having weird issues with this driver under 10-RELEASE, too. On > one machine, I had to limit hw.igb.num_queues=3D2 in order to get any sor= t of > useful connectivity. On another machine, I had to severely bump > kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, sinc= e > SCTP seems to be working OK, but I'm not trusting igb NICs at the moment. > > Lars > From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 08:00:12 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CDAF5907 for ; Mon, 11 Aug 2014 08:00:12 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A2E282D42 for ; Mon, 11 Aug 2014 08:00:12 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.8/8.14.8) with ESMTP id s7B80CaR091800 for ; Mon, 11 Aug 2014 08:00:12 GMT (envelope-from bugzilla-noreply@freebsd.org) Message-Id: <201408110800.s7B80CaR091800@kenobi.freebsd.org> From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bugzilla] Commit Needs MFC MIME-Version: 1.0 X-Bugzilla-Type: whine X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated Date: Mon, 11 Aug 2014 08:00:12 +0000 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 08:00:12 -0000 Hi, You have a bug in the "Needs MFC" state which has not been touched in 7 or more days. This email serves as a reminder that you may want to MFC this bug or marked it as completed. In the event you have a longer MFC timeout you may update this bug with a comment and I won't remind you again for 7 days. This reminder is only sent on Mondays. Please file a bug about concerns you may have. This search was scheduled by eadler@FreeBSD.org. (1 bugs) Bug 183659: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183659 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-net@FreeBSD.org Status: Needs MFC Resolution: Summary: [tcp] TCP stack lock contention with short-lived connections From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 09:48:23 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6423048A; Mon, 11 Aug 2014 09:48:23 +0000 (UTC) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1lp0145.outbound.protection.outlook.com [207.46.163.145]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B2822AD7; Mon, 11 Aug 2014 09:48:21 +0000 (UTC) Received: from BY1PR0301MB0902.namprd03.prod.outlook.com (25.160.195.141) by BY1PR0301MB0903.namprd03.prod.outlook.com (25.160.195.142) with Microsoft SMTP Server (TLS) id 15.0.1005.10; Mon, 11 Aug 2014 09:48:12 +0000 Received: from BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) by BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) with mapi id 15.00.1005.008; Mon, 11 Aug 2014 09:48:12 +0000 From: Wei Hu To: Adrian Chadd Subject: RE: vRSS support on FreeBSD Thread-Topic: vRSS support on FreeBSD Thread-Index: AQHPs0Bb+mxjwe28tEGUfZYiw4SKJJvLJUyg Date: Mon, 11 Aug 2014 09:48:12 +0000 Message-ID: References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [167.220.232.169] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 03008837BD x-forefront-antispam-report: SFV:NSPM; SFS:(6009001)(51914003)(66654002)(57704003)(24454002)(13464003)(377454003)(51704005)(189002)(199002)(101416001)(85306004)(110136001)(77096002)(99396002)(86362001)(46102001)(76576001)(107046002)(105586002)(95666004)(106116001)(106356001)(33646002)(21056001)(74316001)(80022001)(99286002)(555904002)(83072002)(85852003)(87936001)(2656002)(79102001)(92566001)(76482001)(77982001)(66066001)(76176999)(86612001)(83322001)(20776003)(50986999)(54356999)(19580405001)(81342001)(81542001)(74502001)(31966008)(19580395003)(21314002)(24736002)(108616003); DIR:OUT; SFP:; SCL:1; SRVR:BY1PR0301MB0903; H:BY1PR0301MB0902.namprd03.prod.outlook.com; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; LANG:en; Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-OriginatorOrg: microsoft.onmicrosoft.com Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 09:48:23 -0000 Q0MgZnJlZWJzZC1uZXRAIGZvciB3aWRlciBkaXNjdXNzaW9uLg0KDQpIaSBBZHJpYW4sDQoNCk1h bnkgdGhhbmtzIGZvciB0aGUgZXhwbGFuYXRpb24uICBJIGNoZWNrZWQgdGhlIGlmX2lnYi5jICBh bmQgZm91bmQgdGhlIGZsb3dpZCBmaWVsZCB3YXMgc2V0IGluIHRoZSBSWCBzaWRlIGluIGlnYl9y eGVvZigpOg0KDQpJZ2Jfcnhlb2YoKQ0Kew0KIC4uLg0KI2lmZGVmICBSU1MNCiAgICAgICAgICAg ICAgICAgICAgICAgIC8qIFhYWCBzZXQgZmxvd3R5cGUgb25jZSB0aGlzIHdvcmtzIHJpZ2h0ICov DQogICAgICAgICAgICAgICAgICAgICAgICByeHItPmZtcC0+bV9wa3RoZHIuZmxvd2lkID0NCiAg ICAgICAgICAgICAgICAgICAgICAgICAgICBsZTMydG9oKGN1ci0+d2IubG93ZXIuaGlfZHdvcmQu cnNzKTsNCiAgICAgICAgICAgICAgICAgICAgICAgIHJ4ci0+Zm1wLT5tX2ZsYWdzIHw9IE1fRkxP V0lEOw0KIC4uLg0KfQ0KDQpJIGhhdmUgdHdvIHF1ZXN0aW9ucyByZWdhcmRpbmcgdGhpcy4gDQoN CjEuIElzIHRoZSBSU1MgaGFzaCB2YWx1ZSBzdG9yZWQgaW4gY3VyLT53Yi5sb3dlci5oaV9kd29y ZC5yc3Mgc2V0IGJ5IHRoZSBOSUMgaGFyZHdhcmU/DQoyLiBTbyB0aGUgaGFzaCB2YWx1ZSBhbmQg bV9mbGFncyBhcmUgc3RvcmVkIGluIHRoZSBtYnVmIHJlbGF0ZWQgdG8gdGhlIHJlY2VpdmVkIHBh Y2tldCBvbiB0aGUgcnggc2lkZShsZ2Jfcnhlb2YoKSkuIEJ1dCB3ZSBjaGVjayB0aGUgaGFzaCB2 YWx1ZSBhbmQgbV9mbGFncyBpbiBtYnVmIHJlbGF0ZWQgdG8gdGhlIHNlbmQgcGFja2V0IG9uIHRo ZSB0eCBzaWRlIChpbiBpZ2JfbXFfc3RhcnQoKSkuIERvZXMgdGhlIGtlcm5lbCByZS11c2UgdGhl IHNhbWUgbWJ1ZiBmb3IgdHg/IElmIHNvLCBob3cgZG9lcyBpdCBrbm93IGZvciB0aGUgc2FtZSBu ZXR3b3JrIHN0cmVhbSBpdCBzaG91bGQgdXNlIHRoZSBzYW1lIG1idWYgZ290IGZyb20gdGhlIHJ4 IGZvciBwYWNrZXQgc2VuZGluZz8gSWYgbm90LCBob3cgZG9lcyB0aGUga2VybmVsIHByZXNlcnZl IHRoZSBzYW1lIGhhc2ggdmFsdWUgYWNyb3NzIHRoZSByeCBtYnVmIGFuZCB0eCBtYnVmIGZvciBz YW1lIG5ldHdvcmsgc3RyZWFtPyBUaGlzIHNlZW1zIHF1aXRlIG1hZ2ljYWwgdG8gbWUuDQoNCkZv ciB0aGUgSHlwZXItViBjYXNlLCB0aGUgaG9zdCBjb250cm9scyB3aGljaCB2Q1BVIGl0IHdhbnRz IHRvIGludGVycnVwdC4gQW5kIHRoZSBydWxlIGNhbiBjaGFuZ2UgZHluYW1pY2FsbHkgYmFzZWQg b24gdGhlIGxvYWQuIEZvciBhIG5vbi1idXN5IFZNLCBob3N0IHdpbGwgc2VuZCBtb3N0IHBhY2tl dHMgdG8gc2FtZSB2Q1BVIGZvciBwb3dlciBzYXZpbmcgcHVycG9zZS4gRm9yIGEgYnVzeSBWTSwg aG9zdCB3aWxsIGRpc3RyaWJ1dGUgdGhlIHBhY2tldHMgZXZlbmx5IGFjcm9zcyBhbGwgdkNQVXMu IFRoaXMgbWVhbnMgaG9zdCBjb3VsZCBjaGFuZ2UgdGhlIFJTUyBidWNrZXQgbWFwcGluZyBkeW5h bWljYWxseS4gSHlwZXItViBkb2VzIHRoaXMgYnkgc2VuZGluZyBhIG1hcHBpbmcgdGFibGUgdG8g Vk0gd2hlbmV2ZXIgdGhlIGl0IG5lZWRzIHVwZGF0ZS4gVGhpcyBhbHNvIG1lYW5zIHdlIGNhbm5v dCB1c2UgRnJlZUJTRCdzIG93biBidWNrZXQgbWFwcGluZyB3aGljaCBJIGJlbGlldmUgaXMgZml4 ZWQuIEFsc28gSHlwZXItViB1c2UgaXRzIG93biBoYXNoIGtleS4gU28gZG8geW91IHRoaW5rIGl0 IGlzIHBvc3NpYmxlIHdlIHN0aWxsIHVzZSB0aGUgZXhpc2l0aW5nIFJTUyBpbmZyYXN0cnVjdHVy ZSBidWlsdCBpbiBGcmVlQlNEIGluIHRoaXMgcHVycG9zZT8NCg0KVGhhbmtzIHNvIG11Y2gsDQpX ZWkNCg0KDQoNCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBhZHJpYW4uY2hhZGRA Z21haWwuY29tIFttYWlsdG86YWRyaWFuLmNoYWRkQGdtYWlsLmNvbV0gT24gQmVoYWxmIE9mIEFk cmlhbiBDaGFkZA0KU2VudDogU2F0dXJkYXksIEF1Z3VzdCA5LCAyMDE0IDM6MzkgQU0NClRvOiBX ZWkgSHUNCkNjOiBkQGRlbHBoaWoubmV0DQpTdWJqZWN0OiBSZTogdlJTUyBzdXBwb3J0IG9uIEZy ZWVCU0QNCg0KSGkhDQoNCk9uIDggQXVndXN0IDIwMTQgMDQ6NDMsIFdlaSBIdSA8d2VoQG1pY3Jv c29mdC5jb20+IHdyb3RlOg0KPiBIaSBBZHJpYW4sDQo+DQo+IE15IG5hbWUgaXMgV2VpIEh1LiBJ IHdvcmsgZm9yIE1pY3Jvc29mdCBPU1RDIChPcGVuIFNvdXJjZSBUZWNobm9sb2d5IENlbnRlcikg aW4gU2hhbmdoYWksIENoaW5hLg0KPg0KPiBNaWNyb3NvZnQgaXMgaW52ZXN0aW5nIG9uIEZyZWVC U0QgcnVubmluZyBvbiBpdHMgSHlwZXItViB2aXJ0dWFsaXphdGlvbiBlbnZpcm9ubWVudC4gQXMg dGhlIHJlc3VsdCwgSSBhbSB0cnlpbmcgdG8gYnJpbmcgdGhlIHBlcmZvcm1hbmNlIG9mIEZyZWVC U0Qgb24gSHlwZXItViBpbiBwYXIgd2l0aCBvdGhlciBndWVzdCBvcGVyYXRpbmcgc3lzdGVtcyAo c3VjaCBhcyBMaW51eCBhbmQgV2luZG93cykuIE9uZSBvZiB0aGUga2V5IG5ldHdvcmsgZmVhdHVy ZXMgSSBhbSB0cnlpbmcgdG8gYWRkIGlzIHZSU1MgKFZpcnR1YWwgUmVjZWl2ZSBTaWRlIFNjYWxp bmcpIGludG8gb3VyIGV4aXN0aW5nIG5ldHZzYyBkcml2ZXIgb24gRnJlZUJTRC4NCg0KQ29vbCEN Cg0KPiBDdXJyZW50bHkgd2UgYWxyZWFkeSBoYXZlIE5JQyBkcml2ZXIgY2FsbGVkIG5ldHZzYyAo dW5kZXIgc3lzL2Rldi9oeXBlcnYvbmV0dnNjKSB3aGljaCBkcml2ZXMgYSBzeW50aGV0aWMgdmly dHVhbCBOSUMgZGV2aWNlIHByb3ZpZGVkIGJ5IEh5cGVyLVYuIFRoZSBkcml2ZXIgb25seSBzdXBw b3J0cyBvbmUgSC9XIG5ldHdvcmsgcXVldWUgYXMgb2YgRnJlZUJTRCAxMC4gSSBhbSByZXNwb25z aWJsZSBvZiBhZGRpbmcgYm90aCBtdWx0aXF1ZXVlIGFuZCBSU1Mgc3VwcG9ydCBpbnRvIHRoaXMg ZHJpdmVyLiBYaW4gTGkgdG9sZCBtZSB0aGF0IHlvdSBoYXZlIGRvbmUgd29ya3Mgb24gUlNTLiBJ IHdvbmRlciBpZiB5b3UgY2FuIGhlbHAgbWUgd2l0aCBzb21lIHF1ZXN0aW9ucyBJIGFtIGhhdmlu ZyBjdXJyZW50bHkgd2l0aCByZWdhcmQgdG8gdGhlc2UgdHdvIGZlYXR1cmVzLg0KDQpJIGNhbiB0 cnkhDQoNCj4gMS4gVHggbXVsdGlxdWV1ZSBzdXBwb3J0LiBJIGFtIGxvb2tpbmcgYXQgYW4gZXhp c3RpbmcgZHJpdmVyIGlmX2lnYi5jIGZvciBzb21lIGNsdWVzLiBJdCBsb29rcyB0byBtZSBJIGp1 c3QgbmVlZCB0byBzZXQgYSBwcm9wZXIgbXVsdGlxdWV1ZSBhd2FyZSBmdW5jdGlvbiB0byBpZnAt PmlmX3RyYW5zaXQuIEluc2lkZSB0aGlzIGZ1bmN0aW9uIEkgY2FuIHNlbGVjdCB3aGljaCB0eCBx dWV1ZSBJIG5lZWQgdG8gc2VuZCB0aGUgcGFja2V0LiBJcyB0aGlzIGFsbCBJIG5lZWQgdG8gZG8g Zm9yIHRoZSB0eCBxdWV1ZT8gSG93IGRvIEkgbWFrZSBzdXJlIHRoaXMgcHJvY2VkdXJlIGFsc28g aGFwcGVucyBvbiB0aGUgQ1BVIHdoYXQgdGhpcyB0eCBxdWV1ZSBiaW5kcyB0bz8gSSB3YW50IHRv IGRpc3RyaWJ1dGUgdGhlIHNlbmQgd29ya2xvYWQgb24gYWxsIGF2YWlsYWJsZSBDUFVzLiBEb2Vz IHRoZSBrZXJuZWwgYXV0b21hdGljYWxseSBzZWxlY3QgYSBwcm9wZXIgQ1BVIHRvIGNhbGwgaWZw LT5pZl90cmFuc2l0PyBPciBJIGhhdmUgdG8gZG8gdGhlIGRpc3RyaWJ1dGlvbiBieSBkcml2ZXIg aXRzZWxmIGluc2lkZSBpZnAtPmlmX3RyYW5zaXQgcm91dGluZT8NCg0KU28gdGhlIG5ldHdvcmsg c3RhY2sgZG9lc24ndCBlbmZvcmNlIGFmZmluaXR5IC0gaXQncyBqdXN0IGRvaW5nIHBhcmFsbGVs aXNtLiBUaGUgUlNTIHN0dWZmIGknbSB3b3JraW5nIHdpdGggaXMgdHJ5aW5nIHRvIGVuZm9yY2Ug YWZmaW5pdHkuDQoNClRoZSBpZl90cmFuc21pdCgpIG1ldGhvZCBmb3IgYSBnaXZlbiBtYnVmIGNh biBiZSBjYWxsZWQgb24gYW55IENQVSBhbmQgdGhleSBjYW4gYmUgY2FsbGVkIGZyb20gbXVsdGlw bGUgQ1BVcyBhdCB0aGUgc2FtZSB0aW1lLg0KDQpFYWNoIGRyaXZlciBoYXMgYSBkaWZmZXJlbnQg d2F5IG9mIG1hcHBpbmcgYW4gbWJ1ZiB0byBhIGdpdmVuIGRlc3RpbmF0aW9uIFRYIHF1ZXVlLiBU aGUgbmFpdmUgd2F5IGlzIGp1c3QgdG8gcXVldWUgdGhlIG1idWYgaW50byB0aGUgVFggcXVldWUg b2YgdGhlIGN1cnJlbnQgQ1BVLiBUaGUgIm5pY2VyIiB3YXkgaXMgdG8gaGFzaCB0aGUgbV9wa3Ro ZHIuZmxvd2lkIHZhbHVlIHRvIGNob29zZSBhIFRYIHF1ZXVlLiBUaGUgUlNTIHdheSBpcyB0byBz ZWUgaWYgbV9wa3RoZHIuZmxvd2lkIGlzIGFuIFJTUyBoYXNoIGFuZCBpZiBzbywgY2hvb3NlIHRo ZSBkZXN0aW5hdGlvbiBUWCBxdWV1ZSBiYXNlZCBvbiB0aGUgUlNTIGJ1Y2tldCBmb3IgdGhlIGdp dmVuIGhhc2guDQoNCkl0J2xsIGJlIHVwIHRvIHRoZSBoaWdoZXIgbGF5ZXJzIChpZSwgbmV0d29y ayBzdGFjayBhbmQgdXNlcmxhbmQpIHRvIGRpc3RyaWJ1dGUgdHJhbnNtaXQgd29yayB0byBhbGwg Q1BVcy4NCg0KU28geWVzLCB5b3UgaGF2ZSB0byBkbyB0aGUgZGlzdHJpYnV0aW9uIGluc2lkZSB0 aGUgZHJpdmVyIHlvdXJzZWxmIGZvciBub3cuIFlvdSBzYXcgd2hhdCBJIGRpZCBmb3IgaXhnYmUv aWdiIGZvciBSU1MgdHJhbnNtaXQuDQoNCj4gMi4gUnggbXVsdGlxdWV1ZSBzdXBwb3J0LiBUaGUg cmVjZWl2ZWQgcGFja2V0IGNvdWxkIGVuZCB1cCBvbiBkaWZmZXJlbnQgcnggcXVldWVzLiBFYWNo IHJ4IHF1ZXVlIGlzIGJpbmQgdG8gYSBDUFUuIFNvIGRlcGVuZGluZyBvbiB0aGUgd2hpY2ggcXVl dWUgdGhlIHBhY2tldCBhcnJpdmVzLCBpdCB3aWxsIGJlIHByb2Nlc3NlZCBvbiBhIGRpZmZlcmVu dCBDUFUuIE15IHF1ZXN0aW9uIGlzOiBkbyBJIG5lZWQgdG8gc2V0IGFueXRoaW5nIGluIHRoZSBy ZWNlaXZlIHBhY2tldCB0byBpbmZvcm0gdGhlIHVwcGVyIGxheWVyIHdoaWNoIHF1ZXVlIHRoZSBw YWNrZXQgd2FzIHJlY2VpdmVkPyBJZiBpdCBpcyBSU1MgZW5hYmxlZCwgZG8gSSBuZWVkIHRvIGlu Zm9ybWF0aW9uIHRoZSB1cHBlciBsYXllciBJUCBmaWVsZHMgd2VyZSBoYXNoZWQgKElQLCBUQ1As IGV0YykgdG8gc2VsZWN0IHRoZSBxdWV1ZSBzbyB0aGUgdXBwZXIgbGF5ZXIga25vd3Mgd2hpY2gg cXVldWUgdGhlIHJlc3BvbnNlIHNob3VsZCBiZSBzZW5kIHRvPw0KDQpUaGlzIGlzIHdoZXJlIGZs b3dpZCBjb21lcyBpbnRvIGl0Lg0KDQpJbml0aWFsbHksIGZsb3dpZCB3YXMganVzdCBhbiBvcGFx dWUgdG9rZW4gcHJvdmlkZWQgYnkgdGhlIE5JQyBkcml2ZXIgdG8gcmVwcmVzZW50IHdoaWNoIHF1 ZXVlIHRoZSBnaXZlbiBwYWNrZXQgY2FtZSBpbiBvbi4gSXQgd2FzIHRoZW4gcHJvcGFnYXRlZCB0 aHJvdWdob3V0IHRoZSBuZXR3b3JrIHN0YWNrIHNvIHRyYW5zbWl0IHdvdWxkIGFsc28gb2NjdXIg dXNpbmcgdGhlIHNhbWUgZmxvd2lkLiBUaGlzIHdhcyBkb25lIHB1cmVseSB0byBrZWVwIHBhY2tl dHMgaW4gYSBmbG93IGluLW9yZGVyIG9uIHRoZSBzYW1lIHF1ZXVlIHJhdGhlciB0aGFuIGhhdmlu ZyB0aGVtIGdvIG91dCB0byBkaWZmZXJlbnQgQ1BVcyBiYXNlZCBvbiB3aGljaCBDUFUgdGhlIHNj aGVkdWxlciBkZWNpZGVkIHRvIHJ1biB0aGluZ3Mgb24uDQoNClRoZSBSU1Mgd29yayBtdXRhdGVz IHRoYXQgYSBsaXR0bGUgc28gdGhlIGZsb3dpZCBfY2FuXyBiZSBvbmUgb2YgdGhlIFJTUyBoYXNo dHlwZXMuIElmIGl0IGlzLCB0aGVuIHRoZSBkcml2ZXIgc2hvdWxkIHRhZyB0aGUgbWJ1ZiB3aXRo IHRoZSBSU1MgaGFzaCB2YWx1ZSBpbiBmbG93aWQgYW5kIHNldCB0aGUgbWJ1ZiBoYXNoIHR5cGUg dG8gdGhlIHJlbGV2YW50IFJTUyBoYXNoIHR5cGUuDQoNClNvIHdoYXQgeW91IG5lZWQgdG8gZG8g aXM6DQoNCiogY3JlYXRlIG9uZSByeCBxdWV1ZSBwZXIgUlNTIGJ1Y2tldDsNCiogZ2V0IHRoZSBS U1MgYnVja2V0IC0+IENQVSBtYXBwaW5nIGZyb20gaW5fcnNzLmM7DQoqIENQVSBwaW4gdGhpbmdz IGFwcHJvcHJpYXRlbHk7DQoqIG1ha2Ugc3VyZSB0aGUgdHJhZmZmaWMgZm9yIGEgZ2l2ZW4gUlNT IGhhc2ggLT4gUlNTIGJ1Y2tldCBlbmRzIHVwIGluIHRoZSByaWdodCBSWCBxdWV1ZTsNCiogbWFr ZSBzdXJlIG1idWYgaGFzaCBmbG93aWQgc2V0LCB0aGUgaGFzaCB0eXBlIHNldCwgYW5kIHRoZSBN X0ZMT1dJRCBmbGFnIHNldC4NCg0KPiAzLiBSU1MgaGFzaC4gSSBmb3VuZCB0aGUgZmlsZSBzeXMv bmV0aW5ldC9pbl9yc3MuYyBjb250YWlucyBSU1Mgc3VwcG9ydCBhbHJlYWR5IGluIEZyZWVCU0Qg aGVhZC4gQnV0IEkgZG9uJ3Qga25vdyBob3cgdG8gdXNlIGl0LiBJIGNoZWNrZWQgdGhlIHNhbWUg ZHJpdmVyIGlmX2lnYi5jLiBJdCBoYXMgZm9sbG93aW5nIGNvZGU6DQoNCj4gc3RhdGljIGludA0K PiBpZ2JfbXFfc3RhcnQoc3RydWN0IGlmbmV0ICppZnAsIHN0cnVjdCBtYnVmICptKSB7DQo+ICAg Li4uDQo+ICNpZmRlZiAgUlNTDQo+ICAgICAgICAgdWludDMyX3QgICAgICAgICAgICAgICAgYnVj a2V0X2lkOw0KPiAjZW5kaWYNCj4NCj4gICAgICAgICAvKiBXaGljaCBxdWV1ZSB0byB1c2UgKi8N Cj4gICAgICAgICAvKg0KPiAgICAgICAgICAqIFdoZW4gZG9pbmcgUlNTLCBtYXAgaXQgdG8gdGhl IHNhbWUgb3V0Ym91bmQgcXVldWUNCj4gICAgICAgICAgKiBhcyB0aGUgaW5jb21pbmcgZmxvdyB3 b3VsZCBiZSBtYXBwZWQgdG8uDQo+ICAgICAgICAgICoNCj4gICAgICAgICAgKiBJZiBldmVyeXRo aW5nIGlzIHNldHVwIGNvcnJlY3RseSwgaXQgc2hvdWxkIGJlIHRoZQ0KPiAgICAgICAgICAqIHNh bWUgYnVja2V0IHRoYXQgdGhlIGN1cnJlbnQgQ1BVIHdlJ3JlIG9uIGlzLg0KPiAgICAgICAgICAq Lw0KPiAgICAgICAgIGlmICgobS0+bV9mbGFncyAmIE1fRkxPV0lEKSAhPSAwKSB7ICNpZmRlZiAg UlNTDQo+ICAgICAgICAgICAgICAgICBpZiAocnNzX2hhc2gyYnVja2V0KG0tPm1fcGt0aGRyLmZs b3dpZCwNCj4gICAgICAgICAgICAgICAgICAgICBNX0hBU0hUWVBFX0dFVChtKSwgJmJ1Y2tldF9p ZCkgPT0gMCkgew0KPiAgICAgICAgICAgICAgICAgICAgICAgICAvKiBYWFggVE9ETzogc3BpdCBv dXQgc29tZXRoaW5nIGlmIGJ1Y2tldF9pZCA+IA0KPiBudW1fcXVldWUgcz8gKi8NCj4gICAgICAg ICAgICAgICAgICAgICAgICAgaSA9IGJ1Y2tldF9pZCAlIGFkYXB0ZXItPm51bV9xdWV1ZXM7DQo+ ICAgICAgICAgICAgICAgICB9IGVsc2Ugew0KPiAjZW5kaWYNCj4gICAgICAgICAgICAgICAgICAg ICAgICAgaSA9IG0tPm1fcGt0aGRyLmZsb3dpZCAlIGFkYXB0ZXItPm51bV9xdWV1ZXM7IA0KPiAu Li4NCj4gfQ0KPg0KPiBXaGF0IGV4YWN0bHkgaXMgdGhlIG1fcGt0aGRyLmZsb3dpZCBmaWVsZD8g RG9lcyBpdCBhbHJlYWR5IGNvbnRhaW4gdGhlIElQIGFuZCBUQ1AgaGFzaCBvZiB0aGUgc2VuZGlu ZyBwYWNrZXQ/IFdoZW4gd2FzIGl0IHNldCBhbmQgTV9IQVNIVFlQRV9HRVQobSkgcmV0dXJuaW5n IHByb3BlciBoYXNoIHR5cGU/DQoNClRoaXMgaXMgZm9yIHRyYW5zbWl0LiBUaGlzIGlzIGRvbmUg dG8gZW5zdXJlIHRoYXQgd2hlbiB0cmFuc21pdHRpbmcsIHRyYWZmaWMgZ29pbmcgdG8gYSBnaXZl biBSU1MgYnVja2V0IGVuZHMgdXAgaW4gdGhlIHJpZ2h0IGRlc3RpbmF0aW9uIFJTUyBidWNrZXQg LyBUWCBxdWV1ZS4gSWYgaXQncyBhbGwgc2V0dXAgY29ycmVjdGx5IGFuZCB1c2VybGFuZCBrbm93 cyBob3cgdG8gQ1BVIHBpbiB0aGluZ3MsIGl0J2xsIGJlIHNjaGVkdWxlZCB0byB0aGUgc2FtZSBD UFUgdGhlIHVzZXJsYW5kDQp0aHJlYWQocykgYXJlIHJ1bm5pbmcgb24uIElmIGl0J3Mgbm90IChl ZyBhIG5vbi1SU1MtYXdhcmUgcHJvZ3JhbSkgdGhlbiBpdCdsbCBzY2hlZHVsZSB0aGUgdHJhbnNt aXQgdG8gZ28gb3V0IG9uIHRoZSBjb3JyZWN0LCBub24tQ1BVLWxvY2FsIGRlc3RpbmF0aW9uIFRY IHF1ZXVlIHNvIHBhY2tldHMgYXJlIGtlcHQgaW4tb3JkZXIuDQoNCkkgaG9wZSB0aGF0IGhlbHBz IQ0KDQpJZiB5b3UnZCBsaWtlIHRvIGFzayBtb3JlIHF1ZXN0aW9ucyBwbGVhc2UgQ0Mgb25lIG9m IHRoZSBtYWlsaW5nIGxpc3RzIGxpa2UgZnJlZWJzZC1uZXRALiBJJ2QgbGlrZSBvdGhlcnMgdG8g c2VlIHRoZSBhbnN3ZXJzIGFuZCBwYXJ0aWNpcGF0ZSBpbiB0aGUgZGlzY3Vzc2lvbi4gOikNCg0K DQoNCi1hDQo= From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 11:49:27 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D5ABF43A for ; Mon, 11 Aug 2014 11:49:27 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6468529CA for ; Mon, 11 Aug 2014 11:49:27 +0000 (UTC) Received: from [192.168.1.200] (p548181A9.dip0.t-ipconnect.de [84.129.129.169]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 59F9D1C0E96E9; Mon, 11 Aug 2014 13:49:23 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: Date: Mon, 11 Aug 2014 13:49:22 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> To: Niu Zhixiong X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , "Eggert, Lars" , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 11:49:28 -0000 On 11 Aug 2014, at 09:31, Niu Zhixiong wrote: > During TCP benchmark. >=20 > root@Freetest0: # netstat -m > 35806/2129/37935 mbufs in use (current/cache/total) > 35251/1037/36288/1011638 mbuf clusters in use = (current/cache/total/max) > 35251/1030 mbuf+clusters out of packet secondary zone in use = (current/cache) > 0/6/6/505818 4k (page size) jumbo clusters in use = (current/cache/total/max) > 0/0/0/149872 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/84303 16k jumbo clusters in use (current/cache/total/max) > 79481K/2630K/82111K bytes allocated to network (current/cache/total) > 15/18307/31 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile >=20 > During SCTP benchmark > 36131/1804/37935 mbufs in use (current/cache/total) > 35394/894/36288/1011638 mbuf clusters in use (current/cache/total/max) > 35394/887 mbuf+clusters out of packet secondary zone in use = (current/cache) > 0/6/6/505818 4k (page size) jumbo clusters in use = (current/cache/total/max) > 0/0/0/149872 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/84303 16k jumbo clusters in use (current/cache/total/max) > 79950K/2263K/82213K bytes allocated to network (current/cache/total) > 15/18307/31 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile Thanks. I was more asking Lars to have a look if the igb problems he = observed are related to mbuf shortages... I observed that a long time ago and I = think it was fixed. Not sure about 10.0, since this was on head at that = time... Best regards Michael >=20 >=20 > Regards, > Niu Zhixiong > =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D= =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D > kaiaixi@gmail.com >=20 >=20 > On Mon, Aug 11, 2014 at 3:17 PM, Michael Tuexen = wrote: > On 11 Aug 2014, at 09:08, Eggert, Lars wrote: >=20 > > Hi, > > > > On 2014-8-10, at 5:48, Niu Zhixiong wrote: > >> I am using Intel I350-T4 NIC. > > > > igb driver? > > > > I've been having weird issues with this driver under 10-RELEASE, = too. On one machine, I had to limit hw.igb.num_queues=3D2 in order to = get any sort of useful connectivity. On another machine, I had to = severely bump kern.ipc.nmbclusters & friends. I'm not sure this is the = issue here, since SCTP seems to be working OK, but I'm not trusting igb = NICs at the moment. > Was there any suspicious output provided by netstat -m when the = problems occur? >=20 > Best regards > Michael > > > > Lars >=20 >=20 From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 11:50:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4EF22520 for ; Mon, 11 Aug 2014 11:50:29 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 08A1929F1 for ; Mon, 11 Aug 2014 11:50:29 +0000 (UTC) Received: from [192.168.1.200] (p548181A9.dip0.t-ipconnect.de [84.129.129.169]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 0C88B1C0E96E9; Mon, 11 Aug 2014 13:50:25 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: Date: Mon, 11 Aug 2014 13:50:24 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <31C0C883-A847-4746-96FA-FE3148B4BD31@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> To: Niu Zhixiong X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , "Eggert, Lars" , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 11:50:29 -0000 On 11 Aug 2014, at 09:32, Niu Zhixiong wrote: > Thanks for your reminding. I tried hw.igb.num_queues=3D2 just now. = But, the throughput is still slow. And When I tested same things in my = Virtual machine-based environment(use Virtio), the throughput is = similar(SCTP is 2x than TCP). So is the curve still the same or is it now constantly half of the SCTP = throughput which was almost constant for different RTTs... Best regards Michael >=20 > Regards, > Niu Zhixiong > =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D= =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D > kaiaixi@gmail.com >=20 >=20 > On Mon, Aug 11, 2014 at 3:08 PM, Eggert, Lars wrote: > Hi, >=20 > On 2014-8-10, at 5:48, Niu Zhixiong wrote: > > I am using Intel I350-T4 NIC. >=20 > igb driver? >=20 > I've been having weird issues with this driver under 10-RELEASE, too. = On one machine, I had to limit hw.igb.num_queues=3D2 in order to get any = sort of useful connectivity. On another machine, I had to severely bump = kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, = since SCTP seems to be working OK, but I'm not trusting igb NICs at the = moment. >=20 > Lars >=20 From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 12:12:46 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E6559DC1 for ; Mon, 11 Aug 2014 12:12:45 +0000 (UTC) Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx12.netapp.com", Issuer "VeriSign Class 3 International Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 9B8C72D69 for ; Mon, 11 Aug 2014 12:12:45 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,841,1400050800"; d="asc'?scan'208";a="181245356" Received: from vmwexceht04-prd.hq.netapp.com ([10.106.77.34]) by mx12-out.netapp.com with ESMTP; 11 Aug 2014 05:12:46 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by vmwexceht04-prd.hq.netapp.com (10.106.77.34) with Microsoft SMTP Server (TLS) id 14.3.123.3; Mon, 11 Aug 2014 05:12:17 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx07-prd.hq.netapp.com (10.122.105.40) with Microsoft SMTP Server (TLS) id 15.0.913.22; Mon, 11 Aug 2014 05:12:17 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::f0de:b572:dd26:36b5%21]) with mapi id 15.00.0913.011; Mon, 11 Aug 2014 05:12:16 -0700 From: "Eggert, Lars" To: Michael Tuexen Subject: Re: A problem on TCP in High RTT Environment. Thread-Topic: A problem on TCP in High RTT Environment. Thread-Index: AQHPswaXaqmGo9mF3Ua6+epBNevn4ZvJEqcAgAATYoCAAA7VAIAAA8CAgABXqQCAAANCAP//lMnkgAB+UQCAAASwAIAByiAAgAACkYCAAFJhAA== Date: Mon, 11 Aug 2014 12:12:16 +0000 Message-ID: <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> In-Reply-To: <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.122.56.79] Content-Type: multipart/signed; boundary="Apple-Mail=_703FE0A3-9BEC-4B5B-975E-BC76BDF12757"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 12:12:46 -0000 --Apple-Mail=_703FE0A3-9BEC-4B5B-975E-BC76BDF12757 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 2014-8-11, at 9:17, Michael Tuexen = wrote: > Was there any suspicious output provided by netstat -m when the = problems occur? root@laurel:~ # netstat -m 8186/2179/10365 mbufs in use (current/cache/total) 8184/1214/9398/2036224 mbuf clusters in use (current/cache/total/max) 8184/885 mbuf+clusters out of packet secondary zone in use = (current/cache) 0/5/5/1018111 4k (page size) jumbo clusters in use = (current/cache/total/max) 0/0/0/301662 9k jumbo clusters in use (current/cache/total/max) 0/0/0/169685 16k jumbo clusters in use (current/cache/total/max) 18414K/2992K/21407K bytes allocated to network (current/cache/total) 544/57/8194 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile root@laurel:~ # uptime 2:12PM up 37 mins, 3 users, load averages: 0.20, 0.25, 0.15 Lars --Apple-Mail=_703FE0A3-9BEC-4B5B-975E-BC76BDF12757 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU+izOtZcnpRveo1xAQIkEQP+Mz8Hv0qDDQQWTHCIb5r/cLGP8+g/lO/D /WUQI7peSV9Tuse0+0LQiVr9KIcl3fAkd/KFnRF+4RCq517YhwnVojq4FwJpiZvc K1TNOMUH26e9tlDN4oROC3NeL5qfVR8vNbS3Hw8glu/P6jMiZJIjMqWxhSwyrkRn U5ES2+JblwI= =4vxi -----END PGP SIGNATURE----- --Apple-Mail=_703FE0A3-9BEC-4B5B-975E-BC76BDF12757-- From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 12:16:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 372C8F73 for ; Mon, 11 Aug 2014 12:16:51 +0000 (UTC) Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C30DC2DC8 for ; Mon, 11 Aug 2014 12:16:50 +0000 (UTC) Received: by mail-wi0-f180.google.com with SMTP id n3so4084885wiv.1 for ; Mon, 11 Aug 2014 05:16:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:content-type; bh=8JT8gd9IXxo6HxMcVbI3moGNgmZO+SAgWvUxjdreIUY=; b=N50f3nsK6NbWqy0QI+b1tC6keNxpMkCOIa2ZJrsH5zNkXBRaI20zylRZYe074Rkh64 C8LRFyyg++sFZBy8I6TRMZSRAVaFioYMEqbJn6DD5ctr/SZku3dqq8IqMy8ScYSzSZRw zkEg7c49nLddVyGsbI2RR8Bok4fE35uAzCsi3cPoKmPcDkO8VLM9Gi80ILlJ+MVlPMUG /AsuzaQ0vPqkF/YofM/m6IpPYKZhCMXoVdbwBa+UBMimnYiieOlRMSkk/L6wXIuMHXoM Z+CCGDhCGVNPqjbqPFZeo/nrCuX5QgSzvonYZo0rI0SKk1LRlicPpVz4iGClR/CN3lJp 9HXQ== X-Gm-Message-State: ALoCoQms27bjVYjlYCnU/yE43bVF8Pt3k9zsG/WrfTpufGTyUIMcBKqeWD/IblIMcrpzooQLn34L X-Received: by 10.180.86.1 with SMTP id l1mr6531120wiz.62.1407759402799; Mon, 11 Aug 2014 05:16:42 -0700 (PDT) Received: from [10.0.0.164] (84.94.198.183.cable.012.net.il. [84.94.198.183]) by mx.google.com with ESMTPSA id kt3sm1146350wjb.37.2014.08.11.05.16.42 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 11 Aug 2014 05:16:42 -0700 (PDT) Message-ID: <53E8B424.2000904@cloudius-systems.com> Date: Mon, 11 Aug 2014 15:16:36 +0300 From: Vlad Zolotarov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: TCP Rx window auto sizing relies on TCP timestamp option? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 12:16:51 -0000 Hi, I have the most strange question about the TCP Rx window auto sizing implementation in a FreeBSD networking stack. When I looked at the FreeBSD code (hash 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that the mentioned above feature will not be "enabled" if there isn't a TCP timestamp option present in the current TCP session: See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: if (V_tcp_do_autorcvbuf && *to.to_tsecr* && <-------- this is what I'm talking about (so->so_rcv.sb_flags & SB_AUTOSIZE)) So, if i read the code correctly, if there isn't a TS option (negotiated and thus present in every received packet) the receive socket buffer won't grow thus preventing the growth of the Rx window. If that's the case this is very strange since TS option is not promised and even more - in many cases it won't be present. For example in Linux this feature is disabled by default (controlled by /proc/sys/net/ipv4/tcp_timestamps). This is how I actually noticed the problem the first place: I ran iperf test where Linux was an initiator and a transmitter (iperf -c) FreeBSD box was a receiver (iperf -s) and I noticed that the Rx window wasn't opening up because Linux box hasn't negotiated the TS option in the SYN. As a result, the throughput numbers were significantly lower compared to Linux-to-Linux setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). Could anybody comment on this, pls.? Did I miss anything? Is it true that FreeBSD assumes that TS option is always present and if not how can I cause an Rx Window to open up when TS option hasn't been negotiated? thanks in advance, vlad From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 13:01:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 31587CE0 for ; Mon, 11 Aug 2014 13:01:13 +0000 (UTC) Received: from mail-qg0-x22f.google.com (mail-qg0-x22f.google.com [IPv6:2607:f8b0:400d:c04::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DA88223DC for ; Mon, 11 Aug 2014 13:01:12 +0000 (UTC) Received: by mail-qg0-f47.google.com with SMTP id i50so8339237qgf.6 for ; Mon, 11 Aug 2014 06:01:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=aBuE0SGJhE7SxhaYHl37SmHKTE18wtQtTqKRLP50vJI=; b=hbX0NR/Z6WLbDW2l7R3rvv4GniyTUVY5V0pFNdMkaVeab5bXZC+3WVzhbF7p+sHYqf oIPgA3/lH/NGEk2XzwU/ok4XAzPy8pw2MUrxrsSKW/GNaZUGJD9ed7ZyH0wvQJzg2/n3 7eBE9Qid5oyqd8PXeVEa3AOYZwL0Lhe/C3omscDSdIR5sc12aEICgY+0rGmQlhdi413Z eWThKtHXIou7tG/JyySFCmqUr0aaDzGlRh6po4MEx5HZw2gDMBqn9Gm24Bc3saSQGfEY tC4xc7Gfg0NG7XE9En2wBLyVXBWa0WlHCPA7hQhxVR2/Di+MJ4Um4pzKA9etpxnUb9t3 5Iaw== MIME-Version: 1.0 X-Received: by 10.140.30.180 with SMTP id d49mr45765670qgd.63.1407762071512; Mon, 11 Aug 2014 06:01:11 -0700 (PDT) Received: by 10.224.65.65 with HTTP; Mon, 11 Aug 2014 06:01:11 -0700 (PDT) In-Reply-To: <31C0C883-A847-4746-96FA-FE3148B4BD31@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <31C0C883-A847-4746-96FA-FE3148B4BD31@lurchi.franken.de> Date: Mon, 11 Aug 2014 21:01:11 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Michael Tuexen Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , "Eggert, Lars" , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 13:01:13 -0000 For 60s test at 20Mbps, 200ms delay with hw.igb.num_queues=3D2 speed is 9.5Mbps. and without it the speed is 9.1Mbps. However, SCTP is 17.10. For 60s test at 20Mbps, 150ms delay with hw.igb.num_queues=3D2 speed is 11.1Mbps. and without it the speed is 9.92Mbps. However, SCTP is 18.59. For 60s test at 20Mbps, 150ms delay with hw.igb.num_queues=3D2 speed is 11.13Mbps. and without it the speed is 9.92Mbps. However, SCTP is 18.59. For 60s test at 20Mbps, 100ms delay with hw.igb.num_queues=3D2 speed is 9.9Mbps. and without it the speed is 11.13Mbps. For 60s test at 20Mbps, 50ms delay with hw.igb.num_queues=3D2 speed is 13.1Mbps. and without it the speed is 13.1Mbps. For 60s test at 20Mbps, 10ms delay with hw.igb.num_queues=3D2 speed is 18.0619Mbps. and without it the speed is 18.12Mbps. Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Mon, Aug 11, 2014 at 7:50 PM, Michael Tuexen < Michael.Tuexen@lurchi.franken.de> wrote: > On 11 Aug 2014, at 09:32, Niu Zhixiong wrote: > > > Thanks for your reminding. I tried hw.igb.num_queues=3D2 just now. But, > the throughput is still slow. And When I tested same things in my Virtual > machine-based environment(use Virtio), the throughput is similar(SCTP is = 2x > than TCP). > So is the curve still the same or is it now constantly half of the SCTP > throughput which > was almost constant for different RTTs... > > Best regards > Michael > > > > Regards, > > Niu Zhixiong > > =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC= =8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D > > kaiaixi@gmail.com > > > > > > On Mon, Aug 11, 2014 at 3:08 PM, Eggert, Lars wrote: > > Hi, > > > > On 2014-8-10, at 5:48, Niu Zhixiong wrote: > > > I am using Intel I350-T4 NIC. > > > > igb driver? > > > > I've been having weird issues with this driver under 10-RELEASE, too. O= n > one machine, I had to limit hw.igb.num_queues=3D2 in order to get any sor= t of > useful connectivity. On another machine, I had to severely bump > kern.ipc.nmbclusters & friends. I'm not sure this is the issue here, sinc= e > SCTP seems to be working OK, but I'm not trusting igb NICs at the moment. > > > > Lars > > > > From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 17:06:10 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CF617B4F for ; Mon, 11 Aug 2014 17:06:10 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EB7382584 for ; Mon, 11 Aug 2014 17:06:09 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7BH66rN034590 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 11 Aug 2014 10:06:07 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7BH66HH034589; Mon, 11 Aug 2014 10:06:06 -0700 (PDT) (envelope-from jmg) Date: Mon, 11 Aug 2014 10:06:06 -0700 From: John-Mark Gurney To: Vlad Zolotarov Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? Message-ID: <20140811170606.GV83475@funkthat.com> Mail-Followup-To: Vlad Zolotarov , freebsd-net@freebsd.org References: <53E8B424.2000904@cloudius-systems.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53E8B424.2000904@cloudius-systems.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Mon, 11 Aug 2014 10:06:07 -0700 (PDT) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 17:06:11 -0000 Vlad Zolotarov wrote this message on Mon, Aug 11, 2014 at 15:16 +0300: > Hi, I have the most strange question about the TCP Rx window auto sizing > implementation in a FreeBSD networking stack. > When I looked at the FreeBSD code (hash > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that > the mentioned above feature will not be "enabled" if there isn't a TCP > timestamp option present in the current TCP session: > > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: > > if (V_tcp_do_autorcvbuf && > *to.to_tsecr* && <-------- this is what I'm > talking about > (so->so_rcv.sb_flags & SB_AUTOSIZE)) > > So, if i read the code correctly, if there isn't a TS option (negotiated > and thus present in every received packet) the receive socket buffer > won't grow thus preventing the growth of the Rx window. > If that's the case this is very strange since TS option is not promised > and even more - in many cases it won't be present. > For example in Linux this feature is disabled by default (controlled by > /proc/sys/net/ipv4/tcp_timestamps). > This is how I actually noticed the problem the first place: I ran iperf > test where Linux was an initiator and a transmitter (iperf -c) FreeBSD > box was a receiver (iperf -s) and I noticed that the Rx window wasn't > opening up because Linux box hasn't negotiated the TS option in the SYN. > As a result, the throughput numbers were significantly lower compared to > Linux-to-Linux setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). > > Could anybody comment on this, pls.? > Did I miss anything? > Is it true that FreeBSD assumes that TS option is always present and if > not how can I cause an Rx Window to open up when TS option hasn't been > negotiated? This means the receive buffer won't grow beyond the default of 64k... But, as the comment says: * On the receive side the socket buffer memory is only rarely * used to any significant extent. This allows us to be much The receive buffer will only get used if the application takes too long to read it's buffer, or it isn't currently waiting... If that's the case, then the application should be fixed to be able to process the data as quickly as it comes in... So, I don't see much of an issue w/ the code you pointed out, yes, the receive buffer won't grow, but there are options that you can set (sysctl net.inet.tcp.recvspace) and SO_RCVBUF in the application that will address it otherwise... Obviously setting the default too large will just waste memory... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 17:15:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7B4A3E3F for ; Mon, 11 Aug 2014 17:15:19 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 290E82660 for ; Mon, 11 Aug 2014 17:15:18 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7BHFHeM034726 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 11 Aug 2014 10:15:18 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7BHFH6B034725; Mon, 11 Aug 2014 10:15:17 -0700 (PDT) (envelope-from jmg) Date: Mon, 11 Aug 2014 10:15:17 -0700 From: John-Mark Gurney To: Niu Zhixiong Subject: Re: A problem on TCP in High RTT Environment. Message-ID: <20140811171517.GW83475@funkthat.com> Mail-Followup-To: Niu Zhixiong , Michael Tuexen , "freebsd-net@freebsd.org" , Bill Yuan References: <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <20140810045355.GM83475@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Mon, 11 Aug 2014 10:15:18 -0700 (PDT) Cc: Michael Tuexen , Bill Yuan , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 17:15:19 -0000 Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 20:27 +0800: > Hi, I am not sure whether my last email is filtered by mailing list. > After disabled tso??? the speed become even poorer??? > This is the packets captures. Plz see google drive. > tcp_with_tso_off.pcapng.gz > So, the reason that this is also slow is that it only ever really has one segment on the wire at a time... This is similar to the previous packet capture... Which side was thie captured on? Was this the receiving side? Because it looks like packets are getting merged still... 22:19:25.628087 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq 149171:152067, ack 1, win 32783, options [nop,nop,TS val 61731427 ecr 2405797018], length 2896 and as before: 22:19:25.634095 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq 165099:166547, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr 2405797022], length 1448 22:19:25.635084 IP 10.0.10.3.9000 > 10.0.10.2.62995: Flags [.], ack 167995, win 32745, options [nop,nop,TS val 2405797438 ecr 61731431], length 0 22:19:25.635097 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq 166547:167995, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr 2405797022], length 1448 22:19:25.636073 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq 167995:170891, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr 2405797022], length 2896 22:19:25.636266 IP 10.0.10.3.9000 > 10.0.10.2.62995: Flags [.], ack 170891, win 32745, options [nop,nop,TS val 2405797439 ecr 61731431], length 0 Though the other thing I noticed is that we appear to be ack'ing before the segment was received, which is a bit odd... And it happens quite consistantly... We really need someone who knows our TCP stack to comment on this... > On Sun, Aug 10, 2014 at 1:24 PM, Niu Zhixiong wrote: > > > Hi??? > > After disabled tso??? the speed become even poorer??? > > This is the packets captures. Plz see google drive. > > ??? > > tcp_with_tso_off.pcapng.gz > > > > ??? > > > > > > John-Mark Gurney ???2014???8???10????????????????????? > > > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:48 +0800: > >> > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the > >> way, > >> > when I am using KVM-based virtual machine(virtio NIC) do the exactly > >> same > >> > test. The results are same. > >> > >> Have you tried disabling tso? I asked that in an earlier email, but > >> never heard from you if that changed anything... > >> > >> a lot of the trace looks like: > >> 19:29:57.223574 IP 10.0.10.2.61010 > 10.0.10.3.9000: . > >> 251521:257313(5792) ack 1 win 32783 > >> 19:29:57.223798 IP 10.0.10.3.9000 > 10.0.10.2.61010: . ack 257313 win > >> 32745 > >> 19:29:57.225570 IP 10.0.10.2.61010 > 10.0.10.3.9000: . > >> 257313:263105(5792) ack 1 win 32783 > >> > >> Notice how the ack comes back immediately, but for some reason, we decide > >> to > >> wait almost 2ms before sending out the next frame... > >> > >> For some reason, we just aren't filling our window out... tcptcace's > >> graphs shows the winow at 2MB, but we only ever have 4 segments > >> outstanding at once... > >> > >> > ifconfig igb0 > >> > igb0: flags=8843 metric 0 mtu > >> 1500 > >> > > >> options=403bb > >> > ether a0:36:9f:38:27:d0 > >> > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 > >> > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 > >> > nd6 options=29 > >> > media: Ethernet autoselect (1000baseT ) > >> > status: active > >> > > >> > Regards, > >> > Niu Zhixiong > >> > ????????????????????????????????????????????? > >> > kaiaixi@gmail.com > >> > > >> > > >> > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney > >> wrote: > >> > > >> > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: > >> > > > I am sorry that I upload a WRONG SCTP capture. But, the throughput > >> is > >> > > same. > >> > > > SCTP is double than TCP, about 18Mbps. > >> > > > ??? > >> > > > sctp_2.pcapng.gz > >> > > > < > >> > > > >> https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=drive_web > >> > > > > >> > > > ??? > >> > > > >> > > Ok, the owin graph is very interesting... We do have a full 2MB > >> window > >> > > on the receiver side, but for some reason, we only ever have just > >> under > >> > > 6k outstanding on the connection... > >> > > > >> > > So, it looks like we send for a short period of time, and then stop > >> > > sending... Do you have LRO enabled? I think it might be related to: > >> > > https://svnweb.freebsd.org/changeset/base/r256920 > >> > > > >> > > As I'm seeing >100ms gaps where the sender doesn't send any data, and > >> > > as soon as more than one ack comes in, the next segment goes out... > >> If > >> > > we only receive a single ack, then we wait for a timeout before > >> sending > >> > > the next segment.. > >> > > > >> > > Can you try to disable LRO on the receiving host? > >> > > > >> > > ifconfig -lro > >> > > > >> > > And see if that helps... If it does... Applying the patch, or > >> compiling > >> > > a more recent kernel from stable/10 that is after r257367 as that is > >> was > >> > > the date that the change was merged... > >> > > > >> > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong > >> > > wrote: > >> > > > > >> > > > > I am sure that wnd is about 2MB all the time. > >> > > > > This is my latest capture, plz see Google Drive. > >> > > > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) > >> is > >> > > about > >> > > > > 18Mbps. > >> > > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > >> > > > > The SCTP and TCP are tested in same environment. > >> > > > > > >> > > > > ??? > >> > > > > sctp.pcapng.gz > >> > > > > < > >> > > > >> https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=drive_web > >> > > > > >> > > > > ?????? > >> > > > > tcp.pcapng.gz > >> > > > > < > >> > > > >> https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=drive_web > >> > > > > >> > > > > ??? > >> > > > > > >> > > > > > >> > > > > > >> > > > > Regards, > >> > > > > Niu Zhixiong > >> > > > > ????????????????????????????????????????????? > >> > > > > kaiaixi@gmail.com > >> > > > > > >> > > > > > >> > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney < > >> jmg@funkthat.com> > >> > > > > wrote: > >> > > > > > >> > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 > >> +0800: > >> > > > >> > During the TCP4 transmission. > >> > > > >> > Proto Recv-Q Send-Q Local Address Foreign Address > >> > > > >> (state) > >> > > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 > >> > > > >> > ESTABLISHED > >> > > > >> > >> > > > >> Ok, so you are getting a full 2MB in there, and w/ that, you > >> should > >> > > > >> easily be saturating your pipe... > >> > > > >> > >> > > > >> The next thing would be to get a tcpdump, and take a look at the > >> > > > >> window size.. Wireshark has lots of neat tools to make this > >> analysis > >> > > > >> easy... Another tool that is good is tcptrace.. It can output a > >> > > > >> variety of different graphs that will help you track down, and > >> see > >> > > > >> what part of the system is the problem... > >> > > > >> > >> > > > >> You probably only need a few tens of seconds of the tcpdump... > >> > > > >> > >> > > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > >> > > > >> > Michael.Tuexen@lurchi.franken.de> wrote: > >> > > > >> > > >> > > > >> > > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney >> > > >> > > wrote: > >> > > > >> > > > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at > >> 21:51 > >> > > > >> +0200: > >> > > > >> > > >> > >> > > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney < > >> jmg@funkthat.com> > >> > > > >> wrote: > >> > > > >> > > >> > >> > > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at > >> 20:34 > >> > > > >> +0800: > >> > > > >> > > >>>> Dear all, > >> > > > >> > > >>>> > >> > > > >> > > >>>> Last month, I send problems related to FTP/TCP in a > >> high RTT > >> > > > >> > > environment. > >> > > > >> > > >>>> After that, I setup a simulation environment(Dummynet) > >> to > >> > > test > >> > > > >> TCP > >> > > > >> > > and SCTP > >> > > > >> > > >>>> in high delay environment. After finishing the test, I > >> can > >> > > see > >> > > > >> TCP is > >> > > > >> > > >>>> always slower than SCTP. But, I think it is not > >> possible. > >> > > (Plz > >> > > > >> see the > >> > > > >> > > >>>> figure in the attachment). When the delay is 200ms(means > >> > > > >> RTT=400ms). > >> > > > >> > > >>>> Besides, the TCP is extremely slow. > >> > > > >> > > >>>> > >> > > > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by > >> > > dummynet) > >> > > > >> > > >>>> > >> > > > >> > > >>>> This is my parameters: > >> > > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE > >> #0: Thu > >> > > Aug > >> > > > >> 7 > >> > > > >> > > >>>> 11:04:15 HKT 2014 > >> > > > >> > > >>>> > >> > > > >> > > >>>> sysctl net.inet.tcp > >> > > > >> > > >>> > >> > > > >> > > >>> [...] > >> > > > >> > > >>> > >> > > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > >> > > > >> > > >>> > >> > > > >> > > >>> [...] > >> > > > >> > > >>> > >> > > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > >> > > > >> > > >>> > >> > > > >> > > >>> Try enabling this... This should allow the buffer to > >> grow > >> > > large > >> > > > >> enough > >> > > > >> > > >>> to deal w/ the higher latency... > >> > > > >> > > >>> > >> > > > >> > > >>> Also, make sure your program isn't setting the recv > >> buffer > >> > > size > >> > > > >> as that > >> > > > >> > > >>> will disable the auto growing... > >> > > > >> > > >> I think the program sets the buffer to 2MB, which it also > >> does > >> > > for > >> > > > >> SCTP. > >> > > > >> > > >> So having both statically at the same size makes sense > >> for the > >> > > > >> > > comparison. > >> > > > >> > > >> I remember that there was a bug in the combination of LRO > >> and > >> > > > >> delayed > >> > > > >> > > ACK, > >> > > > >> > > >> which was fixed, but I don't remember it was fixed before > >> > > 10.0... > >> > > > >> > > > > >> > > > >> > > > Sounds like disabling LRO and TSO would be a useful test > >> to see > >> > > if > >> > > > >> that > >> > > > >> > > > improves things... But hiren said that the fix made it, > >> so... > >> > > > >> > > > > >> > > > >> > > >>> If you use netstat -a, you should be able to see the > >> send-q > >> > > on the > >> > > > >> > > >>> sender grow as necessary... > >> > > > >> > > > > >> > > > >> > > > Also, getting the send-q output while it's running would > >> let us > >> > > know > >> > > > >> > > > if the buffer is getting to 2MB or not... > >> > > > >> > > That is correct. Niu: Can you provide this? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 17:40:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F11DB63B for ; Mon, 11 Aug 2014 17:40:12 +0000 (UTC) Received: from mail-la0-x22c.google.com (mail-la0-x22c.google.com [IPv6:2a00:1450:4010:c03::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7BE7B2894 for ; Mon, 11 Aug 2014 17:40:12 +0000 (UTC) Received: by mail-la0-f44.google.com with SMTP id el20so5997306lab.31 for ; Mon, 11 Aug 2014 10:40:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=RZamDK5lZh0SAScR0GnwIVjH2CaYGmX1j+ZVFLF7eaw=; b=SfhOrz2wKnBvPqg3ZR9vnRv5ZmYKla19RXAFUndWKN2O3YIJRWOkjl2iiBsa0W6pDm 7OTGRQy6p6/1zOHiXH+cNMq7s+lrKbdXiUvXMZuLuV2O1DHRrE7JAHRxm5OtkZ6bWaRf y+HBTIGRJabY57MSGKMAtyKowmunp8JpQWqRQc0+Ra4jzPTW1UJUym9bZf684wEoYa34 YaZ1sxirSkQTWlArM7LyDBE9ZIyuvowN/fF0qJdGjlMHD1w9N7nCEdOjCQDmQRXhLA0O jKJ5SX/KMl9MKkSbsJ8JA6XPwKdemqYuq0zcLPh4r7Z6I2VcU3hB7b7OjlbUp4u2v+Bq 195Q== MIME-Version: 1.0 X-Received: by 10.112.166.200 with SMTP id zi8mr2601316lbb.102.1407778810279; Mon, 11 Aug 2014 10:40:10 -0700 (PDT) Received: by 10.114.81.73 with HTTP; Mon, 11 Aug 2014 10:40:10 -0700 (PDT) In-Reply-To: <53E8B424.2000904@cloudius-systems.com> References: <53E8B424.2000904@cloudius-systems.com> Date: Mon, 11 Aug 2014 10:40:10 -0700 Message-ID: Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? From: hiren panchasara To: Vlad Zolotarov Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 17:40:13 -0000 On Mon, Aug 11, 2014 at 5:16 AM, Vlad Zolotarov wrote: > Hi, I have the most strange question about the TCP Rx window auto sizing > implementation in a FreeBSD networking stack. > When I looked at the FreeBSD code (hash > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that > the mentioned above feature will not be "enabled" if there isn't a TCP > timestamp option present in the current TCP session: > > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: > > if (V_tcp_do_autorcvbuf && > *to.to_tsecr* && <-------- this is what I'm > talking about > (so->so_rcv.sb_flags & SB_AUTOSIZE)) > > So, if i read the code correctly, if there isn't a TS option (negotiated and > thus present in every received packet) the receive socket buffer won't grow > thus preventing the growth of the Rx window. > If that's the case this is very strange since TS option is not promised and > even more - in many cases it won't be present. > For example in Linux this feature is disabled by default (controlled by > /proc/sys/net/ipv4/tcp_timestamps). > This is how I actually noticed the problem the first place: I ran iperf test > where Linux was an initiator and a transmitter (iperf -c) FreeBSD box was a > receiver (iperf -s) and I noticed that the Rx window wasn't opening up > because Linux box hasn't negotiated the TS option in the SYN. As a result, > the throughput numbers were significantly lower compared to Linux-to-Linux > setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). > > Could anybody comment on this, pls.? > Did I miss anything? > Is it true that FreeBSD assumes that TS option is always present and if not > how can I cause an Rx Window to open up when TS option hasn't been > negotiated? In my limited understanding, we (FreeBSD) want to base Rx window size increase on accurate RTT which is not possible if TS option is not negotiated. And that's why this requirement. Do you know how does it work in Linux? cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 18:27:27 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 94AE2AD8 for ; Mon, 11 Aug 2014 18:27:27 +0000 (UTC) Received: from mail-qg0-x22d.google.com (mail-qg0-x22d.google.com [IPv6:2607:f8b0:400d:c04::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 526AF2FF2 for ; Mon, 11 Aug 2014 18:27:27 +0000 (UTC) Received: by mail-qg0-f45.google.com with SMTP id f51so8745879qge.32 for ; Mon, 11 Aug 2014 11:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=OTHAGzKtSRgs/rv+efJJCe6336hpo/TLjtDUocAa73U=; b=uCRQxrg9Q4MllEczT9881d3DYZKjQu4Aiwc8yUNcj71Fh+5AbaQLwzPypCzev0QHpM CvBUOPsMYZAOmHunfBc0PCxFWi87vuPD6fIra2G5na3H9Jxg7ydvHlZO0Wfsjz6oZ0SY jmwnlNKYhFss0DiOrHp6ysb28eWs3VFOB51shWyeO+dnerQTD9v4h/ixEKvaT9dd0Fr+ D42zfF1ffudKWhNdEjUiOaFtisrhVL0FsZBvR1Y1lS1ZzvbtrL7ECD3J839ZbNHhnOhH CR1nT8i7FbW89zXQT4J8z7JyRDJaPtVYTIFEYCHfWPEOkxm8uvyHe6RNlPFvmbVscUHq kttw== MIME-Version: 1.0 X-Received: by 10.224.0.141 with SMTP id 13mr66589004qab.26.1407781646212; Mon, 11 Aug 2014 11:27:26 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.41.6 with HTTP; Mon, 11 Aug 2014 11:27:26 -0700 (PDT) In-Reply-To: References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> Date: Mon, 11 Aug 2014 11:27:26 -0700 X-Google-Sender-Auth: ADlHDhCYsBf4RHQ3q73VSraYYj8 Message-ID: Subject: Re: vRSS support on FreeBSD From: Adrian Chadd To: Wei Hu Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 18:27:27 -0000 On 11 August 2014 02:48, Wei Hu wrote: > CC freebsd-net@ for wider discussion. > > Hi Adrian, > > Many thanks for the explanation. I checked the if_igb.c and found the f= lowid field was set in the RX side in igb_rxeof(): > > Igb_rxeof() > { > ... > #ifdef RSS > /* XXX set flowtype once this works right */ > rxr->fmp->m_pkthdr.flowid =3D > le32toh(cur->wb.lower.hi_dword.rss); > rxr->fmp->m_flags |=3D M_FLOWID; > ... > } > > I have two questions regarding this. > > 1. Is the RSS hash value stored in cur->wb.lower.hi_dword.rss set by the = NIC hardware? Yup. > 2. So the hash value and m_flags are stored in the mbuf related to the re= ceived packet on the rx side(lgb_rxeof()). But we check the hash value and = m_flags in mbuf related to the send packet on the tx side (in igb_mq_start(= )). Does the kernel re-use the same mbuf for tx? If so, how does it know fo= r the same network stream it should use the same mbuf got from the rx for p= acket sending? If not, how does the kernel preserve the same hash value acr= oss the rx mbuf and tx mbuf for same network stream? This seems quite magic= al to me. The mbuf flowid/flowtype ends up in the inpcb->inp_flowid / inpcb->inp_flowtype as part of the TCP receive path. Then whenever the TCP code outputs an mbuf, it copies the inpcb flow details out to outbound mbufs. > > For the Hyper-V case, the host controls which vCPU it wants to interrupt.= And the rule can change dynamically based on the load. For a non-busy VM, = host will send most packets to same vCPU for power saving purpose. For a bu= sy VM, host will distribute the packets evenly across all vCPUs. This means= host could change the RSS bucket mapping dynamically. Hyper-V does this by= sending a mapping table to VM whenever the it needs update. This also mean= s we cannot use FreeBSD's own bucket mapping which I believe is fixed. Also= Hyper-V use its own hash key. So do you think it is possible we still use = the exisiting RSS infrastructure built in FreeBSD in this purpose? Eventually. Doing rebalancing in RSS is on the TODO list, after I get the rest of the basic packet handling / routing done. How's vRSS notify the VM that the mapping table has changed? What's the format of it look like? -a From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 18:41:25 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 483B261C for ; Mon, 11 Aug 2014 18:41:25 +0000 (UTC) Received: from mail-we0-f171.google.com (mail-we0-f171.google.com [74.125.82.171]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D23D521EC for ; Mon, 11 Aug 2014 18:41:24 +0000 (UTC) Received: by mail-we0-f171.google.com with SMTP id p10so9116058wes.2 for ; Mon, 11 Aug 2014 11:41:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=mocSbU7uDg/nKOVerPnLfyqIRTWoaN/hV6D2hUFshmI=; b=UExUdEZu9MfGBzT7U4Q0Fgd9wzV5dxq7BdJZMbYyZKAO3v2GGn646vMTcnGtN2CF6Z ALfTDBcdxzfuencLR5sXqPNKv93eZSjCIMvaMu41N4hgMJ+XiIAkOOSw9uyHVtVxBUqX 3KuEELorahjRthU1wSSkc7XnhSerQ72NpBBjqhIMko0K/+I0aLog/Y+Q90nTFCi9tz/C xXPPyWhPfYed2+XjVHeXhV4ikUErb1MCZrFHTs3cA1y9/6AwlWWmdqGAmLKPI9lB1RDm SrVf0gZYfyo3FiHxGONK5HsdQeq22rlPOPd7FplOWjeD2OChRpJw2I8oWKlUij473zYt Fmrw== X-Gm-Message-State: ALoCoQlLAdmKIFNut87x1gRoJBaDUoA0X7n78qsCAfHRSKkuPn17agwZ6izHFScTMnzMHydKCW7y MIME-Version: 1.0 X-Received: by 10.194.8.35 with SMTP id o3mr55769500wja.3.1407782481964; Mon, 11 Aug 2014 11:41:21 -0700 (PDT) Received: by 10.194.88.66 with HTTP; Mon, 11 Aug 2014 11:41:21 -0700 (PDT) Received: by 10.194.88.66 with HTTP; Mon, 11 Aug 2014 11:41:21 -0700 (PDT) In-Reply-To: References: <53E8B424.2000904@cloudius-systems.com> Date: Mon, 11 Aug 2014 21:41:21 +0300 Message-ID: Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? From: Vladislav Zolotarov To: hiren panchasara Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 18:41:25 -0000 On Aug 11, 2014 8:40 PM, "hiren panchasara" wrote: > > On Mon, Aug 11, 2014 at 5:16 AM, Vlad Zolotarov > wrote: > > Hi, I have the most strange question about the TCP Rx window auto sizing > > implementation in a FreeBSD networking stack. > > When I looked at the FreeBSD code (hash > > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that > > the mentioned above feature will not be "enabled" if there isn't a TCP > > timestamp option present in the current TCP session: > > > > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: > > > > if (V_tcp_do_autorcvbuf && > > *to.to_tsecr* && <-------- this is what I'm > > talking about > > (so->so_rcv.sb_flags & SB_AUTOSIZE)) > > > > So, if i read the code correctly, if there isn't a TS option (negotiated and > > thus present in every received packet) the receive socket buffer won't grow > > thus preventing the growth of the Rx window. > > If that's the case this is very strange since TS option is not promised and > > even more - in many cases it won't be present. > > For example in Linux this feature is disabled by default (controlled by > > /proc/sys/net/ipv4/tcp_timestamps). > > This is how I actually noticed the problem the first place: I ran iperf test > > where Linux was an initiator and a transmitter (iperf -c) FreeBSD box was a > > receiver (iperf -s) and I noticed that the Rx window wasn't opening up > > because Linux box hasn't negotiated the TS option in the SYN. As a result, > > the throughput numbers were significantly lower compared to Linux-to-Linux > > setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm > > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). > > > > Could anybody comment on this, pls.? > > Did I miss anything? > > Is it true that FreeBSD assumes that TS option is always present and if not > > how can I cause an Rx Window to open up when TS option hasn't been > > negotiated? > > In my limited understanding, we (FreeBSD) want to base Rx window size > increase on accurate RTT which is not possible if TS option is not > negotiated. And that's why this requirement. > > Do you know how does it work in Linux? Linux uses an algorithm mentioned above for estimating the RTT and that algorithm doesn't require a TS. They use a TS if it is present though. > > cheers, > Hiren From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 19:27:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C7C0F6A2 for ; Mon, 11 Aug 2014 19:27:51 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7E12C27D1 for ; Mon, 11 Aug 2014 19:27:51 +0000 (UTC) Received: from [192.168.1.200] (p548181A9.dip0.t-ipconnect.de [84.129.129.169]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id BBD891C0C069E; Mon, 11 Aug 2014 21:27:46 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> Date: Mon, 11 Aug 2014 21:27:45 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> To: "Eggert, Lars" X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 19:27:51 -0000 On 11 Aug 2014, at 14:12, Eggert, Lars wrote: > On 2014-8-11, at 9:17, Michael Tuexen = wrote: >> Was there any suspicious output provided by netstat -m when the = problems occur? >=20 > root@laurel:~ # netstat -m > 8186/2179/10365 mbufs in use (current/cache/total) > 8184/1214/9398/2036224 mbuf clusters in use (current/cache/total/max) > 8184/885 mbuf+clusters out of packet secondary zone in use = (current/cache) > 0/5/5/1018111 4k (page size) jumbo clusters in use = (current/cache/total/max) > 0/0/0/301662 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/169685 16k jumbo clusters in use (current/cache/total/max) > 18414K/2992K/21407K bytes allocated to network (current/cache/total) > 544/57/8194 requests for mbufs denied (mbufs/clusters/mbuf+clusters) I guess the above is the problem. The card wants a lot of mbufs... So the problem should go away if you increase the number of = mbufs/clusters, which means no requests are denied and you don't experience any = performance issue. I ran into this on machines having several igb and ixgbe cards, each = wanting a lot of clusters for each of the receive rings per receive queue... Best regards Michael > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile >=20 > root@laurel:~ # uptime > 2:12PM up 37 mins, 3 users, load averages: 0.20, 0.25, 0.15 >=20 > Lars From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 19:41:07 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 23E1FDC9 for ; Mon, 11 Aug 2014 19:41:07 +0000 (UTC) Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx12.netapp.com", Issuer "VeriSign Class 3 International Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id CA96928F4 for ; Mon, 11 Aug 2014 19:41:06 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,843,1400050800"; d="asc'?scan'208";a="181332348" Received: from vmwexceht06-prd.hq.netapp.com ([10.106.77.104]) by mx12-out.netapp.com with ESMTP; 11 Aug 2014 12:41:05 -0700 Received: from HIOEXCMBX06-PRD.hq.netapp.com (10.122.105.39) by vmwexceht06-prd.hq.netapp.com (10.106.77.104) with Microsoft SMTP Server (TLS) id 14.3.123.3; Mon, 11 Aug 2014 12:40:37 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx06-prd.hq.netapp.com (10.122.105.39) with Microsoft SMTP Server (TLS) id 15.0.913.22; Mon, 11 Aug 2014 12:40:25 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::f0de:b572:dd26:36b5%21]) with mapi id 15.00.0913.011; Mon, 11 Aug 2014 12:40:24 -0700 From: "Eggert, Lars" To: Michael Tuexen Subject: Re: A problem on TCP in High RTT Environment. Thread-Topic: A problem on TCP in High RTT Environment. Thread-Index: AQHPswaXaqmGo9mF3Ua6+epBNevn4ZvJEqcAgAATYoCAAA7VAIAAA8CAgABXqQCAAANCAP//lMnkgAB+UQCAAASwAIAByiAAgAACkYCAAFJhAIAAeY2AgAADqYA= Date: Mon, 11 Aug 2014 19:40:24 +0000 Message-ID: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> In-Reply-To: <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.120.60.36] Content-Type: multipart/signed; boundary="Apple-Mail=_AE6527FD-6380-4027-8A3F-C288BC009E0E"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 19:41:07 -0000 --Apple-Mail=_AE6527FD-6380-4027-8A3F-C288BC009E0E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, On 2014-8-11, at 21:27, Michael Tuexen = wrote: > On 11 Aug 2014, at 14:12, Eggert, Lars wrote: >> 544/57/8194 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > I guess the above is the problem. The card wants a lot of mbufs... > So the problem should go away if you increase the number of = mbufs/clusters, > which means no requests are denied and you don't experience any = performance > issue. And I have six of those ports in that box... So I bump kern.ipc.nmbclusters? Any additional sysctls I should bump? Thanks, Lars --Apple-Mail=_AE6527FD-6380-4027-8A3F-C288BC009E0E Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU+kcQ9ZcnpRveo1xAQLaRQP/RUpRTNYwhG6nWv+Tkz4XYPUsZwtoaFL3 zm8+MRmvsy1ZjVDKgijzGPpcLkfs7btYaj4DektXsyWFTDy0s8oAytDX7Vd6k8Ip TXsWJ0BZaiY7N6stcfDo+zgR2JtArm2ugOqOCCJo0m+Bh8AdYwvUJWHor556tcsV CpTGC5E6htE= =rihH -----END PGP SIGNATURE----- --Apple-Mail=_AE6527FD-6380-4027-8A3F-C288BC009E0E-- From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 19:59:45 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C4F923EE for ; Mon, 11 Aug 2014 19:59:45 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B21C2B2B for ; Mon, 11 Aug 2014 19:59:45 +0000 (UTC) Received: from [192.168.1.200] (p548181A9.dip0.t-ipconnect.de [84.129.129.169]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 73E681C0C069E; Mon, 11 Aug 2014 21:59:42 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: Date: Mon, 11 Aug 2014 21:59:41 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> To: "Eggert, Lars" X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 19:59:45 -0000 On 11 Aug 2014, at 21:40, Eggert, Lars wrote: > Hi, >=20 > On 2014-8-11, at 21:27, Michael Tuexen = wrote: >> On 11 Aug 2014, at 14:12, Eggert, Lars wrote: >>> 544/57/8194 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >> I guess the above is the problem. The card wants a lot of mbufs... >> So the problem should go away if you increase the number of = mbufs/clusters, >> which means no requests are denied and you don't experience any = performance >> issue. >=20 > And I have six of those ports in that box... >=20 > So I bump kern.ipc.nmbclusters? Any additional sysctls I should bump? If I remember correctly, I increased kern.ipc.nmbufs and kern.ipc.nmbclusters in /boot/loader.conf After reboot you shouldn't see any denied requests in netstat -m. Using these setting the card worked fine (for SCTP... ). Best regards Michael >=20 > Thanks, > Lars >=20 From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 21:29:17 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 57A06B3F for ; Mon, 11 Aug 2014 21:29:17 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3054A271E for ; Mon, 11 Aug 2014 21:29:17 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 29D44B94A; Mon, 11 Aug 2014 17:29:16 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: Question about tcp keep-alive timer Date: Mon, 11 Aug 2014 15:50:31 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201408111550.31642.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 11 Aug 2014 17:29:16 -0400 (EDT) Cc: David Bar X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 21:29:17 -0000 On Sunday, August 10, 2014 4:09:55 pm David Bar wrote: > Hi > > > (Forgive me if this topic has been discussed before. I didn't find it in > the archives) > > In tcp_input(), when a packet is received on an established socket the code > re-arms the keep-alive timer, for each packet. > Here: > https://svnweb.freebsd.org/base/release/10.0.0/sys/netinet/tcp_input.c?revision=260789&view=markup#l1518 > > Isn't this a waste to do this for each packet? > > The setting of the timer when the connection becomes established should > suffice if there was a small change in tcp_timer_keep(). > If tcp_timer_keep() would first checks if tp->t_rcvtime is recent (newer > than the TT_KEEPIDLE time), and would just re-arm the timer to go off > later, then we would keep the same functionality. > > I can't think of any downsides to this idea. Any good reason why this > hasn't been done before? I think it is just a tradeoff between having the timer run at all. However, I suspect that with high packet rates it probably is cheaper to have the timer run periodically and reschedule itself if it notices it isn't needed as you suggested. Do you want to write up a patch and test it? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 21:29:18 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D0684B43; Mon, 11 Aug 2014 21:29:18 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 90C62271F; Mon, 11 Aug 2014 21:29:18 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 63AE0B972; Mon, 11 Aug 2014 17:29:17 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: NFS client READ performance on -current Date: Mon, 11 Aug 2014 16:53:42 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: <2136988575.13956627.1405199640153.JavaMail.root@uoguelph.ca> <53C7B774.60304@freebsd.org> <1780417.KfjTWjeQCU@pippin.baldwin.cx> In-Reply-To: <1780417.KfjTWjeQCU@pippin.baldwin.cx> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201408111653.42283.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 11 Aug 2014 17:29:17 -0400 (EDT) Cc: pyunyh@gmail.com, "Russell L. Carter" , Rick Macklem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 21:29:18 -0000 On Saturday, July 19, 2014 1:28:19 pm John Baldwin wrote: > On Thursday 17 July 2014 19:45:56 Julian Elischer wrote: > > On 7/15/14, 10:34 PM, John Baldwin wrote: > > > On Saturday, July 12, 2014 5:14:00 pm Rick Macklem wrote: > > >> Yonghyeon Pyun wrote: > > >>> On Fri, Jul 11, 2014 at 09:54:23AM -0400, John Baldwin wrote: > > >>>> On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote: > > >>>>> John Baldwin wrote: > > >>>>>> On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote: > > >>>>>>> Russell L. Carter wrote: > > >>>>>>>> On 07/02/14 19:09, Rick Macklem wrote: > > >>>>>>>>> Could you please post the dmesg stuff for the network > > >>>>>>>>> interface, > > >>>>>>>>> so I can tell what driver is being used? I'll take a look > > >>>>>>>>> at > > >>>>>>>>> it, > > >>>>>>>>> in case it needs to be changed to use m_defrag(). > > >>>>>>>> > > >>>>>>>> em0: port > > >>>>>>>> 0xd020-0xd03f > > >>>>>>>> mem > > >>>>>>>> 0xfe4a0000-0xfe4bffff,0xfe480000-0xfe49ffff irq 44 at > > >>>>>>>> device 0.0 > > >>>>>>>> on > > >>>>>>>> pci2 > > >>>>>>>> em0: Using an MSI interrupt > > >>>>>>>> em0: Ethernet address: 00:15:17:bc:29:ba > > >>>>>>>> 001.000007 [2323] netmap_attach success for em0 > > >>>>>>>> tx > > >>>>>>>> 1/1024 > > >>>>>>>> rx > > >>>>>>>> 1/1024 queues/slots > > >>>>>>>> > > >>>>>>>> This is one of those dual nic cards, so there is em1 as > > >>>>>>>> well... > > >>>>>>> > > >>>>>>> Well, I took a quick look at the driver and it does use > > >>>>>>> m_defrag(), > > >>>>>>> but > > >>>>>>> I think that the "retry:" label it does a goto after doing so > > >>>>>>> might > > >>>>>>> be in > > >>>>>>> the wrong place. > > >>>>>>> > > >>>>>>> The attached untested patch might fix this. > > >>>>>>> > > >>>>>>> Is it convenient to build a kernel with this patch applied > > >>>>>>> and then > > >>>>>>> try > > >>>>>>> it with TSO enabled? > > >>>>>>> > > >>>>>>> rick > > >>>>>>> ps: It does have the transmit segment limit set to 32. I have > > >>>>>>> no > > >>>>>>> idea if > > >>>>>>> > > >>>>>>> this is a hardware limitation. > > >>>>>> > > >>>>>> I think the retry is not in the wrong place, but the overhead > > >>>>>> of all > > >>>>>> those > > >>>>>> pullups is apparently quite severe. > > >>>>> > > >>>>> The m_defrag() call after the first failure will just barely > > >>>>> squeeze > > >>>>> the just under 64K TSO segment into 32 mbuf clusters. Then I > > >>>>> think any > > >>>>> m_pullup() done during the retry will allocate an mbuf > > >>>>> (at a glance it seems to always do this when the old mbuf is a > > >>>>> cluster) > > >>>>> and prepend that to the list. > > >>>>> --> Now the list is > 32 mbufs again and the > > >>>>> bus_dmammap_load_mbuf_sg() > > >>>>> > > >>>>> will fail again on the retry, this time fatally, I think? > > >>>>> > > >>>>> I can't see any reason to re-do all the stuff using m_pullup() > > >>>>> and Russell > > >>>>> reported that moving the "retry:" fixed his problem, from what I > > >>>>> understood. > > >>>> > > >>>> Ah, I had assumed (incorrectly) that the m_pullup()s would all be > > >>>> nops in this > > >>>> case. It seems the NIC would really like to have all those things > > >>>> in a single > > >>>> segment, but it is not required, so I agree that your patch is > > >>>> fine. > > >>> > > >>> I recall em(4) controllers have various limitation in TSO. Driver > > >>> has to update IP header to make TSO work so driver has to get a > > >>> writable mbufs. bpf(4) consumers will see IP packet length is 0 > > >>> after this change. I think tcpdump has a compile time option to > > >>> guess correct IP packet length. The firmware of controller also > > >>> should be able to access complete IP/TCP header in a single buffer. > > >>> I don't remember more details in TSO limitation but I guess you may > > >>> be able to get more details TSO limitation from publicly available > > >>> Intel data sheet. > > >> > > >> I think that the patch should handle this ok. All of the m_pullup() > > >> stuff gets done the first time. Then, if the result is more than 32 > > >> mbufs in the list, m_defrag() is called to copy the chain. This should > > >> result in all the header stuff in the first mbuf cluster and the map > > >> call is done again with this list of clusters. (Without the patch, > > >> m_pullup() would allocate another prepended mbuf and make the chain > > >> more than 32mbufs again.) > > > > > > Hmm, I am surprised by the m_pullup() behavior that it doesn't just > > > notice that the first mbuf with a cluster has the desired data already > > > and returns without doing anything. That is, I'm surprised the first > > > > > > statement in m_pullup() isn't just: > > > if (n->m_len >= len) > > > > > > return (n); > > > > I seem to remember that the standard behaviour is for the caller to do > > exactly that. > > Huh, the manpage doesn't really state that, and it does check in one case. > However, I think that means that the code in em(4) is busted and should be > checking m_len before all the calls to m_pullup(). I think this will fix > the issue the same as Rick's change but it might also avoid unnecessary > pullups in some cases when defrag isn't needed in the first place. FYI, I still think this patch is worth testing if someone is up for it. -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 21:29:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EA48CBE0 for ; Mon, 11 Aug 2014 21:29:20 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AD0F62723 for ; Mon, 11 Aug 2014 21:29:20 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A3AA5B94A; Mon, 11 Aug 2014 17:29:19 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: zero window and persist timer not set Date: Mon, 11 Aug 2014 17:20:18 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201408111720.18544.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 11 Aug 2014 17:29:19 -0400 (EDT) Cc: Jeremiah Lott X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 21:29:21 -0000 On Wednesday, August 06, 2014 5:25:38 pm Jeremiah Lott wrote: > Hello, > > We've been seeing a problem where a tcp connection is stuck in a zero > window condition and even though the client has opened more window space, > our FreeBSD box never sends any more. After some analysis it appears that > the FreeBSD box is not sending zero window probes, because the persist > timer did not get set (we can see in kgdb that the tcpcb shows 0 window, > there is data in the socket buffer, but the persist timer is not active). > > After looking over the code for a while, I think I see the problem. When > tcp_output chooses to send a packet, it never arms the persist timer. This > causes a problem in the following scenario: > > 1. A --> B: packet containing enough data to fill the window > 2. B --> A: ACK for #1 + new data (0 window advertisement) > 3. A --> B: ACK for #2, 0 len packet > > In this case, A will not activate the persist timer, because it chose to > send a packet. Unless tcp_output is called for some other reason (delayed > ack timer, another input packet from B, socket syscall), A will not send > zero window probes. I was finally able to recreate this condition by > setting an very small window and running programs that send very specific > sequences of packets without calling recv (purposefully forcing a zero > window condition). Here is a packet capture that shows the sequence: > > A == 10.2.15.69 == FreeBSD 9.2 > B == 10.2.14.61 == FreeBSD 8.2 > > 16:19:49.664790 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [S], seq > 2362665163, win 4300, options [mss 1460,nop,wscale 6,sackOK,TS val 88804503 > ecr 0], length 0 > 16:19:49.664821 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [S.], seq > 3306387947, ack 2362665164, win 65535, options [mss 1460,nop,wscale > 6,sackOK,TS val 1605043666 ecr 88804503], length 0 > 16:19:49.664859 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], ack 1, > win 67, options [nop,nop,TS val 88804503 ecr 1605043666], length 0 > 16:19:49.664921 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq > 1:101, ack 1, win 67, options [nop,nop,TS val 88804503 ecr 1605043666], > length 100 > 16:19:49.665137 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [P.], seq > 1:3001, ack 101, win 2046, options [nop,nop,TS val 1605043666 ecr > 88804503], length 3000 > 16:19:49.665208 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq > 101:1321, ack 1449, win 45, options [nop,nop,TS val 88804503 ecr > 1605043666], length 1220 > 16:19:49.666195 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq > 1321:2769, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr > 1605043666], length 1448 > 16:19:49.666205 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack > 2769, win 2004, options [nop,nop,TS val 1605043667 ecr 88804503], length 0 > 16:19:49.666207 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq > 2769:2771, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr > 1605043666], length 2 > 16:19:49.667183 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq > 2771:4219, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr > 1605043667], length 1448 > 16:19:49.667190 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], seq > 3001:4345, ack 4219, win 1982, options [nop,nop,TS val 1605043668 ecr > 88804504], length 1344 > 16:19:49.667193 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq > 4219:4221, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr > 1605043667], length 2 > 16:19:49.766487 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq > 4221:4321, ack 4345, win 0, options [nop,nop,TS val 88804605 ecr > 1605043668], length 100 > 16:19:49.766499 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack > 4321, win 1980, options [nop,nop,TS val 1605043768 ecr 88804505], length 0 > > The important packets are the last four: > > 1. A --> B: length 1344, fills the remaining window > 2. B --> A: length 2, does not ack additional data, delayed ack timer is set > 3. B --> A: length 100, acks #1, immediate ack (delayed ack timer > cancelled, tcp_output called with ACKNOW) > 4. A --> B: length 0, acks #1 and #2, because a packet is sent tcp_output > does not activate the persist timer. > > I would normally expect A to begin sending zero-window probes, but (since > it didn't activate the persist timer) it does not. Using kgdb, I can see > that the persist timer is not set, only the keep timer is set. This is > kgdb on "A": > > (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_nxt > $5 = 3306392292 > (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_max > $6 = 3306392292 > (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_una > $7 = 3306392292 > (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_wnd > $8 = 0 > (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_cwnd > $9 = 4380 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_rexmt->c_flags > $11 = 16 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_persist->c_flags > $12 = 16 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_keep->c_flags > $13 = 22 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_2msl->c_flags > $14 = 16 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_delack->c_flags > $15 = 16 > (kgdb) print ((struct > tcpcb*)(0xfffffe02ae289b70))->t_inpcb->inp_socket.so_snd.sb_cc > $16 = 1656 > > There is zero window, data in the socket buffer, and the persist timer is > not set. > > My proposed fix follows. If you send a 0-length packet, but there is data > is the socket buffer, and neither the rexmt or persist timer is already > set, then activate the persist timer. > > --- sys/netinet/tcp_output.c (revision 269644) > +++ sys/netinet/tcp_output.c (working copy) > @@ -1290,7 +1290,12 @@ > tp->t_rxtshift = 0; > } > tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); > - } > + } else if (len == 0 && so->so_snd.sb_cc && > + !tcp_timer_active(tp, TT_REXMT) && > + !tcp_timer_active(tp, TT_PERSIST)) { > + tp->t_rxtshift = 0; > + tcp_setpersist(tp); > + } > > } else { > /* > * Persist case, update snd_max but since we are in > > Let me know any comments. Thanks, I think your patch is correct, but please file this as a bug report so we can hopefully wrangle another person to review this. -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 21:29:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CE69BB78; Mon, 11 Aug 2014 21:29:19 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A82292722; Mon, 11 Aug 2014 21:29:19 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 9D264B9C0; Mon, 11 Aug 2014 17:29:18 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: Multicast races on vlan & lagg Date: Mon, 11 Aug 2014 17:10:46 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: <53DA50E2.6070606@FreeBSD.org> In-Reply-To: <53DA50E2.6070606@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201408111710.46443.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 11 Aug 2014 17:29:18 -0400 (EDT) Cc: Alexander Motin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 21:29:19 -0000 On Thursday, July 31, 2014 10:21:22 am Alexander Motin wrote: > Hi. > > Doing some tests on FreeNAS (FreeBSD 9.2+) I hit series of panic during > active interfaces manipulation in some scenarios including multicast and > several vlans on top of lagg. I am not ready to reproduce the full > environment on head, but the code looks equal, so probably the bugs. > > I've made a patch to improve locking in that area, that seems fixes the > problems: http://people.freebsd.org/~mav/mcast_vlan_lagg.patch > > Could somebody with more experience in the area please take a look? Can't you use IF_ADDR_RLOCK instead of IF_ADDR_WLOCK? Also, strictly speaking it might be best to use if_maddr_rlock() instead of directly using IF_ADDR_RLOCK(). -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 23:52:15 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 72DFA415 for ; Mon, 11 Aug 2014 23:52:15 +0000 (UTC) Received: from mail-lb0-x236.google.com (mail-lb0-x236.google.com [IPv6:2a00:1450:4010:c04::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E62262661 for ; Mon, 11 Aug 2014 23:52:14 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id z11so6534455lbi.27 for ; Mon, 11 Aug 2014 16:52:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=E0M23eE8Xe+1rdedMUsqr48v1Zl80FBhqrBp/XMI/NE=; b=ogjcjB1mVBh7W8XJqL7Qa3U7HOT7364CxUVvEzHY4zscSjX8SIe0+hxx1QxFz0sCX/ SahFTB1JfqxSg+BwzjcVqS7f6EZy8RaLzpoNHRd8D3nzbrawG9nfkXND5PQFcf850cYZ reOVD/+p5nA4rytaKjJvdhvPPcfIQRlBZmH7U+OhVz9/n7bMNmR3gpKD3hO6AMD5Bnou jhVaySRruuxQIO2ELzVU0SEPl5MncEVWgBbIxywdmvYe+96yx5BSjE0I8+njjBdPLzf7 zKCYiKuyFsFKzMsW9Qq5lO1OS9pZ+pbI8RcDSlzc15Y2dhoRWmPJnrPJ+pG6+Bu7njIE PRDA== MIME-Version: 1.0 X-Received: by 10.112.91.196 with SMTP id cg4mr720494lbb.42.1407801132270; Mon, 11 Aug 2014 16:52:12 -0700 (PDT) Received: by 10.114.81.73 with HTTP; Mon, 11 Aug 2014 16:52:12 -0700 (PDT) In-Reply-To: <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> Date: Mon, 11 Aug 2014 16:52:12 -0700 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: hiren panchasara To: Michael Tuexen Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , "Eggert, Lars" , Niu Zhixiong , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 23:52:15 -0000 On Mon, Aug 11, 2014 at 12:59 PM, Michael Tuexen wrote: > On 11 Aug 2014, at 21:40, Eggert, Lars wrote: > >> Hi, >> >> On 2014-8-11, at 21:27, Michael Tuexen wrote: >>> On 11 Aug 2014, at 14:12, Eggert, Lars wrote: >>>> 544/57/8194 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >>> I guess the above is the problem. The card wants a lot of mbufs... >>> So the problem should go away if you increase the number of mbufs/clusters, >>> which means no requests are denied and you don't experience any performance >>> issue. >> >> And I have six of those ports in that box... >> >> So I bump kern.ipc.nmbclusters? Any additional sysctls I should bump? > If I remember correctly, I increased > kern.ipc.nmbufs and kern.ipc.nmbclusters in /boot/loader.conf I believe, you just need to set kern.ipc.nmbclusters (max mbuf clusters allowed) and kern.ipc.nmbufs (max mbufs allowed) should be adjusted based on that. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 00:02:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 65C09581; Tue, 12 Aug 2014 00:02:13 +0000 (UTC) Received: from smtp2.wemm.org (smtp2.wemm.org [192.203.228.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp2.wemm.org", Issuer "StartCom Class 1 Primary Intermediate Server CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 43DD9272B; Tue, 12 Aug 2014 00:02:12 +0000 (UTC) Received: from overcee.wemm.org (canning.wemm.org [192.203.228.65]) by smtp2.wemm.org (Postfix) with ESMTP id 54AF8D82; Mon, 11 Aug 2014 17:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wemm.org; s=m20140428; t=1407801726; bh=ySEwImV61sOjSj9+CFifR6biedyMy1fgTynZ2MF6t8U=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=sMNOmDf5seHivOYhP3eN28FHsWoDLOtg4ZndqKmfFCysTZDt+aDLg0Hnby0k01M5N x0FG1Qn0o5SHcTcMaAWT9KHuR/+ICLy3ImHD79XT83QewAiLUtsGNvNqTxkFK7Q4p3 h/A9DxXFuwVglI/aKnjE9NzEOl1YoagtLSKzONEA= From: Peter Wemm To: freebsd-net@freebsd.org Subject: Re: zero window and persist timer not set Date: Mon, 11 Aug 2014 17:02:01 -0700 Message-ID: <24778594.oyrJY37Iyv@overcee.wemm.org> User-Agent: KMail/4.12.5 (FreeBSD/11.0-CURRENT; KDE/4.12.5; amd64; ; ) In-Reply-To: <201408111720.18544.jhb@freebsd.org> References: <201408111720.18544.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart26642615.nu2nT7dpWp"; micalg="pgp-sha1"; protocol="application/pgp-signature" Cc: Jeremiah Lott , John Baldwin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 00:02:13 -0000 --nextPart26642615.nu2nT7dpWp Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" On Monday 11 August 2014 17:20:18 John Baldwin wrote: > On Wednesday, August 06, 2014 5:25:38 pm Jeremiah Lott wrote: > > Hello, > >=20 > > We've been seeing a problem where a tcp connection is stuck in a ze= ro > > window condition and even though the client has opened more window = space, > > our FreeBSD box never sends any more. After some analysis it appea= rs that > > the FreeBSD box is not sending zero window probes, because the pers= ist > > timer did not get set (we can see in kgdb that the tcpcb shows 0 wi= ndow, > > there is data in the socket buffer, but the persist timer is not ac= tive). [..] > > My proposed fix follows. If you send a 0-length packet, but there = is data > > is the socket buffer, and neither the rexmt or persist timer is alr= eady > > set, then activate the persist timer. > >=20 > > --- sys/netinet/tcp_output.c (revision 269644) > > +++ sys/netinet/tcp_output.c (working copy) > > @@ -1290,7 +1290,12 @@ > >=20 > > tp->t_rxtshift =3D 0; > > =20 > > } > > tcp_timer_activate(tp, TT_REXMT, tp->t_rxtc= ur); > >=20 > > - } > > + } else if (len =3D=3D 0 && so->so_snd.sb_cc && > > + !tcp_timer_active(tp, TT_REXMT) && > > + !tcp_timer_active(tp, TT_PERSIST)) { > > + tp->t_rxtshift =3D 0; > > + tcp_setpersist(tp); > > + } > >=20 > > } else { > > =20 > > /* > > =20 > > * Persist case, update snd_max but since we are in= > >=20 > > Let me know any comments. Thanks, >=20 > I think your patch is correct, but please file this as a bug report s= o we > can hopefully wrangle another person to review this. This sounds suspiciously like one of the failures we have been seeing b= etween=20 machines in the cluster that are doing package mirroring.. We had been= =20 attributing it to a mystery rsync bug but this seems to fit. It might = also=20 have been implicated in the svn mirroring too (eg: to the portsnap buil= der). =20 We've also had problems with ftp mirrors that also might fit this. =2D-=20 Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI= 6FJV UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246 --nextPart26642615.nu2nT7dpWp Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJT6Vl+AAoJEDXWlwnsgJ4ELNMIAKWyuETT6yxO019zf/OAhwlZ c8HnD3tKuR/Uh+dcn45J1cDJEpxv+RxFNulbtq6nZomMUgdUAxV+cb+ajldDnb13 lXgtSb5F0hWJG/ihckad5Y5k1qEihNBEawP2uliObBhuZ3ntm9UBKWSEGZsusPIK uEG2LZIJdp369P7xdL0dcCpCWajRTfiMaJbJqAR8bUUPwzx+eVP9jBpURbRFAVUK yUBPHKkavu4kAG54dQoRcfHO09mmbBzuMl7WWOsCgGGYQK/aubS5Ry+fy82/3qJz vOop37kk7WcAwKBDVWTEmwR1E9OOlpAzmG5ciuDltDOmwKTFvZqbb3/ZnjNS0SI= =s7al -----END PGP SIGNATURE----- --nextPart26642615.nu2nT7dpWp-- From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 00:05:10 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D45C8652; Tue, 12 Aug 2014 00:05:10 +0000 (UTC) Received: from mail-lb0-x234.google.com (mail-lb0-x234.google.com [IPv6:2a00:1450:4010:c04::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1FD7F2747; Tue, 12 Aug 2014 00:05:09 +0000 (UTC) Received: by mail-lb0-f180.google.com with SMTP id v6so6515383lbi.11 for ; Mon, 11 Aug 2014 17:05:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=hp25wjswTkqvinyQR0bxUrEI9XzliPffiueQCF33E4w=; b=AbbfQrljWHGHuIrT8/VYWsxNTDhHjym3DyzcBD1WcPzQb9t9wvyvC5Hs8FNUVgSOWA 2f+YbhtnoXeyseTqH0CbFOM/GbpKJrYx1aDpC0zdRNWuyqm0A9OpAKNOLzkFWxVpcR/Z tKMHeIPvv4lWHdIhvure+Q5Oha6pI3MyBzDNHP7DOa0uNP4cr+5sX1d3s/nGIbAC/XfL xsHE3n6f3GjbJFPqSyrV+KbgDqvabDOH+sPHcHL/ijrbmRiLgPSQoVe/dBd/zUpznxUK MWkDKJYpr0TjZ4BIZDvBrYUyoRKc0lrQTM/OA893K2xyHueBt47VZDeHVJH2hGkl9V/O Uxqg== MIME-Version: 1.0 X-Received: by 10.152.161.225 with SMTP id xv1mr807768lab.71.1407801908038; Mon, 11 Aug 2014 17:05:08 -0700 (PDT) Received: by 10.114.81.73 with HTTP; Mon, 11 Aug 2014 17:05:07 -0700 (PDT) In-Reply-To: <201408111720.18544.jhb@freebsd.org> References: <201408111720.18544.jhb@freebsd.org> Date: Mon, 11 Aug 2014 17:05:07 -0700 Message-ID: Subject: Re: zero window and persist timer not set From: hiren panchasara To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" , Jeremiah Lott X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 00:05:10 -0000 On Mon, Aug 11, 2014 at 2:20 PM, John Baldwin wrote: > On Wednesday, August 06, 2014 5:25:38 pm Jeremiah Lott wrote: >> Hello, >> >> We've been seeing a problem where a tcp connection is stuck in a zero >> window condition and even though the client has opened more window space, >> our FreeBSD box never sends any more. After some analysis it appears that >> the FreeBSD box is not sending zero window probes, because the persist >> timer did not get set (we can see in kgdb that the tcpcb shows 0 window, >> there is data in the socket buffer, but the persist timer is not active). >> >> After looking over the code for a while, I think I see the problem. When >> tcp_output chooses to send a packet, it never arms the persist timer. This >> causes a problem in the following scenario: >> >> 1. A --> B: packet containing enough data to fill the window >> 2. B --> A: ACK for #1 + new data (0 window advertisement) >> 3. A --> B: ACK for #2, 0 len packet >> >> In this case, A will not activate the persist timer, because it chose to >> send a packet. Unless tcp_output is called for some other reason (delayed >> ack timer, another input packet from B, socket syscall), A will not send >> zero window probes. I was finally able to recreate this condition by >> setting an very small window and running programs that send very specific >> sequences of packets without calling recv (purposefully forcing a zero >> window condition). Here is a packet capture that shows the sequence: >> >> A == 10.2.15.69 == FreeBSD 9.2 >> B == 10.2.14.61 == FreeBSD 8.2 >> >> 16:19:49.664790 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [S], seq >> 2362665163, win 4300, options [mss 1460,nop,wscale 6,sackOK,TS val 88804503 >> ecr 0], length 0 >> 16:19:49.664821 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [S.], seq >> 3306387947, ack 2362665164, win 65535, options [mss 1460,nop,wscale >> 6,sackOK,TS val 1605043666 ecr 88804503], length 0 >> 16:19:49.664859 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], ack 1, >> win 67, options [nop,nop,TS val 88804503 ecr 1605043666], length 0 >> 16:19:49.664921 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >> 1:101, ack 1, win 67, options [nop,nop,TS val 88804503 ecr 1605043666], >> length 100 >> 16:19:49.665137 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [P.], seq >> 1:3001, ack 101, win 2046, options [nop,nop,TS val 1605043666 ecr >> 88804503], length 3000 >> 16:19:49.665208 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >> 101:1321, ack 1449, win 45, options [nop,nop,TS val 88804503 ecr >> 1605043666], length 1220 >> 16:19:49.666195 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq >> 1321:2769, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr >> 1605043666], length 1448 >> 16:19:49.666205 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack >> 2769, win 2004, options [nop,nop,TS val 1605043667 ecr 88804503], length 0 >> 16:19:49.666207 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >> 2769:2771, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr >> 1605043666], length 2 >> 16:19:49.667183 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq >> 2771:4219, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr >> 1605043667], length 1448 >> 16:19:49.667190 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], seq >> 3001:4345, ack 4219, win 1982, options [nop,nop,TS val 1605043668 ecr >> 88804504], length 1344 >> 16:19:49.667193 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >> 4219:4221, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr >> 1605043667], length 2 >> 16:19:49.766487 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >> 4221:4321, ack 4345, win 0, options [nop,nop,TS val 88804605 ecr >> 1605043668], length 100 >> 16:19:49.766499 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack >> 4321, win 1980, options [nop,nop,TS val 1605043768 ecr 88804505], length 0 >> >> The important packets are the last four: >> >> 1. A --> B: length 1344, fills the remaining window >> 2. B --> A: length 2, does not ack additional data, delayed ack timer is set >> 3. B --> A: length 100, acks #1, immediate ack (delayed ack timer >> cancelled, tcp_output called with ACKNOW) >> 4. A --> B: length 0, acks #1 and #2, because a packet is sent tcp_output >> does not activate the persist timer. >> >> I would normally expect A to begin sending zero-window probes, but (since >> it didn't activate the persist timer) it does not. Using kgdb, I can see >> that the persist timer is not set, only the keep timer is set. This is >> kgdb on "A": >> >> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_nxt >> $5 = 3306392292 >> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_max >> $6 = 3306392292 >> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_una >> $7 = 3306392292 >> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_wnd >> $8 = 0 >> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_cwnd >> $9 = 4380 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_rexmt->c_flags >> $11 = 16 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_persist->c_flags >> $12 = 16 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_keep->c_flags >> $13 = 22 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_2msl->c_flags >> $14 = 16 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_delack->c_flags >> $15 = 16 >> (kgdb) print ((struct >> tcpcb*)(0xfffffe02ae289b70))->t_inpcb->inp_socket.so_snd.sb_cc >> $16 = 1656 >> >> There is zero window, data in the socket buffer, and the persist timer is >> not set. >> >> My proposed fix follows. If you send a 0-length packet, but there is data >> is the socket buffer, and neither the rexmt or persist timer is already >> set, then activate the persist timer. >> >> --- sys/netinet/tcp_output.c (revision 269644) >> +++ sys/netinet/tcp_output.c (working copy) >> @@ -1290,7 +1290,12 @@ >> tp->t_rxtshift = 0; >> } >> tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); >> - } >> + } else if (len == 0 && so->so_snd.sb_cc && >> + !tcp_timer_active(tp, TT_REXMT) && >> + !tcp_timer_active(tp, TT_PERSIST)) { >> + tp->t_rxtshift = 0; >> + tcp_setpersist(tp); >> + } >> >> } else { >> /* >> * Persist case, update snd_max but since we are in >> >> Let me know any comments. Thanks, > > I think your patch is correct, but please file this as a bug report so we can > hopefully wrangle another person to review this. Looks okay to me also from the looks of it. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 00:07:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 77CA2710 for ; Tue, 12 Aug 2014 00:07:12 +0000 (UTC) Received: from mail-la0-x232.google.com (mail-la0-x232.google.com [IPv6:2a00:1450:4010:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 017032764 for ; Tue, 12 Aug 2014 00:07:11 +0000 (UTC) Received: by mail-la0-f50.google.com with SMTP id pi18so282323lab.9 for ; Mon, 11 Aug 2014 17:07:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=MMLk7793MfwKJP1OWHWSqOAMbEGcpsq/nDoFMbyNO8I=; b=Z+f+7Pcx2lfzvTlZKN+i44hoH6ej7tjz550gsxsVnbBpN3Cj6ogHgR8JMigc48FHFC q7atapsrc0YbdmOdwoaVY4+xau1aKaUyDJjBZy/Npx/1ONmOiaalTBS1P0xps91tQz/a dxCm43rehbGGkMtiB5Rj58i+ypS66RFJg9r9UoESSez1CzH9lwh27FP8aM1YqlO5HDPz BbsqTAtxniR+kFlpTJ3fMr+JM+lxnErPreuebHnYVT1+NH072A1QX/iEDk4rF2bv2PqX b4nCiHZDtxVeZCqR3XdquMpacZulNhkqY8uHZm24JKty9ao+b5RJH2d3X94DL0xy9QW4 vuLg== MIME-Version: 1.0 X-Received: by 10.112.138.102 with SMTP id qp6mr781190lbb.60.1407802029824; Mon, 11 Aug 2014 17:07:09 -0700 (PDT) Sender: hiren.panchasara@gmail.com Received: by 10.114.81.73 with HTTP; Mon, 11 Aug 2014 17:07:09 -0700 (PDT) Date: Mon, 11 Aug 2014 17:07:09 -0700 X-Google-Sender-Auth: st602qmmBIl2kukmM7ja5nHT28A Message-ID: Subject: Regression test suite for TCP From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 00:07:12 -0000 I was looking for one and found https://wiki.freebsd.org/SummerOfCode2008#TCP.2FIP_regression_test_suite_.28tcptest.29 which is a good start but needs a lot of love (work). Please share if you are aware of any covering basic scenarios. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 00:35:54 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 53F96669; Tue, 12 Aug 2014 00:35:54 +0000 (UTC) Received: from mail-qa0-x22a.google.com (mail-qa0-x22a.google.com [IPv6:2607:f8b0:400d:c00::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 02BB72A1A; Tue, 12 Aug 2014 00:35:53 +0000 (UTC) Received: by mail-qa0-f42.google.com with SMTP id j15so8598220qaq.29 for ; Mon, 11 Aug 2014 17:35:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Laqn9vwhjqIzb3bqjUxxkabMv4ItF8fdkdbparmeyiU=; b=FJX3aQ1n/0FCQhFhRumowU3R0hlRfBG+ijiQPgqjW9ZY90emymraR96XbEB48yU3ut pfX5JecWOnxdFzxwfwyhRcEBW6vKss9mGYn430Xc1vbDEbo6/AgmgpjHqz27iH8LN61I XNDS2hURNyx4zKPaDPhNmVVHRSXmh5Mdjkmb1i6+gwowLQQhq0IJ8l5XeuPSXNzrMYzx f3qAWhZ9BY/PqJk+HFDPNvn9d8YXtnel2KAU3XYSv7cRGAay0a6/jjJGW7U26bHFQSk5 61vCzQYyHbehCNHkVxTArG0mDFL6F9jtUNs1w99SPpuf5FvVmRF+Nfv1rpM2tk0JBsVa 6xYQ== MIME-Version: 1.0 X-Received: by 10.224.15.195 with SMTP id l3mr1309061qaa.98.1407803752666; Mon, 11 Aug 2014 17:35:52 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.41.6 with HTTP; Mon, 11 Aug 2014 17:35:52 -0700 (PDT) In-Reply-To: References: <201408111720.18544.jhb@freebsd.org> Date: Mon, 11 Aug 2014 17:35:52 -0700 X-Google-Sender-Auth: B5HrG-VyE303YFGZps8uAlZz2jM Message-ID: Subject: Re: zero window and persist timer not set From: Adrian Chadd To: hiren panchasara Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" , John Baldwin , Jeremiah Lott X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 00:35:54 -0000 Sweet, I can trigger this at home when doing high connection rate TCP tests. Lemme give this a go tonight/tomorrow and see if it changes the behaviour. Thanks! And yes ,please do file a PR! -a On 11 August 2014 17:05, hiren panchasara wrote: > On Mon, Aug 11, 2014 at 2:20 PM, John Baldwin wrote: >> On Wednesday, August 06, 2014 5:25:38 pm Jeremiah Lott wrote: >>> Hello, >>> >>> We've been seeing a problem where a tcp connection is stuck in a zero >>> window condition and even though the client has opened more window space, >>> our FreeBSD box never sends any more. After some analysis it appears that >>> the FreeBSD box is not sending zero window probes, because the persist >>> timer did not get set (we can see in kgdb that the tcpcb shows 0 window, >>> there is data in the socket buffer, but the persist timer is not active). >>> >>> After looking over the code for a while, I think I see the problem. When >>> tcp_output chooses to send a packet, it never arms the persist timer. This >>> causes a problem in the following scenario: >>> >>> 1. A --> B: packet containing enough data to fill the window >>> 2. B --> A: ACK for #1 + new data (0 window advertisement) >>> 3. A --> B: ACK for #2, 0 len packet >>> >>> In this case, A will not activate the persist timer, because it chose to >>> send a packet. Unless tcp_output is called for some other reason (delayed >>> ack timer, another input packet from B, socket syscall), A will not send >>> zero window probes. I was finally able to recreate this condition by >>> setting an very small window and running programs that send very specific >>> sequences of packets without calling recv (purposefully forcing a zero >>> window condition). Here is a packet capture that shows the sequence: >>> >>> A == 10.2.15.69 == FreeBSD 9.2 >>> B == 10.2.14.61 == FreeBSD 8.2 >>> >>> 16:19:49.664790 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [S], seq >>> 2362665163, win 4300, options [mss 1460,nop,wscale 6,sackOK,TS val 88804503 >>> ecr 0], length 0 >>> 16:19:49.664821 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [S.], seq >>> 3306387947, ack 2362665164, win 65535, options [mss 1460,nop,wscale >>> 6,sackOK,TS val 1605043666 ecr 88804503], length 0 >>> 16:19:49.664859 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], ack 1, >>> win 67, options [nop,nop,TS val 88804503 ecr 1605043666], length 0 >>> 16:19:49.664921 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >>> 1:101, ack 1, win 67, options [nop,nop,TS val 88804503 ecr 1605043666], >>> length 100 >>> 16:19:49.665137 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [P.], seq >>> 1:3001, ack 101, win 2046, options [nop,nop,TS val 1605043666 ecr >>> 88804503], length 3000 >>> 16:19:49.665208 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >>> 101:1321, ack 1449, win 45, options [nop,nop,TS val 88804503 ecr >>> 1605043666], length 1220 >>> 16:19:49.666195 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq >>> 1321:2769, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr >>> 1605043666], length 1448 >>> 16:19:49.666205 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack >>> 2769, win 2004, options [nop,nop,TS val 1605043667 ecr 88804503], length 0 >>> 16:19:49.666207 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >>> 2769:2771, ack 3001, win 21, options [nop,nop,TS val 88804504 ecr >>> 1605043666], length 2 >>> 16:19:49.667183 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [.], seq >>> 2771:4219, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr >>> 1605043667], length 1448 >>> 16:19:49.667190 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], seq >>> 3001:4345, ack 4219, win 1982, options [nop,nop,TS val 1605043668 ecr >>> 88804504], length 1344 >>> 16:19:49.667193 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >>> 4219:4221, ack 3001, win 21, options [nop,nop,TS val 88804505 ecr >>> 1605043667], length 2 >>> 16:19:49.766487 IP 10.2.14.61.23133 > 10.2.15.69.12345: Flags [P.], seq >>> 4221:4321, ack 4345, win 0, options [nop,nop,TS val 88804605 ecr >>> 1605043668], length 100 >>> 16:19:49.766499 IP 10.2.15.69.12345 > 10.2.14.61.23133: Flags [.], ack >>> 4321, win 1980, options [nop,nop,TS val 1605043768 ecr 88804505], length 0 >>> >>> The important packets are the last four: >>> >>> 1. A --> B: length 1344, fills the remaining window >>> 2. B --> A: length 2, does not ack additional data, delayed ack timer is set >>> 3. B --> A: length 100, acks #1, immediate ack (delayed ack timer >>> cancelled, tcp_output called with ACKNOW) >>> 4. A --> B: length 0, acks #1 and #2, because a packet is sent tcp_output >>> does not activate the persist timer. >>> >>> I would normally expect A to begin sending zero-window probes, but (since >>> it didn't activate the persist timer) it does not. Using kgdb, I can see >>> that the persist timer is not set, only the keep timer is set. This is >>> kgdb on "A": >>> >>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_nxt >>> $5 = 3306392292 >>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_max >>> $6 = 3306392292 >>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_una >>> $7 = 3306392292 >>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_wnd >>> $8 = 0 >>> (kgdb) print ((struct tcpcb*)(0xfffffe02ae289b70))->snd_cwnd >>> $9 = 4380 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_rexmt->c_flags >>> $11 = 16 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_persist->c_flags >>> $12 = 16 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_keep->c_flags >>> $13 = 22 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_2msl->c_flags >>> $14 = 16 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_timers->tt_delack->c_flags >>> $15 = 16 >>> (kgdb) print ((struct >>> tcpcb*)(0xfffffe02ae289b70))->t_inpcb->inp_socket.so_snd.sb_cc >>> $16 = 1656 >>> >>> There is zero window, data in the socket buffer, and the persist timer is >>> not set. >>> >>> My proposed fix follows. If you send a 0-length packet, but there is data >>> is the socket buffer, and neither the rexmt or persist timer is already >>> set, then activate the persist timer. >>> >>> --- sys/netinet/tcp_output.c (revision 269644) >>> +++ sys/netinet/tcp_output.c (working copy) >>> @@ -1290,7 +1290,12 @@ >>> tp->t_rxtshift = 0; >>> } >>> tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); >>> - } >>> + } else if (len == 0 && so->so_snd.sb_cc && >>> + !tcp_timer_active(tp, TT_REXMT) && >>> + !tcp_timer_active(tp, TT_PERSIST)) { >>> + tp->t_rxtshift = 0; >>> + tcp_setpersist(tp); >>> + } >>> >>> } else { >>> /* >>> * Persist case, update snd_max but since we are in >>> >>> Let me know any comments. Thanks, >> >> I think your patch is correct, but please file this as a bug report so we can >> hopefully wrangle another person to review this. > > Looks okay to me also from the looks of it. > > cheers, > Hiren > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 03:34:52 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CFEFB5E2 for ; Tue, 12 Aug 2014 03:34:52 +0000 (UTC) Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com [IPv6:2a00:1450:400c:c05::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6EBCA2C86 for ; Tue, 12 Aug 2014 03:34:52 +0000 (UTC) Received: by mail-wi0-f177.google.com with SMTP id ho1so5123131wib.10 for ; Mon, 11 Aug 2014 20:34:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=4qMLOOZbhd0V7Gr1qhEkTcy19hswvBH6Bk6kny0Jwck=; b=UBLs4rqJxg0ktG6ScB72vcbTl8oIs/PFE4vw5CpMl4TXKvXgynQihm5WsPvOMR78T2 ep5OGBiytVF+bcBslOjvaOZ9B7RBYMENnEz/BWjJs9zlaSpS2y5ENODNnRBbWRAE46mQ AOYIwkN+H/ttbWuJ1jEuIEyf+u+dx5R6iQSp02TBFDemX81lvhFZtlA06IurFIKXgCOM eR3MreFifzVU29z5vnpdb2Z4/+RjA5OqWnsvoMRPtPbTkrrDXZ5yt+d++DfkjWhnz/Ha L4tMkBj9UmJ9+ER4IrHZjUZImy77IzjAyHqABGGV4dwlXQVufYgJbkddPeaUbmNa9H/Y V1CQ== MIME-Version: 1.0 X-Received: by 10.194.184.166 with SMTP id ev6mr2009875wjc.61.1407814490405; Mon, 11 Aug 2014 20:34:50 -0700 (PDT) Received: by 10.216.112.138 with HTTP; Mon, 11 Aug 2014 20:34:50 -0700 (PDT) Date: Tue, 12 Aug 2014 00:34:50 -0300 Message-ID: Subject: Netmap instalation error: No rule to make target `/root/netmap/LINUX/e1000/e1000_main.o' From: Emerson Barea To: net@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 03:34:52 -0000 I'm trying to install Netmap, but I got this error: [root@localhost LINUX]# make clean; make make[1]: Entering directory `/usr/src/kernels/3.15.8-200.fc20.x86_64' CLEAN /root/netmap/LINUX/.tmp_versions make[1]: Leaving directory `/usr/src/kernels/3.15.8-200.fc20.x86_64' LIN_VER 30f08 ---- Building from /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet ---- copying e1000 e1000e forcedeth.c igb ixgbe virtio_net.c --- From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet/nvidia : From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet/realtek : From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet/intel : drwxr-xr-x. 2 root root 4096 Ago 11 22:04 e1000/ drwxr-xr-x. 2 root root 4096 Ago 11 22:04 e1000e/ drwxr-xr-x. 2 root root 4096 Ago 11 22:04 igb/ drwxr-xr-x. 2 root root 4096 Ago 11 22:04 ixgbe/ From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet/broadcom : From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net/ethernet : From /lib/modules/3.15.8-200.fc20.x86_64/build/drivers/net : ** patch with diff--e1000--20620--99999 The text leading up to this was: -------------------------- |diff --git a/e1000/e1000_main.c b/e1000/e1000_main.c |index bcd192c..5de7009 100644 |--- a/e1000/e1000_main.c |+++ b/e1000/e1000_main.c -------------------------- No file to patch. Skipping patch. 9 out of 9 hunks ignored ** patch with diff--e1000e--30900--99999 The text leading up to this was: -------------------------- |diff --git a/e1000e/netdev.c b/e1000e/netdev.c |index 7e615e2..f9d8a88 100644 |--- a/e1000e/netdev.c |+++ b/e1000e/netdev.c -------------------------- No file to patch. Skipping patch. 8 out of 8 hunks ignored ** patch with diff--forcedeth.c--20626--99999 The text leading up to this was: -------------------------- |diff --git a/forcedeth.c b/forcedeth.c |index 9c0b1ba..b081d6b 100644 |--- a/forcedeth.c |+++ b/forcedeth.c -------------------------- No file to patch. Skipping patch. 5 out of 5 hunks ignored ** patch with diff--igb--30b00--99999 The text leading up to this was: -------------------------- |diff --git a/igb/igb_main.c b/igb/igb_main.c |index c1d72c0..9815796 100644 |--- a/igb/igb_main.c |+++ b/igb/igb_main.c -------------------------- No file to patch. Skipping patch. 10 out of 10 hunks ignored ** patch with diff--ixgbe--30a00--99999 The text leading up to this was: -------------------------- |diff --git a/ixgbe/ixgbe_main.c b/ixgbe/ixgbe_main.c |index d30fbdd..7418c57 100644 |--- a/ixgbe/ixgbe_main.c |+++ b/ixgbe/ixgbe_main.c -------------------------- No file to patch. Skipping patch. 9 out of 9 hunks ignored ** patch with diff--virtio_net.c--30b00--99999 The text leading up to this was: -------------------------- |diff --git a/virtio_net.c b/virtio_net.c |index 3d2a90a..ae899a4 100644 |--- a/virtio_net.c |+++ b/virtio_net.c -------------------------- No file to patch. Skipping patch. 7 out of 7 hunks ignored Building the following drivers: e1000 e1000e forcedeth.c igb ixgbe virtio_net.c make -C /lib/modules/3.15.8-200.fc20.x86_64/build M=/root/netmap/LINUX CONFIG_NETMAP=m CONFIG_E1000=m CONFIG_E1000E=m CONFIG_IXGBE=m CONFIG_IGB=m CONFIG_BNX2X=m CONFIG_MLX4=m CONFIG_VIRTIO_NET=m \ EXTRA_CFLAGS='-I/root/netmap/LINUX -I/root/netmap/LINUX/../sys -I/root/netmap/LINUX/../sys/dev -DCONFIG_NETMAP -Wno-unused-but-set-variable' \ O_DRIVERS="e1000/ e1000e/ igb/ ixgbe/" modules make[1]: Entering directory `/usr/src/kernels/3.15.8-200.fc20.x86_64' CC [M] /root/netmap/LINUX/netmap.o CC [M] /root/netmap/LINUX/netmap_mem2.o CC [M] /root/netmap/LINUX/netmap_generic.o CC [M] /root/netmap/LINUX/netmap_mbq.o CC [M] /root/netmap/LINUX/netmap_vale.o CC [M] /root/netmap/LINUX/netmap_offloadings.o CC [M] /root/netmap/LINUX/netmap_pipe.o CC [M] /root/netmap/LINUX/netmap_linux.o LD [M] /root/netmap/LINUX/netmap_lin.o make[3]: *** No rule to make target `/root/netmap/LINUX/e1000/e1000_main.o', needed by `/root/netmap/LINUX/e1000/e1000.o'. Stop. make[2]: *** [/root/netmap/LINUX/e1000] Error 2 make[1]: *** [_module_/root/netmap/LINUX] Error 2 make[1]: Leaving directory `/usr/src/kernels/3.15.8-200.fc20.x86_64' make: *** [build] Error 2 [root@localhost LINUX]# I already tried to install in another server with broadcom interface, but I got the same error. There are 3 Intel e1000 devices in this server. [root@localhost LINUX]# ethtool -i p8p1 driver: e1000 version: 7.3.21-k8-NAPI firmware-version: bus-info: 0000:00:09.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no [root@localhost LINUX]# I take Netmap from git clone https://code.google.com/p/netmap/ Thank you From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 08:03:05 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3509AD2B for ; Tue, 12 Aug 2014 08:03:05 +0000 (UTC) Received: from mx11.netapp.com (mx11.netapp.com [216.240.18.76]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx11.netapp.com", Issuer "VeriSign Class 3 International Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id F2D4726F0 for ; Tue, 12 Aug 2014 08:03:04 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,847,1400050800"; d="asc'?scan'208";a="139648611" Received: from vmwexceht06-prd.hq.netapp.com ([10.106.77.104]) by mx11-out.netapp.com with ESMTP; 12 Aug 2014 01:03:00 -0700 Received: from HIOEXCMBX02-PRD.hq.netapp.com (10.122.105.35) by vmwexceht06-prd.hq.netapp.com (10.106.77.104) with Microsoft SMTP Server (TLS) id 14.3.123.3; Tue, 12 Aug 2014 01:02:30 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx02-prd.hq.netapp.com (10.122.105.35) with Microsoft SMTP Server (TLS) id 15.0.913.22; Tue, 12 Aug 2014 01:02:29 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::f0de:b572:dd26:36b5%21]) with mapi id 15.00.0913.011; Tue, 12 Aug 2014 01:02:29 -0700 From: "Eggert, Lars" To: hiren panchasara Subject: Re: A problem on TCP in High RTT Environment. Thread-Topic: A problem on TCP in High RTT Environment. Thread-Index: AQHPswaXaqmGo9mF3Ua6+epBNevn4ZvJEqcAgAATYoCAAA7VAIAAA8CAgABXqQCAAANCAP//lMnkgAB+UQCAAASwAIAByiAAgAACkYCAAFJhAIAAeY2AgAADqYCAAAVDgIAAQPcAgACJIIA= Date: Tue, 12 Aug 2014 08:02:28 +0000 Message-ID: References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.122.56.79] Content-Type: multipart/signed; boundary="Apple-Mail=_037FD837-EC43-4A48-9F0A-AA9B508B397E"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: Michael Tuexen , John-Mark Gurney , Niu Zhixiong , Bill Yuan , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 08:03:05 -0000 --Apple-Mail=_037FD837-EC43-4A48-9F0A-AA9B508B397E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, On 2014-8-12, at 1:52, hiren panchasara = wrote: > On Mon, Aug 11, 2014 at 12:59 PM, Michael Tuexen > wrote: >> If I remember correctly, I increased >> kern.ipc.nmbufs and kern.ipc.nmbclusters in /boot/loader.conf >=20 > I believe, you just need to set kern.ipc.nmbclusters (max mbuf > clusters allowed) and kern.ipc.nmbufs (max mbufs allowed) should be > adjusted based on that. I bumped kern.ipc.nmbclusters by a factor of 100 (from 2036224 to = 203622400). As Hiren said, kern.ipc.nmbufs auto-adjusted (from 13031835 = to 205111860). However, I still see "requests for mbufs denied" immediately after = reboot. root@laurel:~ # netstat -m 12280/1580/13860 mbufs in use (current/cache/total) 12279/827/13106/203622400 mbuf clusters in use (current/cache/total/max) 12279/819 mbuf+clusters out of packet secondary zone in use = (current/cache) 0/3/3/1018111 4k (page size) jumbo clusters in use = (current/cache/total/max) 0/0/0/301662 9k jumbo clusters in use (current/cache/total/max) 0/0/0/169685 16k jumbo clusters in use (current/cache/total/max) 27628K/2061K/29689K bytes allocated to network (current/cache/total) 253/5481/12473 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile I just noticed that the total "mbufs in use" didn't seem to have = increase when I did the 100x scaling of kern.ipc.nmbclusters (and = kern.ipc.nmbufs auto-adjusted). Neither did "bytes allocated to = network". Is that expected? Lars --Apple-Mail=_037FD837-EC43-4A48-9F0A-AA9B508B397E Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU+nKM9ZcnpRveo1xAQJ2VwQAjfA/6j0hSNjq+2iTPE5xH2CRPFCKPY0z tT/REyB8OUKPmDyqD8Gt7RciOa4UHk+BEVrD3pGtM+WKW/pjNd0dumeUzrLhKBdx MG69c1PSr0ILhl2VY3Q7ObHZqhLwLswbi4ugCJH3D++BFIt1EU6Kka26oG8Kkyhu +S90EPhqZCE= =7rj+ -----END PGP SIGNATURE----- --Apple-Mail=_037FD837-EC43-4A48-9F0A-AA9B508B397E-- From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 08:22:55 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DDE0B227 for ; Tue, 12 Aug 2014 08:22:55 +0000 (UTC) Received: from mail-qc0-x236.google.com (mail-qc0-x236.google.com [IPv6:2607:f8b0:400d:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 93F8528E9 for ; Tue, 12 Aug 2014 08:22:55 +0000 (UTC) Received: by mail-qc0-f182.google.com with SMTP id i8so2595499qcq.27 for ; Tue, 12 Aug 2014 01:22:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=LYPXX9ciZjEXTlmrPOYF3MMdpBwV3vBNB9mYK4p6Txc=; b=lYXLKU3trnYJYTg4hZ5oepn9lM48TRL8PwaIxkg+wUEpR5DaeXX+gXRLAFE9SfWKk5 g6O/YB9wPHKj4+tDKyGK4+CxuW0GkRy+nwXTHwi2Cb8Pzo3496klqNc9mUwY0bKYcreO tXBO+LxBlY2SsYAk0zDORGQi4oRGqGFILBP2UEY1DTGcpofsOwm66nQP5xYoAo1gt/VS dU8g+im0ym86smnM+dpYyoioUVm+F0v4q2zZWboSCxUH2dUSSdN6+ND0JMmIX8kBxdFw uN4Y2gyo5/YGWyAVt/GB1ti8AUCvIWJzxnXb78fDggVDkUyv/L56z7rl9XEXdwLnXxsO jD+w== MIME-Version: 1.0 X-Received: by 10.224.95.74 with SMTP id c10mr3983970qan.35.1407831774591; Tue, 12 Aug 2014 01:22:54 -0700 (PDT) Received: by 10.224.65.65 with HTTP; Tue, 12 Aug 2014 01:22:54 -0700 (PDT) In-Reply-To: <20140811171517.GW83475@funkthat.com> References: <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <20140810045355.GM83475@funkthat.com> <20140811171517.GW83475@funkthat.com> Date: Tue, 12 Aug 2014 16:22:54 +0800 Message-ID: Subject: Re: A problem on TCP in High RTT Environment. From: Niu Zhixiong To: Niu Zhixiong , Michael Tuexen , "freebsd-net@freebsd.org" , Bill Yuan Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 08:22:56 -0000 I use a switch and capture in the a sender mirror port. and I also noticed that some acks are before segment. I am not sure how to solve the problem. But, for my kvm-based virtual machines experimental environment. These are no such issues. =E2=80=8B testtest.tar.gz =E2=80=8B Regards, Niu Zhixiong =EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF= =BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D=EF=BC=8D kaiaixi@gmail.com On Tue, Aug 12, 2014 at 1:15 AM, John-Mark Gurney wrote: > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 20:27 +0800: > > Hi, I am not sure whether my last email is filtered by mailing list. > > After disabled tso??? the speed become even poorer??? > > This is the packets captures. Plz see google drive. > > tcp_with_tso_off.pcapng.gz > > < > https://docs.google.com/file/d/0By8sTL79ob4tYXQ0N0lZN0FUNVE/edit?usp=3Ddr= ive_web > > > > So, the reason that this is also slow is that it only ever really has one > segment on the wire at a time... This is similar to the previous > packet capture... > > Which side was thie captured on? Was this the receiving > side? Because it looks like packets are getting merged still... > > 22:19:25.628087 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq > 149171:152067, ack 1, win 32783, options [nop,nop,TS val 61731427 ecr > 2405797018], length 2896 > > and as before: > 22:19:25.634095 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq > 165099:166547, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr > 2405797022], length 1448 > 22:19:25.635084 IP 10.0.10.3.9000 > 10.0.10.2.62995: Flags [.], ack > 167995, win 32745, options [nop,nop,TS val 2405797438 ecr 61731431], leng= th > 0 > 22:19:25.635097 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq > 166547:167995, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr > 2405797022], length 1448 > 22:19:25.636073 IP 10.0.10.2.62995 > 10.0.10.3.9000: Flags [.], seq > 167995:170891, ack 1, win 32783, options [nop,nop,TS val 61731431 ecr > 2405797022], length 2896 > 22:19:25.636266 IP 10.0.10.3.9000 > 10.0.10.2.62995: Flags [.], ack > 170891, win 32745, options [nop,nop,TS val 2405797439 ecr 61731431], leng= th > 0 > > Though the other thing I noticed is that we appear to be ack'ing before > the segment was received, which is a bit odd... And it happens quite > consistantly... > > We really need someone who knows our TCP stack to comment on this... > > > On Sun, Aug 10, 2014 at 1:24 PM, Niu Zhixiong wrote= : > > > > > Hi??? > > > After disabled tso??? the speed become even poorer??? > > > This is the packets captures. Plz see google drive. > > > ??? > > > tcp_with_tso_off.pcapng.gz > > > < > https://docs.google.com/file/d/0By8sTL79ob4tYXQ0N0lZN0FUNVE/edit?usp=3Ddr= ive_web > > > > > ??? > > > > > > > > > John-Mark Gurney >???2014???8???10????????????????????? > > > > > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:48 +0800: > > >> > I am using Intel I350-T4 NIC. The LRO is closed by default. And by > the > > >> way, > > >> > when I am using KVM-based virtual machine(virtio NIC) do the exact= ly > > >> same > > >> > test. The results are same. > > >> > > >> Have you tried disabling tso? I asked that in an earlier email, but > > >> never heard from you if that changed anything... > > >> > > >> a lot of the trace looks like: > > >> 19:29:57.223574 IP 10.0.10.2.61010 > 10.0.10.3.9000: . > > >> 251521:257313(5792) ack 1 win 32783 1047294279> > > >> 19:29:57.223798 IP 10.0.10.3.9000 > 10.0.10.2.61010: . ack 257313 wi= n > > >> 32745 > > >> 19:29:57.225570 IP 10.0.10.2.61010 > 10.0.10.3.9000: . > > >> 257313:263105(5792) ack 1 win 32783 1047294279> > > >> > > >> Notice how the ack comes back immediately, but for some reason, we > decide > > >> to > > >> wait almost 2ms before sending out the next frame... > > >> > > >> For some reason, we just aren't filling our window out... tcptcace'= s > > >> graphs shows the winow at 2MB, but we only ever have 4 segments > > >> outstanding at once... > > >> > > >> > ifconfig igb0 > > >> > igb0: flags=3D8843 metric = 0 > mtu > > >> 1500 > > >> > > > >> > options=3D403bb > > >> > ether a0:36:9f:38:27:d0 > > >> > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 > > >> > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 > > >> > nd6 options=3D29 > > >> > media: Ethernet autoselect (1000baseT ) > > >> > status: active > > >> > > > >> > Regards, > > >> > Niu Zhixiong > > >> > ????????????????????????????????????????????? > > >> > kaiaixi@gmail.com > > >> > > > >> > > > >> > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney < > jmg@funkthat.com> > > >> wrote: > > >> > > > >> > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 > +0800: > > >> > > > I am sorry that I upload a WRONG SCTP capture. But, the > throughput > > >> is > > >> > > same. > > >> > > > SCTP is double than TCP, about 18Mbps. > > >> > > > ??? > > >> > > > sctp_2.pcapng.gz > > >> > > > < > > >> > > > > >> > https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=3Ddr= ive_web > > >> > > > > > >> > > > ??? > > >> > > > > >> > > Ok, the owin graph is very interesting... We do have a full 2MB > > >> window > > >> > > on the receiver side, but for some reason, we only ever have jus= t > > >> under > > >> > > 6k outstanding on the connection... > > >> > > > > >> > > So, it looks like we send for a short period of time, and then > stop > > >> > > sending... Do you have LRO enabled? I think it might be relate= d > to: > > >> > > https://svnweb.freebsd.org/changeset/base/r256920 > > >> > > > > >> > > As I'm seeing >100ms gaps where the sender doesn't send any data= , > and > > >> > > as soon as more than one ack comes in, the next segment goes > out... > > >> If > > >> > > we only receive a single ack, then we wait for a timeout before > > >> sending > > >> > > the next segment.. > > >> > > > > >> > > Can you try to disable LRO on the receiving host? > > >> > > > > >> > > ifconfig -lro > > >> > > > > >> > > And see if that helps... If it does... Applying the patch, or > > >> compiling > > >> > > a more recent kernel from stable/10 that is after r257367 as tha= t > is > > >> was > > >> > > the date that the change was merged... > > >> > > > > >> > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong < > kaiaixi@gmail.com> > > >> > > wrote: > > >> > > > > > >> > > > > I am sure that wnd is about 2MB all the time. > > >> > > > > This is my latest capture, plz see Google Drive. > > >> > > > > In the latest test, TCP(0s-120s) is about 9Mbps and > SCTP(0s-120s) > > >> is > > >> > > about > > >> > > > > 18Mbps. > > >> > > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) > > >> > > > > The SCTP and TCP are tested in same environment. > > >> > > > > > > >> > > > > ??? > > >> > > > > sctp.pcapng.gz > > >> > > > > < > > >> > > > > >> > https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=3Ddr= ive_web > > >> > > > > > >> > > > > ?????? > > >> > > > > tcp.pcapng.gz > > >> > > > > < > > >> > > > > >> > https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=3Ddr= ive_web > > >> > > > > > >> > > > > ??? > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > Regards, > > >> > > > > Niu Zhixiong > > >> > > > > ????????????????????????????????????????????? > > >> > > > > kaiaixi@gmail.com > > >> > > > > > > >> > > > > > > >> > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney < > > >> jmg@funkthat.com> > > >> > > > > wrote: > > >> > > > > > > >> > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:= 12 > > >> +0800: > > >> > > > >> > During the TCP4 transmission. > > >> > > > >> > Proto Recv-Q Send-Q Local Address Foreign Addres= s > > >> > > > >> (state) > > >> > > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.900= 0 > > >> > > > >> > ESTABLISHED > > >> > > > >> > > >> > > > >> Ok, so you are getting a full 2MB in there, and w/ that, yo= u > > >> should > > >> > > > >> easily be saturating your pipe... > > >> > > > >> > > >> > > > >> The next thing would be to get a tcpdump, and take a look a= t > the > > >> > > > >> window size.. Wireshark has lots of neat tools to make this > > >> analysis > > >> > > > >> easy... Another tool that is good is tcptrace.. It can > output a > > >> > > > >> variety of different graphs that will help you track down, > and > > >> see > > >> > > > >> what part of the system is the problem... > > >> > > > >> > > >> > > > >> You probably only need a few tens of seconds of the > tcpdump... > > >> > > > >> > > >> > > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < > > >> > > > >> > Michael.Tuexen@lurchi.franken.de> wrote: > > >> > > > >> > > > >> > > > >> > > > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney < > jmg@funkthat.com > > >> > > > >> > > wrote: > > >> > > > >> > > > > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 201= 4 > at > > >> 21:51 > > >> > > > >> +0200: > > >> > > > >> > > >> > > >> > > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney < > > >> jmg@funkthat.com> > > >> > > > >> wrote: > > >> > > > >> > > >> > > >> > > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 201= 4 > at > > >> 20:34 > > >> > > > >> +0800: > > >> > > > >> > > >>>> Dear all, > > >> > > > >> > > >>>> > > >> > > > >> > > >>>> Last month, I send problems related to FTP/TCP in = a > > >> high RTT > > >> > > > >> > > environment. > > >> > > > >> > > >>>> After that, I setup a simulation > environment(Dummynet) > > >> to > > >> > > test > > >> > > > >> TCP > > >> > > > >> > > and SCTP > > >> > > > >> > > >>>> in high delay environment. After finishing the > test, I > > >> can > > >> > > see > > >> > > > >> TCP is > > >> > > > >> > > >>>> always slower than SCTP. But, I think it is not > > >> possible. > > >> > > (Plz > > >> > > > >> see the > > >> > > > >> > > >>>> figure in the attachment). When the delay is > 200ms(means > > >> > > > >> RTT=3D400ms). > > >> > > > >> > > >>>> Besides, the TCP is extremely slow. > > >> > > > >> > > >>>> > > >> > > > >> > > >>>> ALL BW=3D20Mbps, DELAY=3D 0 ~ 200MS, Packet LOSS = =3D 0 (by > > >> > > dummynet) > > >> > > > >> > > >>>> > > >> > > > >> > > >>>> This is my parameters: > > >> > > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEA= SE > > >> #0: Thu > > >> > > Aug > > >> > > > >> 7 > > >> > > > >> > > >>>> 11:04:15 HKT 2014 > > >> > > > >> > > >>>> > > >> > > > >> > > >>>> sysctl net.inet.tcp > > >> > > > >> > > >>> > > >> > > > >> > > >>> [...] > > >> > > > >> > > >>> > > >> > > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 > > >> > > > >> > > >>> > > >> > > > >> > > >>> [...] > > >> > > > >> > > >>> > > >> > > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 > > >> > > > >> > > >>> > > >> > > > >> > > >>> Try enabling this... This should allow the buffer = to > > >> grow > > >> > > large > > >> > > > >> enough > > >> > > > >> > > >>> to deal w/ the higher latency... > > >> > > > >> > > >>> > > >> > > > >> > > >>> Also, make sure your program isn't setting the recv > > >> buffer > > >> > > size > > >> > > > >> as that > > >> > > > >> > > >>> will disable the auto growing... > > >> > > > >> > > >> I think the program sets the buffer to 2MB, which it > also > > >> does > > >> > > for > > >> > > > >> SCTP. > > >> > > > >> > > >> So having both statically at the same size makes sen= se > > >> for the > > >> > > > >> > > comparison. > > >> > > > >> > > >> I remember that there was a bug in the combination o= f > LRO > > >> and > > >> > > > >> delayed > > >> > > > >> > > ACK, > > >> > > > >> > > >> which was fixed, but I don't remember it was fixed > before > > >> > > 10.0... > > >> > > > >> > > > > > >> > > > >> > > > Sounds like disabling LRO and TSO would be a useful > test > > >> to see > > >> > > if > > >> > > > >> that > > >> > > > >> > > > improves things... But hiren said that the fix made > it, > > >> so... > > >> > > > >> > > > > > >> > > > >> > > >>> If you use netstat -a, you should be able to see th= e > > >> send-q > > >> > > on the > > >> > > > >> > > >>> sender grow as necessary... > > >> > > > >> > > > > > >> > > > >> > > > Also, getting the send-q output while it's running > would > > >> let us > > >> > > know > > >> > > > >> > > > if the buffer is getting to 2MB or not... > > >> > > > >> > > That is correct. Niu: Can you provide this? > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." > From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 10:31:21 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A69EEB81 for ; Tue, 12 Aug 2014 10:31:21 +0000 (UTC) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail-n.franken.de", Issuer "Thawte DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5985827F7 for ; Tue, 12 Aug 2014 10:31:21 +0000 (UTC) Received: from [192.168.1.200] (p54819867.dip0.t-ipconnect.de [84.129.152.103]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 888D71C0E96AC; Tue, 12 Aug 2014 12:31:16 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: A problem on TCP in High RTT Environment. From: Michael Tuexen In-Reply-To: Date: Tue, 12 Aug 2014 12:31:15 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <45B75806-ECE3-43D0-A75B-1B71A5969234@lurchi.franken.de> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> To: "Eggert, Lars" X-Mailer: Apple Mail (2.1878.6) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , hiren panchasara , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 10:31:21 -0000 On 12 Aug 2014, at 10:02, Eggert, Lars wrote: > Hi, >=20 > On 2014-8-12, at 1:52, hiren panchasara = wrote: >> On Mon, Aug 11, 2014 at 12:59 PM, Michael Tuexen >> wrote: >>> If I remember correctly, I increased >>> kern.ipc.nmbufs and kern.ipc.nmbclusters in /boot/loader.conf >>=20 >> I believe, you just need to set kern.ipc.nmbclusters (max mbuf >> clusters allowed) and kern.ipc.nmbufs (max mbufs allowed) should be >> adjusted based on that. >=20 > I bumped kern.ipc.nmbclusters by a factor of 100 (from 2036224 to = 203622400). As Hiren said, kern.ipc.nmbufs auto-adjusted (from 13031835 = to 205111860). Just to double check: You changed it in /boot/loader.conf, right? >=20 > However, I still see "requests for mbufs denied" immediately after = reboot. >=20 > root@laurel:~ # netstat -m > 12280/1580/13860 mbufs in use (current/cache/total) > 12279/827/13106/203622400 mbuf clusters in use = (current/cache/total/max) > 12279/819 mbuf+clusters out of packet secondary zone in use = (current/cache) > 0/3/3/1018111 4k (page size) jumbo clusters in use = (current/cache/total/max) > 0/0/0/301662 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/169685 16k jumbo clusters in use (current/cache/total/max) > 27628K/2061K/29689K bytes allocated to network (current/cache/total) > 253/5481/12473 requests for mbufs denied = (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile >=20 > I just noticed that the total "mbufs in use" didn't seem to have = increase when I did the 100x scaling of kern.ipc.nmbclusters (and = kern.ipc.nmbufs auto-adjusted). Neither did "bytes allocated to = network". Is that expected? I don't think so... Best regards Michael >=20 > Lars From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 10:43:52 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B8BCD1FB for ; Tue, 12 Aug 2014 10:43:52 +0000 (UTC) Received: from mx1.netapp.com (mx1.netapp.com [216.240.18.38]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx1.netapp.com", Issuer "VeriSign Class 3 Secure Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 814A329E2 for ; Tue, 12 Aug 2014 10:43:52 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,848,1400050800"; d="asc'?scan'208";a="338527006" Received: from vmwexceht03-prd.hq.netapp.com ([10.106.76.241]) by mx1-out.netapp.com with ESMTP; 12 Aug 2014 03:43:53 -0700 Received: from HIOEXCMBX05-PRD.hq.netapp.com (10.122.105.38) by vmwexceht03-prd.hq.netapp.com (10.106.76.241) with Microsoft SMTP Server (TLS) id 14.3.123.3; Tue, 12 Aug 2014 03:43:21 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx05-prd.hq.netapp.com (10.122.105.38) with Microsoft SMTP Server (TLS) id 15.0.913.22; Tue, 12 Aug 2014 03:43:02 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::f0de:b572:dd26:36b5%21]) with mapi id 15.00.0913.011; Tue, 12 Aug 2014 03:43:03 -0700 From: "Eggert, Lars" To: Michael Tuexen Subject: Re: A problem on TCP in High RTT Environment. Thread-Topic: A problem on TCP in High RTT Environment. Thread-Index: AQHPswaXaqmGo9mF3Ua6+epBNevn4ZvJEqcAgAATYoCAAA7VAIAAA8CAgABXqQCAAANCAP//lMnkgAB+UQCAAASwAIAByiAAgAACkYCAAFJhAIAAeY2AgAADqYCAAAVDgIAAQPcAgACJIICAACltgIAAA24A Date: Tue, 12 Aug 2014 10:43:02 +0000 Message-ID: <56F2C841-6192-4C64-92B0-18BF12D07EAB@netapp.com> References: <20140809184232.GF83475@funkthat.com> <8AE1AC56-D52F-4F13-AAA3-BB96042B37DD@lurchi.franken.de> <20140809204500.GG83475@funkthat.com> <3F6BC212-4223-4AAC-8668-A27075DC55C2@lurchi.franken.de> <20140810022350.GI83475@funkthat.com> <20140810033212.GL83475@funkthat.com> <17A804F3-BEA6-46F4-887F-B68750618FD9@netapp.com> <0CF85443-26AC-4931-9D00-3396C18C7690@lurchi.franken.de> <7A4120EE-60F3-4D32-89C4-C694B8DFEAE4@netapp.com> <5E8A6382-7096-495A-907C-86CE26A163A2@lurchi.franken.de> <5D3CBFDC-362E-4DB6-A132-BA842EF5B1B2@lurchi.franken.de> <45B75806-ECE3-43D0-A75B-1B71A5969234@lurchi.franken.de> In-Reply-To: <45B75806-ECE3-43D0-A75B-1B71A5969234@lurchi.franken.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.122.56.79] Content-Type: multipart/signed; boundary="Apple-Mail=_6D45AB0D-ECFA-49AD-AC44-649F95916F0E"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Niu Zhixiong , hiren panchasara , Bill Yuan X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 10:43:52 -0000 --Apple-Mail=_6D45AB0D-ECFA-49AD-AC44-649F95916F0E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 2014-8-12, at 12:31, Michael Tuexen = wrote: > On 12 Aug 2014, at 10:02, Eggert, Lars wrote: >> I bumped kern.ipc.nmbclusters by a factor of 100 (from 2036224 to = 203622400). As Hiren said, kern.ipc.nmbufs auto-adjusted (from 13031835 = to 205111860). > Just to double check: You changed it in /boot/loader.conf, right? Yep, and it has taken effect: # sysctl -a | egrep 'nmb|mbuf' kern.ipc.maxmbufmem: 16680744960 kern.ipc.nmbclusters: 203622400 kern.ipc.nmbjumbop: 1018111 kern.ipc.nmbjumbo9: 904986 kern.ipc.nmbjumbo16: 678740 kern.ipc.nmbufs: 205111860 net.inet.sctp.max_chained_mbufs: 5 >> I just noticed that the total "mbufs in use" didn't seem to have = increase when I did the 100x scaling of kern.ipc.nmbclusters (and = kern.ipc.nmbufs auto-adjusted). Neither did "bytes allocated to = network". Is that expected? > I don't think so... Am I hitting some other kernel limit on how much memory in total the = stack can use? Lars --Apple-Mail=_6D45AB0D-ECFA-49AD-AC44-649F95916F0E Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU+nv1dZcnpRveo1xAQI0/AP+MwQU7XgTj+ppVbrmflFiQJeL27NrE+pb Y2qxEqzpVEHu6Qzat5at7lMkjeiu88kGL1GNwi5nJeylNgfUYIA9mX+5XaGk1itR /IHtJXd/2w6GQtK1ryFM/u2tUGPYLzvcxJoy+YLMgGyfaiLLKqVbN8LVCLZEzXt5 QBjJmKkUmY4= =Rm5E -----END PGP SIGNATURE----- --Apple-Mail=_6D45AB0D-ECFA-49AD-AC44-649F95916F0E-- From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 11:55:14 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D2D942E9 for ; Tue, 12 Aug 2014 11:55:14 +0000 (UTC) Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5B07522C4 for ; Tue, 12 Aug 2014 11:55:13 +0000 (UTC) Received: by mail-wi0-f176.google.com with SMTP id bs8so5706383wib.9 for ; Tue, 12 Aug 2014 04:55:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type; bh=oU1UE2/RfgRT+JNVjv54jI+Fsp63LmbqH0IzM10m+5k=; b=ca/BQkHUrtPezxC8V9AVp0hZu4bvmRRqKJxugA7fFYgAQFokv0tL3jhwTsco1SPZTC bGwRouMa28nkQFb4bQq7l0u+qKtxiTSYiTB1OXGA8jBSJ7lhSjbu0faJ0DQ9lbt4dBrZ af6j2cVVl/ELymadt+L/4jAM/bqPpBh0K9rBoouicEkkel3HjuT8Q5AYWUpU9re2zeNP LKr8tX5Ljs5X+EnuY/gy85wgIw6TuSfOJabaVkji5ohbn7dop89H9AKQY7F5zK8tDhXr 7mUBlxxwJluxPRIAhD+QhGWDtKqIpEtmcsaNr+uAJ4kZRlw2zn9d//eadjDM9vlC9IrT Q7Tg== X-Gm-Message-State: ALoCoQl/+3Ea0HMnADI+sQwUmk0E3kdAawcqoX7yjVYpaWb73QPkdOrwm7bxj7QD/ns4NydA6Rvn X-Received: by 10.194.22.166 with SMTP id e6mr4729704wjf.88.1407844148004; Tue, 12 Aug 2014 04:49:08 -0700 (PDT) Received: from [10.0.0.3] (bzq-79-182-26-155.red.bezeqint.net. [79.182.26.155]) by mx.google.com with ESMTPSA id es9sm8808024wjd.1.2014.08.12.04.49.06 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 12 Aug 2014 04:49:07 -0700 (PDT) Message-ID: <53E9FF32.3010802@cloudius-systems.com> Date: Tue, 12 Aug 2014 14:49:06 +0300 From: Vlad Zolotarov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? References: <53E8B424.2000904@cloudius-systems.com> <20140811170606.GV83475@funkthat.com> In-Reply-To: <20140811170606.GV83475@funkthat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: Osv Dev X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 11:55:15 -0000 On Aug 11, 2014 8:06 PM, "John-Mark Gurney" > wrote: > > Vlad Zolotarov wrote this message on Mon, Aug 11, 2014 at 15:16 +0300: > > Hi, I have the most strange question about the TCP Rx window auto sizing > > implementation in a FreeBSD networking stack. > > When I looked at the FreeBSD code (hash > > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that > > the mentioned above feature will not be "enabled" if there isn't a TCP > > timestamp option present in the current TCP session: > > > > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: > > > > if (V_tcp_do_autorcvbuf && > > *to.to_tsecr* && <-------- this is what I'm > > talking about > > (so->so_rcv.sb_flags & SB_AUTOSIZE)) > > > > So, if i read the code correctly, if there isn't a TS option (negotiated > > and thus present in every received packet) the receive socket buffer > > won't grow thus preventing the growth of the Rx window. > > If that's the case this is very strange since TS option is not promised > > and even more - in many cases it won't be present. > > For example in Linux this feature is disabled by default (controlled by > > /proc/sys/net/ipv4/tcp_timestamps). > > This is how I actually noticed the problem the first place: I ran iperf > > test where Linux was an initiator and a transmitter (iperf -c) FreeBSD > > box was a receiver (iperf -s) and I noticed that the Rx window wasn't > > opening up because Linux box hasn't negotiated the TS option in the SYN. > > As a result, the throughput numbers were significantly lower compared to > > Linux-to-Linux setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm > > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). > > > > Could anybody comment on this, pls.? > > Did I miss anything? > > Is it true that FreeBSD assumes that TS option is always present and if > > not how can I cause an Rx Window to open up when TS option hasn't been > > negotiated? > > This means the receive buffer won't grow beyond the default of 64k... > But, as the comment says: > * On the receive side the socket buffer memory is only rarely > * used to any significant extent. This allows us to be much > > The receive buffer will only get used if the application takes too long > to read it's buffer, or it isn't currently waiting... If that's the > case, then the application should be fixed to be able to process the > data as quickly as it comes in... U r right about the Rx buffer and as a result the Rx window will not grow beyond this value too. See the following lines: tcp_output.c: tcp_output(): line 509: recwin = sbspace(&so->so_rcv); line 1034: /* * According to RFC1323 the window field in a SYN (i.e., a * or ) segment itself is never scaled. The * case is handled in syncache. */ if (flags & TH_SYN) th->th_win = htons((u_short) (min(sbspace(&so->so_rcv), TCP_MAXWIN))); else th->th_win = htons((u_short)(recwin >> tp->rcv_scale)); As a result the Tx window of a transmitter will not grow beyond 64K as well and this is a single full LSO/LRO frame. So this will limit a transmitter by a single LSO frame (64K) frame per RTT since the receiver will only "see" the new bytes only after they are delivered by a HW and this will be after all 64KB (full LRO aggregation) are received and only then it will send an ACK. Now let's consider u have a 0.2ms RTT like I have on my setup with 40Gbps ConnectX 3 NICs connected back to back. So, in this case the best throughput u'll ever get with the 64K window will be 8*64K/0.2ms ~ 2.5Gbps which is 1/16 of a line rate and u need at least 64K*16 ~ 1MB window to reach the line rate. And the higher RTT the larger Window we'll need. And this is in case the application frees the socket buffer immediately once it arrives which may never be the case of course. I suppose use cases like above were exactly the motivation for Window Scaling option in RFC 1323. > > So, I don't see much of an issue w/ the code you pointed out, yes, > the receive buffer won't grow, > but there are options that you can set > (sysctl net.inet.tcp.recvspace) and SO_RCVBUF in the application that > will address it otherwise... Exactly! If there is no TS - it won't and FreeBSD will not be able to utilize the network link. Frankly, I don't understand your advice - u suggest for each and every application to go and manually configure a receive socket buffer size? Or increase the initial socket buffer globally, which is even worse?! And which value should we choose? As u may see above the proper value depends on the RTT and RTT may change while application runs due to routing change. I doubt your suggestion is feasible. So, my first question stands - doesn't FreeBSD community think that it would be beneficial for FreeBSD to use a DRS (or similar?) algorithm when there are no TS negotiated? thanks, vlad > > Obviously setting the default too large will just waste memory... > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 12:03:17 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 65D0D872 for ; Tue, 12 Aug 2014 12:03:17 +0000 (UTC) Received: from mail-vc0-x231.google.com (mail-vc0-x231.google.com [IPv6:2607:f8b0:400c:c03::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2421F24CA for ; Tue, 12 Aug 2014 12:03:17 +0000 (UTC) Received: by mail-vc0-f177.google.com with SMTP id hy4so12878017vcb.8 for ; Tue, 12 Aug 2014 05:03:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=mnfVcARQKO5dDXoq1LB85KCrpn47TDcacZ8meM4xjMA=; b=y0Gr6AYSXPRgIPGJ6E9karO6Yu1YgBlyvnNECKU1KH1arAxGcnwgIwH4H6ngYA8cNf eaPAv3tEaE3AbRPVw/lvyOImoPBT+VlNCFoEQBOXjvCRYcm/D4Htq7QV+42UVuvebXpO m//17HnEiC48RSI9a7rVu9rUbk5zIk4ucy54QyflhcpVIeQ9USZzk/vq9sFtxmhP3d9S WDjaRpgXlsrCUh0JXFYmIgsnaPCORXn0otzxECQ8CUfYiBkaQj8aXzSlGNPPFegZ6JzN wJu0M5BRiqNtMb7xugTxTCILqtzKRtoZRvcb/FJLtmZZHe2bhqj/iNxLSmE3q/ovDtQA 3qSQ== MIME-Version: 1.0 X-Received: by 10.220.59.65 with SMTP id k1mr3207393vch.22.1407844996153; Tue, 12 Aug 2014 05:03:16 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.220.186.193 with HTTP; Tue, 12 Aug 2014 05:03:15 -0700 (PDT) In-Reply-To: <53E9FF32.3010802@cloudius-systems.com> References: <53E8B424.2000904@cloudius-systems.com> <20140811170606.GV83475@funkthat.com> <53E9FF32.3010802@cloudius-systems.com> Date: Tue, 12 Aug 2014 05:03:15 -0700 X-Google-Sender-Auth: YhygqUDc36zT0rEaNk2Ph4YUg0c Message-ID: Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? From: Adrian Chadd To: Vlad Zolotarov Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net , Osv Dev X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 12:03:17 -0000 The TL;DR is - yes, I bet it'd be nice to have. :) -a On 12 August 2014 04:49, Vlad Zolotarov wrote: > > On Aug 11, 2014 8:06 PM, "John-Mark Gurney" > wrote: >> >> Vlad Zolotarov wrote this message on Mon, Aug 11, 2014 at 15:16 +0300: >> > Hi, I have the most strange question about the TCP Rx window auto sizing >> > implementation in a FreeBSD networking stack. >> > When I looked at the FreeBSD code (hash >> > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that >> > the mentioned above feature will not be "enabled" if there isn't a TCP >> > timestamp option present in the current TCP session: >> > >> > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: >> > >> > if (V_tcp_do_autorcvbuf && >> > *to.to_tsecr* && <-------- this is what I'm >> > talking about >> > (so->so_rcv.sb_flags & SB_AUTOSIZE)) >> > >> > So, if i read the code correctly, if there isn't a TS option (negotiated >> > and thus present in every received packet) the receive socket buffer >> > won't grow thus preventing the growth of the Rx window. >> > If that's the case this is very strange since TS option is not promised >> > and even more - in many cases it won't be present. >> > For example in Linux this feature is disabled by default (controlled by >> > /proc/sys/net/ipv4/tcp_timestamps). >> > This is how I actually noticed the problem the first place: I ran iperf >> > test where Linux was an initiator and a transmitter (iperf -c) FreeBSD >> > box was a receiver (iperf -s) and I noticed that the Rx window wasn't >> > opening up because Linux box hasn't negotiated the TS option in the SYN. >> > As a result, the throughput numbers were significantly lower compared to >> > Linux-to-Linux setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm >> > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). >> > >> > Could anybody comment on this, pls.? >> > Did I miss anything? >> > Is it true that FreeBSD assumes that TS option is always present and if >> > not how can I cause an Rx Window to open up when TS option hasn't been >> > negotiated? >> >> This means the receive buffer won't grow beyond the default of 64k... >> But, as the comment says: >> * On the receive side the socket buffer memory is only >> rarely >> * used to any significant extent. This allows us to be >> much >> >> The receive buffer will only get used if the application takes too long >> to read it's buffer, or it isn't currently waiting... If that's the >> case, then the application should be fixed to be able to process the >> data as quickly as it comes in... > > U r right about the Rx buffer and as a result the Rx window will not grow > beyond this value too. > > See the following lines: > > tcp_output.c: tcp_output(): > > line 509: > > recwin = sbspace(&so->so_rcv); > > > line 1034: > > /* > * According to RFC1323 the window field in a SYN (i.e., a > * or ) segment itself is never scaled. The > * case is handled in syncache. > */ > if (flags & TH_SYN) > th->th_win = htons((u_short) > (min(sbspace(&so->so_rcv), TCP_MAXWIN))); > else > th->th_win = htons((u_short)(recwin >> tp->rcv_scale)); > > > As a result the Tx window of a transmitter will not grow beyond 64K as well > and this is a single full LSO/LRO frame. > So this will limit a transmitter by a single LSO frame (64K) frame per RTT > since the receiver will only "see" the new bytes only after they are > delivered by a HW and this will be after all 64KB (full LRO aggregation) are > received and only then it will send an ACK. > > Now let's consider u have a 0.2ms RTT like I have on my setup with 40Gbps > ConnectX 3 NICs connected back to back. > So, in this case the best throughput u'll ever get with the 64K window will > be 8*64K/0.2ms ~ 2.5Gbps which is 1/16 of a line rate and u need at least > 64K*16 ~ 1MB window to reach the line rate. And the higher RTT the larger > Window we'll need. And this is in case the application frees the socket > buffer immediately once it arrives which may never be the case of course. > > I suppose use cases like above were exactly the motivation for Window > Scaling option in RFC 1323. > > >> >> So, I don't see much of an issue w/ the code you pointed out, yes, >> the receive buffer won't grow, > >> but there are options that you can set >> (sysctl net.inet.tcp.recvspace) and SO_RCVBUF in the application that >> will address it otherwise... > > Exactly! If there is no TS - it won't and FreeBSD will not be able to > utilize the network link. > Frankly, I don't understand your advice - u suggest for each and every > application to go and manually configure a receive socket buffer size? Or > increase the initial socket buffer globally, which is even worse?! And which > value should we choose? As u may see above the proper value depends on the > RTT and RTT may change while application runs due to routing change. I doubt > your suggestion is feasible. > > So, my first question stands - doesn't FreeBSD community think that it would > be beneficial for FreeBSD to use a DRS (or similar?) algorithm when there > are no TS negotiated? > > thanks, > vlad > > >> >> Obviously setting the default too large will just waste memory... >> >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 16:08:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4AFB5E57 for ; Tue, 12 Aug 2014 16:08:04 +0000 (UTC) Received: from mail-oa0-x231.google.com (mail-oa0-x231.google.com [IPv6:2607:f8b0:4003:c02::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0BE352535 for ; Tue, 12 Aug 2014 16:08:03 +0000 (UTC) Received: by mail-oa0-f49.google.com with SMTP id eb12so7353654oac.36 for ; Tue, 12 Aug 2014 09:08:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=averesystems.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=8PezDCxqfK354myV0FzxaW3TOgn1Fmbu8Km7IzCK9V8=; b=W7VVY9+CMnsk0MWMMzUj+QhaN0b8JSt5tgcTfxDKVbRY4kAuz0AZwoL2PGHTeqIpjY wArdN7s/oel0O4iYWk5cBd78zsc8NN3bsgILRK/nJmtLuNpWOWhfoC9v2Qi45btXYD3j iUTewfZjlnLSdtza560DYR+NET0hctaiy/0HI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=8PezDCxqfK354myV0FzxaW3TOgn1Fmbu8Km7IzCK9V8=; b=iwbK26YEX+uQGtzMoJnxbwLMBj6ZShAmXEzX6slAqtukyE6qyvnQQ0pzEBLksaL5VF edZxNGo8h/9/qnvJLCyBpeXEs28LDo6r6jduzkbKGNvUE8T28ctMalSZrgpfHMK5L9LZ NHfXdPUMtMafsXPIhMDyNSJbpR4ubhq7sWhqr+cvbckQdr4QZKVKvirbmKdotdjU10j+ VDdwmZOyhIfsmO7oRmzcTqLLxDkYkdA4o+ktCCPMwAt8Dn3JhWDUZ+O5l7CBait0YpPt mXkjoE027pxFuoYA1RcFsT503fl9zlYnlhmbBjUxZ0mB/iOLcMCesXWLrxSHfIzbQRRW QQDA== X-Gm-Message-State: ALoCoQnKgsyrg1XhnaV1r2RMR7GfWSlY/wavDzlzhfxTUqvZTZ3C8MAXpkowIRV54O18PPb40Tx5 MIME-Version: 1.0 X-Received: by 10.182.32.5 with SMTP id e5mr5884696obi.73.1407859683303; Tue, 12 Aug 2014 09:08:03 -0700 (PDT) Received: by 10.76.93.209 with HTTP; Tue, 12 Aug 2014 09:08:03 -0700 (PDT) In-Reply-To: References: <201408111720.18544.jhb@freebsd.org> Date: Tue, 12 Aug 2014 12:08:03 -0400 Message-ID: Subject: Re: zero window and persist timer not set From: Jeremiah Lott To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18 Cc: "freebsd-net@freebsd.org" , hiren panchasara , John Baldwin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 16:08:04 -0000 On Mon, Aug 11, 2014 at 8:35 PM, Adrian Chadd wrote: > Thanks! And yes ,please do file a PR! > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192599 From owner-freebsd-net@FreeBSD.ORG Tue Aug 12 20:36:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4964CF3F for ; Tue, 12 Aug 2014 20:36:04 +0000 (UTC) Received: from equinox.hilltopgroup.com (nova.hilltopgroup.com [204.109.63.176]) by mx1.freebsd.org (Postfix) with ESMTP id 21FD42E8F for ; Tue, 12 Aug 2014 20:36:03 +0000 (UTC) Received: from igarinil.com (adsl-072-149-073-165.sip.asm.bellsouth.net [72.149.73.165]) by equinox.hilltopgroup.com (Postfix) with ESMTP id 6BA071A3C12 for ; Tue, 12 Aug 2014 20:26:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=hilltopgroup.com; s=mail; t=1407875205; x=1723123886; q=dns/txt; h=From:Subject: Date:Message-ID:Content-Type:Content-Transfer-Encoding: Content-Language; bh=5B0xyqdGqwTZ7kn8eolq6QW2ctPQiFKV2IY/Z9BZxno =; b=hiuoNBcNI8GzBt0HfxLEyGrd8GWmoHIuYNqsZppuYe9A6o2YmGTmlOe8Sve MSVjEMF7w1//qGlB/Fm4UW5Wdn6yubL59vjLFqN0Rk823uHYyD9Ivpxl9BAN9shB 3I3vq79BylL00q/9MbfouVK/u11yYGRN0AJNTFVrgtlXaVoo= Received: from ([50.167.119.14]) by oberth.igarinil.com with ESMTP with TLS id 0810B00368.10871790; Tue, 12 Aug 2014 16:26:44 -0400 From: "Joseph Ward" To: Subject: SPAN port doesn't pick up locally generated traffic Date: Tue, 12 Aug 2014 16:26:59 -0400 Message-ID: <08b701cfb66b$c4ee4820$4ecad860$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 thread-index: Ac+2a8Sb6s/uTh0eReyVjE5+ZAz03A== Content-Language: en-us X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 20:36:04 -0000 Hi, I have built a firewall/routing box utilizing FreeBSD and need to mirror all of the lan-side traffic before it is NATed to another box which will have traffic analysis software running on it. The firewall box has 4 interfaces: 3 wired (re0, re1, re2) and 1 wireless (ath0). re0 is the internet port (WAN), re1 and ath0 are bridged into bridge0 which has my LAN IP (so that both my wired and wireless systems are all on the same physical network), and re2 is a member of bridge0 as a SPAN port. A tcpdump on the SPAN (and on the analysis box) shows that all packets which enter the system via ath0 and re1 are mirrored appropriately, but if the packets originate either on the WAN port (re1) or internal to the firewall box (ping a LAN endpoint from the firewall shell) the packets are not present on the SPAN port. tcpdump on bridge0 captures the packets, so they're definitely on the bridge. In order to eliminate all possibilities I ran a liveCD of FreeBSD 10 on a box with 4 interfaces with em0 and em1 bridged together into bridge0 with em3 as a SPAN port for bridge0. No firewall, no ports, nothing has been installed or configured. On this box, any packets which physically enter either em0 or em1 (the bridged interfaces) are SPANned, but nothing that originates on the fresh box shows up on the SPAN. Again, the packets originating on the system show up on a tcpdump of bridge0. I'm not much of a system-level programmer, but it certainly looks as if my expected behavior is "proper" based on if_bridge.c and the comment before "bridge_output" function which definitely has a "bridge_span" call when sending unicast with locally generated traffic which is what I'm doing here. Am I missing something? A configuration variable somewhere perhaps? Or is this a bug somewhere? Any help would be greatly appreciated! From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 01:16:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0EC9A234 for ; Wed, 13 Aug 2014 01:16:12 +0000 (UTC) Received: from nm28-vm3.bullet.mail.ne1.yahoo.com (nm28-vm3.bullet.mail.ne1.yahoo.com [98.138.91.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C94D52195 for ; Wed, 13 Aug 2014 01:16:11 +0000 (UTC) Received: from [98.138.100.113] by nm28.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 01:16:05 -0000 Received: from [98.138.226.162] by tm104.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 01:16:05 -0000 Received: from [127.0.0.1] by omp1063.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 01:16:05 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 550736.13519.bm@omp1063.mail.ne1.yahoo.com Received: (qmail 68937 invoked by uid 60001); 13 Aug 2014 01:16:05 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1407892565; bh=izR5lhGfW09k6Dv31On70x5/6SGdws6o57494Mkf9ME=; h=Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=nHopqAhXUAtgg5SiYRBJ5gZ+84g1uw/k75NE2vNeDpn6tRvoh7aPKwhpLFcDc17p/dGcdB+1TDNd8ma4SRmA9H5Op6dUYtJLmjeTNyk91r8ykraUwyFeNqWEVlX/1Lcb0ZldM+iz4J/00U2f5wyunTfhOJRsacODaYyEPiqJ+CM= X-YMail-OSG: BpW2g2IVM1nQIKPMh.U.1F_aXPwbcJaWB1kdsqTbwzA9K9b QnzbL_AQJ8ejFkAnmf8PGmzvntjnGGz94Q.O5UA6nbrXxyM3oWU.XhZfuwBZ blTVNcLVNtR72XIPrZQ9G7KjiKiSgXgZIhs4b8SIlP5IQdAO075Fn_Swp6yb ZJC06qjcS1j.Rm__2XmKaXmsn5TEma5p2P4iKTJ9UHahFIgM33SRVo4kC1Cs _AvoZeZ6mP_8vhv9y4dwWp6q0T.60TPJu5hjGmD7k5qF8lrY3DMpsygQ1STm Gc1y5blHqp9xA_281KFyy93W3xeNZTP1jTEmWVYSsqGC4auEPEC0lQ_x8Yth HFnhIPafc6lI85Cyiokl7ayqiqThnq5Vpiymj9Z0j2NfLFApBElGtOOe_Af1 yKfnAnP.KlYjt5YBvUR5steWVlyXFCf7sVWxpjupxw4OmwSRdtfdesfFsSG6 vu9qakRptu5_Itfhs7f784NyWYCFwU0u4XbbW1gJ3Vvj1HvjYfYZHbJxn2bB BMc8LPHz5x7c1mkccnXIjSjfLwAIdntWdI9lPzuvkQM3CdYsgKWL.cUjQFdS F0A-- Received: from [76.108.181.232] by web121605.mail.ne1.yahoo.com via HTTP; Tue, 12 Aug 2014 18:16:05 PDT X-Rocket-MIMEInfo: 002.001, SSBub3RpY2UgdGhhdCB0aGVyZSBoYXNuJ3QgYmVlbiBhbiB1cGRhdGUgaW4gdGhlIEludGVsIERvd25sb2FkIENlbnRlciBzaW5jZSBKdWx5LiBJcyB0aGVyZSBubyBvZmZpY2lhbCBzdXBwb3J0IGZvciAxMD8KCldlIGxpa2VkIHRvIHVzZSB0aGUgaW50ZWwgc3R1ZmYgYXMgYW4gYWx0ZXJuYXRpdmUgdG8gdGhlICJsYXRlc3QiIGZyZWVic2QgY29kZSwgYnV0IGl0IGRvZXNudCDCoGNvbXBpbGUuCgpCQwEwAQEBAQ-- X-Mailer: YahooMailWebService/0.8.201.700 Message-ID: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> Date: Tue, 12 Aug 2014 18:16:05 -0700 From: Barney Cordoba Reply-To: Barney Cordoba Subject: Intel Support for FreeBSD To: "freebsd-net@freebsd.org" MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 01:16:12 -0000 I notice that there hasn't been an update in the Intel Download Center sinc= e July. Is there no official support for 10?=0A=0AWe liked to use the intel= stuff as an alternative to the "latest" freebsd code, but it doesnt =A0com= pile.=0A=0ABC From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 01:56:59 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2B058462 for ; Wed, 13 Aug 2014 01:56:59 +0000 (UTC) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E3FC32660 for ; Wed, 13 Aug 2014 01:56:58 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.9/8.14.9) with ESMTP id s7D1urDZ027721; Tue, 12 Aug 2014 21:56:54 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <53EAC5E8.2050207@sentex.net> Date: Tue, 12 Aug 2014 21:56:56 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Barney Cordoba , "freebsd-net@freebsd.org" Subject: Re: Intel Support for FreeBSD References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> In-Reply-To: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.74 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 01:56:59 -0000 On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: > I notice that there hasn't been an update in the Intel Download Center since July. Is there no official support for 10? Hi, The latest code is committed directly into the tree by Intel eg http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html and http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html They have been MFC'd to RELENG_10 a few weeks ago ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 02:26:59 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0BED57F7 for ; Wed, 13 Aug 2014 02:26:59 +0000 (UTC) Received: from equinox.hilltopgroup.com (nova.hilltopgroup.com [204.109.63.176]) by mx1.freebsd.org (Postfix) with ESMTP id C9E0428A8 for ; Wed, 13 Aug 2014 02:26:58 +0000 (UTC) Received: from igarinil.com (adsl-072-149-073-165.sip.asm.bellsouth.net [72.149.73.165]) by equinox.hilltopgroup.com (Postfix) with ESMTP id AFF421A3C11 for ; Wed, 13 Aug 2014 02:27:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=hilltopgroup.com; s=mail; t=1407896815; x=1723123886; q=dns/txt; h=From:Subject: Date:Message-ID:Content-Type:Content-Transfer-Encoding: Content-Language; bh=Hc+zUmIeB3Dtr0ycPT4+MWU6uKm7VsHdzbU1+KgfvIw =; b=CTwgcU4GC1Bn44+EIR5UUs7c+Z7YShwPAE8GbwG+noZh5Yz3T7/133JhvIy PODA5miuWTo4DuNHJNbQJi2TECoeEyWvkyiy+fjc3RvDDNN/mIe+o7yacbuvoWey NhYeAZhFbreZ6TVCtEjavVSPZZjF4stXDX7XxL1t3CqunAZ0= Received: from ([50.167.119.14]) by oberth.igarinil.com with ESMTP with TLS id 0810B00368.10871995; Tue, 12 Aug 2014 22:26:54 -0400 From: "Joseph Ward" To: References: In-Reply-To: Subject: RE: SPAN port doesn't pick up locally generated traffic Date: Tue, 12 Aug 2014 22:27:11 -0400 Message-ID: <08f701cfb69e$1698e2c0$43caa840$@com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 thread-index: Ac+2a8Sb6s/uTh0eReyVjE5+ZAz03AAMNfZA Content-Language: en-us X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 02:26:59 -0000 I found a workaround that is acceptable. First, I want to thank Hiren Panchasara for recommending the work-around that I hadn't thought about trying. For the archives and anyone struggling with the same issue: I altered the setup below by giving the LAN IP to the wired interface re1 as opposed to bridge0. Doing that magically made the span port (re2) get all the traffic, both passing through in re1 and out ath0 (and vice versa) as well as the packets that originate inside the system and are passed to the bridge. This isn't ideal as it means that if the physical interface re1 goes down, clients on ath0 will lose connectivity to the system, and I had always understood that when bridging it's ideal to give the IPs to the bridge itself to protect against that possibility. However, I can give each interface another IP on a different subnet that will at least allow for remote connectivity in that scenario. Does anyone know if this is known/expected behavior? If no one knows I'll file a bug ticket on the scenario as it certainly doesn't seem kosher to me. Thanks everyone, -Joseph -----Original Message----- From: Joseph Ward [mailto:jbward@hilltopgroup.com] Sent: Tuesday, August 12, 2014 4:27 PM To: 'freebsd-net@freebsd.org' Subject: SPAN port doesn't pick up locally generated traffic Hi, I have built a firewall/routing box utilizing FreeBSD and need to mirror all of the lan-side traffic before it is NATed to another box which will have traffic analysis software running on it. The firewall box has 4 interfaces: 3 wired (re0, re1, re2) and 1 wireless (ath0). re0 is the internet port (WAN), re1 and ath0 are bridged into bridge0 which has my LAN IP (so that both my wired and wireless systems are all on the same physical network), and re2 is a member of bridge0 as a SPAN port. A tcpdump on the SPAN (and on the analysis box) shows that all packets which enter the system via ath0 and re1 are mirrored appropriately, but if the packets originate either on the WAN port (re1) or internal to the firewall box (ping a LAN endpoint from the firewall shell) the packets are not present on the SPAN port. tcpdump on bridge0 captures the packets, so they're definitely on the bridge. In order to eliminate all possibilities I ran a liveCD of FreeBSD 10 on a box with 4 interfaces with em0 and em1 bridged together into bridge0 with em3 as a SPAN port for bridge0. No firewall, no ports, nothing has been installed or configured. On this box, any packets which physically enter either em0 or em1 (the bridged interfaces) are SPANned, but nothing that originates on the fresh box shows up on the SPAN. Again, the packets originating on the system show up on a tcpdump of bridge0. I'm not much of a system-level programmer, but it certainly looks as if my expected behavior is "proper" based on if_bridge.c and the comment before "bridge_output" function which definitely has a "bridge_span" call when sending unicast with locally generated traffic which is what I'm doing here. Am I missing something? A configuration variable somewhere perhaps? Or is this a bug somewhere? Any help would be greatly appreciated! From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 04:08:06 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2E0FDEC6; Wed, 13 Aug 2014 04:08:06 +0000 (UTC) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1blp0190.outbound.protection.outlook.com [207.46.163.190]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 742B82539; Wed, 13 Aug 2014 04:08:03 +0000 (UTC) Received: from BY1PR0301MB0902.namprd03.prod.outlook.com (25.160.195.141) by BY1PR0301MB0903.namprd03.prod.outlook.com (25.160.195.142) with Microsoft SMTP Server (TLS) id 15.0.1005.10; Wed, 13 Aug 2014 04:08:01 +0000 Received: from BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) by BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) with mapi id 15.00.1005.008; Wed, 13 Aug 2014 04:08:00 +0000 From: Wei Hu To: Adrian Chadd Subject: RE: vRSS support on FreeBSD Thread-Topic: vRSS support on FreeBSD Thread-Index: AQHPs0Bb+mxjwe28tEGUfZYiw4SKJJvLJUyggACYAACAAjAAYA== Date: Wed, 13 Aug 2014 04:08:00 +0000 Message-ID: References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [167.220.232.169] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 0302D4F392 x-forefront-antispam-report: SFV:NSPM; SFS:(6009001)(51704005)(377454003)(189002)(199003)(51914003)(66654002)(13464003)(164054003)(24454002)(92566001)(81342001)(74662001)(81542001)(77982001)(66066001)(76482001)(54356999)(20776003)(86612001)(31966008)(74502001)(108616004)(83322001)(2656002)(19580405001)(46102001)(19580395003)(87936001)(50986999)(79102001)(101416001)(76576001)(85306004)(93886004)(106356001)(105586002)(110136001)(107046002)(80022001)(4396001)(77096002)(99286002)(106116001)(95666004)(64706001)(74316001)(99396002)(76176999)(85852003)(21056001)(83072002)(33646002)(21314002)(24736002); DIR:OUT; SFP:; SCL:1; SRVR:BY1PR0301MB0903; H:BY1PR0301MB0902.namprd03.prod.outlook.com; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-OriginatorOrg: microsoft.onmicrosoft.com Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 04:08:06 -0000 SGkgQWRyaWFuLA0KDQpUaGUgc2VuZCBtYXBwaW5nIHRhYmxlIGlzIGFuIGFycmF5IHdpdGggZml4 ZWQgdGhlIHNpemUgb2YgZWxlbWVudHMsIHNheSBWUlNTX1RBQl9TSVpFLiBJdCBjb250YWlucyB0 aGUgdHggcXVldWUgbnVtYmVyIG9uIHdoaWNoIFRYIHBhY2tldCBzaG91bGQgYmUgc2VudC4gU28g dGhlIHZDUFUgPSBTZW5kX3RhYmxlW2hhc2gtdmFsdWUgJSBWUlNTX1RBQl9TSVpFICUgbnVtYmVy X29mX3R4X3F1ZXVlXSBpcyB0aGUgd2F5IHRvIGNob29zZSB0aGUgdHggcXVldWUuIFNlbmRfdGFi bGUgaXMgdXBkYXRlZCBieSB0aGUgaG9zdCBldmVyeSBmZXcgbWludXRlcyAob24gYSBidXN5IHN5 c3RlbSkgb3IgaG91cnMgKG9uIGEgbGlnaHQgc3lzdGVtKS4NCg0KU2luY2UgdGhlIHZOSUMgZG9l c24ndCBnaXZlIGd1ZXN0IFZNIHRoZSBoYXNoIHZhbHVlIGZvciBhIHJ4IHBhY2tldCwgSSBhbSB0 aGlua2luZyBtYXliZSBJIGNhbiBwdXQgdGhlIHJ4IHF1ZXVlIG51bWJlciBpbiB0aGUgbV9wa3Ro ZHIuZmxvd2lkIG9mIHRoZSBtYnVmIG9uIHRoZSByZWNlaXZpbmcgcGF0aC4gU28gdGhlIHF1ZXVl IG51bWJlciB3aWxsIGJlIHBhc3NlZCB0byB0aGUgbWJ1ZiBvbiB0aGUgc2VuZGluZyBwYXRoLiBU aGlzIHdheSB3ZSBjaG9vc2UgdGhlIHNhbWUgcXVldWUgdG8gc2VuZCB0aGUgcGFja2V0LCBhbmQg d2UgZG9uJ3QgbmVlZCB0byBjYWxjdWxhdGUgdGhlIGhhc2ggdmFsdWUgaW4gdGhlIHNvZnR3YXJl LiANCg0KVGhlIG90aGVyIHdheSBpcyBjYWxjdWxhdGluZyB0aGUgaGFzaCB2YWx1ZSBvbiB0aGUg c2VuZCBwYXRoLCBhbmQgY2hvb3NlIHRoZSB0eCBxdWV1ZSBiYXNlZCBvbiB0aGUgc2VuZCB0YWJs ZSwgbGV0dGluZyB0aGUgaG9zdCB0byBkZWNpZGUgd2hpY2ggcXVldWUgdG8gc2VuZCBwYWNrZXQg KHNpbmNlIHRoZSBzZW5kIHRhYmxlIGlzIGdpdmVuIGJ5IGhvc3QpLiANCg0KSSBtYXkgaW1wbGVt ZW50IHRoZSBib3RoIGFuZCBzZWUgd2hpY2ggb25lIGhhcyBiZXR0ZXIgcGVyZm9ybWFuY2UuIA0K DQpUaGFua3MsDQpXZWkNCg0KDQoNCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBh ZHJpYW4uY2hhZGRAZ21haWwuY29tIFttYWlsdG86YWRyaWFuLmNoYWRkQGdtYWlsLmNvbV0gT24g QmVoYWxmIE9mIEFkcmlhbiBDaGFkZA0KU2VudDogVHVlc2RheSwgQXVndXN0IDEyLCAyMDE0IDI6 MjcgQU0NClRvOiBXZWkgSHUNCkNjOiBkQGRlbHBoaWoubmV0OyBmcmVlYnNkLW5ldEBmcmVlYnNk Lm9yZw0KU3ViamVjdDogUmU6IHZSU1Mgc3VwcG9ydCBvbiBGcmVlQlNEDQoNCk9uIDExIEF1Z3Vz dCAyMDE0IDAyOjQ4LCBXZWkgSHUgPHdlaEBtaWNyb3NvZnQuY29tPiB3cm90ZToNCj4gQ0MgZnJl ZWJzZC1uZXRAIGZvciB3aWRlciBkaXNjdXNzaW9uLg0KPg0KPiBIaSBBZHJpYW4sDQo+DQo+IE1h bnkgdGhhbmtzIGZvciB0aGUgZXhwbGFuYXRpb24uICBJIGNoZWNrZWQgdGhlIGlmX2lnYi5jICBh bmQgZm91bmQgdGhlIGZsb3dpZCBmaWVsZCB3YXMgc2V0IGluIHRoZSBSWCBzaWRlIGluIGlnYl9y eGVvZigpOg0KPg0KPiBJZ2Jfcnhlb2YoKQ0KPiB7DQo+ICAuLi4NCj4gI2lmZGVmICBSU1MNCj4g ICAgICAgICAgICAgICAgICAgICAgICAgLyogWFhYIHNldCBmbG93dHlwZSBvbmNlIHRoaXMgd29y a3MgcmlnaHQgKi8NCj4gICAgICAgICAgICAgICAgICAgICAgICAgcnhyLT5mbXAtPm1fcGt0aGRy LmZsb3dpZCA9DQo+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICBsZTMydG9oKGN1ci0+d2Iu bG93ZXIuaGlfZHdvcmQucnNzKTsNCj4gICAgICAgICAgICAgICAgICAgICAgICAgcnhyLT5mbXAt Pm1fZmxhZ3MgfD0gTV9GTE9XSUQ7ICAuLi4NCj4gfQ0KPg0KPiBJIGhhdmUgdHdvIHF1ZXN0aW9u cyByZWdhcmRpbmcgdGhpcy4NCj4NCj4gMS4gSXMgdGhlIFJTUyBoYXNoIHZhbHVlIHN0b3JlZCBp biBjdXItPndiLmxvd2VyLmhpX2R3b3JkLnJzcyBzZXQgYnkgdGhlIE5JQyBoYXJkd2FyZT8NCg0K WXVwLg0KDQo+IDIuIFNvIHRoZSBoYXNoIHZhbHVlIGFuZCBtX2ZsYWdzIGFyZSBzdG9yZWQgaW4g dGhlIG1idWYgcmVsYXRlZCB0byB0aGUgcmVjZWl2ZWQgcGFja2V0IG9uIHRoZSByeCBzaWRlKGxn Yl9yeGVvZigpKS4gQnV0IHdlIGNoZWNrIHRoZSBoYXNoIHZhbHVlIGFuZCBtX2ZsYWdzIGluIG1i dWYgcmVsYXRlZCB0byB0aGUgc2VuZCBwYWNrZXQgb24gdGhlIHR4IHNpZGUgKGluIGlnYl9tcV9z dGFydCgpKS4gRG9lcyB0aGUga2VybmVsIHJlLXVzZSB0aGUgc2FtZSBtYnVmIGZvciB0eD8gSWYg c28sIGhvdyBkb2VzIGl0IGtub3cgZm9yIHRoZSBzYW1lIG5ldHdvcmsgc3RyZWFtIGl0IHNob3Vs ZCB1c2UgdGhlIHNhbWUgbWJ1ZiBnb3QgZnJvbSB0aGUgcnggZm9yIHBhY2tldCBzZW5kaW5nPyBJ ZiBub3QsIGhvdyBkb2VzIHRoZSBrZXJuZWwgcHJlc2VydmUgdGhlIHNhbWUgaGFzaCB2YWx1ZSBh Y3Jvc3MgdGhlIHJ4IG1idWYgYW5kIHR4IG1idWYgZm9yIHNhbWUgbmV0d29yayBzdHJlYW0/IFRo aXMgc2VlbXMgcXVpdGUgbWFnaWNhbCB0byBtZS4NCg0KVGhlIG1idWYgZmxvd2lkL2Zsb3d0eXBl IGVuZHMgdXAgaW4gdGhlIGlucGNiLT5pbnBfZmxvd2lkIC8NCmlucGNiLT5pbnBfZmxvd3R5cGUg YXMgcGFydCBvZiB0aGUgVENQIHJlY2VpdmUgcGF0aC4NCg0KVGhlbiB3aGVuZXZlciB0aGUgVENQ IGNvZGUgb3V0cHV0cyBhbiBtYnVmLCBpdCBjb3BpZXMgdGhlIGlucGNiIGZsb3cgZGV0YWlscyBv dXQgdG8gb3V0Ym91bmQgbWJ1ZnMuDQoNCj4NCj4gRm9yIHRoZSBIeXBlci1WIGNhc2UsIHRoZSBo b3N0IGNvbnRyb2xzIHdoaWNoIHZDUFUgaXQgd2FudHMgdG8gaW50ZXJydXB0LiBBbmQgdGhlIHJ1 bGUgY2FuIGNoYW5nZSBkeW5hbWljYWxseSBiYXNlZCBvbiB0aGUgbG9hZC4gRm9yIGEgbm9uLWJ1 c3kgVk0sIGhvc3Qgd2lsbCBzZW5kIG1vc3QgcGFja2V0cyB0byBzYW1lIHZDUFUgZm9yIHBvd2Vy IHNhdmluZyBwdXJwb3NlLiBGb3IgYSBidXN5IFZNLCBob3N0IHdpbGwgZGlzdHJpYnV0ZSB0aGUg cGFja2V0cyBldmVubHkgYWNyb3NzIGFsbCB2Q1BVcy4gVGhpcyBtZWFucyBob3N0IGNvdWxkIGNo YW5nZSB0aGUgUlNTIGJ1Y2tldCBtYXBwaW5nIGR5bmFtaWNhbGx5LiBIeXBlci1WIGRvZXMgdGhp cyBieSBzZW5kaW5nIGEgbWFwcGluZyB0YWJsZSB0byBWTSB3aGVuZXZlciB0aGUgaXQgbmVlZHMg dXBkYXRlLiBUaGlzIGFsc28gbWVhbnMgd2UgY2Fubm90IHVzZSBGcmVlQlNEJ3Mgb3duIGJ1Y2tl dCBtYXBwaW5nIHdoaWNoIEkgYmVsaWV2ZSBpcyBmaXhlZC4gQWxzbyBIeXBlci1WIHVzZSBpdHMg b3duIGhhc2gga2V5LiBTbyBkbyB5b3UgdGhpbmsgaXQgaXMgcG9zc2libGUgd2Ugc3RpbGwgdXNl IHRoZSBleGlzaXRpbmcgUlNTIGluZnJhc3RydWN0dXJlIGJ1aWx0IGluIEZyZWVCU0QgaW4gdGhp cyBwdXJwb3NlPw0KDQpFdmVudHVhbGx5LiBEb2luZyByZWJhbGFuY2luZyBpbiBSU1MgaXMgb24g dGhlIFRPRE8gbGlzdCwgYWZ0ZXIgSSBnZXQgdGhlIHJlc3Qgb2YgdGhlIGJhc2ljIHBhY2tldCBo YW5kbGluZyAvIHJvdXRpbmcgZG9uZS4NCg0KSG93J3MgdlJTUyBub3RpZnkgdGhlIFZNIHRoYXQg dGhlIG1hcHBpbmcgdGFibGUgaGFzIGNoYW5nZWQ/IFdoYXQncyB0aGUgZm9ybWF0IG9mIGl0IGxv b2sgbGlrZT8NCg0KDQotYQ0K From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 06:13:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 698E5A30 for ; Wed, 13 Aug 2014 06:13:50 +0000 (UTC) Received: from mail-qc0-x22d.google.com (mail-qc0-x22d.google.com [IPv6:2607:f8b0:400d:c01::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2711B2283 for ; Wed, 13 Aug 2014 06:13:50 +0000 (UTC) Received: by mail-qc0-f173.google.com with SMTP id w7so3689113qcr.4 for ; Tue, 12 Aug 2014 23:13:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=nG9dm4l2RqxLStu8ZCc0DapPmSB8IcAXWnAMTbYYlG8=; b=fWm8Msu3WsdakQp8ZuZeSboUNAQMXuOvg4Gv/EzZWnm1LQm/SvtARA+Ee7AN6vxfyh hPCAXXCu/Jo0CkaY+mOYxlKYBN9DzkzmqIGCvamGSfIGbAxcyLqSo5OiDR+wV5j7V/Rx l8N90Xszyw/UqAnDvTmE23cmJ33v5YJlTN4JqI425vZYev9Ai3FJJOKNdBR7DKyyQ5lt qO+FHQQ7PzjXxVChFfQnuannfNGNU9KwA1+moeBONxNwIwgFuGgeVbEzRsr5YCD7amhk k1hR8qd5icMdSLhpoPOwqYNJ4wvhtFuuvKg3c1xEOg4xF3Z1QcxVuCKvRPPQbZ2L/Oi6 7nDQ== MIME-Version: 1.0 X-Received: by 10.224.46.8 with SMTP id h8mr3725833qaf.6.1407910429191; Tue, 12 Aug 2014 23:13:49 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.39.139 with HTTP; Tue, 12 Aug 2014 23:13:49 -0700 (PDT) In-Reply-To: References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> Date: Tue, 12 Aug 2014 23:13:49 -0700 X-Google-Sender-Auth: fDR8xMRLPVX9DPG_7XIqXNd_JvU Message-ID: Subject: Re: vRSS support on FreeBSD From: Adrian Chadd To: Wei Hu Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 06:13:50 -0000 Hi! Is there a spec for this stuff floating around somewhere? What do other platforms do for receive/transmit affinity on hyperv? -a On 12 August 2014 21:08, Wei Hu wrote: > Hi Adrian, > > The send mapping table is an array with fixed the size of elements, say V= RSS_TAB_SIZE. It contains the tx queue number on which TX packet should be = sent. So the vCPU =3D Send_table[hash-value % VRSS_TAB_SIZE % number_of_tx_= queue] is the way to choose the tx queue. Send_table is updated by the host= every few minutes (on a busy system) or hours (on a light system). > > Since the vNIC doesn't give guest VM the hash value for a rx packet, I am= thinking maybe I can put the rx queue number in the m_pkthdr.flowid of the= mbuf on the receiving path. So the queue number will be passed to the mbuf= on the sending path. This way we choose the same queue to send the packet,= and we don't need to calculate the hash value in the software. > > The other way is calculating the hash value on the send path, and choose = the tx queue based on the send table, letting the host to decide which queu= e to send packet (since the send table is given by host). > > I may implement the both and see which one has better performance. > > Thanks, > Wei > > > > -----Original Message----- > From: adrian.chadd@gmail.com [mailto:adrian.chadd@gmail.com] On Behalf Of= Adrian Chadd > Sent: Tuesday, August 12, 2014 2:27 AM > To: Wei Hu > Cc: d@delphij.net; freebsd-net@freebsd.org > Subject: Re: vRSS support on FreeBSD > > On 11 August 2014 02:48, Wei Hu wrote: >> CC freebsd-net@ for wider discussion. >> >> Hi Adrian, >> >> Many thanks for the explanation. I checked the if_igb.c and found the = flowid field was set in the RX side in igb_rxeof(): >> >> Igb_rxeof() >> { >> ... >> #ifdef RSS >> /* XXX set flowtype once this works right */ >> rxr->fmp->m_pkthdr.flowid =3D >> le32toh(cur->wb.lower.hi_dword.rss); >> rxr->fmp->m_flags |=3D M_FLOWID; ... >> } >> >> I have two questions regarding this. >> >> 1. Is the RSS hash value stored in cur->wb.lower.hi_dword.rss set by the= NIC hardware? > > Yup. > >> 2. So the hash value and m_flags are stored in the mbuf related to the r= eceived packet on the rx side(lgb_rxeof()). But we check the hash value and= m_flags in mbuf related to the send packet on the tx side (in igb_mq_start= ()). Does the kernel re-use the same mbuf for tx? If so, how does it know f= or the same network stream it should use the same mbuf got from the rx for = packet sending? If not, how does the kernel preserve the same hash value ac= ross the rx mbuf and tx mbuf for same network stream? This seems quite magi= cal to me. > > The mbuf flowid/flowtype ends up in the inpcb->inp_flowid / > inpcb->inp_flowtype as part of the TCP receive path. > > Then whenever the TCP code outputs an mbuf, it copies the inpcb flow deta= ils out to outbound mbufs. > >> >> For the Hyper-V case, the host controls which vCPU it wants to interrupt= . And the rule can change dynamically based on the load. For a non-busy VM,= host will send most packets to same vCPU for power saving purpose. For a b= usy VM, host will distribute the packets evenly across all vCPUs. This mean= s host could change the RSS bucket mapping dynamically. Hyper-V does this b= y sending a mapping table to VM whenever the it needs update. This also mea= ns we cannot use FreeBSD's own bucket mapping which I believe is fixed. Als= o Hyper-V use its own hash key. So do you think it is possible we still use= the exisiting RSS infrastructure built in FreeBSD in this purpose? > > Eventually. Doing rebalancing in RSS is on the TODO list, after I get the= rest of the basic packet handling / routing done. > > How's vRSS notify the VM that the mapping table has changed? What's the f= ormat of it look like? > > > -a From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 06:27:43 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 43A50F88; Wed, 13 Aug 2014 06:27:43 +0000 (UTC) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2lp0210.outbound.protection.outlook.com [207.46.163.210]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A6AB224F1; Wed, 13 Aug 2014 06:27:41 +0000 (UTC) Received: from BY1PR0301MB0902.namprd03.prod.outlook.com (25.160.195.141) by BY1PR0301MB0904.namprd03.prod.outlook.com (25.160.195.143) with Microsoft SMTP Server (TLS) id 15.0.1005.10; Wed, 13 Aug 2014 06:27:32 +0000 Received: from BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) by BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) with mapi id 15.00.1005.008; Wed, 13 Aug 2014 06:27:32 +0000 From: Wei Hu To: Adrian Chadd Subject: RE: vRSS support on FreeBSD Thread-Topic: vRSS support on FreeBSD Thread-Index: AQHPs0Bb+mxjwe28tEGUfZYiw4SKJJvLJUyggACYAACAAjAAYIAAJ7GAgAACGnA= Date: Wed, 13 Aug 2014 06:27:31 +0000 Message-ID: <78ee306703c9492aad278425745032c7@BY1PR0301MB0902.namprd03.prod.outlook.com> References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [167.220.232.169] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 0302D4F392 x-forefront-antispam-report: SFV:NSPM; SFS:(979002)(6009001)(189002)(164054003)(199003)(51914003)(24454002)(66654002)(51704005)(13464003)(377454003)(77982001)(105586002)(46102001)(4396001)(110136001)(86612001)(77096002)(50986999)(33646002)(81542001)(108616004)(31966008)(106116001)(101416001)(74316001)(86362001)(92566001)(83072002)(2656002)(76482001)(54356999)(87936001)(19580395003)(83322001)(19580405001)(107046002)(99396002)(85852003)(80022001)(99286002)(76576001)(66066001)(106356001)(64706001)(93886004)(74662001)(20776003)(76176999)(81342001)(74502001)(21056001)(79102001)(95666004)(85306004)(21314002)(24736002)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:; SCL:1; SRVR:BY1PR0301MB0904; H:BY1PR0301MB0902.namprd03.prod.outlook.com; FPR:; MLV:ovrnspm; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-OriginatorOrg: microsoft.onmicrosoft.com Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 06:27:43 -0000 VW5mb3J0dW5hdGVseSBJIGFtIG5vdCBhd2FyZSBvZiBhbnkgb2ZmaWNpYWwgc3BlYy4gDQoNClRo ZSBkcml2ZXIgZm9yIExpbnV4IGd1ZXN0IE9TIGRvZXMgdGhlIHNpbWlsYXIgdGhpbmcgZm9yIHRo ZSB0eCBwYXRoIC0tIHJlY2VpdmUgdGhlIHNlbmQgdGFibGUgdXBkYXRlIGZyb20gaG9zdCwgY2Fs Y3VsYXRlIHRoZSBoYXNoIHZhbHVlIGZvciBlYWNoIHR4IHBhY2tldCBhbmQgdXNlIGl0IHRvIGZp bmQgdGhlIHR4IHF1ZXVlIGZyb20gdGhlIHNlbmQgdGFibGUuIA0KDQpJIHRoaW5rIHRoZSBXaW5k b3dzIGd1ZXN0cyBkbyB0aGUgc2FtZS4NCg0KV2VpDQoNCg0KDQotLS0tLU9yaWdpbmFsIE1lc3Nh Z2UtLS0tLQ0KRnJvbTogYWRyaWFuLmNoYWRkQGdtYWlsLmNvbSBbbWFpbHRvOmFkcmlhbi5jaGFk ZEBnbWFpbC5jb21dIE9uIEJlaGFsZiBPZiBBZHJpYW4gQ2hhZGQNClNlbnQ6IFdlZG5lc2RheSwg QXVndXN0IDEzLCAyMDE0IDI6MTQgUE0NClRvOiBXZWkgSHUNCkNjOiBkQGRlbHBoaWoubmV0OyBm cmVlYnNkLW5ldEBmcmVlYnNkLm9yZw0KU3ViamVjdDogUmU6IHZSU1Mgc3VwcG9ydCBvbiBGcmVl QlNEDQoNCkhpIQ0KDQpJcyB0aGVyZSBhIHNwZWMgZm9yIHRoaXMgc3R1ZmYgZmxvYXRpbmcgYXJv dW5kIHNvbWV3aGVyZT8NCg0KV2hhdCBkbyBvdGhlciBwbGF0Zm9ybXMgZG8gZm9yIHJlY2VpdmUv dHJhbnNtaXQgYWZmaW5pdHkgb24gaHlwZXJ2Pw0KDQoNCg0KLWENCg0KDQpPbiAxMiBBdWd1c3Qg MjAxNCAyMTowOCwgV2VpIEh1IDx3ZWhAbWljcm9zb2Z0LmNvbT4gd3JvdGU6DQo+IEhpIEFkcmlh biwNCj4NCj4gVGhlIHNlbmQgbWFwcGluZyB0YWJsZSBpcyBhbiBhcnJheSB3aXRoIGZpeGVkIHRo ZSBzaXplIG9mIGVsZW1lbnRzLCBzYXkgVlJTU19UQUJfU0laRS4gSXQgY29udGFpbnMgdGhlIHR4 IHF1ZXVlIG51bWJlciBvbiB3aGljaCBUWCBwYWNrZXQgc2hvdWxkIGJlIHNlbnQuIFNvIHRoZSB2 Q1BVID0gU2VuZF90YWJsZVtoYXNoLXZhbHVlICUgVlJTU19UQUJfU0laRSAlIG51bWJlcl9vZl90 eF9xdWV1ZV0gaXMgdGhlIHdheSB0byBjaG9vc2UgdGhlIHR4IHF1ZXVlLiBTZW5kX3RhYmxlIGlz IHVwZGF0ZWQgYnkgdGhlIGhvc3QgZXZlcnkgZmV3IG1pbnV0ZXMgKG9uIGEgYnVzeSBzeXN0ZW0p IG9yIGhvdXJzIChvbiBhIGxpZ2h0IHN5c3RlbSkuDQo+DQo+IFNpbmNlIHRoZSB2TklDIGRvZXNu J3QgZ2l2ZSBndWVzdCBWTSB0aGUgaGFzaCB2YWx1ZSBmb3IgYSByeCBwYWNrZXQsIEkgYW0gdGhp bmtpbmcgbWF5YmUgSSBjYW4gcHV0IHRoZSByeCBxdWV1ZSBudW1iZXIgaW4gdGhlIG1fcGt0aGRy LmZsb3dpZCBvZiB0aGUgbWJ1ZiBvbiB0aGUgcmVjZWl2aW5nIHBhdGguIFNvIHRoZSBxdWV1ZSBu dW1iZXIgd2lsbCBiZSBwYXNzZWQgdG8gdGhlIG1idWYgb24gdGhlIHNlbmRpbmcgcGF0aC4gVGhp cyB3YXkgd2UgY2hvb3NlIHRoZSBzYW1lIHF1ZXVlIHRvIHNlbmQgdGhlIHBhY2tldCwgYW5kIHdl IGRvbid0IG5lZWQgdG8gY2FsY3VsYXRlIHRoZSBoYXNoIHZhbHVlIGluIHRoZSBzb2Z0d2FyZS4N Cj4NCj4gVGhlIG90aGVyIHdheSBpcyBjYWxjdWxhdGluZyB0aGUgaGFzaCB2YWx1ZSBvbiB0aGUg c2VuZCBwYXRoLCBhbmQgY2hvb3NlIHRoZSB0eCBxdWV1ZSBiYXNlZCBvbiB0aGUgc2VuZCB0YWJs ZSwgbGV0dGluZyB0aGUgaG9zdCB0byBkZWNpZGUgd2hpY2ggcXVldWUgdG8gc2VuZCBwYWNrZXQg KHNpbmNlIHRoZSBzZW5kIHRhYmxlIGlzIGdpdmVuIGJ5IGhvc3QpLg0KPg0KPiBJIG1heSBpbXBs ZW1lbnQgdGhlIGJvdGggYW5kIHNlZSB3aGljaCBvbmUgaGFzIGJldHRlciBwZXJmb3JtYW5jZS4N Cj4NCj4gVGhhbmtzLA0KPiBXZWkNCj4NCj4NCj4NCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0t LS0NCj4gRnJvbTogYWRyaWFuLmNoYWRkQGdtYWlsLmNvbSBbbWFpbHRvOmFkcmlhbi5jaGFkZEBn bWFpbC5jb21dIE9uIEJlaGFsZiBPZiBBZHJpYW4gQ2hhZGQNCj4gU2VudDogVHVlc2RheSwgQXVn dXN0IDEyLCAyMDE0IDI6MjcgQU0NCj4gVG86IFdlaSBIdQ0KPiBDYzogZEBkZWxwaGlqLm5ldDsg ZnJlZWJzZC1uZXRAZnJlZWJzZC5vcmcNCj4gU3ViamVjdDogUmU6IHZSU1Mgc3VwcG9ydCBvbiBG cmVlQlNEDQo+DQo+IE9uIDExIEF1Z3VzdCAyMDE0IDAyOjQ4LCBXZWkgSHUgPHdlaEBtaWNyb3Nv ZnQuY29tPiB3cm90ZToNCj4+IENDIGZyZWVic2QtbmV0QCBmb3Igd2lkZXIgZGlzY3Vzc2lvbi4N Cj4+DQo+PiBIaSBBZHJpYW4sDQo+Pg0KPj4gTWFueSB0aGFua3MgZm9yIHRoZSBleHBsYW5hdGlv bi4gIEkgY2hlY2tlZCB0aGUgaWZfaWdiLmMgIGFuZCBmb3VuZCB0aGUgZmxvd2lkIGZpZWxkIHdh cyBzZXQgaW4gdGhlIFJYIHNpZGUgaW4gaWdiX3J4ZW9mKCk6DQo+Pg0KPj4gSWdiX3J4ZW9mKCkN Cj4+IHsNCj4+ICAuLi4NCj4+ICNpZmRlZiAgUlNTDQo+PiAgICAgICAgICAgICAgICAgICAgICAg ICAvKiBYWFggc2V0IGZsb3d0eXBlIG9uY2UgdGhpcyB3b3JrcyByaWdodCAqLw0KPj4gICAgICAg ICAgICAgICAgICAgICAgICAgcnhyLT5mbXAtPm1fcGt0aGRyLmZsb3dpZCA9DQo+PiAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgbGUzMnRvaChjdXItPndiLmxvd2VyLmhpX2R3b3JkLnJzcyk7 DQo+PiAgICAgICAgICAgICAgICAgICAgICAgICByeHItPmZtcC0+bV9mbGFncyB8PSBNX0ZMT1dJ RDsgIC4uLg0KPj4gfQ0KPj4NCj4+IEkgaGF2ZSB0d28gcXVlc3Rpb25zIHJlZ2FyZGluZyB0aGlz Lg0KPj4NCj4+IDEuIElzIHRoZSBSU1MgaGFzaCB2YWx1ZSBzdG9yZWQgaW4gY3VyLT53Yi5sb3dl ci5oaV9kd29yZC5yc3Mgc2V0IGJ5IHRoZSBOSUMgaGFyZHdhcmU/DQo+DQo+IFl1cC4NCj4NCj4+ IDIuIFNvIHRoZSBoYXNoIHZhbHVlIGFuZCBtX2ZsYWdzIGFyZSBzdG9yZWQgaW4gdGhlIG1idWYg cmVsYXRlZCB0byB0aGUgcmVjZWl2ZWQgcGFja2V0IG9uIHRoZSByeCBzaWRlKGxnYl9yeGVvZigp KS4gQnV0IHdlIGNoZWNrIHRoZSBoYXNoIHZhbHVlIGFuZCBtX2ZsYWdzIGluIG1idWYgcmVsYXRl ZCB0byB0aGUgc2VuZCBwYWNrZXQgb24gdGhlIHR4IHNpZGUgKGluIGlnYl9tcV9zdGFydCgpKS4g RG9lcyB0aGUga2VybmVsIHJlLXVzZSB0aGUgc2FtZSBtYnVmIGZvciB0eD8gSWYgc28sIGhvdyBk b2VzIGl0IGtub3cgZm9yIHRoZSBzYW1lIG5ldHdvcmsgc3RyZWFtIGl0IHNob3VsZCB1c2UgdGhl IHNhbWUgbWJ1ZiBnb3QgZnJvbSB0aGUgcnggZm9yIHBhY2tldCBzZW5kaW5nPyBJZiBub3QsIGhv dyBkb2VzIHRoZSBrZXJuZWwgcHJlc2VydmUgdGhlIHNhbWUgaGFzaCB2YWx1ZSBhY3Jvc3MgdGhl IHJ4IG1idWYgYW5kIHR4IG1idWYgZm9yIHNhbWUgbmV0d29yayBzdHJlYW0/IFRoaXMgc2VlbXMg cXVpdGUgbWFnaWNhbCB0byBtZS4NCj4NCj4gVGhlIG1idWYgZmxvd2lkL2Zsb3d0eXBlIGVuZHMg dXAgaW4gdGhlIGlucGNiLT5pbnBfZmxvd2lkIC8NCj4gaW5wY2ItPmlucF9mbG93dHlwZSBhcyBw YXJ0IG9mIHRoZSBUQ1AgcmVjZWl2ZSBwYXRoLg0KPg0KPiBUaGVuIHdoZW5ldmVyIHRoZSBUQ1Ag Y29kZSBvdXRwdXRzIGFuIG1idWYsIGl0IGNvcGllcyB0aGUgaW5wY2IgZmxvdyBkZXRhaWxzIG91 dCB0byBvdXRib3VuZCBtYnVmcy4NCj4NCj4+DQo+PiBGb3IgdGhlIEh5cGVyLVYgY2FzZSwgdGhl IGhvc3QgY29udHJvbHMgd2hpY2ggdkNQVSBpdCB3YW50cyB0byBpbnRlcnJ1cHQuIEFuZCB0aGUg cnVsZSBjYW4gY2hhbmdlIGR5bmFtaWNhbGx5IGJhc2VkIG9uIHRoZSBsb2FkLiBGb3IgYSBub24t YnVzeSBWTSwgaG9zdCB3aWxsIHNlbmQgbW9zdCBwYWNrZXRzIHRvIHNhbWUgdkNQVSBmb3IgcG93 ZXIgc2F2aW5nIHB1cnBvc2UuIEZvciBhIGJ1c3kgVk0sIGhvc3Qgd2lsbCBkaXN0cmlidXRlIHRo ZSBwYWNrZXRzIGV2ZW5seSBhY3Jvc3MgYWxsIHZDUFVzLiBUaGlzIG1lYW5zIGhvc3QgY291bGQg Y2hhbmdlIHRoZSBSU1MgYnVja2V0IG1hcHBpbmcgZHluYW1pY2FsbHkuIEh5cGVyLVYgZG9lcyB0 aGlzIGJ5IHNlbmRpbmcgYSBtYXBwaW5nIHRhYmxlIHRvIFZNIHdoZW5ldmVyIHRoZSBpdCBuZWVk cyB1cGRhdGUuIFRoaXMgYWxzbyBtZWFucyB3ZSBjYW5ub3QgdXNlIEZyZWVCU0QncyBvd24gYnVj a2V0IG1hcHBpbmcgd2hpY2ggSSBiZWxpZXZlIGlzIGZpeGVkLiBBbHNvIEh5cGVyLVYgdXNlIGl0 cyBvd24gaGFzaCBrZXkuIFNvIGRvIHlvdSB0aGluayBpdCBpcyBwb3NzaWJsZSB3ZSBzdGlsbCB1 c2UgdGhlIGV4aXNpdGluZyBSU1MgaW5mcmFzdHJ1Y3R1cmUgYnVpbHQgaW4gRnJlZUJTRCBpbiB0 aGlzIHB1cnBvc2U/DQo+DQo+IEV2ZW50dWFsbHkuIERvaW5nIHJlYmFsYW5jaW5nIGluIFJTUyBp cyBvbiB0aGUgVE9ETyBsaXN0LCBhZnRlciBJIGdldCB0aGUgcmVzdCBvZiB0aGUgYmFzaWMgcGFj a2V0IGhhbmRsaW5nIC8gcm91dGluZyBkb25lLg0KPg0KPiBIb3cncyB2UlNTIG5vdGlmeSB0aGUg Vk0gdGhhdCB0aGUgbWFwcGluZyB0YWJsZSBoYXMgY2hhbmdlZD8gV2hhdCdzIHRoZSBmb3JtYXQg b2YgaXQgbG9vayBsaWtlPw0KPg0KPg0KPiAtYQ0K From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 06:53:53 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 006A96E for ; Wed, 13 Aug 2014 06:53:52 +0000 (UTC) Received: from mail-qc0-x231.google.com (mail-qc0-x231.google.com [IPv6:2607:f8b0:400d:c01::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B1E9F2807 for ; Wed, 13 Aug 2014 06:53:52 +0000 (UTC) Received: by mail-qc0-f177.google.com with SMTP id x13so3730869qcv.8 for ; Tue, 12 Aug 2014 23:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=iobyXx7ciaytSkmM39tqdARWPKfGz2skBMoCJyL3n/0=; b=tBz+KF6yVCDFF7jWZGfZ3R7rKQ1NAP74T8MusfJuHEl66glipXyzj4+DdplkoXZawu xAhiKk53FHcFE+ziOelFZLuCL3GSLzC21OC/RFdUAjB15hHTcPzljAAySn8TagXSSXlR oYF2EJotgSwALBmdOMv7e0KmL+3AcTqIBAM731R3SdtbZELJc4hB4lGgStC6zB7nTj3/ jQ9p/PGYAbsE/A02Z2u+aUJuqAHOjDLq3UCw4FCsB18hVieFSNQVwG13VimR2vtreJMu SfUnqAxXcij8PXdTpbLnjxR9q1JfbtrSZ/zOY5vq2uwc1v6aqXNlmpqj9uSZlE9IyhvY B1Bg== MIME-Version: 1.0 X-Received: by 10.224.36.4 with SMTP id r4mr3607613qad.69.1407912831912; Tue, 12 Aug 2014 23:53:51 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.39.139 with HTTP; Tue, 12 Aug 2014 23:53:51 -0700 (PDT) In-Reply-To: <78ee306703c9492aad278425745032c7@BY1PR0301MB0902.namprd03.prod.outlook.com> References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> <78ee306703c9492aad278425745032c7@BY1PR0301MB0902.namprd03.prod.outlook.com> Date: Tue, 12 Aug 2014 23:53:51 -0700 X-Google-Sender-Auth: KRtNOIgaTp782tDRH-uI3h66zmE Message-ID: Subject: Re: vRSS support on FreeBSD From: Adrian Chadd To: Wei Hu Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 06:53:53 -0000 On 12 August 2014 23:27, Wei Hu wrote: > Unfortunately I am not aware of any official spec. Damn! Ok, so when they send the update for the redirect table, what's that look like? How do they send it? > The driver for Linux guest OS does the similar thing for the tx path -- receive the send table update from host, calculate the hash value for each tx packet and use it to find the tx queue from the send table. > > I think the Windows guests do the same. I wonder if there's any possibility for getting that added to the receive packet descriptor if the underlying hardware supports it - or well, if the OS has done it already. -a From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 07:24:15 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ECD99951; Wed, 13 Aug 2014 07:24:14 +0000 (UTC) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2lp0207.outbound.protection.outlook.com [207.46.163.207]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2BB2F2C88; Wed, 13 Aug 2014 07:24:12 +0000 (UTC) Received: from BY1PR0301MB0902.namprd03.prod.outlook.com (25.160.195.141) by BY1PR0301MB0901.namprd03.prod.outlook.com (25.160.195.140) with Microsoft SMTP Server (TLS) id 15.0.1005.10; Wed, 13 Aug 2014 07:24:08 +0000 Received: from BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) by BY1PR0301MB0902.namprd03.prod.outlook.com ([25.160.195.141]) with mapi id 15.00.1005.008; Wed, 13 Aug 2014 07:24:08 +0000 From: Wei Hu To: Adrian Chadd Subject: RE: vRSS support on FreeBSD Thread-Topic: vRSS support on FreeBSD Thread-Index: AQHPs0Bb+mxjwe28tEGUfZYiw4SKJJvLJUyggACYAACAAjAAYIAAJ7GAgAACGnCAAAkWgIAAAoJw Date: Wed, 13 Aug 2014 07:24:06 +0000 Message-ID: References: <184b69414bd246eeacc0d4234a730f2f@BY1PR0301MB0902.namprd03.prod.outlook.com> <78ee306703c9492aad278425745032c7@BY1PR0301MB0902.namprd03.prod.outlook.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [167.220.232.169] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 0302D4F392 x-forefront-antispam-report: SFV:NSPM; SFS:(6009001)(189002)(52314003)(13464003)(51704005)(54094003)(24454002)(199003)(377454003)(79102001)(74502001)(87936001)(2656002)(92566001)(86362001)(86612001)(105586002)(106356001)(95666004)(99286002)(20776003)(33646002)(83072002)(21056001)(85306004)(66066001)(76482001)(77982001)(81342001)(74316001)(76576001)(80022001)(50986999)(93886004)(74662001)(76176999)(81542001)(64706001)(107046002)(31966008)(83322001)(19580395003)(106116001)(46102001)(19580405001)(85852003)(110136001)(101416001)(54356999)(4396001)(99396002)(108616004)(77096002)(24736002); DIR:OUT; SFP:; SCL:1; SRVR:BY1PR0301MB0901; H:BY1PR0301MB0902.namprd03.prod.outlook.com; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-OriginatorOrg: microsoft.onmicrosoft.com Cc: "freebsd-net@freebsd.org" , "d@delphij.net" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 07:24:15 -0000 VGhlIHRhYmxlIGlzIHNlbnQgdGhyb3VnaCBhbiBvdXQgb2YgYmFuZCBtZXNzYWdlIHdpdGggYSBz cGVjaWZpYyBoZWFkIHRvIHRoZSBzeW50aGV0aWMgTklDLiBUaGUgZHJpdmVyIG9uIHRoZSBndWVz dCBzaWRlIHVuZGVyc3RhbmRzIGl0IGFuZCB0cmVhdHMgaXQgYXMgYSB1cGRhdGUgb2Ygc2VuZCB0 YWJsZS4NCg0KRG8geW91IG1lYW4gdGhlIGhvc3QgT1MgcHV0IHRoZSB0eCBxdWV1ZSBzZWxlY3Rp b24gZGlyZWN0bHkgb24gdGhlIHJlY2VpdmUgcGFja2V0IGRlc2NyaXB0b3I/IEl0IGlzIGEgdmVy eSBnb29kIGlkZWEuIFRoZSBkcml2ZXIgZG9lc24ndCBuZWVkIHRvIGNhbGN1bGF0ZSB0aGUgaGFz aCAtLSBpdCBhbHJlYWR5IGhhcyB0aGUgdHggcXVldWUgbnVtYmVyIHNhdmVkLiBUaGUgaG9zdCBz aWRlIG1ha2UgdGhlIHJlYmFsYW5jZSBkZWNpc2lvbiBhbnl3YXksIHdoeSBqdXN0IHRlbGwgdGhl IGd1ZXN0IGRpcmVjdGx5PyAgRm9yIHRob3NlIHBhY2tldHMgdGhhdCBkb24ndCBoYXZlIHR4IHF1 ZWVuIG51bWJlciBpdCBzdGlsbCBjYW4gdXNlIHRoZSBzZW5kIHRhYmxlIG9yIGRlZmF1bHQgcXVl dWUuIA0KDQpXZWkNCg0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogYWRyaWFu LmNoYWRkQGdtYWlsLmNvbSBbbWFpbHRvOmFkcmlhbi5jaGFkZEBnbWFpbC5jb21dIE9uIEJlaGFs ZiBPZiBBZHJpYW4gQ2hhZGQNClNlbnQ6IFdlZG5lc2RheSwgQXVndXN0IDEzLCAyMDE0IDI6NTQg UE0NClRvOiBXZWkgSHUNCkNjOiBkQGRlbHBoaWoubmV0OyBmcmVlYnNkLW5ldEBmcmVlYnNkLm9y Zw0KU3ViamVjdDogUmU6IHZSU1Mgc3VwcG9ydCBvbiBGcmVlQlNEDQoNCk9uIDEyIEF1Z3VzdCAy MDE0IDIzOjI3LCBXZWkgSHUgPHdlaEBtaWNyb3NvZnQuY29tPiB3cm90ZToNCj4gVW5mb3J0dW5h dGVseSBJIGFtIG5vdCBhd2FyZSBvZiBhbnkgb2ZmaWNpYWwgc3BlYy4NCg0KRGFtbiENCg0KT2ss IHNvIHdoZW4gdGhleSBzZW5kIHRoZSB1cGRhdGUgZm9yIHRoZSByZWRpcmVjdCB0YWJsZSwgd2hh dCdzIHRoYXQgbG9vayBsaWtlPyBIb3cgZG8gdGhleSBzZW5kIGl0Pw0KDQo+IFRoZSBkcml2ZXIg Zm9yIExpbnV4IGd1ZXN0IE9TIGRvZXMgdGhlIHNpbWlsYXIgdGhpbmcgZm9yIHRoZSB0eCBwYXRo IC0tIHJlY2VpdmUgdGhlIHNlbmQgdGFibGUgdXBkYXRlIGZyb20gaG9zdCwgY2FsY3VsYXRlIHRo ZSBoYXNoIHZhbHVlIGZvciBlYWNoIHR4IHBhY2tldCBhbmQgdXNlIGl0IHRvIGZpbmQgdGhlIHR4 IHF1ZXVlIGZyb20gdGhlIHNlbmQgdGFibGUuDQo+DQo+IEkgdGhpbmsgdGhlIFdpbmRvd3MgZ3Vl c3RzIGRvIHRoZSBzYW1lLg0KDQpJIHdvbmRlciBpZiB0aGVyZSdzIGFueSBwb3NzaWJpbGl0eSBm b3IgZ2V0dGluZyB0aGF0IGFkZGVkIHRvIHRoZSByZWNlaXZlIHBhY2tldCBkZXNjcmlwdG9yIGlm IHRoZSB1bmRlcmx5aW5nIGhhcmR3YXJlIHN1cHBvcnRzIGl0IC0gb3Igd2VsbCwgaWYgdGhlIE9T IGhhcyBkb25lIGl0IGFscmVhZHkuDQoNCg0KDQoNCi1hDQo= From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 13:24:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 228256D5 for ; Wed, 13 Aug 2014 13:24:20 +0000 (UTC) Received: from nm14-vm1.bullet.mail.ne1.yahoo.com (nm14-vm1.bullet.mail.ne1.yahoo.com [98.138.91.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D4A712742 for ; Wed, 13 Aug 2014 13:24:19 +0000 (UTC) Received: from [98.138.100.115] by nm14.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 13:24:13 -0000 Received: from [98.138.89.174] by tm106.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 13:24:13 -0000 Received: from [127.0.0.1] by omp1030.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 13:24:13 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 77412.94810.bm@omp1030.mail.ne1.yahoo.com Received: (qmail 51861 invoked by uid 60001); 13 Aug 2014 13:24:13 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1407936253; bh=3McBEEL9X3Hy/T9F1YipzegV12ZRPJiOl/Yj7tHRK3g=; h=References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=Sojeqn/kdvwWk5q3kpa/7WNh8zssILv2fyjFl+DUtnxpTZn9S/xs7Rh/3+MSBo9nrZY9sbwnFw6wzPAX+uBVxbEyVbL5a2FtoTmDPiK+7HokNAFFJBjZ3kTlAfvtLzB0j4xoLSJNRtDzBmnov8mW+hiT+YZb4Dm8bE0ccqZ0w1Q= X-YMail-OSG: w.6XhRoVM1mt4JmSHCwNQlIG0dlCuumgkGZtMPGxX_uL4mh 6cmiEkYPUe38YmZBxyNZwLs5459OuLorAywxi9aSZ8hLZ4fCAU8dgi_0hpua 3DAZ8AUyfUWyi0sS6BcS.uAFvbaIfLRZqgDM4RhgYPWrX04sEcbsGxHuYgj5 GRZMYHxH6vIGQLLCfSLfFcV.2cm5ZU1SqqXBLpKnSwuvvrOvOZdzNEAWUN.2 sR9tb8_H7Znd2xfVGT5eif38JvHvasXBalLz0xPYi2S3I8pjbSzGRHDQGyUe L2H6NL1xeQNcsAv.G8T93GiDf5xVsEnDcnmXNLwesya3MEO3mFmmi.GYRkkU LeicBdERL6XaWJ5kPfF.dFF2Et8wZK0lOSR4vcOYWSByaKqWVfs2Djlo9iE5 8vhDTcEbzZOGTUljR7v3HxGY48hMw4Laivl8q1Hu.Vjsl0ezxEJAewo9h.6D kTcZmksXWXBl8_1YHwSWvhDMMlIQrRAzm7o0DU1FA1nuZvLWd5LVmEEMZ7GP TwYjWszRJkoAlespLykiGONCBZRTMG4sFrddm4X9s2OPYBlshHgEePeyv_Kv S5_ZdScMO7RI4OGIITlv8pzuvLQYPGVzsD4khSwIug_xNVORmm4lxRYkG97B RQmcIcejesdbRhdhZymKKmp_p_lnFhv5S8DaCqulyAvIdwA-- Received: from [76.108.181.232] by web121601.mail.ne1.yahoo.com via HTTP; Wed, 13 Aug 2014 06:24:12 PDT X-Rocket-MIMEInfo: 002.001, T2suIEl0IHdhcyBhIGxvdCBtb3JlIGNvbnZlbmllbnQgd2hlbiBpdCB3YXMgYSBzdGFuZGFsb25lIG1vZHVsZS90YXJiYWxsIHNvIHlvdSBkaWRuJ3QgaGF2ZSB0byBzdXJnaWNhbGx5IGV4dHJhY3QgaXQgZnJvbSB0aGUgdHJlZSBhbmQgc3BlbmQgYSB3ZWVrIHRyeWluZyB0byBnZXQgaXQgdG8gY29tcGlsZSB3aXRoIHdoYXRldmVyIHZlcnNpb24geW91IGhhcHBlbmVkIHRvIGJlIHJ1bm5pbmcuIFNvIGlmIHlvdSdyZSBydW5uaW5nIDkuMSBvciA5LjIgeW91IGNvdWxkIHN0aWxsIHVzZSBpdCBzZWFtbGVzc2wBMAEBAQE- X-Mailer: YahooMailWebService/0.8.201.700 References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> Message-ID: <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> Date: Wed, 13 Aug 2014 06:24:12 -0700 From: Barney Cordoba Reply-To: Barney Cordoba Subject: Re: Intel Support for FreeBSD To: Mike Tancsa , "freebsd-net@freebsd.org" In-Reply-To: <53EAC5E8.2050207@sentex.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 13:24:20 -0000 Ok. It was a lot more convenient when it was a standalone module/tarball so= you didn't have to surgically extract it from the tree and spend a week tr= ying to get it to compile with whatever version you happened to be running.= So if you're running 9.1 or 9.2 you could still use it seamlessly.=A0=0A= =0ANegative Progress is inevitable.=A0=0A=0ABC=0A=0A=0AOn Tuesday, August 1= 2, 2014 9:57 PM, Mike Tancsa wrote:=0A =0A=0A=0AOn 8/12/2= 014 9:16 PM, Barney Cordoba via freebsd-net wrote:=0A=0A> I notice that the= re hasn't been an update in the Intel Download Center since July. Is there = no official support for 10?=0A=0AHi,=0AThe latest code is committed directl= y into the tree by Intel=0A=0Aeg=0Ahttp://lists.freebsd.org/pipermail/svn-s= rc-head/2014-July/060947.html=0Aand=0Ahttp://lists.freebsd.org/pipermail/sv= n-src-head/2014-June/059904.html=0A=0AThey have been MFC'd to RELENG_10 a f= ew weeks ago=0A=0A=0A=A0=A0=A0 ---Mike=0A=0A=0A-- =0A-------------------=0A= Mike Tancsa, tel +1 519 651 3400=0ASentex Communications, mike@sentex.net= =0AProviding Internet services since 1994 www.sentex.net=0ACambridge, Ontar= io Canada=A0 http://www.tancsa.com/=0A_____________________________________= __________=0Afreebsd-net@freebsd.org mailing list=0Ahttp://lists.freebsd.or= g/mailman/listinfo/freebsd-net=0ATo unsubscribe, send any mail to "freebsd-= net-unsubscribe@freebsd.org=0A" From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 15:48:54 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CC177527 for ; Wed, 13 Aug 2014 15:48:54 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B42402848 for ; Wed, 13 Aug 2014 15:48:54 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s7DFmsMi046470 for ; Wed, 13 Aug 2014 15:48:54 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bug 187341] [netinet] [patch] CARP addresses in backup state should't be used as source Date: Wed, 13 Aug 2014 15:48:54 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: commit-hook@freebsd.org X-Bugzilla-Status: In Discussion X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 15:48:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187341 --- Comment #5 from commit-hook@freebsd.org --- A commit references this bug: Author: ae Date: Wed Aug 13 15:48:10 UTC 2014 New revision: 269944 URL: http://svnweb.freebsd.org/changeset/base/269944 Log: MFC r269306: Add new rule to source address selection algorithm. It prefers address with better virtual status. Use ifa_preferred() to choose better address. PR: 187341 Changes: _U stable/10/ stable/10/sys/netinet6/in6_src.c -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 18:03:48 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 156073BA for ; Wed, 13 Aug 2014 18:03:48 +0000 (UTC) Received: from SMTP.CITRIX.COM (smtp.citrix.com [66.165.176.89]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mail.citrix.com", Issuer "Cybertrust Public SureServer SV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C2DBC291A for ; Wed, 13 Aug 2014 18:03:45 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,857,1400025600"; d="scan'208";a="161448659" Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.78) with Microsoft SMTP Server id 14.3.181.6; Wed, 13 Aug 2014 14:02:32 -0400 Message-ID: <53EBA837.1080607@citrix.com> Date: Wed, 13 Aug 2014 20:02:31 +0200 From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Subject: bce driver errors with if_bridge and dhcp Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-DLP: MIA1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 18:03:48 -0000 Hello, While trying to setup a bridge using if_bridge with a single bce interface I've hit the following error on 10.0-RELEASE (it doesn't happen all the times): NMI ISA 20, EISA ff NMI ISA 30, EISA ff NMI ISA 20, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ISA 20, EISA ff [...] bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) See below for a full verbose log. I'm using the following network configuration: cloned_interfaces="bridge0" ifconfig_bridge0="addm bce0 up" ifconfig_bce0="DHCP" Full verbose dmesg: SMAP type=01 base=0000000000000000 len=00000000000a0000 SMAP type=01 base=0000000000100000 len=00000000bf599000 SMAP type=02 base=00000000bf699000 len=0000000000016000 SMAP type=03 base=00000000bf6af000 len=000000000001f000 SMAP type=02 base=00000000bf6ce000 len=0000000000932000 SMAP type=02 base=00000000e0000000 len=0000000010000000 SMAP type=02 base=00000000fe000000 len=0000000002000000 SMAP type=01 base=0000000100000000 len=0000000740000000 Table 'FACP' at 0xbf6c3f9c Table 'APIC' at 0xbf6c3478 APIC: Found table at 0xbf6c3478 APIC: Using the MADT enumerator. MADT: Found CPU APIC ID 16 ACPI ID 1: enabled SMP: Added CPU 16 (AP) MADT: Found CPU APIC ID 0 ACPI ID 2: enabled SMP: Added CPU 0 (AP) MADT: Found CPU APIC ID 18 ACPI ID 3: enabled SMP: Added CPU 18 (AP) MADT: Found CPU APIC ID 2 ACPI ID 4: enabled SMP: Added CPU 2 (AP) MADT: Found CPU APIC ID 20 ACPI ID 5: enabled SMP: Added CPU 20 (AP) MADT: Found CPU APIC ID 4 ACPI ID 6: enabled SMP: Added CPU 4 (AP) MADT: Found CPU APIC ID 22 ACPI ID 7: enabled SMP: Added CPU 22 (AP) MADT: Found CPU APIC ID 6 ACPI ID 8: enabled SMP: Added CPU 6 (AP) MADT: Found CPU APIC ID 17 ACPI ID 9: enabled SMP: Added CPU 17 (AP) MADT: Found CPU APIC ID 1 ACPI ID 10: enabled SMP: Added CPU 1 (AP) MADT: Found CPU APIC ID 19 ACPI ID 11: enabled SMP: Added CPU 19 (AP) MADT: Found CPU APIC ID 3 ACPI ID 12: enabled SMP: Added CPU 3 (AP) MADT: Found CPU APIC ID 21 ACPI ID 13: enabled SMP: Added CPU 21 (AP) MADT: Found CPU APIC ID 5 ACPI ID 14: enabled SMP: Added CPU 5 (AP) MADT: Found CPU APIC ID 23 ACPI ID 15: enabled SMP: Added CPU 23 (AP) MADT: Found CPU APIC ID 7 ACPI ID 16: enabled SMP: Added CPU 7 (AP) MADT: Found CPU APIC ID 48 ACPI ID 17: disabled MADT: Found CPU APIC ID 49 ACPI ID 18: disabled MADT: Found CPU APIC ID 50 ACPI ID 19: disabled MADT: Found CPU APIC ID 51 ACPI ID 20: disabled MADT: Found CPU APIC ID 52 ACPI ID 21: disabled MADT: Found CPU APIC ID 53 ACPI ID 22: disabled MADT: Found CPU APIC ID 54 ACPI ID 23: disabled MADT: Found CPU APIC ID 55 ACPI ID 24: disabled MADT: Found CPU APIC ID 56 ACPI ID 25: disabled MADT: Found CPU APIC ID 57 ACPI ID 26: disabled MADT: Found CPU APIC ID 58 ACPI ID 27: disabled MADT: Found CPU APIC ID 59 ACPI ID 28: disabled MADT: Found CPU APIC ID 60 ACPI ID 29: disabled MADT: Found CPU APIC ID 61 ACPI ID 30: disabled MADT: Found CPU APIC ID 62 ACPI ID 31: disabled MADT: Found CPU APIC ID 63 ACPI ID 32: disabled Copyright (c) 1992-2014 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610 Preloaded elf kernel "/boot/kernel/kernel" at 0xffffffff81a34000. Preloaded elf obj module "/boot/kernel/zfs.ko" at 0xffffffff81a34c58. Preloaded elf obj module "/boot/kernel/opensolaris.ko" at 0xffffffff81a35380. Calibrating TSC clock ... TSC clock: 2261044324 Hz CPU: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (2261.04-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x106a5 Family = 0x6 Model = 0x1a Stepping = 5 Features=0xbfebfbff Features2=0x9ce3bd AMD Features=0x28100800 AMD Features2=0x1 TSC: P-state invariant, performance statistics real memory = 34359738368 (32768 MB) Physical memory chunk(s): 0x0000000000010000 - 0x000000000009bfff, 573440 bytes (140 pages) 0x0000000000100000 - 0x00000000001fffff, 1048576 bytes (256 pages) 0x0000000001a7b000 - 0x00000000bf698fff, 3183599616 bytes (777246 pages) 0x0000000100000000 - 0x000000080a398fff, 30236315648 bytes (7381913 pages) avail memory = 33283719168 (31741 MB) Event timer "LAPIC" quality 400 ACPI APIC Table: INTR: Adding local APIC 17 as a target INTR: Adding local APIC 18 as a target INTR: Adding local APIC 19 as a target INTR: Adding local APIC 20 as a target INTR: Adding local APIC 21 as a target INTR: Adding local APIC 22 as a target INTR: Adding local APIC 23 as a target INTR: Adding local APIC 0 as a target INTR: Adding local APIC 1 as a target INTR: Adding local APIC 2 as a target INTR: Adding local APIC 3 as a target INTR: Adding local APIC 4 as a target INTR: Adding local APIC 5 as a target INTR: Adding local APIC 6 as a target INTR: Adding local APIC 7 as a target FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads cpu0 (BSP): APIC ID: 16 cpu1 (AP): APIC ID: 17 cpu2 (AP): APIC ID: 18 cpu3 (AP): APIC ID: 19 cpu4 (AP): APIC ID: 20 cpu5 (AP): APIC ID: 21 cpu6 (AP): APIC ID: 22 cpu7 (AP): APIC ID: 23 cpu8 (AP): APIC ID: 0 cpu9 (AP): APIC ID: 1 cpu10 (AP): APIC ID: 2 cpu11 (AP): APIC ID: 3 cpu12 (AP): APIC ID: 4 cpu13 (AP): APIC ID: 5 cpu14 (AP): APIC ID: 6 cpu15 (AP): APIC ID: 7 APIC: CPU 0 has ACPI ID 1 APIC: CPU 1 has ACPI ID 9 APIC: CPU 2 has ACPI ID 3 APIC: CPU 3 has ACPI ID 11 APIC: CPU 4 has ACPI ID 5 APIC: CPU 5 has ACPI ID 13 APIC: CPU 6 has ACPI ID 7 APIC: CPU 7 has ACPI ID 15 APIC: CPU 8 has ACPI ID 2 APIC: CPU 9 has ACPI ID 10 APIC: CPU 10 has ACPI ID 4 APIC: CPU 11 has ACPI ID 12 APIC: CPU 12 has ACPI ID 6 APIC: CPU 13 has ACPI ID 14 APIC: CPU 14 has ACPI ID 8 APIC: CPU 15 has ACPI ID 16 XEN: CPU 0 has VCPU ID 1 XEN: CPU 1 has VCPU ID 9 XEN: CPU 2 has VCPU ID 3 XEN: CPU 3 has VCPU ID 11 XEN: CPU 4 has VCPU ID 5 XEN: CPU 5 has VCPU ID 13 XEN: CPU 6 has VCPU ID 7 XEN: CPU 7 has VCPU ID 15 XEN: CPU 8 has VCPU ID 2 XEN: CPU 9 has VCPU ID 10 XEN: CPU 10 has VCPU ID 4 XEN: CPU 11 has VCPU ID 12 XEN: CPU 12 has VCPU ID 6 XEN: CPU 13 has VCPU ID 14 XEN: CPU 14 has VCPU ID 8 XEN: CPU 15 has VCPU ID 16 lapic16: CMCI unmasked x86bios: IVT 0x000000-0x0004ff at 0xfffff80000000000 x86bios: SSEG 0x098000-0x098fff at 0xfffffe083dbc4000 x86bios: EBDA 0x09e000-0x09ffff at 0xfffff8000009e000 x86bios: ROM 0x0a0000-0x0fefff at 0xfffff800000a0000 random device not loaded; using insecure entropy ULE: setup cpu 0 ULE: setup cpu 1 ULE: setup cpu 2 ULE: setup cpu 3 ULE: setup cpu 4 ULE: setup cpu 5 ULE: setup cpu 6 ULE: setup cpu 7 ULE: setup cpu 8 ULE: setup cpu 9 ULE: setup cpu 10 ULE: setup cpu 11 ULE: setup cpu 12 ULE: setup cpu 13 ULE: setup cpu 14 ULE: setup cpu 15 ACPI: RSDP 0xf1630 00024 (v02 DELL ) ACPI: XSDT 0xf1734 0009C (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: FACP 0xbf6c3f9c 000F4 (v03 DELL PE_SC3 00000001 DELL 00000001) ACPI: DSDT 0xbf6af000 0320F (v01 DELL PE_SC3 00000001 INTL 20050624) ACPI: FACS 0xbf6c6000 00040 ACPI: APIC 0xbf6c3478 0015E (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: SPCR 0xbf6c35d8 00050 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: HPET 0xbf6c362c 00038 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: DMAR 0xbf6c3668 001C0 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: MCFG 0xbf6c38c4 0003C (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: WD__ 0xbf6c3904 00134 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: SLIC 0xbf6c3a3c 00024 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: ERST 0xbf6b2390 00270 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: HEST 0xbf6b2600 0027C (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: BERT 0xbf6b2210 00030 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: EINJ 0xbf6b2240 00150 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: SRAT 0xbf6c3bc0 00370 (v01 DELL PE_SC3 00000001 DELL 00000001) ACPI: TCPA 0xbf6c3f34 00064 (v02 DELL PE_SC3 00000001 DELL 00000001) ACPI: SSDT 0xbf6c7000 04194 (v01 INTEL PPM RCM 80000001 INTL 20061109) MADT: Found IO APIC ID 0, Interrupt 0 at 0xfec00000 ioapic0: Routing external 8259A's -> intpin 0 MADT: Found IO APIC ID 1, Interrupt 32 at 0xfec80000 ioapic1: Changing APIC ID to 1 ioapic1: WARNING: intbase 32 != expected base 24 lapic: Routing NMI -> LINT1 lapic: LINT1 trigger: edge lapic: LINT1 polarity: high MADT: Interrupt override: source 0, irq 2 lapic: LINT1 trigger: edge lapic: LINT1 polarity: high MADT: Interrupt override: source 0, irq 2 ioapic0: Routing IRQ 0 -> intpin 2 MADT: Interrupt override: source 9, irq 9 ioapic0: intpin 9 trigger: level ioapic0 irqs 0-23 on motherboard ioapic1 irqs 32-55 on motherboard cpu0 BSP: ID: 0x10000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 eder_rate_max=201wlan: <802.11 Link Layer> snd_unit_init() u=0x00ff8000 [512] d=0x00007c00 [32] c=0x000003ff [1024] feeder_register: snd_unit=-1 snd_maxautovchans=16 latency=5 feeder_rate_min=1 fe eder_rate_max=2016000 feeder_rate_round=25 Hardware, Intel IvyBridge+ RNG: RDRAND is not present Hardware, VIA Nehemiah Padlock RNG: VIA Padlock RNG not present null: nfslock: pseudo-device VESA: information block 0000 56 45 53 41 00 03 1a 59 00 c0 01 00 00 00 87 55 0010 00 c0 80 00 08 03 2f 59 00 c0 36 59 00 c0 3f 59 0020 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0100 4d 61 74 72 6f 78 00 4d 47 41 2d 47 32 30 30 00 0110 30 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 VESA: 21 mode(s) found VESA: v3.0, 8192k memory, flags:0x1, mode table:0xfffff800000c5587 (c0005587) VESA: Matrox Graphics Inc. VESA: Matrox MGA-G200 00 io: VMBUS: load kbd: new array size 4 kbd1 at kbdmux0 mem: hptnr: R750/DC7280 controller driver v1.0 hpt27xx: RocketRAID 27xx controller driver v1.1 hptrr: RocketRAID 17xx/2xxx SATA controller driver v1.2 acpi0: on motherboard ACPI: All ACPI Tables successfully acquired PCIe: Memory Mapped configuration base @ 0xe0000000 ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 16 vector 48 acpi0: Power Button (fixed) cpu0: Processor \_PR_.CPU1 (ACPI ID 1) -> APIC ID 0 cpu0: on acpi0 cpu1: Processor \_PR_.CPU2 (ACPI ID 2) -> APIC ID 8 cpu1: on acpi0 cpu2: Processor \_PR_.CPU3 (ACPI ID 3) -> APIC ID 2 cpu2: on acpi0 cpu3: Processor \_PR_.CPU4 (ACPI ID 4) -> APIC ID 10 cpu3: on acpi0 cpu4: Processor \_PR_.CPU5 (ACPI ID 5) -> APIC ID 4 cpu4: on acpi0 cpu5: Processor \_PR_.CPU6 (ACPI ID 6) -> APIC ID 12 cpu5: on acpi0 cpu6: Processor \_PR_.CPU7 (ACPI ID 7) -> APIC ID 6 cpu6: on acpi0 cpu7: Processor \_PR_.CPU8 (ACPI ID 8) -> APIC ID 14 cpu7: on acpi0 cpu8: Processor \_PR_.CPU9 (ACPI ID 9) -> APIC ID 1 cpu8: on acpi0 cpu9: Processor \_PR_.CPUA (ACPI ID 10) -> APIC ID 9 cpu9: on acpi0 cpu10: Processor \_PR_.CPUB (ACPI ID 11) -> APIC ID 3 cpu10: on acpi0 cpu11: Processor \_PR_.CPUC (ACPI ID 12) -> APIC ID 11 cpu11: on acpi0 cpu12: Processor \_PR_.CPUD (ACPI ID 13) -> APIC ID 5 cpu12: on acpi0 cpu13: Processor \_PR_.CPUE (ACPI ID 14) -> APIC ID 13 cpu13: on acpi0 cpu14: Processor \_PR_.CPUF (ACPI ID 15) -> APIC ID 7 cpu14: on acpi0 cpu15: Processor \_PR_.CPUG (ACPI ID 16) -> APIC ID 15 cpu15: on acpi0 ACPI: Processor \_PR_.CP17 (ACPI ID 17) ignored ACPI: Processor \_PR_.CP18 (ACPI ID 18) ignored ACPI: Processor \_PR_.CP19 (ACPI ID 19) ignored ACPI: Processor \_PR_.CP20 (ACPI ID 20) ignored ACPI: Processor \_PR_.CP21 (ACPI ID 21) ignored ACPI: Processor \_PR_.CP22 (ACPI ID 22) ignored ACPI: Processor \_PR_.CP23 (ACPI ID 23) ignored ACPI: Processor \_PR_.CP24 (ACPI ID 24) ignored ACPI: Processor \_PR_.CP25 (ACPI ID 25) ignored ACPI: Processor \_PR_.CP26 (ACPI ID 26) ignored ACPI: Processor \_PR_.CP27 (ACPI ID 27) ignored ACPI: Processor \_PR_.CP28 (ACPI ID 28) ignored ACPI: Processor \_PR_.CP29 (ACPI ID 29) ignored ACPI: Processor \_PR_.CP30 (ACPI ID 30) ignored ACPI: Processor \_PR_.CP31 (ACPI ID 31) ignored ACPI: Processor \_PR_.CP32 (ACPI ID 32) ignored atrtc0: port 0x70-0x7f irq 8 on acpi0 atrtc0: registered as a time-of-day clock (resolution 1000000us, adjustment 0.500000000s) ioapic0: routing intpin 8 (ISA IRQ 8) to lapic 16 vector 49 Event timer "RTC" frequency 32768 Hz quality 0 attimer0: port 0x40-0x5f irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 ioapic0: routing intpin 2 (ISA IRQ 0) to lapic 16 vector 50 Event timer "i8254" frequency 1193182 Hz quality 100 hpet0: iomem 0xfed00000-0xfed003ff on acpi0 hpet0: vendor 0x8086, rev 0x1, 14318180Hz 64bit, 4 timers, legacy route hpet0: t0: irqs 0x00f00000 (0), 64bit, periodic hpet0: t1: irqs 0x00f00000 (0) hpet0: t2: irqs 0x00f00800 (0) hpet0: t3: irqs 0x00f01000 (0) Timecounter "HPET" frequency 14318180 Hz quality 950 ioapic0: routing intpin 20 (PCI IRQ 20) to lapic 16 vector 51 Event timer "HPET" frequency 14318180 Hz quality 350 Event timer "HPET1" frequency 14318180 Hz quality 340 Event timer "HPET2" frequency 14318180 Hz quality 340 Event timer "HPET3" frequency 14318180 Hz quality 340 ACPI timer: 1/0 1/0 1/0 1/0 1/0 1/0 1/0 0/0 1/0 1/0 -> 9 Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pci_link0: Index IRQ Rtd Ref IRQs Initial Probe 0 15 N 0 3 4 5 6 7 10 11 14 15 Validation 0 15 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link1: Index IRQ Rtd Ref IRQs Initial Probe 0 14 N 0 3 4 5 6 7 10 11 14 15 Validation 0 14 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link2: Index IRQ Rtd Ref IRQs Initial Probe 0 11 N 0 3 4 5 6 7 10 11 14 15 Validation 0 11 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link3: Index IRQ Rtd Ref IRQs Initial Probe 0 10 N 0 3 4 5 6 7 10 11 14 15 Validation 0 10 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link4: Index IRQ Rtd Ref IRQs Initial Probe 0 5 N 0 3 4 5 6 7 10 11 14 15 Validation 0 5 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link5: Index IRQ Rtd Ref IRQs Initial Probe 0 6 N 0 3 4 5 6 7 10 11 14 15 Validation 0 6 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link6: Index IRQ Rtd Ref IRQs Initial Probe 0 255 N 0 3 4 5 6 7 10 11 14 15 Validation 0 255 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pci_link7: Index IRQ Rtd Ref IRQs Initial Probe 0 14 N 0 3 4 5 6 7 10 11 14 15 Validation 0 14 N 0 3 4 5 6 7 10 11 14 15 After Disable 0 255 N 0 3 4 5 6 7 10 11 14 15 pcib0: port 0xcf8-0xcff on acpi0 pcib0: decoding 4 range 0-0xcf7 pcib0: decoding 4 range 0xd00-0xffff pcib0: decoding 3 range 0xa0000-0xbffff pcib0: decoding 3 range 0xc0000000-0xfdffffff pcib0: decoding 3 range 0xfed40000-0xfed44fff pci0: on pcib0 pci0: domain=0, physical bus=0 found-> vendor=0x8086, dev=0x3406, revid=0x13 domain=0, bus=0, slot=0, func=0 class=06-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0000, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=15 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks pcib0: matched entry for 0.0.INTA pcib0: slot 0 INTA hardwired to IRQ 53 found-> vendor=0x8086, dev=0x3408, revid=0x13 domain=0, bus=0, slot=1, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x340a, revid=0x13 domain=0, bus=0, slot=3, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x340b, revid=0x13 domain=0, bus=0, slot=4, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x340c, revid=0x13 domain=0, bus=0, slot=5, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x340d, revid=0x13 domain=0, bus=0, slot=6, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x340e, revid=0x13 domain=0, bus=0, slot=7, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x3410, revid=0x13 domain=0, bus=0, slot=9, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) intpin=a, irq=255 powerspec 3 supports D0 D3 current D0 MSI supports 2 messages, vector masks found-> vendor=0x8086, dev=0x342e, revid=0x13 domain=0, bus=0, slot=20, func=0 class=08-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0000, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x3422, revid=0x13 domain=0, bus=0, slot=20, func=1 class=08-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0000, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x3423, revid=0x13 domain=0, bus=0, slot=20, func=2 class=08-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0000, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2937, revid=0x02 domain=0, bus=0, slot=26, func=0 class=0c-03-00, hdrtype=0x00, mfdev=1 cmdreg=0x0005, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=14 map[20]: type I/O Port, range 32, base 0xcc40, size 5, enabled pcib0: allocated type 4 (0xcc40-0xcc5f) for rid 20 of pci0:0:26:0 pcib0: matched entry for 0.26.INTA pcib0: slot 26 INTA hardwired to IRQ 17 found-> vendor=0x8086, dev=0x2938, revid=0x02 domain=0, bus=0, slot=26, func=1 class=0c-03-00, hdrtype=0x00, mfdev=0 cmdreg=0x0005, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=11 map[20]: type I/O Port, range 32, base 0xcc60, size 5, enabled pcib0: allocated type 4 (0xcc60-0xcc7f) for rid 20 of pci0:0:26:1 pcib0: matched entry for 0.26.INTB pcib0: slot 26 INTB hardwired to IRQ 18 found-> vendor=0x8086, dev=0x293c, revid=0x02 domain=0, bus=0, slot=26, func=7 class=0c-03-20, hdrtype=0x00, mfdev=0 cmdreg=0x0146, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=c, irq=10 powerspec 2 supports D0 D3 current D0 map[10]: type Memory, range 32, base 0xdfcff800, size 10, enabled pcib0: allocated type 3 (0xdfcff800-0xdfcffbff) for rid 10 of pci0:0:26:7 pcib0: matched entry for 0.26.INTC pcib0: slot 26 INTC hardwired to IRQ 19 ehci early: SMM active, request owner change found-> vendor=0x8086, dev=0x2934, revid=0x02 domain=0, bus=0, slot=29, func=0 class=0c-03-00, hdrtype=0x00, mfdev=1 cmdreg=0x0005, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=6 map[20]: type I/O Port, range 32, base 0xcc80, size 5, enabled pcib0: allocated type 4 (0xcc80-0xcc9f) for rid 20 of pci0:0:29:0 pcib0: matched entry for 0.29.INTA pcib0: slot 29 INTA hardwired to IRQ 21 found-> vendor=0x8086, dev=0x2935, revid=0x02 domain=0, bus=0, slot=29, func=1 class=0c-03-00, hdrtype=0x00, mfdev=0 cmdreg=0x0005, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=5 map[20]: type I/O Port, range 32, base 0xcca0, size 5, enabled pcib0: allocated type 4 (0xcca0-0xccbf) for rid 20 of pci0:0:29:1 pcib0: matched entry for 0.29.INTB pcib0: slot 29 INTB hardwired to IRQ 20 found-> vendor=0x8086, dev=0x293a, revid=0x02 domain=0, bus=0, slot=29, func=7 class=0c-03-20, hdrtype=0x00, mfdev=0 cmdreg=0x0146, statreg=0x0290, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=6 powerspec 2 supports D0 D3 current D0 map[10]: type Memory, range 32, base 0xdfcffc00, size 10, enabled pcib0: allocated type 3 (0xdfcffc00-0xdfcfffff) for rid 10 of pci0:0:29:7 pcib0: matched entry for 0.29.INTA pcib0: slot 29 INTA hardwired to IRQ 21 found-> vendor=0x8086, dev=0x244e, revid=0x92 domain=0, bus=0, slot=30, func=0 class=06-04-01, hdrtype=0x01, mfdev=0 cmdreg=0x0147, statreg=0x0010, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x0b (2750 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2918, revid=0x02 domain=0, bus=0, slot=31, func=0 class=06-01-00, hdrtype=0x00, mfdev=1 cmdreg=0x0147, statreg=0x0210, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2921, revid=0x02 domain=0, bus=0, slot=31, func=2 class=01-01-8f, hdrtype=0x00, mfdev=0 cmdreg=0x0047, statreg=0x02b0, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=c, irq=14 powerspec 3 supports D0 D3 current D0 map[10]: type I/O Port, range 32, base 0xcc10, size 3, enabled pcib0: allocated type 4 (0xcc10-0xcc17) for rid 10 of pci0:0:31:2 map[14]: type I/O Port, range 32, base 0xcc08, size 2, enabled pcib0: allocated type 4 (0xcc08-0xcc0b) for rid 14 of pci0:0:31:2 map[18]: type I/O Port, range 32, base 0xcc18, size 3, enabled pcib0: allocated type 4 (0xcc18-0xcc1f) for rid 18 of pci0:0:31:2 map[1c]: type I/O Port, range 32, base 0xcc0c, size 2, enabled pcib0: allocated type 4 (0xcc0c-0xcc0f) for rid 1c of pci0:0:31:2 map[20]: type I/O Port, range 32, base 0xcc20, size 4, enabled pcib0: allocated type 4 (0xcc20-0xcc2f) for rid 20 of pci0:0:31:2 map[24]: type I/O Port, range 32, base 0xcc30, size 4, enabled pcib0: allocated type 4 (0xcc30-0xcc3f) for rid 24 of pci0:0:31:2 pcib0: matched entry for 0.31.INTC pcib0: slot 31 INTC hardwired to IRQ 23 pcib1: at device 1.0 on pci0 pcib0: allocated type 3 (0xd6000000-0xd9ffffff) for rid 20 of pcib1 pcib1: domain 0 pcib1: secondary bus 1 pcib1: subordinate bus 1 pcib1: memory decode 0xd6000000-0xd9ffffff pcib1: special decode ISA pci1: on pcib1 pci1: domain=0, physical bus=1 found-> vendor=0x14e4, dev=0x1639, revid=0x20 domain=0, bus=1, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=15 powerspec 3 supports D0 D3 current D0 MSI supports 16 messages, 64 bit MSI-X supports 9 messages in map 0x10 map[10]: type Memory, range 64, base 0xd6000000, size 25, enabled pcib1: allocated memory range (0xd6000000-0xd7ffffff) for rid 10 of pci0:1:0:0 pcib1: matched entry for 1.0.INTA pcib1: slot 0 INTA hardwired to IRQ 36 found-> vendor=0x14e4, dev=0x1639, revid=0x20 domain=0, bus=1, slot=0, func=1 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=14 powerspec 3 supports D0 D3 current D0 MSI supports 16 messages, 64 bit MSI-X supports 9 messages in map 0x10 map[10]: type Memory, range 64, base 0xd8000000, size 25, enabled pcib1: allocated memory range (0xd8000000-0xd9ffffff) for rid 10 of pci0:1:0:1 pcib1: matched entry for 1.0.INTB pcib1: slot 0 INTB hardwired to IRQ 48 bce0: mem 0xd6000000-0xd7ffffff irq 36 at device 0.0 on pci1 bce0: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 256 to local APIC 16 vector 52 bce0: using IRQ 256 for MSI miibus0: on bce0 brgphy0: PHY 1 on miibus0 brgphy0: OUI 0x000af7, model 0x003c, rev. 8 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bce0: bpf attached bce0: Ethernet address: 00:24:e8:39:bb:9b bce0: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (4.6.4); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 1.0.6) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1: mem 0xd8000000-0xd9ffffff irq 48 at device 0.1 on pci1 bce1: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 257 to local APIC 16 vector 53 bce1: using IRQ 257 for MSI miibus1: on bce1 brgphy1: PHY 1 on miibus1 brgphy1: OUI 0x000af7, model 0x003c, rev. 8 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bce1: bpf attached bce1: Ethernet address: 00:24:e8:39:bb:9d bce1: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (4.6.4); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 1.0.6) Coal (RX:6,6,18,18; TX:20,20,80,80) pcib2: at device 3.0 on pci0 pcib0: allocated type 3 (0xda000000-0xddffffff) for rid 20 of pcib2 pcib2: domain 0 pcib2: secondary bus 2 pcib2: subordinate bus 2 pcib2: memory decode 0xda000000-0xddffffff pcib2: special decode ISA pci2: on pcib2 pci2: domain=0, physical bus=2 found-> vendor=0x14e4, dev=0x1639, revid=0x20 domain=0, bus=2, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=15 powerspec 3 supports D0 D3 current D0 MSI supports 16 messages, 64 bit MSI-X supports 9 messages in map 0x10 map[10]: type Memory, range 64, base 0xda000000, size 25, enabled pcib2: allocated memory range (0xda000000-0xdbffffff) for rid 10 of pci0:2:0:0 pcib2: matched entry for 2.0.INTA pcib2: slot 0 INTA hardwired to IRQ 32 found-> vendor=0x14e4, dev=0x1639, revid=0x20 domain=0, bus=2, slot=0, func=1 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=14 powerspec 3 supports D0 D3 current D0 MSI supports 16 messages, 64 bit MSI-X supports 9 messages in map 0x10 map[10]: type Memory, range 64, base 0xdc000000, size 25, enabled pcib2: allocated memory range (0xdc000000-0xddffffff) for rid 10 of pci0:2:0:1 pcib2: matched entry for 2.0.INTB pcib2: slot 0 INTB hardwired to IRQ 42 bce2: mem 0xda000000-0xdbffffff irq 32 at device 0.0 on pci2 bce2: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 258 to local APIC 16 vector 54 bce2: using IRQ 258 for MSI miibus2: on bce2 brgphy2: PHY 1 on miibus2 brgphy2: OUI 0x000af7, model 0x003c, rev. 8 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bce2: bpf attached bce2: Ethernet address: 00:24:e8:39:bb:9f bce2: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (4.6.4); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 1.0.6) Coal (RX:6,6,18,18; TX:20,20,80,80) bce3: mem 0xdc000000-0xddffffff irq 42 at device 0.1 on pci2 bce3: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 259 to local APIC 16 vector 55 bce3: using IRQ 259 for MSI miibus3: on bce3 brgphy3: PHY 1 on miibus3 brgphy3: OUI 0x000af7, model 0x003c, rev. 8 brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bce3: bpf attached bce3: Ethernet address: 00:24:e8:39:bb:a1 bce3: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (4.6.4); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 1.0.6) Coal (RX:6,6,18,18; TX:20,20,80,80) pcib3: at device 4.0 on pci0 pcib3: allocating non-ISA range 0xf000-0xf0ff pcib0: allocated type 4 (0xf000-0xf0ff) for rid 1c of pcib3 pcib3: allocating non-ISA range 0xf400-0xf4ff pcib0: allocated type 4 (0xf400-0xf4ff) for rid 1c of pcib3 pcib3: allocating non-ISA range 0xf800-0xf8ff pcib0: allocated type 4 (0xf800-0xf8ff) for rid 1c of pcib3 pcib3: allocating non-ISA range 0xfc00-0xfcff pcib0: allocated type 4 (0xfc00-0xfcff) for rid 1c of pcib3 pcib0: allocated type 3 (0xdfd00000-0xdfefffff) for rid 20 of pcib3 pcib3: domain 0 pcib3: secondary bus 3 pcib3: subordinate bus 3 pcib3: I/O decode 0xf000-0xffff pcib3: memory decode 0xdfd00000-0xdfefffff pcib3: special decode ISA pci3: on pcib3 pci3: domain=0, physical bus=3 found-> vendor=0x1000, dev=0x0058, revid=0x08 domain=0, bus=3, slot=0, func=0 class=01-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=15 powerspec 2 supports D0 D1 D2 D3 current D0 MSI supports 1 message, 64 bit MSI-X supports 1 message in map 0x14 map[10]: type I/O Port, range 32, base 0xfc00, size 8, enabled pcib3: allocated I/O port range (0xfc00-0xfcff) for rid 10 of pci0:3:0:0 map[14]: type Memory, range 64, base 0xdfeec000, size 14, enabled pcib3: allocated memory range (0xdfeec000-0xdfeeffff) for rid 14 of pci0:3:0:0 map[1c]: type Memory, range 64, base 0xdfef0000, size 16, enabled pcib3: allocated memory range (0xdfef0000-0xdfefffff) for rid 1c of pci0:3:0:0 pcib3: matched entry for 3.0.INTA pcib3: slot 0 INTA hardwired to IRQ 33 mpt0: port 0xfc00-0xfcff mem 0xdfeec000-0xdfeeffff,0xdfef0000-0xdfefffff irq 33 at device 0.0 on pci3 mpt0: attempting to allocate 1 MSI-X vectors (1 supported) msi: routing MSI-X IRQ 260 to local APIC 16 vector 56 mpt0: using IRQ 260 for MSI-X mpt0: MPI Version=1.5.18.0 mpt0: chain depth limited to 34 (from 2040) mpt0: Maximum Segment Count: 306, Maximum CAM Segment Count: 33 mpt0: MsgLength=20 IOCNumber = 0 mpt0: IOCFACTS: GlobalCredits=266 BlockSize=8 bytes Request Frame Size 128 bytes Max Chain Depth 34 mpt0: IOCFACTS: Num Ports 1, FWImageSize 0, Flags=0x2 mpt0: No Handlers For Any Event Notify Frames. Event 0xa (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0x16 (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0x12 (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0x16 (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0x16 (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0x16 (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0xf (ACK not required). mpt0: No Handlers For Any Event Notify Frames. Event 0xf (ACK not required). mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 0 Active Volumes (2 Max) mpt0: 0 Hidden Drive Members (14 Max) mpt0: No Handlers For Any Event Notify Frames. Event 0xa (ACK not required). pcib4: at device 5.0 on pci0 pcib4: domain 0 pcib4: secondary bus 4 pcib4: subordinate bus 4 pcib4: special decode ISA pci4: on pcib4 pci4: domain=0, physical bus=4 pcib5: at device 6.0 on pci0 pcib5: allocating non-ISA range 0xd000-0xd0ff pcib0: allocated type 4 (0xd000-0xd0ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xd400-0xd4ff pcib0: allocated type 4 (0xd400-0xd4ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xd800-0xd8ff pcib0: allocated type 4 (0xd800-0xd8ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xdc00-0xdcff pcib0: allocated type 4 (0xdc00-0xdcff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xe000-0xe0ff pcib0: allocated type 4 (0xe000-0xe0ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xe400-0xe4ff pcib0: allocated type 4 (0xe400-0xe4ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xe800-0xe8ff pcib0: allocated type 4 (0xe800-0xe8ff) for rid 1c of pcib5 pcib5: allocating non-ISA range 0xec00-0xecff pcib0: allocated type 4 (0xec00-0xecff) for rid 1c of pcib5 pcib0: allocated type 3 (0xdf000000-0xdfbfffff) for rid 20 of pcib5 pcib5: domain 0 pcib5: secondary bus 5 pcib5: subordinate bus 8 pcib5: I/O decode 0xd000-0xefff pcib5: memory decode 0xdf000000-0xdfbfffff pcib5: special decode ISA pci5: on pcib5 pci5: domain=0, physical bus=5 found-> vendor=0x111d, dev=0x8018, revid=0x0e domain=0, bus=5, slot=0, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) powerspec 3 supports D0 D3 current D0 pcib6: at device 0.0 on pci5 pcib6: allocating non-ISA range 0xd000-0xd0ff pcib5: allocated I/O port range (0xd000-0xd0ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xd400-0xd4ff pcib5: allocated I/O port range (0xd400-0xd4ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xd800-0xd8ff pcib5: allocated I/O port range (0xd800-0xd8ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xdc00-0xdcff pcib5: allocated I/O port range (0xdc00-0xdcff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xe000-0xe0ff pcib5: allocated I/O port range (0xe000-0xe0ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xe400-0xe4ff pcib5: allocated I/O port range (0xe400-0xe4ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xe800-0xe8ff pcib5: allocated I/O port range (0xe800-0xe8ff) for rid 1c of pcib6 pcib6: allocating non-ISA range 0xec00-0xecff pcib5: allocated I/O port range (0xec00-0xecff) for rid 1c of pcib6 pcib5: allocated memory range (0xdf000000-0xdfbfffff) for rid 20 of pcib6 pcib6: domain 0 pcib6: secondary bus 6 pcib6: subordinate bus 8 pcib6: I/O decode 0xd000-0xefff pcib6: memory decode 0xdf000000-0xdfbfffff pcib6: special decode ISA pci6: on pcib6 pci6: domain=0, physical bus=6 found-> vendor=0x111d, dev=0x8018, revid=0x0e domain=0, bus=6, slot=2, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) powerspec 3 supports D0 D3 current D0 MSI supports 1 message, 64 bit found-> vendor=0x111d, dev=0x8018, revid=0x0e domain=0, bus=6, slot=4, func=0 class=06-04-00, hdrtype=0x01, mfdev=0 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x07 (1750 ns), maxlat=0x00 (0 ns) powerspec 3 supports D0 D3 current D0 MSI supports 1 message, 64 bit pcib7: at device 2.0 on pci6 pcib7: allocating non-ISA range 0xe000-0xe0ff pcib6: allocated I/O port range (0xe000-0xe0ff) for rid 1c of pcib7 pcib7: allocating non-ISA range 0xe400-0xe4ff pcib6: allocated I/O port range (0xe400-0xe4ff) for rid 1c of pcib7 pcib7: allocating non-ISA range 0xe800-0xe8ff pcib6: allocated I/O port range (0xe800-0xe8ff) for rid 1c of pcib7 pcib7: allocating non-ISA range 0xec00-0xecff pcib6: allocated I/O port range (0xec00-0xecff) for rid 1c of pcib7 pcib6: allocated memory range (0xdf600000-0xdfbfffff) for rid 20 of pcib7 pcib7: domain 0 pcib7: secondary bus 7 pcib7: subordinate bus 7 pcib7: I/O decode 0xe000-0xefff pcib7: memory decode 0xdf600000-0xdfbfffff pcib7: special decode ISA pci7: on pcib7 pci7: domain=0, physical bus=7 found-> vendor=0x8086, dev=0x10d6, revid=0x02 domain=0, bus=7, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=11 powerspec 2 supports D0 D3 current D0 MSI supports 1 message, 64 bit MSI-X supports 10 messages in map 0x1c map[10]: type Memory, range 32, base 0xdf7c0000, size 17, enabled pcib7: allocated memory range (0xdf7c0000-0xdf7dffff) for rid 10 of pci0:7:0:0 map[14]: type Memory, range 32, base 0xdf800000, size 21, enabled pcib7: allocated memory range (0xdf800000-0xdf9fffff) for rid 14 of pci0:7:0:0 map[18]: type I/O Port, range 32, base 0xecc0, size 5, enabled pcib7: allocated I/O port range (0xecc0-0xecdf) for rid 18 of pci0:7:0:0 map[1c]: type Memory, range 32, base 0xdf7b8000, size 14, enabled pcib7: allocated memory range (0xdf7b8000-0xdf7bbfff) for rid 1c of pci0:7:0:0 pcib5: matched entry for 5.0.INTC pcib5: slot 0 INTC hardwired to IRQ 45 pcib6: slot 2 INTA is routed to irq 45 pcib7: slot 0 INTA is routed to irq 45 found-> vendor=0x8086, dev=0x10d6, revid=0x02 domain=0, bus=7, slot=0, func=1 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=10 powerspec 2 supports D0 D3 current D0 MSI supports 1 message, 64 bit MSI-X supports 10 messages in map 0x1c map[10]: type Memory, range 32, base 0xdf7e0000, size 17, enabled pcib7: allocated memory range (0xdf7e0000-0xdf7fffff) for rid 10 of pci0:7:0:1 map[14]: type Memory, range 32, base 0xdfa00000, size 21, enabled pcib7: allocated memory range (0xdfa00000-0xdfbfffff) for rid 14 of pci0:7:0:1 map[18]: type I/O Port, range 32, base 0xece0, size 5, enabled pcib7: allocated I/O port range (0xece0-0xecff) for rid 18 of pci0:7:0:1 map[1c]: type Memory, range 32, base 0xdf7bc000, size 14, enabled pcib7: allocated memory range (0xdf7bc000-0xdf7bffff) for rid 1c of pci0:7:0:1 pcib5: matched entry for 5.0.INTD pcib5: slot 0 INTD hardwired to IRQ 47 pcib6: slot 2 INTB is routed to irq 47 pcib7: slot 0 INTB is routed to irq 47 igb0: port 0xecc0-0xecdf mem 0xdf7c0000-0xdf7dffff,0xdf800000-0xdf9fffff,0xdf7b8000-0xdf7bbfff irq 45 at device 0.0 on pci7 igb0: attempting to allocate 5 MSI-X vectors (10 supported) msi: routing MSI-X IRQ 261 to local APIC 16 vector 57 msi: routing MSI-X IRQ 262 to local APIC 16 vector 58 msi: routing MSI-X IRQ 263 to local APIC 16 vector 59 msi: routing MSI-X IRQ 264 to local APIC 16 vector 60 msi: routing MSI-X IRQ 265 to local APIC 16 vector 61 igb0: using IRQs 261-265 for MSI-X igb0: Using MSIX interrupts with 5 vectors igb0: bpf attached igb0: Ethernet address: 00:1b:21:3e:fe:c8 igb0: Bound queue 0 to cpu 0 igb0: Bound queue 1 to cpu 1 igb0: Bound queue 2 to cpu 2 igb0: Bound queue 3 to cpu 3 igb1: port 0xece0-0xecff mem 0xdf7e0000-0xdf7fffff,0xdfa00000-0xdfbfffff,0xdf7bc000-0xdf7bffff irq 47 at device 0.1 on pci7 igb1: attempting to allocate 5 MSI-X vectors (10 supported) msi: routing MSI-X IRQ 266 to local APIC 16 vector 62 msi: routing MSI-X IRQ 267 to local APIC 16 vector 63 msi: routing MSI-X IRQ 268 to local APIC 16 vector 64 msi: routing MSI-X IRQ 269 to local APIC 16 vector 65 msi: routing MSI-X IRQ 270 to local APIC 16 vector 66 igb1: using IRQs 266-270 for MSI-X igb1: Using MSIX interrupts with 5 vectors igb1: bpf attached igb1: Ethernet address: 00:1b:21:3e:fe:c9 igb1: Bound queue 0 to cpu 4 igb1: Bound queue 1 to cpu 5 igb1: Bound queue 2 to cpu 6 igb1: Bound queue 3 to cpu 7 pcib8: at device 4.0 on pci6 pcib8: allocating non-ISA range 0xd000-0xd0ff pcib6: allocated I/O port range (0xd000-0xd0ff) for rid 1c of pcib8 pcib8: allocating non-ISA range 0xd400-0xd4ff pcib6: allocated I/O port range (0xd400-0xd4ff) for rid 1c of pcib8 pcib8: allocating non-ISA range 0xd800-0xd8ff pcib6: allocated I/O port range (0xd800-0xd8ff) for rid 1c of pcib8 pcib8: allocating non-ISA range 0xdc00-0xdcff pcib6: allocated I/O port range (0xdc00-0xdcff) for rid 1c of pcib8 pcib6: allocated memory range (0xdf000000-0xdf5fffff) for rid 20 of pcib8 pcib8: domain 0 pcib8: secondary bus 8 pcib8: subordinate bus 8 pcib8: I/O decode 0xd000-0xdfff pcib8: memory decode 0xdf000000-0xdf5fffff pcib8: special decode ISA pci8: on pcib8 pci8: domain=0, physical bus=8 found-> vendor=0x8086, dev=0x10d6, revid=0x02 domain=0, bus=8, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=15 powerspec 2 supports D0 D3 current D0 MSI supports 1 message, 64 bit MSI-X supports 10 messages in map 0x1c map[10]: type Memory, range 32, base 0xdf1c0000, size 17, enabled pcib8: allocated memory range (0xdf1c0000-0xdf1dffff) for rid 10 of pci0:8:0:0 map[14]: type Memory, range 32, base 0xdf200000, size 21, enabled pcib8: allocated memory range (0xdf200000-0xdf3fffff) for rid 14 of pci0:8:0:0 map[18]: type I/O Port, range 32, base 0xdcc0, size 5, enabled pcib8: allocated I/O port range (0xdcc0-0xdcdf) for rid 18 of pci0:8:0:0 map[1c]: type Memory, range 32, base 0xdf1b8000, size 14, enabled pcib8: allocated memory range (0xdf1b8000-0xdf1bbfff) for rid 1c of pci0:8:0:0 pcib5: matched entry for 5.0.INTA pcib5: slot 0 INTA hardwired to IRQ 35 pcib6: slot 4 INTA is routed to irq 35 pcib8: slot 0 INTA is routed to irq 35 found-> vendor=0x8086, dev=0x10d6, revid=0x02 domain=0, bus=8, slot=0, func=1 class=02-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0007, statreg=0x0010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=b, irq=14 powerspec 2 supports D0 D3 current D0 MSI supports 1 message, 64 bit MSI-X supports 10 messages in map 0x1c map[10]: type Memory, range 32, base 0xdf1e0000, size 17, enabled pcib8: allocated memory range (0xdf1e0000-0xdf1fffff) for rid 10 of pci0:8:0:1 map[14]: type Memory, range 32, base 0xdf400000, size 21, enabled pcib8: allocated memory range (0xdf400000-0xdf5fffff) for rid 14 of pci0:8:0:1 map[18]: type I/O Port, range 32, base 0xdce0, size 5, enabled pcib8: allocated I/O port range (0xdce0-0xdcff) for rid 18 of pci0:8:0:1 map[1c]: type Memory, range 32, base 0xdf1bc000, size 14, enabled pcib8: allocated memory range (0xdf1bc000-0xdf1bffff) for rid 1c of pci0:8:0:1 pcib5: matched entry for 5.0.INTB pcib5: slot 0 INTB hardwired to IRQ 46 pcib6: slot 4 INTB is routed to irq 46 pcib8: slot 0 INTB is routed to irq 46 igb2: port 0xdcc0-0xdcdf mem 0xdf1c0000-0xdf1dffff,0xdf200000-0xdf3fffff,0xdf1b8000-0xdf1bbfff irq 35 at device 0.0 on pci8 igb2: attempting to allocate 5 MSI-X vectors (10 supported) msi: routing MSI-X IRQ 271 to local APIC 16 vector 67 msi: routing MSI-X IRQ 272 to local APIC 16 vector 68 msi: routing MSI-X IRQ 273 to local APIC 16 vector 69 msi: routing MSI-X IRQ 274 to local APIC 16 vector 70 msi: routing MSI-X IRQ 275 to local APIC 16 vector 71 igb2: using IRQs 271-275 for MSI-X igb2: Using MSIX interrupts with 5 vectors igb2: bpf attached igb2: Ethernet address: 00:1b:21:3e:fe:cc igb2: Bound queue 0 to cpu 8 igb2: Bound queue 1 to cpu 9 igb2: Bound queue 2 to cpu 10 igb2: Bound queue 3 to cpu 11 igb3: port 0xdce0-0xdcff mem 0xdf1e0000-0xdf1fffff,0xdf400000-0xdf5fffff,0xdf1bc000-0xdf1bffff irq 46 at device 0.1 on pci8 igb3: attempting to allocate 5 MSI-X vectors (10 supported) msi: routing MSI-X IRQ 276 to local APIC 16 vector 72 msi: routing MSI-X IRQ 277 to local APIC 16 vector 73 msi: routing MSI-X IRQ 278 to local APIC 16 vector 74 msi: routing MSI-X IRQ 279 to local APIC 16 vector 75 msi: routing MSI-X IRQ 280 to local APIC 16 vector 76 igb3: using IRQs 276-280 for MSI-X igb3: Using MSIX interrupts with 5 vectors igb3: bpf attached igb3: Ethernet address: 00:1b:21:3e:fe:cd igb3: Bound queue 0 to cpu 12 igb3: Bound queue 1 to cpu 13 igb3: Bound queue 2 to cpu 14 igb3: Bound queue 3 to cpu 15 pcib9: at device 7.0 on pci0 pcib9: domain 0 pcib9: secondary bus 9 pcib9: subordinate bus 9 pcib9: special decode ISA pci9: on pcib9 pci9: domain=0, physical bus=9 pcib10: at device 9.0 on pci0 pcib10: domain 0 pcib10: secondary bus 10 pcib10: subordinate bus 10 pcib10: special decode ISA pci10: on pcib10 pci10: domain=0, physical bus=10 pci0: at device 20.0 (no driver attached) pci0: at device 20.1 (no driver attached) pci0: at device 20.2 (no driver attached) uhci0: port 0xcc40-0xcc5f irq 17 at device 26.0 on pci0 ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 16 vector 77 usbus0 on uhci0 uhci0: usbpf: Attached uhci1: port 0xcc60-0xcc7f irq 18 at device 26.1 on pci0 ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 16 vector 78 usbus1 on uhci1 uhci1: usbpf: Attached ehci0: mem 0xdfcff800-0xdfcffbff irq 19 at device 26.7 on pci0 ioapic0: routing intpin 19 (PCI IRQ 19) to lapic 16 vector 79 usbus2: EHCI version 1.0 usbus2 on ehci0 ehci0: usbpf: Attached uhci2: port 0xcc80-0xcc9f irq 21 at device 29.0 on pci0 ioapic0: routing intpin 21 (PCI IRQ 21) to lapic 16 vector 80 usbus3 on uhci2 uhci2: usbpf: Attached uhci3: port 0xcca0-0xccbf irq 20 at device 29.1 on pci0 usbus4 on uhci3 uhci3: usbpf: Attached ehci1: mem 0xdfcffc00-0xdfcfffff irq 21 at device 29.7 on pci0 usbus5: EHCI version 1.0 usbus5 on ehci1 ehci1: usbpf: Attached pcib11: at device 30.0 on pci0 pcib0: allocated type 3 (0xde000000-0xdeffffff) for rid 20 of pcib11 pcib0: allocated type 3 (0xd5800000-0xd5ffffff) for rid 24 of pcib11 pcib11: domain 0 pcib11: secondary bus 11 pcib11: subordinate bus 11 pcib11: memory decode 0xde000000-0xdeffffff pcib11: prefetched decode 0xd5800000-0xd5ffffff pcib11: special decode VGA, subtractive pci11: on pcib11 pci11: domain=0, physical bus=11 found-> vendor=0x102b, dev=0x0532, revid=0x0a domain=0, bus=11, slot=3, func=0 class=03-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0007, statreg=0x0290, cachelnsz=16 (dwords) lattimer=0x20 (960 ns), mingnt=0x10 (4000 ns), maxlat=0x20 (8000 ns) intpin=a, irq=10 powerspec 1 supports D0 D3 current D0 map[10]: type Prefetchable Memory, range 32, base 0xd5800000, size 23, enabled pcib11: allocated prefetch range (0xd5800000-0xd5ffffff) for rid 10 of pci0:11:3:0 map[14]: type Memory, range 32, base 0xde7fc000, size 14, enabled pcib11: allocated memory range (0xde7fc000-0xde7fffff) for rid 14 of pci0:11:3:0 map[18]: type Memory, range 32, base 0xde800000, size 23, enabled pcib11: allocated memory range (0xde800000-0xdeffffff) for rid 18 of pci0:11:3:0 pcib11: matched entry for 11.3.INTA pcib11: slot 3 INTA hardwired to IRQ 19 vgapci0: mem 0xd5800000-0xd5ffffff,0xde7fc000-0xde7fffff,0xde800000-0xdeffffff irq 19 at device 3.0 on pci11 vgapci0: Boot video device isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0xcc10-0xcc17,0xcc08-0xcc0b,0xcc18-0xcc1f,0xcc0c-0xcc0f,0xcc20-0xcc2f,0xcc30-0xcc3f irq 23 at device 31.2 on pci0 ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 16 vector 81 ata2: at channel 0 on atapci0 ata3: at channel 1 on atapci0 ata3: SControl registers are not functional: 00000000 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: console (115200,n,8,1) ioapic0: routing intpin 4 (ISA IRQ 4) to lapic 16 vector 82 uart0: fast interrupt uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 ioapic0: routing intpin 3 (ISA IRQ 3) to lapic 16 vector 83 uart1: fast interrupt ACPI: Enabled 1 GPEs in block 00 to 3F qpi0: on motherboard pcib12: pcibus 255 on qpi0 pci255: on pcib12 pci255: domain=0, physical bus=255 found-> vendor=0x8086, dev=0x2c40, revid=0x05 domain=0, bus=255, slot=0, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c01, revid=0x05 domain=0, bus=255, slot=0, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c10, revid=0x05 domain=0, bus=255, slot=2, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c11, revid=0x05 domain=0, bus=255, slot=2, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c14, revid=0x05 domain=0, bus=255, slot=2, func=4 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c15, revid=0x05 domain=0, bus=255, slot=2, func=5 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c18, revid=0x05 domain=0, bus=255, slot=3, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c19, revid=0x05 domain=0, bus=255, slot=3, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c1a, revid=0x05 domain=0, bus=255, slot=3, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c1c, revid=0x05 domain=0, bus=255, slot=3, func=4 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c20, revid=0x05 domain=0, bus=255, slot=4, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c21, revid=0x05 domain=0, bus=255, slot=4, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c22, revid=0x05 domain=0, bus=255, slot=4, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c23, revid=0x05 domain=0, bus=255, slot=4, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c28, revid=0x05 domain=0, bus=255, slot=5, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c29, revid=0x05 domain=0, bus=255, slot=5, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c2a, revid=0x05 domain=0, bus=255, slot=5, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c2b, revid=0x05 domain=0, bus=255, slot=5, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c30, revid=0x05 domain=0, bus=255, slot=6, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c31, revid=0x05 domain=0, bus=255, slot=6, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c32, revid=0x05 domain=0, bus=255, slot=6, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c33, revid=0x05 domain=0, bus=255, slot=6, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) pcib13: pcibus 254 on qpi0 pci254: on pcib13 pci254: domain=0, physical bus=254 found-> vendor=0x8086, dev=0x2c40, revid=0x05 domain=0, bus=254, slot=0, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c01, revid=0x05 domain=0, bus=254, slot=0, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c10, revid=0x05 domain=0, bus=254, slot=2, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c11, revid=0x05 domain=0, bus=254, slot=2, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c14, revid=0x05 domain=0, bus=254, slot=2, func=4 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c15, revid=0x05 domain=0, bus=254, slot=2, func=5 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c18, revid=0x05 domain=0, bus=254, slot=3, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c19, revid=0x05 domain=0, bus=254, slot=3, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c1a, revid=0x05 domain=0, bus=254, slot=3, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c1c, revid=0x05 domain=0, bus=254, slot=3, func=4 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c20, revid=0x05 domain=0, bus=254, slot=4, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c21, revid=0x05 domain=0, bus=254, slot=4, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c22, revid=0x05 domain=0, bus=254, slot=4, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c23, revid=0x05 domain=0, bus=254, slot=4, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c28, revid=0x05 domain=0, bus=254, slot=5, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c29, revid=0x05 domain=0, bus=254, slot=5, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c2a, revid=0x05 domain=0, bus=254, slot=5, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c2b, revid=0x05 domain=0, bus=254, slot=5, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c30, revid=0x05 domain=0, bus=254, slot=6, func=0 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c31, revid=0x05 domain=0, bus=254, slot=6, func=1 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c32, revid=0x05 domain=0, bus=254, slot=6, func=2 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) found-> vendor=0x8086, dev=0x2c33, revid=0x05 domain=0, bus=254, slot=6, func=3 class=06-00-00, hdrtype=0x00, mfdev=1 cmdreg=0x0006, statreg=0x0000, cachelnsz=0 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) acpi0: wakeup code va 0xfffffe085f0fc000 pa 0x90000 ahc_isa_identify 0: ioport 0xc00 alloc failed ahc_isa_identify 1: ioport 0x1c00 alloc failed ahc_isa_identify 2: ioport 0x2c00 alloc failed ahc_isa_identify 3: ioport 0x3c00 alloc failed ahc_isa_identify 4: ioport 0x4c00 alloc failed ahc_isa_identify 5: ioport 0x5c00 alloc failed ahc_isa_identify 6: ioport 0x6c00 alloc failed ahc_isa_identify 7: ioport 0x7c00 alloc failed ahc_isa_identify 8: ioport 0x8c00 alloc failed ahc_isa_identify 9: ioport 0x9c00 alloc failed ahc_isa_identify 10: ioport 0xac00 alloc failed ahc_isa_identify 11: ioport 0xbc00 alloc failed ahc_isa_identify 12: ioport 0xcc00 alloc failed ahc_isa_identify 13: ioport 0xdc00 alloc failed ahc_isa_identify 14: ioport 0xec00 alloc failed ex_isa_identify() pcib0: allocated type 3 (0xa0000-0xa07ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa0800-0xa0fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa1000-0xa17ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa1800-0xa1fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa2000-0xa27ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa2800-0xa2fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa3000-0xa37ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa3800-0xa3fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa4000-0xa47ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa4800-0xa4fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa5000-0xa57ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa5800-0xa5fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa6000-0xa67ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa6800-0xa6fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa7000-0xa77ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa7800-0xa7fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa8000-0xa87ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa8800-0xa8fff) for rid 0 of orm0 pcib0: allocated type 3 (0xa9000-0xa97ff) for rid 0 of orm0 pcib0: allocated type 3 (0xa9800-0xa9fff) for rid 0 of orm0 pcib0: allocated type 3 (0xaa000-0xaa7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xaa800-0xaafff) for rid 0 of orm0 pcib0: allocated type 3 (0xab000-0xab7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xab800-0xabfff) for rid 0 of orm0 pcib0: allocated type 3 (0xac000-0xac7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xac800-0xacfff) for rid 0 of orm0 pcib0: allocated type 3 (0xad000-0xad7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xad800-0xadfff) for rid 0 of orm0 pcib0: allocated type 3 (0xae000-0xae7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xae800-0xaefff) for rid 0 of orm0 pcib0: allocated type 3 (0xaf000-0xaf7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xaf800-0xaffff) for rid 0 of orm0 pcib0: allocated type 3 (0xb0000-0xb07ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb0800-0xb0fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb1000-0xb17ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb1800-0xb1fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb2000-0xb27ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb2800-0xb2fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb3000-0xb37ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb3800-0xb3fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb4000-0xb47ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb4800-0xb4fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb5000-0xb57ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb5800-0xb5fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb6000-0xb67ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb6800-0xb6fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb7000-0xb77ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb7800-0xb7fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb8000-0xb87ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb8800-0xb8fff) for rid 0 of orm0 pcib0: allocated type 3 (0xb9000-0xb97ff) for rid 0 of orm0 pcib0: allocated type 3 (0xb9800-0xb9fff) for rid 0 of orm0 pcib0: allocated type 3 (0xba000-0xba7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xba800-0xbafff) for rid 0 of orm0 pcib0: allocated type 3 (0xbb000-0xbb7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xbb800-0xbbfff) for rid 0 of orm0 pcib0: allocated type 3 (0xbc000-0xbc7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xbc800-0xbcfff) for rid 0 of orm0 pcib0: allocated type 3 (0xbd000-0xbd7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xbd800-0xbdfff) for rid 0 of orm0 pcib0: allocated type 3 (0xbe000-0xbe7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xbe800-0xbefff) for rid 0 of orm0 pcib0: allocated type 3 (0xbf000-0xbf7ff) for rid 0 of orm0 pcib0: allocated type 3 (0xbf800-0xbffff) for rid 0 of orm0 isa_probe_children: disabling PnP devices atrtc: atrtc0 already exists; skipping it attimer: attimer0 already exists; skipping it sc: sc0 already exists; skipping it uart: uart0 already exists; skipping it uart: uart1 already exists; skipping it isa_probe_children: probing non-PnP devices orm0: at iomem 0xc0000-0xc7fff,0xce000-0xcefff,0xec000-0xeffff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sc0: fb0, kbd1, terminal emulator: scteken (teken terminal) vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 pcib0: allocated type 4 (0x3c0-0x3df) for rid 0 of vga0 pcib0: allocated type 3 (0xa0000-0xbffff) for rid 0 of vga0 pcib0: allocated type 4 (0x60-0x60) for rid 0 of atkbdc0 pcib0: allocated type 4 (0x64-0x64) for rid 1 of atkbdc0 atkbdc0: at port 0x60,0x64 on isa0 pcib0: allocated type 4 (0x60-0x60) for rid 0 of atkbdc0 pcib0: allocated type 4 (0x64-0x64) for rid 1 of atkbdc0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd: the current kbd controller command byte 0065 atkbd: keyboard ID 0xffffffff (1) kbdc: RESET_KBD return code:00fe kbdc: RESET_KBD return code:00fe kbdc: RESET_KBD return code:00fe kbdc: DIAGNOSE status:0055 kbdc: TEST_KBD_PORT status:0000 atkbd: failed to reset the keyboard. kbd0: atkbd0, AT 84 (1), config:0x0, flags:0x3d0000 ioapic0: routing intpin 1 (ISA IRQ 1) to lapic 16 vector 84 atkbd0: [GIANT-LOCKED] psm0: unable to allocate IRQ pcib0: allocated type 4 (0x3f0-0x3f5) for rid 0 of fdc0 pcib0: allocated type 4 (0x3f7-0x3f7) for rid 1 of fdc0 fdc0 failed to probe at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 ppc0: cannot reserve I/O port range ppc0 failed to probe at irq 7 on isa0 wbwd0 failed to probe on isa0 isa_probe_children: probing PnP devices est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est0 attach returned 6 p4tcc0: on cpu0 est1: on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est1 attach returned 6 p4tcc1: on cpu1 est2: on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est2 attach returned 6 p4tcc2: on cpu2 est3: on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est3 attach returned 6 p4tcc3: on cpu3 est4: on cpu4 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est4 attach returned 6 p4tcc4: on cpu4 est5: on cpu5 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est5 attach returned 6 p4tcc5: on cpu5 est6: on cpu6 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est6 attach returned 6 p4tcc6: on cpu6 est7: on cpu7 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est7 attach returned 6 p4tcc7: on cpu7 est8: on cpu8 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est8 attach returned 6 p4tcc8: on cpu8 est9: on cpu9 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est9 attach returned 6 p4tcc9: on cpu9 est10: on cpu10 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est10 attach returned 6 p4tcc10: on cpu10 est11: on cpu11 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est11 attach returned 6 p4tcc11: on cpu11 est12: on cpu12 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est12 attach returned 6 p4tcc12: on cpu12 est13: on cpu13 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est13 attach returned 6 p4tcc13: on cpu13 est14: on cpu14 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est14 attach returned 6 p4tcc14: on cpu14 est15: on cpu15 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 12 device_attach: est15 attach returned 6 p4tcc15: on cpu15 Device configuration finished. bce0: link state changed to DOWN bce1: link state changed to DOWN bce2: link state changed to DOWN bce3: link state changed to DOWN procfs registered ZFS filesystem version: 5 ZFS storage pool version: features support (5000) lapic: Divisor 2, Frequency 66501308 Hz Timecounters tick every 1.000 msec vlan: initialized, using hash tables with chaining tcp_init: net.inet.tcp.tcbhashsize auto tuned to 262144 lo0: bpf attached hptnr: no controller detected. hpt27xx: no controller detected. hptrr: no controller detected. random: unblocking device. usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 480Mbps High Speed USB v2.0 usbus3: 12Mbps Full Speed USB v1.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 usbus4: 12Mbps Full Speed USB v1.0 usbus5: 480Mbps High Speed USB v2.0 ugen4.1: at usbus4 uhub4: on usbus4 ugen5.1: at usbus5 uhub5: on usbus5 ata2: SATA reset: ports status=0x01 ata2: p0: SATA connect time=0ms status=00000113 ata2: reset tp1 mask=01 ostat0=00 ostat1=00 ata2: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata2: reset tp2 stat0=00 stat1=00 devices=0x10000 ata3: SATA reset: ports status=0x00 uhub1: 2 ports with 2 removable, self powered uhub0: 2 ports with 2 removable, self powered ses0 at mpt0 bus 0 scbus0 target 8 lun 0 ses0: Fixed Enclosure Services SCSI-5 device ses0: 300.000MB/s transfers ses0: SCSI-3 ENC Device GEOM: new disk da0 da0 at mpt0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: Serial Number WD-WCAT1C667462 da0: 300.000MB/s transfers da0: Command Queueing enabled da0: 238418MB (488281250 512 byte sectors: 255H 63S/T 30394C) pass0 at mpt0 bus 0 scbus0 target 0 lun 0 pass0: Fixed Direct Access SCSI-5 device pass0: Serial Number WD-WCAT1C667462 pass0: 300.000MB/s transfers pass0: Command Queueing enabled pass1 at mpt0 bus 0 scbus0 target 8 lun 0 pass1: Fixed Enclosure Services SCSI-5 device pass1: 300.000MB/s transfers da0: Delete methods: pass2 at ata2 bus 0 scbus2 target 0 lun 0 pass2: Removable CD-ROM SCSI-0 device pass2: Serial Number 09070110080122 uhub3: 2 ports with 2 removable, self powered uhub4: 2 ports with 2 removable, self powered pass2: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0 at ata2 bus 0 scbus2 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: Serial Number 09070110080122 cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed GEOM: new disk cd0 Netvsc initializing... lapic19: CMCI unmasked lapic6: CMCI unmasked lapic22: CMCI unmasked lapic23: CMCI unmasked lapic18: CMCI unmasked lapic5: CMCI unmasked SMP: AP CPU #1 Launched! lapic3: CMCI unmasked lapic20: CMCI unmasked lapic21: CMCI unmasked cpu1 AP: lapic0: CMCI unmasked ID: 0x11000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lapic7: CMCI unmasked lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff lapic2: CMCI unmasked timer: 0x000100ef therm: 0x00010000 err: 0x000000f0lapic4: CMCI unmasked pmc: 0x00010400lapic1: CMCI unmasked cmci: 0x000100f2 SMP: AP CPU #9 Launched! cpu9 AP: ID: 0x01000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #5 Launched! cpu5 AP: ID: 0x15000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #2 Launched! cpu2 AP: ID: 0x12000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #13 Launched! cpu13 AP: ID: 0x05000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #10 Launched! cpu10 AP: ID: 0x02000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #15 Launched! cpu15 AP: ID: 0x07000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #7 Launched! cpu7 AP: ID: 0x17000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #6 Launched! cpu6 AP: ID: 0x16000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #14 Launched! cpu14 AP: ID: 0x06000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #4 Launched! cpu4 AP: ID: 0x14000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #11 Launched! cpu11 AP: ID: 0x03000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #8 Launched! cpu8 AP: ID: 0x00000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #3 Launched! cpu3 AP: ID: 0x13000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 SMP: AP CPU #12 Launched! cpu12 AP: ID: 0x04000000 VER: 0x00060015 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00010000 err: 0x000000f0 pmc: 0x00010400 cmci: 0x000000f2 ioapic0: routing intpin 1 (ISA IRQ 1) to lapic 17 vector 48 ioapic0: routing intpin 3 (ISA IRQ 3) to lapic 18 vector 48 ioapic0: routing intpin 4 (ISA IRQ 4) to lapic 19 vector 48 ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 20 vector 48 ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 21 vector 48 ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 22 vector 48 ioapic0: routing intpin 19 (PCI IRQ 19) to lapic 23 vector 48 ioapic0: routing intpin 21 (PCI IRQ 21) to lapic 0 vector 48 ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 1 vector 48 msi: Assigning MSI IRQ 256 to local APIC 2 vector 48 msi: Assigning MSI IRQ 257 to local APIC 3 vector 48 msi: Assigning MSI IRQ 258 to local APIC 4 vector 48 msi: Assigning MSI IRQ 259 to local APIC 5 vector 48 msi: Assigning MSI-X IRQ 260 to local APIC 6 vector 48 msi: Assigning MSI-X IRQ 262 to local APIC 17 vector 49 msi: Assigning MSI-X IRQ 263 to local APIC 18 vector 49 msi: Assigning MSI-X IRQ 264 to local APIC 19 vector 49 msi: Assigning MSI-X IRQ 265 to local APIC 7 vector 48 msi: Assigning MSI-X IRQ 266 to local APIC 20 vector 49 msi: Assigning MSI-X IRQ 267 to local APIC 21 vector 49 msi: Assigning MSI-X IRQ 268 to local APIC 22 vector 49 msi: Assigning MSI-X IRQ 269 to local APIC 23 vector 49 msi: Assigning MSI-X IRQ 271 to local APIC 0 vector 49 msi: Assigning MSI-X IRQ 272 to local APIC 1 vector 49 msi: Assigning MSI-X IRQ 273 to local APIC 2 vector 49 msi: Assigning MSI-X IRQ 274 to local APIC 3 vector 49 msi: Assigning MSI-X IRQ 275 to local APIC 17 vector 50 msi: Assigning MSI-X IRQ 276 to local APIC 4 vector 49 msi: Assigning MSI-X IRQ 277 to local APIC 5 vector 49 msi: Assigning MSI-X IRQ 278 to local APIC 6 vector 49 msi: Assigning MSI-X IRQ 279 to local APIC 7 vector 49 msi: Assigning MSI-X IRQ 280 to local APIC 18 vector 50 SMP: passed TSC synchronization test TSC timecounter discards lower 1 bit(s) Timecounter "TSC-low" frequency 1130522162 Hz quality 1000 Root mount waiting for: usbus5 usbus2 uhub2: 4 ports with 4 removable, self powered uhub5: 4 ports with 4 removable, self powered Root mount waiting for: usbus5 usbus2 ugen2.2: at usbus2 uhub6: on usbus2 uhub6: MTT enabled Root mount waiting for: usbus2 uhub6: 3 ports with 3 removable, self powered Trying to mount root from zfs:tank/root []... ugen3.2: at usbus3 ugen0.2: at usbus0 ukbd0: on usbus3 kbd2 at ukbd0 kbd2: ukbd0, generic (0), config:0x0, flags:0x3d0000 ukbd1: on usbus0 kbd3 at ukbd1 kbd3: ukbd1, generic (0), config:0x0, flags:0x3d0000 start_init: trying /sbin/init Setting hostuuid: 4c4c4544-0048-5110-8052-cac04f46344a. Setting hostid: 0x52385f9a. Entropy harvesting: interrupts ethernet point_to_point swi. Starting file system checks: Mounting local file systems:. Writing entropy file:. bridge0: bpf attached bridge0: Ethernet address: 02:52:38:5f:9a:00 Created clone interfaces: bridge0. bce0: promiscuous mode enabled bridge0: link state changed to DOWN Starting Network: lo0 bce0 bce1 bce2 bce3 igb0 igb1 igb2 igb3 bridge0. lo0: flags=8049 metric 0 mtu 16384 options=600003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x9 0: inet 127.0.0.1 netmigbask 0xff000000 nd6 options=21 00enk0: flags=8943 me 1tric 0 mtu 1500 options=c01bb<0 RXCSUM,TXCSUM,VLMbAN_MTU,VLAN_HWTApsGGING,JUMBO_MTU, FVLAN_HWCSUM,TSO4ul,VLAN_HWTSO,LINKl STATE> ether 0Dup0:24:e8:39:bb:9ble nd6 options=2x,9 media:w Ethernet autoseColect (none) stntatus: no carrierro bce1: flags=88l:02 onmetric 0 mtu 150e 0 options=c01bigb etherat 00:24:e8:39:bb:e 9d nd6 optionsch=29 medid a: Ethernet autotoselect bce2: fl Uags=8802 metric 0 mtu 1500 options=c01bb i ether 00:24:e8:s 39:bb:9f 0 nd6 oupptions=29 media: EtherneMbt autoselect bcpse3: flags=8802 metrl ic 0 mtu 1500 Duoptions=c01bb ether 00:on24:e8:39:bb:a1 tr nd6 options=29 media: Enethernet autosele ct igb0: flags=ig8c02 metricin 0 mtu 1500 opk tions=403bb d ether 00:1b:21:3toe:fe:c8 nd6 op Utions=29 media: Ethernet autoselect (1000baseT ) status: active igb1: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:c9 nd6 options=29 media: Ethernet autoselect status: no carrier igb2: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:cc nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active igb3: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:cd nd6 options=29 media: Ethernet autoselect status: no carrier bridge0: flags=8843 metric 0 mtu 1500 ether 02:52:38:5f:9a:00 nd6 options=9 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: bce0 flags=143 ifmaxaddr 0 port 1 priority 128 path cost 55 Starting devd. Starting Network: bce1. bce1: flags=8802 metric 0 mtu 1500 options=c01bb ether 00:24:e8:39:bb:9d nd6 options=29 media: Ethernet autoselect Starting Network: bce2. bce2: flags=8802 metric 0 mtu 1500 options=c01bb ether 00:24:e8:39:bb:9f nd6 options=29 media: Ethernet autoselect Starting Network: bce3. bce3: flags=8802 metric 0 mtu 1500 options=c01bb ether 00:24:e8:39:bb:a1 nd6 options=29 media: Ethernet autoselect Starting Network: igb0. igb0: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:c8 nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active Starting Network: igb1. igb1: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:c9 nd6 options=29 media: Ethernet autoselect status: no carrier Starting Network: igb2. igb2: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:fe:cc nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active NMI ISA 20, EISA ff NMI ISA 30, EISA ff NMI ISA 20, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ISA 30, EISA ff NMI ISA 20, EISA ff NMI ISA 30, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ISA 20, EISA ff NMI ... going to debugger NMI ISA 20, EISA ff NMI ISA 20, EISA ff NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger bce0: link state changed to UP NMI ... going to debugger bridge0: link state changed to UP bce0: NMI ISA 20, EISA ff Gigabit link up! NMI ... going to debugger bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) Starting Network: igb3. igb3: flags=8c02 metric 0 mtu 1500 options=403bb ether 00:1b:21:3e:febc:cd nd6 optione0s=29 medgaia: Ethernet autbit link up! boselect statusce: no carrier 0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) ums0: bce0: on usbus3 discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) ums0: bce0: 3 buttons and [Z] coordinates ID=0 discard frame w/o leading ethernet header (len 0 pkt len 0) ums1: bce0: on usbus0 discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) ums1: bce0: 5 buttons and [XYZ] coordinates ID=1 discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) Starting dhclient. bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) DHCPDISCOVER on bce0 to 255.255.255.255 port 67 interval 8 bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) bce0: discard frame w/o leading ethernet header (len 0 pkt len 0) DHCPDISCOVER on bce0 to 255.255.255.255 port 67 interval 8 DHCPDISCOVER on bce0 to 255.255.255.255 port 67 interval 15 DHCPDISCOVER on bce0 to 255.255.255.255 port 67 interval 13 DHCPDISCOVER on bce0 to 255.255.255.255 port 67 interval 16 No DHCPOFFERS received. No working leases in persistent database - sleeping. Starting ums0 moused. Starting ums1 moused. add net fe80::: gateway ::1 add net ff02::: gateway ::1 add net ::ffff:0.0.0.0: gateway ::1 add net ::0.0.0.0: gateway ::1 Generating host.conf. Waiting 30s for the default route interface: ............................. Creating and/or trimming log files. Starting syslogd. ELF ldconfig path: /lib /usr/lib /usr/lib/compat 32-bit compatibility ldconfig path: /usr/lib32 Clearing /tmp (X related). Updating motd:. Mounting late file systems:. Configuring syscons: blanktime. Performing sanity check on sshd configuration. Starting sshd. Starting cron. Starting background file system checks in 60 seconds. Wed Aug 13 17:44:10 BST 2014 From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 18:24:11 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6CF4484D for ; Wed, 13 Aug 2014 18:24:11 +0000 (UTC) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3155B2B5C for ; Wed, 13 Aug 2014 18:24:10 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id wm4so102665obc.41 for ; Wed, 13 Aug 2014 11:24:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:from:subject:date:to; bh=8hI5sMiVWxmvhwHcbSK/tu4nO1drTeB1Ht0aZqlcwhY=; b=RFc7nMXVc3QbpHGkLJrAUwRNIpeo/t3Z4ClJG5X6yJyoRNn/tjhtwCK4DqWyc+4Gj0 Gy0snqXr+/RdUfD+qDSHGFTDTwFILt3lQrW0yTMz663W4kr02NX0NvehYpdwJ6dp9i+B L8KVBcuDJZcAyGWG60hoWMjW7y0KzxxU8lHdxBJ6clMbK2p48W0Vv4QqqSBABpticU+i zIxYhTep4yOJaN20W5O8xrxLoY2p5aX9aaxi5s33F2Qsxuen+jGj2kxPT46sDlRpXrkq SfR/DWHRMqaBUE7QcevZhas8NkvjTt9T/x0caqsMkhC/uiKtO/n3MKu5XJ06F0XaUUjX yXmg== X-Gm-Message-State: ALoCoQlWaRwsKzVUI/UetJMDQLlVZR75d6nog8WbqWpXJU2qdiTL86zgcStEZODP1ptYoS/54LFl X-Received: by 10.60.62.66 with SMTP id w2mr5823226oer.43.1407947159541; Wed, 13 Aug 2014 09:25:59 -0700 (PDT) Received: from [172.21.0.13] (65-36-83-120.static.grandenetworks.net. [65.36.83.120]) by mx.google.com with ESMTPSA id ca1sm3411915oec.16.2014.08.13.09.25.58 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 13 Aug 2014 09:25:58 -0700 (PDT) References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> Mime-Version: 1.0 (1.0) In-Reply-To: <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: X-Mailer: iPhone Mail (11D257) From: Jim Thompson Subject: Re: Intel Support for FreeBSD Date: Wed, 13 Aug 2014 11:25:57 -0500 To: Barney Cordoba Cc: "freebsd-net@freebsd.org" , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 18:24:11 -0000 > On Aug 13, 2014, at 8:24, Barney Cordoba via freebsd-net wrote: >=20 > Negative Progress is inevitable.=20 Many here undoubtedly consider the referenced effort to be the opposite.=20 Jim= From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 18:49:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 042AEEE8 for ; Wed, 13 Aug 2014 18:49:51 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B15D72E21 for ; Wed, 13 Aug 2014 18:49:50 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7DInnQv074143 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 13 Aug 2014 11:49:49 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7DInn2D074142; Wed, 13 Aug 2014 11:49:49 -0700 (PDT) (envelope-from jmg) Date: Wed, 13 Aug 2014 11:49:49 -0700 From: John-Mark Gurney To: Barney Cordoba Subject: Re: Intel Support for FreeBSD Message-ID: <20140813184949.GF83475@funkthat.com> Mail-Followup-To: Barney Cordoba , Mike Tancsa , "freebsd-net@freebsd.org" References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Wed, 13 Aug 2014 11:49:49 -0700 (PDT) Cc: "freebsd-net@freebsd.org" , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 18:49:51 -0000 Barney Cordoba via freebsd-net wrote this message on Wed, Aug 13, 2014 at 06:24 -0700: > Ok. It was a lot more convenient when it was a standalone module/tarball so you didn't have to surgically extract it from the tree and spend a week trying to get it to compile with whatever version you happened to be running. So if you're running 9.1 or 9.2 you could still use it seamlessly.  > > Negative Progress is inevitable.  The problem is that you are using an old version of FreeBSD that only provides security update... The correct solution is to update your machines... I'd much rather have Intel support it in tree, meaning that supported versions of FreeBSD have an up to date driver, than to cater to your wants of using older releases of FreeBSD... Thanks. > On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa wrote: > > > > On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: > > > I notice that there hasn't been an update in the Intel Download Center since July. Is there no official support for 10? > > Hi, > The latest code is committed directly into the tree by Intel > > eg > http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html > and > http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html > > They have been MFC'd to RELENG_10 a few weeks ago -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 20:15:17 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 34843C81; Wed, 13 Aug 2014 20:15:17 +0000 (UTC) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D496828CC; Wed, 13 Aug 2014 20:15:16 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id B78BA242003E; Thu, 14 Aug 2014 00:15:13 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 842892C0466; Thu, 14 Aug 2014 00:15:13 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id D5oUBPMVhc-FDI8IIgA; Thu, 14 Aug 2014 00:15:13 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: bec30993-e604-4ed7-b5f9-0b49c088db7c DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1407960913; bh=e7LzCQGxvt/6/+XjalQzxu+zIdQQ8+xm1C0tENFEx84=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type:Content-Transfer-Encoding; b=x4qYgWvKxHN9sguEIgji/qUq1X7+qfZQXsgqMHAc95Gb83WHRYW3pt2pf4hf+gy7q +HddDu2vV3EG7Upwz2DYF9W8vaxQLa+sMvl0yLJezI72Cd+AY8Z0jo6RMUJM59zdHL CREt6TzhbytGOLKd8dYBLw0ggvas8HClDx/pk9lM= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53EBC750.1050203@yandex-team.ru> Date: Thu, 14 Aug 2014 00:15:12 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "freebsd-net@freebsd.org" , freebsd-ipfw , Luigi Rizzo Subject: [CFT] new tables for ipfw Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 20:15:17 -0000 Hello list. (sorry for posting twice, patch seems to be too big to be posted as attachment). I've been hacking ipfw for a while and It seems there is something ready to test/review in projects/ipfw branch. Main user-visible changes are related to tables: 1) Tables are now identified by names, not numbers. There can be up to 65k tables with up to 63-byte long names (*1). 2) Tables are now set-aware (default off), so you can switch/move them atomically with rules. 3) More functionality is supported (swap, lock, limits, user-level lookup, batched add/del) by generic table code. 4) New table types are added (flow) so you can match multiple packet fields at once. 5) Ability to add different type of lookup algorithms for particular table type has been added. 5) New table algorithms are added (cidr:hash, iface:array, number:array and flow:hash) to make certain types of lookup more effective. 6) No ABI breakage has happened: all functionality supported by old ipfw(8) remains functional. Old & new binaries can work together with the following restrictions: * Tables named other than ^\d+$ are shown as table(65535) in ruleset in old binaries * I'm a bit unsure about "lookup src-port|dst-port N" case, something may be broken here. Anyway, this can be fixed for MFC. Some examples (see ipfw(8) manual page for the description): 0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash 0:02 [2] zfscurr0# ipfw table fl2 info +++ table(fl2), set(0) +++ kindex: 0, type: flow:src-ip,proto,dst-port valtype: number, references: 0 algorithm: flow:hash items: 0, size: 280 0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000 0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000 0:02 [2] zfscurr0# ipfw table fl2 list +++ table(fl2), set(0) +++ 2a02:6b8::333,6,443 45000 10.0.0.92,6,80 22000 0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)' ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" ipfw table mi_test add 10.0.0.8/30 ipfw table mi_test add 2a02:6b8:b010::1/64 25 # ipfw table si add 1.1.1.1/32 1111 2.2.2.2/32 2222 added: 1.1.1.1/32 1111 added: 2.2.2.2/32 2222 # ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 4444 exists: 2.2.2.2/32 2200 added: 4.4.4.4/32 4444 ipfw: Adding record failed: record already exists ^^^^^ Returns error but keeps inserted items # ipfw table si list +++ table(si), set(0) +++ 1.1.1.1/32 1111 2.2.2.2/32 2222 4.4.4.4/32 4444 # ipfw table si atomic add 3.3.3.3/32 3333 4.4.4.4/32 4400 5.5.5.5/32 5555 added(reverted): 3.3.3.3/32 3333 exists: 4.4.4.4/32 4400 ignored: 5.5.5.5/32 5555 ipfw: Adding record failed: record already exists ^^^^^ Returns error and reverts added records IPFW internals has also changed significantly, mostly userland-interaction part. Changing table ids to numbers resulted in format modification for most sockopt codes. Old sopt format was compact, but very hard to extend (no versioning, inability to add more opcodes), so 1) All relevant opcodes were converted to TLV-based versioned IP_FW3-based codes. 2) The remaining opcodes (except NAT handlers) were also converted to be able to eliminate all older opcodes at once 3) All IP_FW3 handlers uses special API instead of calling sooptcopy* directly to ease adding another communication methods 4) struct ip_fw is now different for kernel and userland 5) tablearg value has been changed to 0 to ease future extensions 6) Batched add/delete has been added to tables code 7) Batched rule addition is coming soon (most of the changes has been already done) 8) interface tracking API has been added (started on demand) to permit effective interface tables operations 9) O(1) skipto cache (*2), currently turned on by default (eats 512K). This has to be made optional 10) Rule counters were separated from rule itself and made per-cpu. However, this part is not finished yet (problems with timestamps/api) 11) Make radix entries fit into 128 bytes 12) Make struct ip_fw more compact so more rules will fit into 64 bytes 13) Make interface tables use array of existing ifindexes for faster match 14) Several steps has been made towards making libipfw: * most of new functions were separated into "parse/prepare/show and actuall-do-stuff" pieces. * there are separate functions for parsing text string into "struct ip_fw" and printing "struct ip_fw" to supplied buffer. 15) Probably some more less significant/forgotten features This is not final version: probably more documentation/style is required, there are definitely some uncaught bugs, and so on. However, test/feedback/review is welcome. All these changes are available in projects/ipfw branch (synced to recent -HEAD), but may be easily applied to recent 9/10 (at least kernel part). Branch: svn://svn.freebsd.org/base/projects/ipfw Web: http://svnweb.freebsd.org/base/projects/ipfw/ Today's patch to -HEAD is available at http://static.ipfw.ru/patches/ipfw_tables3.diff From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 21:55:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CC635F88 for ; Wed, 13 Aug 2014 21:55:47 +0000 (UTC) Received: from nm13-vm1.bullet.mail.ne1.yahoo.com (nm13-vm1.bullet.mail.ne1.yahoo.com [98.138.91.62]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 848882529 for ; Wed, 13 Aug 2014 21:55:46 +0000 (UTC) Received: from [98.138.100.111] by nm13.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:53:20 -0000 Received: from [98.138.88.239] by tm100.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:53:20 -0000 Received: from [127.0.0.1] by omp1039.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:53:20 -0000 X-Yahoo-Newman-Property: ymail-5 X-Yahoo-Newman-Id: 763685.96091.bm@omp1039.mail.ne1.yahoo.com Received: (qmail 29736 invoked by uid 60001); 13 Aug 2014 21:53:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1407966800; bh=WlK6A5mRj6RJpcfefDWAWSEaa56SfYEVQEgSN0merHs=; h=References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=Uo/DhFABRf97I5ipsVuEpky17Q/JAJ7z6SziQmiqcsPsWwp5lGxTDGWopbM5ICGwDgvQFSuXLXGNli1o3i2MG/2obC5i8lNxtZ/PY8nGzEmI6cyv56XtKTWvJ4Cu7jcLXobSRJb6aJ7G2NUI8bQy63sdogz4p/1c1O7pCsiiTEc= X-YMail-OSG: S880PpgVM1n.WkYdCMVqj1TaKdYrTsGyjWA8MwJ4AIOHffU pwL413_d7.YJJ7IIB7GBEDtOyUvXzogqEKnlKacF0Oz.43c3DNw.PnKUaCjU eHd8aF0G0z8c6AqTmmnTz2HSQ0nmQsgaGm1t8G4y4xkn5qteZLVuG4dpwGkX MOgwVdPh2.55aBFKBPQQGtJVxrXtzsCK7P92FiBn5jTPLhSJ8UTXyAsldMuU g4oa.b.u4zqB9shRi2JWoVuQlX5i70Vy_vO2Z.UI33hCgoI.ZAtQDeCai3pL N64.YA63T.zedzmH1MboqtZLAd6pEDoR1YErRBJXRMP1u5KUcXfTlvCFJ_51 usdaf4x1WMuoe3t_r8b3FB.kZUH1L0vwJW0CIf.Ahl2ZbGrbQyaOh0qEZQEi USw5txUuuX_nIUkFlcxKAFK90iduySUALgIAbXY1FD_614ZJ1EHuj0yUcEiG sXEZ_CzbwrbQ9cjaOVuM4ZAKYsfuquGNQmInIyBSYRm3_cE7WxJqaCySH0Qj IdiqEscXagCyLgrC8dMG48Kz79UFXq_2y3pfSQGeVrtbr5oXyBsm40LMmQJI - Received: from [76.108.181.232] by web121605.mail.ne1.yahoo.com via HTTP; Wed, 13 Aug 2014 14:53:20 PDT X-Rocket-MIMEInfo: 002.001, SXQncyBub3QgYW4gZWl0aGVyL29yLiBVbnRpbCBsYXN0IEp1bHkgdGhlcmUgd2FzIGJvdGguIExpa2UgRidpbmcgSW50ZWwgaXNuJ3QgbWFraW5nIGVub3VnaCBtb25leSB0byBwYXkgc29tZW9uZSB0byBtYWludGFpbiBhIEZyZWVCU0QgdmVyc2lvbi4KCgpPbiBXZWRuZXNkYXksIEF1Z3VzdCAxMywgMjAxNCAyOjI0IFBNLCBKaW0gVGhvbXBzb24gPGppbUBuZXRnYXRlLmNvbT4gd3JvdGU6CiAKCgoKCj4gT24gQXVnIDEzLCAyMDE0LCBhdCA4OjI0LCBCYXJuZXkgQ29yZG9iYSB2aWEgZnJlZWJzZC1uZXQgPGYBMAEBAQE- X-Mailer: YahooMailWebService/0.8.201.700 References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> Message-ID: <1407966800.12683.YahooMailNeo@web121605.mail.ne1.yahoo.com> Date: Wed, 13 Aug 2014 14:53:20 -0700 From: Barney Cordoba Reply-To: Barney Cordoba Subject: Re: Intel Support for FreeBSD To: Jim Thompson In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 21:55:47 -0000 It's not an either/or. Until last July there was both. Like F'ing Intel isn't making enough money to pay someone to maintain a FreeBSD version. On Wednesday, August 13, 2014 2:24 PM, Jim Thompson wrote: > On Aug 13, 2014, at 8:24, Barney Cordoba via freebsd-net wrote: > > Negative Progress is inevitable. Many here undoubtedly consider the referenced effort to be the opposite. Jim _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 21:58:36 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8104D101 for ; Wed, 13 Aug 2014 21:58:36 +0000 (UTC) Received: from nm22-vm2.bullet.mail.ne1.yahoo.com (nm22-vm2.bullet.mail.ne1.yahoo.com [98.138.91.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3F7032564 for ; Wed, 13 Aug 2014 21:58:35 +0000 (UTC) Received: from [98.138.100.114] by nm22.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:58:29 -0000 Received: from [98.138.89.194] by tm105.bullet.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:58:29 -0000 Received: from [127.0.0.1] by omp1052.mail.ne1.yahoo.com with NNFMP; 13 Aug 2014 21:58:29 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 133214.3415.bm@omp1052.mail.ne1.yahoo.com Received: (qmail 18119 invoked by uid 60001); 13 Aug 2014 21:58:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1407967109; bh=jrvK7GAgcy2ihtbtwGiT7g0CaVBiP7n5ZlW0f/DK7VQ=; h=References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=sLuo9sG/QRJXQcwMYEyxgaZYZ2C42nqtSeWSC83vhptes7aSHi3VoUer0dSQJal3gPG35/utU3+GxhRYA3MhGXXB5Z+6Rcq/zQk3tggJfM4UWD4zRGvxg0vNzb8E2q2ClbTcwGVFuWUzPvbzxpgcLF+0J1D4yBL8WcBOQU/6Vbg= X-YMail-OSG: ilA8daEVM1nhfWfKOoU6cRrR3RWUt7u4lMXPDmwufVhjx0o AHHWafil8lOHUDj3akV6MlWJ1J4L6gI08xr.rIpzlLguL8mQEhZ0I5do1n0M VrxZCTc_OHenSA_iEOWgBTnYf6REb_nlPWz7fc1xTEqHO5nPRE7l7G1Eh16S wzILqwz3VcxwQChtHTYxPmuavF94sLd8hy56Dh6WiAG7T.lwwrzgfWkUx15H 6NTczANekyCpOGNYDrmrm0RBtXUXZ.SAKBPnbSNB4v2DDJaFVpB4MiIcrczc _dg3dCU_5J.OZHXR1BflvmlSjp8Bif7ha9Cvl6aGsfC2xDyi3Nq_atNEh.Se njF3OE0Y.lOg.AtXGvAjTZ8CsSfA_Pi2yslawsVygHbPWU.ETUp.9v0Kytyc VAynJ6PImmsetyWk7Kyh11RIzWA5Lbz8LfVf33VNU_AG0s06ITScMjGO0URd BwDoToaukpMbk.AjnBCGD12FV.sKHgPhdmUEyT3J79C2ldNGrZz5OOryhh1Z 4Lm75hIyKNJ.WxhWTi3EIZFGrULM1C0kqk1Yk1ENo9bQzRpT9U6s2eWDzeBJ 3DuPHm6qOn2lKokC9zcr9dC5FOcbaRhOObHLDgjytcNDpw_w8OVHINFjxYpA H9ePSxRqS89_ckQne1ij5t10JRNZofERA8yzf7EKS51dtgrys_kNE9R5nc0. A3Q-- Received: from [76.108.181.232] by web121604.mail.ne1.yahoo.com via HTTP; Wed, 13 Aug 2014 14:58:28 PDT X-Rocket-MIMEInfo: 002.001, VGhpcyBraW5kIG9mIHN0dXBpZGl0eSByZWFsbHkgaXJyaXRhdGVzIG1lLiBUaGUgY29tbWVyY2lhbCB1c2Ugb2YgRnJlZUJTRCBpcyB0aGUgb25seSByZWFzb24gdGhhdCB0aGVyZSBpcyBhIHByb2plY3QsIGFuZCBhbnlvbmUgd2l0aCAxLzIgYSBicmFpbiBrbm93cyB0aGF0IGNvbXBhbmllcyB3aXRoIHByb2R1Y3RzIGJhc2VkIG9uIGZyZWVic2QgY2FuJ3QganVzdCB1cGdyYWRlIHRoZWlyIHRyZWUgZXZlcnkgdGltZSBzb21lIGdlZWsgZ2V0cyBhcm91bmQgdG8gd3JpdGluZyBhIHBhdGNoLiBNYXliZSBpdHMBMAEBAQE- X-Mailer: YahooMailWebService/0.8.201.700 References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> <20140813184949.GF83475@funkthat.com> Message-ID: <1407967108.71480.YahooMailNeo@web121604.mail.ne1.yahoo.com> Date: Wed, 13 Aug 2014 14:58:28 -0700 From: Barney Cordoba Reply-To: Barney Cordoba Subject: Re: Intel Support for FreeBSD To: John-Mark Gurney In-Reply-To: <20140813184949.GF83475@funkthat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 21:58:36 -0000 This kind of stupidity really irritates me. The commercial use of FreeBSD i= s the only reason that there is a project, and anyone with 1/2 a brain know= s that companies with products based on freebsd can't just upgrade their tr= ee every time some geek gets around to writing a patch. Maybe its the reaso= n that linux sucks but everyone uses it? 10 years later, some old brain dea= d mentality.=0A=0A=0AOn Wednesday, August 13, 2014 2:49 PM, John-Mark Gurne= y wrote:=0A =0A=0A=0ABarney Cordoba via freebsd-net wrot= e this message on Wed, Aug 13, 2014 at 06:24 -0700:=0A> Ok. It was a lot mo= re convenient when it was a standalone module/tarball so you didn't have to= surgically extract it from the tree and spend a week trying to get it to c= ompile with whatever version you happened to be running. So if you're runni= ng 9.1 or 9.2 you could still use it seamlessly.=A0=0A> =0A> Negative Progr= ess is inevitable.=A0=0A=0AThe problem is that you are using an old version= of FreeBSD that only=0Aprovides security update...=A0 The correct solution= is to update your=0Amachines...=0A=0AI'd much rather have Intel support it= in tree, meaning that supported=0Aversions of FreeBSD have an up to date d= river, than to cater to your=0Awants of using older releases of FreeBSD...= =0A=0AThanks.=0A=0A=0A> On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa wrote:=0A>=A0 =0A> =0A> =0A> On 8/12/2014 9:16 PM, Barney C= ordoba via freebsd-net wrote:=0A> =0A> > I notice that there hasn't been an= update in the Intel Download Center since July. Is there no official suppo= rt for 10?=0A> =0A> Hi,=0A> The latest code is committed directly into the = tree by Intel=0A> =0A> eg=0A> http://lists.freebsd.org/pipermail/svn-src-he= ad/2014-July/060947.html=0A> and=0A> http://lists.freebsd.org/pipermail/svn= -src-head/2014-June/059904.html=0A> =0A> They have been MFC'd to RELENG_10 = a few weeks ago=0A=0A-- =0A=A0 John-Mark Gurney=A0=A0=A0 =A0=A0=A0 =A0=A0= =A0 =A0=A0=A0 Voice: +1 415 225 5579=0A=0A=A0 =A0 "All that I will do, has= been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 22:07:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7ABF258B for ; Wed, 13 Aug 2014 22:07:47 +0000 (UTC) Received: from mail-ob0-f173.google.com (mail-ob0-f173.google.com [209.85.214.173]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3CCF2267C for ; Wed, 13 Aug 2014 22:07:46 +0000 (UTC) Received: by mail-ob0-f173.google.com with SMTP id vb8so312464obc.18 for ; Wed, 13 Aug 2014 15:07:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=fn+KvAA7QOFmsPZAdz44tHo17v5wfCTBNcn7yWlj+as=; b=kGHl3zXWWq51PSk1b5yeIVp3Op4+PWuMfqn0kAQ8uVahUSSpx1NuE5/GUD4NJGftR4 n4BCoVRF0SbiN7CtQU5P23NoL9JKu0P2wTDeiphPoxb1rB+49AKlh+yc/0P+NSkdiKlR HojLb3nOu04cURTEo3a0QXRN+8I8toQUx7Pj0QUNbK+yuQW+9Am+n2BRa16hqBa/sWYp pnAq8wCCeW3qqxVsKHAq/bjjLn3fpqq/TZmEovMS/ApbxTmhkNQLnoYdUKE970qGXMSU Qdj8vTq7XkcvXKt5qn0cGg0rpU+BlWxpL/Ejfsz/GMiBF3g+BN866+sRNlSrckATHNSQ EBNg== X-Gm-Message-State: ALoCoQmnTrjxPXT1+BMuOFFf+m4qkWcogDTHbyVB+jwGHEQjEbnW3Qq1/zRBU5gglIXDZtt1tDz+ X-Received: by 10.60.62.66 with SMTP id w2mr7716705oer.43.1407967665864; Wed, 13 Aug 2014 15:07:45 -0700 (PDT) Received: from ?IPv6:2610:160:11:33:d596:383b:294c:1a63? ([2610:160:11:33:d596:383b:294c:1a63]) by mx.google.com with ESMTPSA id ej4sm3517235obb.28.2014.08.13.15.07.41 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 13 Aug 2014 15:07:41 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.0 \(1972.3\)) Subject: Re: Intel Support for FreeBSD From: Jim Thompson In-Reply-To: <1407967108.71480.YahooMailNeo@web121604.mail.ne1.yahoo.com> Date: Wed, 13 Aug 2014 17:07:40 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> <20140813184949.GF83475@funkthat.com> <1407967108.71480.YahooMailNeo@web121604.mail.ne1.yahoo.com> To: Barney Cordoba X-Mailer: Apple Mail (2.1972.3) Cc: "freebsd-net@freebsd.org" , John-Mark Gurney , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 22:07:47 -0000 Barney, I think everyone on-list understand you=92re upset. You=92ve made that = clear. However, (and I=92ll put my vendor hat on), the project does not exist = solely for the benefit of the companies who choose to use it in their = product(s). Given same, your statement that =93the commercial use of FreeBSD is the = only reason that there is a project=94 is incorrect. It is a reason, = but not the only reason. Jim > On Aug 13, 2014, at 4:58 PM, Barney Cordoba via freebsd-net = wrote: >=20 > This kind of stupidity really irritates me. The commercial use of = FreeBSD is the only reason that there is a project, and anyone with 1/2 = a brain knows that companies with products based on freebsd can't just = upgrade their tree every time some geek gets around to writing a patch. = Maybe its the reason that linux sucks but everyone uses it? 10 years = later, some old brain dead mentality. >=20 >=20 > On Wednesday, August 13, 2014 2:49 PM, John-Mark Gurney = wrote: >=20 >=20 >=20 > Barney Cordoba via freebsd-net wrote this message on Wed, Aug 13, 2014 = at 06:24 -0700: >> Ok. It was a lot more convenient when it was a standalone = module/tarball so you didn't have to surgically extract it from the tree = and spend a week trying to get it to compile with whatever version you = happened to be running. So if you're running 9.1 or 9.2 you could still = use it seamlessly.=20 >>=20 >> Negative Progress is inevitable.=20 >=20 > The problem is that you are using an old version of FreeBSD that only > provides security update... The correct solution is to update your > machines... >=20 > I'd much rather have Intel support it in tree, meaning that supported > versions of FreeBSD have an up to date driver, than to cater to your > wants of using older releases of FreeBSD... >=20 > Thanks. >=20 >=20 >> On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa = wrote: >> =20 >>=20 >>=20 >> On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: >>=20 >>> I notice that there hasn't been an update in the Intel Download = Center since July. Is there no official support for 10? >>=20 >> Hi, >> The latest code is committed directly into the tree by Intel >>=20 >> eg >> http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html >> and >> http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html >>=20 >> They have been MFC'd to RELENG_10 a few weeks ago >=20 > --=20 > John-Mark Gurney Voice: +1 415 225 5579 >=20 > "All that I will do, has been done, All that I have, has not." > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Aug 13 23:49:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0C00BD9F for ; Wed, 13 Aug 2014 23:49:57 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id DC4362133 for ; Wed, 13 Aug 2014 23:49:56 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7DNntmB077928 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 13 Aug 2014 16:49:56 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7DNntt1077927; Wed, 13 Aug 2014 16:49:55 -0700 (PDT) (envelope-from jmg) Date: Wed, 13 Aug 2014 16:49:55 -0700 From: John-Mark Gurney To: Barney Cordoba Subject: Re: Intel Support for FreeBSD Message-ID: <20140813234955.GJ83475@funkthat.com> Mail-Followup-To: Barney Cordoba , Mike Tancsa , "freebsd-net@freebsd.org" References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> <53EAC5E8.2050207@sentex.net> <1407936252.96291.YahooMailNeo@web121601.mail.ne1.yahoo.com> <20140813184949.GF83475@funkthat.com> <1407967108.71480.YahooMailNeo@web121604.mail.ne1.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1407967108.71480.YahooMailNeo@web121604.mail.ne1.yahoo.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Wed, 13 Aug 2014 16:49:56 -0700 (PDT) Cc: "freebsd-net@freebsd.org" , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 23:49:57 -0000 Barney Cordoba wrote this message on Wed, Aug 13, 2014 at 14:58 -0700: > This kind of stupidity really irritates me. The commercial use of FreeBSD is the only reason that there is a project, and anyone with 1/2 a brain knows that companies with products based on freebsd can't just upgrade their tree every time some geek gets around to writing a patch. Maybe its the reason that linux sucks but everyone uses it? 10 years later, some old brain dead mentality. Clearly your gripe is w/ Intel, not the FreeBSD community.. Intel changed how they supported their driver.. We cannot change what Intel does... Please go complain to your vendor, and as you're a commercial user of their hardware, they should listen to you... > On Wednesday, August 13, 2014 2:49 PM, John-Mark Gurney wrote: > > > > Barney Cordoba via freebsd-net wrote this message on Wed, Aug 13, 2014 at 06:24 -0700: > > Ok. It was a lot more convenient when it was a standalone module/tarball so you didn't have to surgically extract it from the tree and spend a week trying to get it to compile with whatever version you happened to be running. So if you're running 9.1 or 9.2 you could still use it seamlessly.  > > > > Negative Progress is inevitable.  > > The problem is that you are using an old version of FreeBSD that only > provides security update...  The correct solution is to update your > machines... > > I'd much rather have Intel support it in tree, meaning that supported > versions of FreeBSD have an up to date driver, than to cater to your > wants of using older releases of FreeBSD... > > Thanks. > > > > On Tuesday, August 12, 2014 9:57 PM, Mike Tancsa wrote: > >  > > > > > > On 8/12/2014 9:16 PM, Barney Cordoba via freebsd-net wrote: > > > > > I notice that there hasn't been an update in the Intel Download Center since July. Is there no official support for 10? > > > > Hi, > > The latest code is committed directly into the tree by Intel > > > > eg > > http://lists.freebsd.org/pipermail/svn-src-head/2014-July/060947.html > > and > > http://lists.freebsd.org/pipermail/svn-src-head/2014-June/059904.html > > > > They have been MFC'd to RELENG_10 a few weeks ago -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 09:23:30 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D2A71183; Thu, 14 Aug 2014 09:23:30 +0000 (UTC) Received: from mail-la0-x22f.google.com (mail-la0-x22f.google.com [IPv6:2a00:1450:4010:c03::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C88602A3C; Thu, 14 Aug 2014 09:23:29 +0000 (UTC) Received: by mail-la0-f47.google.com with SMTP id mc6so759024lab.20 for ; Thu, 14 Aug 2014 02:23:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=qLnNumXXyFcABLj2bZlBea8o9c4ElCwxyy/4vTWdB0Q=; b=BG8DsbvqisjzDD5/wd8zyEMJFi4x4EsVInMArCg6FQAYk/yQtqwJgipJkqCt/82SZo vVBmZz3JwZUwtXrdbj0hOElwDuV46I5IFtd6nNJwBHXHXNzNQMeqa1qoyKxiFOMgdzfZ aLcefMQ/NvMMQBtKOuKKPF5+YP2pXezvU3PsPkrgE9tfT3RTbTYp9KZckqiK37fdXYLA D9E8z6ojL13NTGf67iTfVZlvKgkDzxffCtgJjRJVxTPERBjYwjPtx1SLCYS20yV4wyzb M8U7Kos4jYBeZwMhA58DfQpuqHnsRUdWiXPvCPH1LszgR5mC5lF1DgCNLX/Hh56e1O5N RYTQ== MIME-Version: 1.0 X-Received: by 10.112.34.8 with SMTP id v8mr3417820lbi.47.1408008207733; Thu, 14 Aug 2014 02:23:27 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.244.2 with HTTP; Thu, 14 Aug 2014 02:23:27 -0700 (PDT) In-Reply-To: <53EBC687.9050503@yandex-team.ru> References: <53EBC687.9050503@yandex-team.ru> Date: Thu, 14 Aug 2014 11:23:27 +0200 X-Google-Sender-Auth: ecJUl9nK6T04x2mzyb5kx4-yzI8 Message-ID: Subject: Re: [CFT] new tables for ipfw From: Luigi Rizzo To: "Alexander V. Chernikov" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 09:23:30 -0000 On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > Hello list. > > I've been hacking ipfw for a while and It seems there is something ready > to test/review in projects/ipfw branch. > =E2=80=8Bthis is a fantastic piece of work, thanks for doing it and for integrating the feedback. =E2=80=8B I have some detailed feedback that will send you privately, but just a curiosity: =E2=80=8B...=E2=80=8B > > Some examples (see ipfw(8) manual page for the description): > > > =E2=80=8B... > > > ipfw table mi_test create type cidr algo "cidr:hash masks=3D/30,/64" > =E2=80=8Bwhy do we need to specify mask lengths in the above=E2=80=8B ? cheers luigi From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 09:57:38 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3EB03977; Thu, 14 Aug 2014 09:57:38 +0000 (UTC) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C5EC42E61; Thu, 14 Aug 2014 09:57:37 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id 391062420040; Thu, 14 Aug 2014 13:57:34 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 07E3A2C05E8; Thu, 14 Aug 2014 13:57:34 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id QrJ3r13hHk-vYIinBAU; Thu, 14 Aug 2014 13:57:34 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: af1fc88a-0435-4944-bb04-1597ac39686e DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408010254; bh=JY9TNqE3DCEys9ey1ITFb8tzgD32DcqPEv6CaI0BciQ=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type; b=BbBtOxPA9Uv/vKjaVpBOzkigMICGbOACc+Bf+h9ehYDfuwTKahGi/f1I3OOFkDKk6 83Ho0QLt4+sDQVx+XTF1aW58/916IdnSdlSwslCnFD4xyQf9XrfKiZig32UOTnAO6K P5KrtyePryxp49yAUBtSMzuOZp/6Ve38cDnKFyPE= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53EC880B.3020903@yandex-team.ru> Date: Thu, 14 Aug 2014 13:57:31 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 09:57:38 -0000 On 14.08.2014 13:23, Luigi Rizzo wrote: > > > > On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov > > wrote: > > Hello list. > > I've been hacking ipfw for a while and It seems there is something > ready to test/review in projects/ipfw branch. > > > ​this is a fantastic piece of work, thanks for doing it and for > integrating the feedback. > ​ > I have some detailed feedback that will send you privately, > but just a curiosity: > > ​...​ > > Some examples (see ipfw(8) manual page for the description): > > ​... > > > ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" > > > ​why do we need to specify mask lengths in the above​ ? Well, since we're hashing IP we have to know mask to cut host bits in advance. (And the real reason is that I'm too lazy to implement hierarchical matching (check /32, then /31, then /30) like how, for example, this is done in ipset), so this particular algorithm supports only single IPv4 and single IPv6 mask. Anyway, it is not too hard to add another algo which is doing the above. > > cheers > luigi > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 10:44:25 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9244E10C; Thu, 14 Aug 2014 10:44:25 +0000 (UTC) Received: from mail-lb0-x231.google.com (mail-lb0-x231.google.com [IPv6:2a00:1450:4010:c04::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 767372452; Thu, 14 Aug 2014 10:44:24 +0000 (UTC) Received: by mail-lb0-f177.google.com with SMTP id s7so830781lbd.36 for ; Thu, 14 Aug 2014 03:44:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=9Emzx579XucHI7hUvpY/DO6lq6rtCJOQWVx1mxAHX64=; b=uRLkuqXFU0MCzpuCZ8sP5jyLMAM3tkS7LziIiuveoncQBv530ugxYXQPgkGXWzHgvi 8q6N08Q4ay4Poj2Neofn9fWXFEwBBKguTvzz+CPJW/YgKn3Q9GC54X9D6PMh/aYJQJju vySml/18qWKEIOqLf7SzMc5QXoNEkRVIbVMRElcexYMwMn1twyhkB50ODCNZyw601OPy I23F51M7/BJXKjh1SVqLQhCqN8oUTwCP8gvoHluc6Q1Poc2SJahMPSRWi/lwL+z6sZK/ ES8SwarTXQc5XPKAhurDkvicQSpcC1AenUKbbgOJ6gxDjNmPlTgeVOWYtufM6NEc8eBt rplA== MIME-Version: 1.0 X-Received: by 10.112.22.37 with SMTP id a5mr3942315lbf.76.1408013062224; Thu, 14 Aug 2014 03:44:22 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.244.2 with HTTP; Thu, 14 Aug 2014 03:44:22 -0700 (PDT) In-Reply-To: <53EC880B.3020903@yandex-team.ru> References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> Date: Thu, 14 Aug 2014 12:44:22 +0200 X-Google-Sender-Auth: nW20OSYRnenml96i0Nu5cYilVFA Message-ID: Subject: Re: [CFT] new tables for ipfw From: Luigi Rizzo To: "Alexander V. Chernikov" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 10:44:25 -0000 On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > On 14.08.2014 13:23, Luigi Rizzo wrote: > > > > > On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < > melifaro@yandex-team.ru> wrote: > >> Hello list. >> >> I've been hacking ipfw for a while and It seems there is something ready >> to test/review in projects/ipfw branch. >> > > =E2=80=8Bthis is a fantastic piece of work, thanks for doing it and for > integrating the feedback. > =E2=80=8B > I have some detailed feedback that will send you privately, > but just a curiosity: > > =E2=80=8B...=E2=80=8B >> >> Some examples (see ipfw(8) manual page for the description): >> >> >> =E2=80=8B... >> >> >> ipfw table mi_test create type cidr algo "cidr:hash masks=3D/30,/64" >> > > =E2=80=8Bwhy do we need to specify mask lengths in the above=E2=80=8B ? > > Well, since we're hashing IP we have to know mask to cut host bits in > advance. > (And the real reason is that I'm too lazy to implement hierarchical > matching (check /32, then /31, then /30) like how, for example, > =E2=80=8Boh well for that we should use cidr:radix Research results have never shown a strong superiority of hierarchical hash tables over good radix implementations, and in those cases one usually adopts partial prefix expansion so you only have, say, masks that are a multiple of 2..8 bits so you only need a small number of hash lookups. =E2=80=8Bcheers luigi=E2=80=8B From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 10:57:31 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7DE113A0; Thu, 14 Aug 2014 10:57:31 +0000 (UTC) Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net [IPv6:2a02:6b8:0:202::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2691E255B; Thu, 14 Aug 2014 10:57:31 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id C7DA264057F; Thu, 14 Aug 2014 14:57:17 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 91D3F2C05E8; Thu, 14 Aug 2014 14:57:17 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id GSBR0eaIlo-vHIuD88D; Thu, 14 Aug 2014 14:57:17 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: 91582161-a429-4f64-ac2f-2c09a2800245 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408013837; bh=C+koB9qYLL8VFaE+z35vJXaKL3HUkLAZlQGsbLzMFeY=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type; b=wOSECESQV0uOrFbR6net5+GTv26/yo/kmk5NanHYbdGlfYNmE/TyUl1GIegPKdK7X dkGansgFFMiCcC/FKckFhOHHskeprosAp3NSwvYlizJ4oWTbxdnWRgupTZy6fP6r7M Hv/iJy7CfdoSPQqm7SZed1y92p9PRud5/H8p/rSA= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53EC960A.1030603@yandex-team.ru> Date: Thu, 14 Aug 2014 14:57:14 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 10:57:31 -0000 On 14.08.2014 14:44, Luigi Rizzo wrote: > > > > On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov > > wrote: > > On 14.08.2014 13:23, Luigi Rizzo wrote: >> >> >> >> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov >> > wrote: >> >> Hello list. >> >> I've been hacking ipfw for a while and It seems there is >> something ready to test/review in projects/ipfw branch. >> >> >> ​this is a fantastic piece of work, thanks for doing it and for >> integrating the feedback. >> ​ >> I have some detailed feedback that will send you privately, >> but just a curiosity: >> >> ​...​ >> >> Some examples (see ipfw(8) manual page for the description): >> >> ​... >> >> >> ipfw table mi_test create type cidr algo "cidr:hash >> masks=/30,/64" >> >> >> ​why do we need to specify mask lengths in the above​ ? > Well, since we're hashing IP we have to know mask to cut host bits > in advance. > (And the real reason is that I'm too lazy to implement > hierarchical matching (check /32, then /31, then /30) like how, > for example, > > > ​oh well for that we should use cidr:radix > > Research results have never shown a strong superiority of > hierarchical hash tables over good radix implementations, > and in those cases one usually adopts partial prefix > expansion so you only have, say, masks that are a > multiple of 2..8 bits so you only need a small number of > hash lookups. Definitely, especially for IPv6. So I was actually thinking about covering some special sparse cases (e.g. someone having a bunch of /32 and a bunch of /30 and that's all). Btw, since we're talking about "good radix implementation": what license does DXR have? :) Is it OK to merge it as another cidr implementation? > > ​cheers > luigi​ > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 11:15:44 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A4A247A7; Thu, 14 Aug 2014 11:15:44 +0000 (UTC) Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com [IPv6:2a00:1450:4010:c03::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8B5922736; Thu, 14 Aug 2014 11:15:43 +0000 (UTC) Received: by mail-la0-f45.google.com with SMTP id ty20so867206lab.18 for ; Thu, 14 Aug 2014 04:15:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=5vQ9XaXkb2TRo8Op0LYVTotfjensTxi6TzsIoWM0Q/0=; b=ifr45WBFl4WJRP3yxwrr+FzzLwHKA/m1RhAgWgdJI0orulB5yVMJmVWpBCGoFC6tEX 4nqTQHGGnAaJ1NNFAJmzRvwearJqITADKCMYdlagnqdWFEXhjLPlURP1qwRX3dZWm688 2VH+KsFvA/XFSfOsXuxnUgyEJfX99ezdZHZH1fCz08ZsUggBFL6JotRUDvHGyvpRln/7 4K10JuvOXEy6EzwBtj1wk3ck/giZSPOg8GeuBPQ5CmCT6PflNpi6nPVxBCPzSnvmaYFW y52dyq0SereKQ36wXscYfMFNB5h4CGZZlrZ1txTB/RmT2T6mihv2Ar29EeRKZB6rNK6Y xAtA== MIME-Version: 1.0 X-Received: by 10.112.56.206 with SMTP id c14mr2605618lbq.27.1408014941450; Thu, 14 Aug 2014 04:15:41 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.244.2 with HTTP; Thu, 14 Aug 2014 04:15:41 -0700 (PDT) In-Reply-To: <53EC960A.1030603@yandex-team.ru> References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> Date: Thu, 14 Aug 2014 13:15:41 +0200 X-Google-Sender-Auth: 0dMTaJr-5B2dBosDFQ5DsvybrVU Message-ID: Subject: Re: [CFT] new tables for ipfw From: Luigi Rizzo To: "Alexander V. Chernikov" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 11:15:44 -0000 On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > On 14.08.2014 14:44, Luigi Rizzo wrote: > > > > > On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov < > melifaro@yandex-team.ru> wrote: > >> On 14.08.2014 13:23, Luigi Rizzo wrote: >> >> >> >> >> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < >> melifaro@yandex-team.ru> wrote: >> >>> Hello list. >>> >>> I've been hacking ipfw for a while and It seems there is something read= y >>> to test/review in projects/ipfw branch. >>> >> >> =E2=80=8Bthis is a fantastic piece of work, thanks for doing it and for >> integrating the feedback. >> =E2=80=8B >> I have some detailed feedback that will send you privately, >> but just a curiosity: >> >> =E2=80=8B...=E2=80=8B >>> >>> Some examples (see ipfw(8) manual page for the description): >>> >>> >>> =E2=80=8B... >>> >>> >>> ipfw table mi_test create type cidr algo "cidr:hash masks=3D/30,/64" >>> >> >> =E2=80=8Bwhy do we need to specify mask lengths in the above=E2=80=8B ? >> >> Well, since we're hashing IP we have to know mask to cut host bits in >> advance. >> (And the real reason is that I'm too lazy to implement hierarchical >> matching (check /32, then /31, then /30) like how, for example, >> > > =E2=80=8Boh well for that we should use cidr:radix > > Research results have never shown a strong superiority of > hierarchical hash tables over good radix implementations, > and in those cases one usually adopts partial prefix > expansion so you only have, say, masks that are a > multiple of 2..8 bits so you only need a small number of > hash lookups. > > Definitely, especially for IPv6. So I was actually thinking about coverin= g > some special sparse cases (e.g. someone having a bunch of /32 and a bunch > of /30 and that's all). > > Btw, since we're talking about "good radix implementation": what license > does DXR have? :) > Is it OK to merge it as another cidr implementation? > "cidr" is a very ugly name, i'd rather use "addr" DXR has a =E2=80=8Bbsd license and of course it is possible to use it. You should ask Marko Zec for his latest version of the code (and probably make sure we have one copy of the code in the source tree). Speaking of features, one thing that would be nice is the ability for tables to reference the in-kernel tables (e.g. fibs, socket lists, interface lists...), perhaps in readonly mode. How complex do you think that would be ? cheers luigi From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 11:52:41 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 403EDF25; Thu, 14 Aug 2014 11:52:41 +0000 (UTC) Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net [IPv6:2a02:6b8:0:202::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C98E32ABF; Thu, 14 Aug 2014 11:52:40 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id BF81064057F; Thu, 14 Aug 2014 15:52:37 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 8453D2C05F8; Thu, 14 Aug 2014 15:52:37 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id hicEnCyaNo-qbI8Ml1r; Thu, 14 Aug 2014 15:52:37 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: e7c1dd03-f5f2-4e3f-8ed2-29970b6f0154 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408017157; bh=6AcoeR2+4Di8V0QUtEsMUTVKHMoqYXLh+iK3vgGdRwY=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type; b=pwplYCuiuO/PpnZmmpOh6xwU/gvfBqAZdlde+T/GVS1flCmZRvBvO2T+wGUo8bWDo +9i+7tz9nDTB7WvCb3BTKDaQL6TxyznP5DjTKhEvlrd6zCLlJ7n4BnZoqopXg0t1e2 6Id52j3PPPryDRNOJFkX4tmtXtfZdQJh6sdSLiXk= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53ECA302.8010100@yandex-team.ru> Date: Thu, 14 Aug 2014 15:52:34 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 11:52:41 -0000 On 14.08.2014 15:15, Luigi Rizzo wrote: > > > > On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov > > wrote: > > On 14.08.2014 14:44, Luigi Rizzo wrote: >> >> >> >> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov >> > wrote: >> >> On 14.08.2014 13:23, Luigi Rizzo wrote: >>> >>> >>> >>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov >>> > >>> wrote: >>> >>> Hello list. >>> >>> I've been hacking ipfw for a while and It seems there is >>> something ready to test/review in projects/ipfw branch. >>> >>> >>> ​this is a fantastic piece of work, thanks for doing it and for >>> integrating the feedback. >>> ​ >>> I have some detailed feedback that will send you privately, >>> but just a curiosity: >>> >>> ​...​ >>> >>> Some examples (see ipfw(8) manual page for the description): >>> >>> ​... >>> >>> >>> ipfw table mi_test create type cidr algo "cidr:hash >>> masks=/30,/64" >>> >>> >>> ​why do we need to specify mask lengths in the above​ ? >> Well, since we're hashing IP we have to know mask to cut host >> bits in advance. >> (And the real reason is that I'm too lazy to implement >> hierarchical matching (check /32, then /31, then /30) like >> how, for example, >> >> >> ​oh well for that we should use cidr:radix >> >> Research results have never shown a strong superiority of >> hierarchical hash tables over good radix implementations, >> and in those cases one usually adopts partial prefix >> expansion so you only have, say, masks that are a >> multiple of 2..8 bits so you only need a small number of >> hash lookups. > Definitely, especially for IPv6. So I was actually thinking about > covering some special sparse cases (e.g. someone having a bunch of > /32 and a bunch of /30 and that's all). > > Btw, since we're talking about "good radix implementation": what > license does DXR have? :) > Is it OK to merge it as another cidr implementation? > > "cidr" is a very ugly name, i'd rather use "addr" Ok, no problem with that. "addr" really sounds better. > > DXR has a ​bsd license and of course it is possible to use it. > You should ask Marko Zec for his latest version of the code > (and probably make sure we have one copy of the code in the source tree). Great!. I'll ask him :) > > Speaking of features, one thing that would be nice is the ability > for tables to reference the in-kernel tables (e.g. fibs, socket > lists, interface lists...), perhaps in readonly mode. > How complex do you think that would be ? Implementing algo support for particular provider like sockets/iflists shouldn't be hard. Most of the algorithms complexity lies in table modifications. Here we have to support lookup and dump operations, so it is the question of providing necessary bindings to existing mechanisms (via some direct binding or utilizing things like kernel_sysctl for dump support). It looks like the following maps well to current table concept: * such tables are not created by default * user issues `ipfw table kfib create type addr algo "addr:kernel fib=0"` or `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=0"` or `ipfw table kiface create type iface algo "iface:kernel"` * tables have special "readonly" type, flush_all requests are ignored * no state stored internally So generic table handling code needs to be modified to support read-only tables (and making more callbacks optional). Additionally, we might need to proxy "info" request info algo callback (optional, "real" algorithms won't implement it) to be able to show number of items (and some other info) to user. > > cheers > luigi > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 12:08:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C38AE4A0; Thu, 14 Aug 2014 12:08:57 +0000 (UTC) Received: from mail.fer.hr (mail.fer.hr [161.53.72.233]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client CN "mail.fer.hr", Issuer "TERENA SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 274362C8A; Thu, 14 Aug 2014 12:08:56 +0000 (UTC) Received: from x23 (31.147.112.155) by MAIL.fer.hr (161.53.72.233) with Microsoft SMTP Server (TLS) id 14.2.342.3; Thu, 14 Aug 2014 14:07:44 +0200 Date: Thu, 14 Aug 2014 14:08:18 +0200 From: Marko Zec To: "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw Message-ID: <20140814140818.3539d9c5@x23> In-Reply-To: <53ECA302.8010100@yandex-team.ru> References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA302.8010100@yandex-team.ru> Organization: FER X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; amd64-portbld-freebsd9.1) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [31.147.112.155] Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 12:08:57 -0000 On Thu, 14 Aug 2014 15:52:34 +0400 "Alexander V. Chernikov" wrote: > On 14.08.2014 15:15, Luigi Rizzo wrote: > > > > > > > > On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov=20 > > > wrote: > > > > On 14.08.2014 14:44, Luigi Rizzo wrote: > >> > >> > >> > >> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov > >> > > >> wrote: > >> > >> On 14.08.2014 13:23, Luigi Rizzo wrote: > >>> > >>> > >>> > >>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov > >>> > > >>> wrote: > >>> > >>> Hello list. > >>> > >>> I've been hacking ipfw for a while and It seems there > >>> is something ready to test/review in projects/ipfw branch. > >>> > >>> > >>> =E2=80=8Bthis is a fantastic piece of work, thanks for doing = it > >>> and for integrating the feedback. > >>> =E2=80=8B > >>> I have some detailed feedback that will send you > >>> privately, but just a curiosity: > >>> > >>> =E2=80=8B...=E2=80=8B > >>> > >>> Some examples (see ipfw(8) manual page for the > >>> description): > >>> > >>> =E2=80=8B... > >>> > >>> > >>> ipfw table mi_test create type cidr algo "cidr:hash > >>> masks=3D/30,/64" > >>> > >>> > >>> =E2=80=8Bwhy do we need to specify mask lengths in the above= =E2=80=8B ? > >> Well, since we're hashing IP we have to know mask to cut > >> host bits in advance. > >> (And the real reason is that I'm too lazy to implement > >> hierarchical matching (check /32, then /31, then /30) like > >> how, for example, > >> > >> > >> =E2=80=8Boh well for that we should use cidr:radix > >> > >> Research results have never shown a strong superiority of > >> hierarchical hash tables over good radix implementations, > >> and in those cases one usually adopts partial prefix > >> expansion so you only have, say, masks that are a > >> multiple of 2..8 bits so you only need a small number of > >> hash lookups. > > Definitely, especially for IPv6. So I was actually thinking > > about covering some special sparse cases (e.g. someone having a > > bunch of /32 and a bunch of /30 and that's all). > > > > Btw, since we're talking about "good radix implementation": what > > license does DXR have? :) > > Is it OK to merge it as another cidr implementation? > > > > "cidr" is a very ugly name, i'd rather use "addr" > Ok, no problem with that. "addr" really sounds better. > > > > DXR has a =E2=80=8Bbsd license and of course it is possible to use it. > > You should ask Marko Zec for his latest version of the code > > (and probably make sure we have one copy of the code in the source > > tree). > Great!. I'll ask him :) The so far cleanest DXR implementation is significantly C++ poluted and wrapped inside Click glue (available here: http://www.nxab.fer.hr/dxr) I'll try to backport the fixes to the original C-only / BSD implementation over the weekend and let you know how it goes... Marko > > > > Speaking of features, one thing that would be nice is the ability > > for tables to reference the in-kernel tables (e.g. fibs, socket > > lists, interface lists...), perhaps in readonly mode. > > How complex do you think that would be ? > Implementing algo support for particular provider like > sockets/iflists shouldn't be hard. Most of the algorithms complexity > lies in table modifications. Here we have to support > lookup and dump operations, so it is the question of providing > necessary bindings to existing mechanisms (via some direct binding or > utilizing things like kernel_sysctl for dump support). >=20 > It looks like the following maps well to current table concept: > * such tables are not created by default > * user issues > `ipfw table kfib create type addr algo "addr:kernel fib=3D0"` > or > `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=3D0"` > or > `ipfw table kiface create type iface algo "iface:kernel"` > * tables have special "readonly" type, flush_all requests are ignored > * no state stored internally >=20 > So generic table handling code needs to be modified to support > read-only tables (and making more callbacks optional). > Additionally, we might need to proxy "info" request info algo > callback (optional, "real" algorithms won't implement it) to be able > to show number of items (and some other info) to user. >=20 >=20 >=20 > > > > cheers > > luigi > > >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 12:08:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9F5EE49D; Thu, 14 Aug 2014 12:08:57 +0000 (UTC) Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 351032C8B; Thu, 14 Aug 2014 12:08:57 +0000 (UTC) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 788B41534ED; Thu, 14 Aug 2014 14:08:45 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b7WALQPu2SGS; Thu, 14 Aug 2014 14:08:24 +0200 (CEST) Received: from [IPv6:2001:4cb8:3:1:cdb4:168:33e1:2262] (unknown [IPv6:2001:4cb8:3:1:cdb4:168:33e1:2262]) by smtp.digiware.nl (Postfix) with ESMTP id 9921C1534EC; Thu, 14 Aug 2014 14:08:24 +0200 (CEST) Message-ID: <53ECA6B2.8010003@digiware.nl> Date: Thu, 14 Aug 2014 14:08:18 +0200 From: Willem Jan Withagen Organization: Digiware Management b.v. User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Luigi Rizzo , "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 12:08:57 -0000 On 2014-08-14 13:15, Luigi Rizzo wrote: > On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov < > melifaro@yandex-team.ru> wrote: > >> On 14.08.2014 14:44, Luigi Rizzo wrote: >> >> >> >> >> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov < >> melifaro@yandex-team.ru> wrote: >> >>> On 14.08.2014 13:23, Luigi Rizzo wrote: >>> >>> >>> >>> >>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < >>> melifaro@yandex-team.ru> wrote: >>> >>>> Hello list. >>>> >>>> I've been hacking ipfw for a while and It seems there is something ready >>>> to test/review in projects/ipfw branch. >>>> >>> >>> ​this is a fantastic piece of work, thanks for doing it and for >>> integrating the feedback. >>> ​ >>> I have some detailed feedback that will send you privately, >>> but just a curiosity: >>> >>> ​...​ >>>> >>>> Some examples (see ipfw(8) manual page for the description): >>>> >>>> >>>> ​... >>>> >>>> >>>> ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" >>>> >>> >>> ​why do we need to specify mask lengths in the above​ ? >>> >>> Well, since we're hashing IP we have to know mask to cut host bits in >>> advance. >>> (And the real reason is that I'm too lazy to implement hierarchical >>> matching (check /32, then /31, then /30) like how, for example, >>> >> >> ​oh well for that we should use cidr:radix >> >> Research results have never shown a strong superiority of >> hierarchical hash tables over good radix implementations, >> and in those cases one usually adopts partial prefix >> expansion so you only have, say, masks that are a >> multiple of 2..8 bits so you only need a small number of >> hash lookups. >> >> Definitely, especially for IPv6. So I was actually thinking about covering >> some special sparse cases (e.g. someone having a bunch of /32 and a bunch >> of /30 and that's all). >> >> Btw, since we're talking about "good radix implementation": what license >> does DXR have? :) >> Is it OK to merge it as another cidr implementation? >> > > "cidr" is a very ugly name, i'd rather use "addr" > > DXR has a ​bsd license and of course it is possible to use it. > You should ask Marko Zec for his latest version of the code > (and probably make sure we have one copy of the code in the source tree). > > Speaking of features, one thing that would be nice is the ability > for tables to reference the in-kernel tables (e.g. fibs, socket > lists, interface lists...), perhaps in readonly mode. > How complex do you think that would be ? I'm a very happy user of ipfw and I think these are nice improvements and will make things more flexible... I have 2 nits to pick with the current version. I've found the notation ipnr:something rather frustrating when using ipv6 addresses. Sort of like typing a ipv6 address in a browser, the last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. compare 2001:4cb8:3:1::1 2001:4cb8:3:1::1:80 [2001:4cb8:3:1::1]:80 The first and the last are the same host but a different port, the middle one is just a different host. Could/should we do the same in ipfw? And I keep running into the ipfw add deny all from table(50) to any notation. the ()'s need to be escaped in most any shell. Where as I look at the syntax there is little reason to require the ()'s. the keyword table always needs to be followed by a number (and in the new version a (word|number) ). Thanx for the nice work, --WjW From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 12:46:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 982711F1; Thu, 14 Aug 2014 12:46:12 +0000 (UTC) Received: from forward-corp1g.mail.yandex.net (forward-corp1g.mail.yandex.net [IPv6:2a02:6b8:0:1402::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3CD4A2194; Thu, 14 Aug 2014 12:46:11 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1g.mail.yandex.net (Yandex) with ESMTP id F0FC7366005A; Thu, 14 Aug 2014 16:46:07 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id AF7062C05E8; Thu, 14 Aug 2014 16:46:07 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id iAsmop7sNc-k7IK310a; Thu, 14 Aug 2014 16:46:07 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: 91582161-a429-4f64-ac2f-2c09a2800245 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408020367; bh=gDd+ENkKuIiuie79a1gdzWuvzE7SKND9vNrKHd6ds7g=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=Fu+fAS5TS9gkMrIFBGWgDDm0aZrIRDZWvTF5nnQH6fG8+3Ok/vpvwpIt00+wRAjhi KExt79k9Z64clx4h1hCi/W+yUbF8MmUqOIXzv1yuR2kkNPq1P5B+iM5Tl5SphHEtL0 Rw325ARHwOuhDyEGiDNO5BgD37o6+Otfi1m9FRuA= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53ECAF8C.2020007@yandex-team.ru> Date: Thu, 14 Aug 2014 16:46:04 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Marko Zec Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA302.8010100@yandex-team.ru> <20140814140818.3539d9c5@x23> In-Reply-To: <20140814140818.3539d9c5@x23> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 12:46:12 -0000 On 14.08.2014 16:08, Marko Zec wrote: > On Thu, 14 Aug 2014 15:52:34 +0400 > "Alexander V. Chernikov" wrote: > >> On 14.08.2014 15:15, Luigi Rizzo wrote: >>> >>> >>> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov >>> > wrote: >>> >>> On 14.08.2014 14:44, Luigi Rizzo wrote: >>>> >>>> >>>> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov >>>> > >>>> wrote: >>>> >>>> On 14.08.2014 13:23, Luigi Rizzo wrote: >>>>> >>>>> >>>>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov >>>>> > >>>>> wrote: >>>>> >>>>> Hello list. >>>>> >>>>> I've been hacking ipfw for a while and It seems there >>>>> is something ready to test/review in projects/ipfw branch. >>>>> >>>>> >>>>> ​this is a fantastic piece of work, thanks for doing it >>>>> and for integrating the feedback. >>>>> ​ >>>>> I have some detailed feedback that will send you >>>>> privately, but just a curiosity: >>>>> >>>>> ​...​ >>>>> >>>>> Some examples (see ipfw(8) manual page for the >>>>> description): >>>>> >>>>> ​... >>>>> >>>>> >>>>> ipfw table mi_test create type cidr algo "cidr:hash >>>>> masks=/30,/64" >>>>> >>>>> >>>>> ​why do we need to specify mask lengths in the above​ ? >>>> Well, since we're hashing IP we have to know mask to cut >>>> host bits in advance. >>>> (And the real reason is that I'm too lazy to implement >>>> hierarchical matching (check /32, then /31, then /30) like >>>> how, for example, >>>> >>>> >>>> ​oh well for that we should use cidr:radix >>>> >>>> Research results have never shown a strong superiority of >>>> hierarchical hash tables over good radix implementations, >>>> and in those cases one usually adopts partial prefix >>>> expansion so you only have, say, masks that are a >>>> multiple of 2..8 bits so you only need a small number of >>>> hash lookups. >>> Definitely, especially for IPv6. So I was actually thinking >>> about covering some special sparse cases (e.g. someone having a >>> bunch of /32 and a bunch of /30 and that's all). >>> >>> Btw, since we're talking about "good radix implementation": what >>> license does DXR have? :) >>> Is it OK to merge it as another cidr implementation? >>> >>> "cidr" is a very ugly name, i'd rather use "addr" >> Ok, no problem with that. "addr" really sounds better. >>> DXR has a ​bsd license and of course it is possible to use it. >>> You should ask Marko Zec for his latest version of the code >>> (and probably make sure we have one copy of the code in the source >>> tree). >> Great!. I'll ask him :) > The so far cleanest DXR implementation is significantly C++ poluted and > wrapped inside Click glue (available here: http://www.nxab.fer.hr/dxr) > > I'll try to backport the fixes to the original C-only / BSD > implementation over the weekend and let you know how it goes... Great! I've got 2012 version half-ported (and radix fix has been merged to the tree), but something definitely has changed since then :) I'd be happy to hear from you :) > > Marko > > >>> Speaking of features, one thing that would be nice is the ability >>> for tables to reference the in-kernel tables (e.g. fibs, socket >>> lists, interface lists...), perhaps in readonly mode. >>> How complex do you think that would be ? >> Implementing algo support for particular provider like >> sockets/iflists shouldn't be hard. Most of the algorithms complexity >> lies in table modifications. Here we have to support >> lookup and dump operations, so it is the question of providing >> necessary bindings to existing mechanisms (via some direct binding or >> utilizing things like kernel_sysctl for dump support). >> >> It looks like the following maps well to current table concept: >> * such tables are not created by default >> * user issues >> `ipfw table kfib create type addr algo "addr:kernel fib=0"` >> or >> `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=0"` >> or >> `ipfw table kiface create type iface algo "iface:kernel"` >> * tables have special "readonly" type, flush_all requests are ignored >> * no state stored internally >> >> So generic table handling code needs to be modified to support >> read-only tables (and making more callbacks optional). >> Additionally, we might need to proxy "info" request info algo >> callback (optional, "real" algorithms won't implement it) to be able >> to show number of items (and some other info) to user. >> >> >> >>> cheers >>> luigi >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 12:48:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4C9F03E6; Thu, 14 Aug 2014 12:48:04 +0000 (UTC) Received: from mail-yk0-x22d.google.com (mail-yk0-x22d.google.com [IPv6:2607:f8b0:4002:c07::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EFCFE21BF; Thu, 14 Aug 2014 12:48:03 +0000 (UTC) Received: by mail-yk0-f173.google.com with SMTP id 131so898577ykp.32 for ; Thu, 14 Aug 2014 05:48:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ePQ9LG1QGGFI/AblEX+zFI8vzPgtWO4QCmJkHqbG+JE=; b=0waehPueWCO1MsNUqXE+mWKSMi/SZwett4cvJi7bnnz2BR59w5MMtvWE7bD3wvT3vp eM5sg0FEkX0d02FvJujb/qmO85wrCbGImCrqy/g/AQz79xvVN74oNSprPjL4IVXG2bXY R2340mUnpBU2+HAD+m6fYXgAVXffiJW4J72OO/Mw2/z8YT/h4720fhinT8bfeb7BfBnk 3ZVqHsWZmrl/y3mef5SMi5NqPOd6DhCvu9iWN97sms3Z/UfkAiX3RUlFMMhI/G0dUaND WCHlJoz4PElNkzKBN6JVt9isNSzzkHsi9eMne13jdFzOoKxmE7nrVgj5Ljw/0ITtDrZp JTlg== MIME-Version: 1.0 X-Received: by 10.236.104.133 with SMTP id i5mr17370383yhg.137.1408020483112; Thu, 14 Aug 2014 05:48:03 -0700 (PDT) Received: by 10.170.218.197 with HTTP; Thu, 14 Aug 2014 05:48:03 -0700 (PDT) In-Reply-To: <20140814140818.3539d9c5@x23> References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA302.8010100@yandex-team.ru> <20140814140818.3539d9c5@x23> Date: Thu, 14 Aug 2014 05:48:03 -0700 Message-ID: Subject: Re: [CFT] new tables for ipfw From: Mehmet Erol Sanliturk To: Marko Zec Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "Alexander V. Chernikov" , "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 12:48:04 -0000 On Thu, Aug 14, 2014 at 5:08 AM, Marko Zec wrote: > On Thu, 14 Aug 2014 15:52:34 +0400 > "Alexander V. Chernikov" wrote: > > > On 14.08.2014 15:15, Luigi Rizzo wrote: > > > > > > > > > > > > On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov > > > > wrote: > > > > > > On 14.08.2014 14:44, Luigi Rizzo wrote: > > >> > > >> > > >> > > >> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov > > >> > > > >> wrote: > > >> > > >> On 14.08.2014 13:23, Luigi Rizzo wrote: > > >>> > > >>> > > >>> > > >>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov > > >>> > > > >>> wrote: > > >>> > > >>> Hello list. > > >>> > > >>> I've been hacking ipfw for a while and It seems there > > >>> is something ready to test/review in projects/ipfw branch. > > >>> > > >>> > > >>> =E2=80=8Bthis is a fantastic piece of work, thanks for doin= g it > > >>> and for integrating the feedback. > > >>> =E2=80=8B > > >>> I have some detailed feedback that will send you > > >>> privately, but just a curiosity: > > >>> > > >>> =E2=80=8B...=E2=80=8B > > >>> > > >>> Some examples (see ipfw(8) manual page for the > > >>> description): > > >>> > > >>> =E2=80=8B... > > >>> > > >>> > > >>> ipfw table mi_test create type cidr algo "cidr:hash > > >>> masks=3D/30,/64" > > >>> > > >>> > > >>> =E2=80=8Bwhy do we need to specify mask lengths in the abov= e=E2=80=8B ? > > >> Well, since we're hashing IP we have to know mask to cut > > >> host bits in advance. > > >> (And the real reason is that I'm too lazy to implement > > >> hierarchical matching (check /32, then /31, then /30) like > > >> how, for example, > > >> > > >> > > >> =E2=80=8Boh well for that we should use cidr:radix > > >> > > >> Research results have never shown a strong superiority of > > >> hierarchical hash tables over good radix implementations, > > >> and in those cases one usually adopts partial prefix > > >> expansion so you only have, say, masks that are a > > >> multiple of 2..8 bits so you only need a small number of > > >> hash lookups. > > > Definitely, especially for IPv6. So I was actually thinking > > > about covering some special sparse cases (e.g. someone having a > > > bunch of /32 and a bunch of /30 and that's all). > > > > > > Btw, since we're talking about "good radix implementation": what > > > license does DXR have? :) > > > Is it OK to merge it as another cidr implementation? > > > > > > "cidr" is a very ugly name, i'd rather use "addr" > > Ok, no problem with that. "addr" really sounds better. > > > > > > DXR has a =E2=80=8Bbsd license and of course it is possible to use it= . > > > You should ask Marko Zec for his latest version of the code > > > (and probably make sure we have one copy of the code in the source > > > tree). > > Great!. I'll ask him :) > > The so far cleanest DXR implementation is significantly C++ poluted and > wrapped inside Click glue (available here: http://www.nxab.fer.hr/dxr) > > I'll try to backport the fixes to the original C-only / BSD > implementation over the weekend and let you know how it goes... > > Marko > > > > > > > > Speaking of features, one thing that would be nice is the ability > > > for tables to reference the in-kernel tables (e.g. fibs, socket > > > lists, interface lists...), perhaps in readonly mode. > > > How complex do you think that would be ? > > Implementing algo support for particular provider like > > sockets/iflists shouldn't be hard. Most of the algorithms complexity > > lies in table modifications. Here we have to support > > lookup and dump operations, so it is the question of providing > > necessary bindings to existing mechanisms (via some direct binding or > > utilizing things like kernel_sysctl for dump support). > > > > It looks like the following maps well to current table concept: > > * such tables are not created by default > > * user issues > > `ipfw table kfib create type addr algo "addr:kernel fib=3D0"` > > or > > `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=3D0"` > > or > > `ipfw table kiface create type iface algo "iface:kernel"` > > * tables have special "readonly" type, flush_all requests are ignored > > * no state stored internally > > > > So generic table handling code needs to be modified to support > > read-only tables (and making more callbacks optional). > > Additionally, we might need to proxy "info" request info algo > > callback (optional, "real" algorithms won't implement it) to be able > > to show number of items (and some other info) to user. > > > > > > > > > > > > cheers > > > luigi > > > > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" The link is http://www.nxlab.fer.hr/dxr/ Thank you very much . Mehmet Erol Sanliturk From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 13:05:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BE29E740; Thu, 14 Aug 2014 13:05:12 +0000 (UTC) Received: from relay.mailchannels.net (si-002-i47.relay.mailchannels.net [184.154.112.221]) by mx1.freebsd.org (Postfix) with ESMTP id 4AC622471; Thu, 14 Aug 2014 13:05:06 +0000 (UTC) X-Sender-Id: totalchoicehosting|x-authuser|lee@dilkie.com Received: from data.snhdns.com (ip-10-236-1-24.us-west-2.compute.internal [10.236.1.24]) by relay.mailchannels.net (Postfix) with ESMTPA id 2C393120101; Thu, 14 Aug 2014 12:46:48 +0000 (UTC) X-Sender-Id: totalchoicehosting|x-authuser|lee@dilkie.com Received: from data.snhdns.com (data.snhdns.com [10.253.92.5]) (using TLSv1 with cipher DHE-RSA-AES256-SHA) by 0.0.0.0:2500 (trex/5.2.12); Thu, 14 Aug 2014 12:46:50 GMT X-MC-Relay: Good X-MailChannels-SenderId: totalchoicehosting|x-authuser|lee@dilkie.com X-MailChannels-Auth-Id: totalchoicehosting X-MC-Ingress-Time: 1408020410499 Received: from [142.46.160.218] (port=51123 helo=[192.168.51.11]) by data.snhdns.com with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82) (envelope-from ) id 1XHuQY-0002l2-9M; Thu, 14 Aug 2014 08:46:46 -0400 Message-ID: <53ECAFB9.50507@dilkie.com> Date: Thu, 14 Aug 2014 08:46:49 -0400 From: Lee Dilkie User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Willem Jan Withagen , Luigi Rizzo , "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> In-Reply-To: <53ECA6B2.8010003@digiware.nl> X-AuthUser: lee@dilkie.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 13:05:12 -0000 On 8/14/2014 08:08, Willem Jan Withagen wrote: > I've found the notation ipnr:something rather frustrating when using > ipv6 addresses. Sort of like typing a ipv6 address in a browser, the > last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. > compare > 2001:4cb8:3:1::1 > 2001:4cb8:3:1::1:80 > [2001:4cb8:3:1::1]:80 > The first and the last are the same host but a different port, the > middle one is just a different host. > > Could/should we do the same in ipfw? the first and second forms are valid, but as ipv6 addresses *with no port*, The third is an ipv6 address with a port. If the intent of the second form is an address and port, it will not be parsed that way by standard parsers and violates the ivp6 addressing rfc's. -lee From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 15:21:06 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BCAD3BF5; Thu, 14 Aug 2014 15:21:06 +0000 (UTC) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 61C8F2456; Thu, 14 Aug 2014 15:21:06 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id 2B9442420040; Thu, 14 Aug 2014 19:21:02 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id EC9682C05E8; Thu, 14 Aug 2014 19:21:01 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id h8jc9gbi16-L1IW4HC6; Thu, 14 Aug 2014 19:21:01 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: 91582161-a429-4f64-ac2f-2c09a2800245 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408029661; bh=rpQSF12tsRSa8XZPJqei1gilw40bceJOWEK9tZiTCuw=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=IA6/h/D4rU/JS8Sh9YWCFvNuBcza79FpjUiIRq5P+AbuFDjTdBuxrlFahwKz8mYrm GdWEgMP6ueGODIggrLIYVtsIuf8Ou5tjdKHJcjAxzo8IJQvRjffEWAl3wx4a1b0chN nLS7XxDn8xbLZn15MOq/4mcv3MFH8s4V6lhBqInA= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53ECD3DA.6060501@yandex-team.ru> Date: Thu, 14 Aug 2014 19:20:58 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Willem Jan Withagen , Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> In-Reply-To: <53ECA6B2.8010003@digiware.nl> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 15:21:06 -0000 On 14.08.2014 16:08, Willem Jan Withagen wrote: > On 2014-08-14 13:15, Luigi Rizzo wrote: >> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov < >> melifaro@yandex-team.ru> wrote: >> >>> On 14.08.2014 14:44, Luigi Rizzo wrote: >>> >>> >>> >>> >>> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov < >>> melifaro@yandex-team.ru> wrote: >>> >>>> On 14.08.2014 13:23, Luigi Rizzo wrote: >>>> >>>> >>>> >>>> >>>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < >>>> melifaro@yandex-team.ru> wrote: >>>> >>>>> Hello list. >>>>> >>>>> I've been hacking ipfw for a while and It seems there is something >>>>> ready >>>>> to test/review in projects/ipfw branch. >>>>> >>>> >>>> ​this is a fantastic piece of work, thanks for doing it and for >>>> integrating the feedback. >>>> ​ >>>> I have some detailed feedback that will send you privately, >>>> but just a curiosity: >>>> >>>> ​...​ >>>>> >>>>> Some examples (see ipfw(8) manual page for the description): >>>>> >>>>> >>>>> ​... >>>>> >>>>> >>>>> ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" >>>>> >>>> >>>> ​why do we need to specify mask lengths in the above​ ? >>>> >>>> Well, since we're hashing IP we have to know mask to cut host >>>> bits in >>>> advance. >>>> (And the real reason is that I'm too lazy to implement hierarchical >>>> matching (check /32, then /31, then /30) like how, for example, >>>> >>> >>> ​oh well for that we should use cidr:radix >>> >>> Research results have never shown a strong superiority of >>> hierarchical hash tables over good radix implementations, >>> and in those cases one usually adopts partial prefix >>> expansion so you only have, say, masks that are a >>> multiple of 2..8 bits so you only need a small number of >>> hash lookups. >>> >>> Definitely, especially for IPv6. So I was actually thinking about >>> covering >>> some special sparse cases (e.g. someone having a bunch of /32 and a >>> bunch >>> of /30 and that's all). >>> >>> Btw, since we're talking about "good radix implementation": what >>> license >>> does DXR have? :) >>> Is it OK to merge it as another cidr implementation? >>> >> >> "cidr" is a very ugly name, i'd rather use "addr" >> >> DXR has a ​bsd license and of course it is possible to use it. >> You should ask Marko Zec for his latest version of the code >> (and probably make sure we have one copy of the code in the source >> tree). >> >> Speaking of features, one thing that would be nice is the ability >> for tables to reference the in-kernel tables (e.g. fibs, socket >> lists, interface lists...), perhaps in readonly mode. >> How complex do you think that would be ? > > I'm a very happy user of ipfw and I think these are nice improvements > and will make things more flexible... > > I have 2 nits to pick with the current version. > > I've found the notation ipnr:something rather frustrating when using > ipv6 addresses. Sort of like typing a ipv6 address in a browser, the > last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. > compare > 2001:4cb8:3:1::1 > 2001:4cb8:3:1::1:80 > [2001:4cb8:3:1::1]:80 > The first and the last are the same host but a different port, the > middle one is just a different host. > > Could/should we do the same in ipfw? Well, we should, but I'm unsure if we have host:port notation anywhere in current (or new) syntax: > > And I keep running into the > ipfw add deny all from table(50) to any > notation. the ()'s need to be escaped in most any shell. Where as I > look at the syntax there is little reason to require the ()'s. > the keyword table always needs to be followed by a number (and in the > new version a (word|number) ). We need _some_ discriminator to ensure that the next parameter after "to" or "from" is not hostname. We also have some other places where tables are used: "via interface|table(X)", lookup X, flow table(X) [new]. I agree that parenthesis might not be the best choice. (and something like :tablename:, %tablename%, or even table:tablename might look better). Theoretically, we can support both (old/new) and show rules with new one by default. > > Thanx for the nice work, > --WjW > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 15:28:05 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 30D9CF7F; Thu, 14 Aug 2014 15:28:05 +0000 (UTC) Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DDA872544; Thu, 14 Aug 2014 15:28:04 +0000 (UTC) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 0888D153A8A; Thu, 14 Aug 2014 17:28:01 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kbaq5KYLrgTo; Thu, 14 Aug 2014 17:27:50 +0200 (CEST) Received: from [192.168.101.102] (vpn.ecoracks.nl [31.223.170.173]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 26961153A74; Thu, 14 Aug 2014 17:27:50 +0200 (CEST) Message-ID: <53ECD576.8040801@digiware.nl> Date: Thu, 14 Aug 2014 17:27:50 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Lee Dilkie , Luigi Rizzo , "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> <53ECAFB9.50507@dilkie.com> In-Reply-To: <53ECAFB9.50507@dilkie.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 15:28:05 -0000 On 14-8-2014 14:46, Lee Dilkie wrote: > > On 8/14/2014 08:08, Willem Jan Withagen wrote: >> I've found the notation ipnr:something rather frustrating when using >> ipv6 addresses. Sort of like typing a ipv6 address in a browser, the >> last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. >> compare >> 2001:4cb8:3:1::1 >> 2001:4cb8:3:1::1:80 >> [2001:4cb8:3:1::1]:80 >> The first and the last are the same host but a different port, the >> middle one is just a different host. >> >> Could/should we do the same in ipfw? > > the first and second forms are valid, but as ipv6 addresses *with no port*, > > The third is an ipv6 address with a port. > > If the intent of the second form is an address and port, it will not be > parsed that way by standard parsers and violates the ivp6 addressing rfc's. I agree, but ipfw does not understand [2001:4cb8:3:1::1] last time I tried. So I think you rephrased what I meant to say. Thanx, --WjW From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 15:53:23 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F21DC8C4; Thu, 14 Aug 2014 15:53:22 +0000 (UTC) Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 81F912828; Thu, 14 Aug 2014 15:53:22 +0000 (UTC) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id CBA65153A51; Thu, 14 Aug 2014 17:53:19 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0HacgiO8Xawi; Thu, 14 Aug 2014 17:52:48 +0200 (CEST) Received: from [192.168.101.102] (vpn.ecoracks.nl [31.223.170.173]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id A4E9D1534D2; Thu, 14 Aug 2014 17:52:48 +0200 (CEST) Message-ID: <53ECDB51.2030201@digiware.nl> Date: Thu, 14 Aug 2014 17:52:49 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "Alexander V. Chernikov" , Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> <53ECD3DA.6060501@yandex-team.ru> In-Reply-To: <53ECD3DA.6060501@yandex-team.ru> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 15:53:23 -0000 On 14-8-2014 17:20, Alexander V. Chernikov wrote: > On 14.08.2014 16:08, Willem Jan Withagen wrote: >> On 2014-08-14 13:15, Luigi Rizzo wrote: >>> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov < >>> melifaro@yandex-team.ru> wrote: >>> >>>> On 14.08.2014 14:44, Luigi Rizzo wrote: >>>> >>>> >>>> >>>> >>>> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov < >>>> melifaro@yandex-team.ru> wrote: >>>> >>>>> On 14.08.2014 13:23, Luigi Rizzo wrote: >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov < >>>>> melifaro@yandex-team.ru> wrote: >>>>> >>>>>> Hello list. >>>>>> >>>>>> I've been hacking ipfw for a while and It seems there is something >>>>>> ready >>>>>> to test/review in projects/ipfw branch. >>>>>> >>>>> >>>>> ​this is a fantastic piece of work, thanks for doing it and for >>>>> integrating the feedback. >>>>> ​ >>>>> I have some detailed feedback that will send you privately, >>>>> but just a curiosity: >>>>> >>>>> ​...​ >>>>>> >>>>>> Some examples (see ipfw(8) manual page for the description): >>>>>> >>>>>> >>>>>> ​... >>>>>> >>>>>> >>>>>> ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64" >>>>>> >>>>> >>>>> ​why do we need to specify mask lengths in the above​ ? >>>>> >>>>> Well, since we're hashing IP we have to know mask to cut host >>>>> bits in >>>>> advance. >>>>> (And the real reason is that I'm too lazy to implement hierarchical >>>>> matching (check /32, then /31, then /30) like how, for example, >>>>> >>>> >>>> ​oh well for that we should use cidr:radix >>>> >>>> Research results have never shown a strong superiority of >>>> hierarchical hash tables over good radix implementations, >>>> and in those cases one usually adopts partial prefix >>>> expansion so you only have, say, masks that are a >>>> multiple of 2..8 bits so you only need a small number of >>>> hash lookups. >>>> >>>> Definitely, especially for IPv6. So I was actually thinking about >>>> covering >>>> some special sparse cases (e.g. someone having a bunch of /32 and a >>>> bunch >>>> of /30 and that's all). >>>> >>>> Btw, since we're talking about "good radix implementation": what >>>> license >>>> does DXR have? :) >>>> Is it OK to merge it as another cidr implementation? >>>> >>> >>> "cidr" is a very ugly name, i'd rather use "addr" >>> >>> DXR has a ​bsd license and of course it is possible to use it. >>> You should ask Marko Zec for his latest version of the code >>> (and probably make sure we have one copy of the code in the source >>> tree). >>> >>> Speaking of features, one thing that would be nice is the ability >>> for tables to reference the in-kernel tables (e.g. fibs, socket >>> lists, interface lists...), perhaps in readonly mode. >>> How complex do you think that would be ? >> >> I'm a very happy user of ipfw and I think these are nice improvements >> and will make things more flexible... >> >> I have 2 nits to pick with the current version. >> >> I've found the notation ipnr:something rather frustrating when using >> ipv6 addresses. Sort of like typing a ipv6 address in a browser, the >> last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. >> compare >> 2001:4cb8:3:1::1 >> 2001:4cb8:3:1::1:80 >> [2001:4cb8:3:1::1]:80 >> The first and the last are the same host but a different port, the >> middle one is just a different host. >> >> Could/should we do the same in ipfw? > Well, we should, but I'm unsure if we have host:port notation anywhere > in current (or new) syntax: I can not answer that right from the top of my head. But I remember digging in the code to see how adresses were converted and how IPv6 fitted in there. And that because of the problem described above. But the main reason for "reporting" this, is that I forsee to possibility that with the new syntax it might be possible to run into this. But you have disgned the syntax, so I'll take your word for it that would not happen. >> And I keep running into the >> ipfw add deny all from table(50) to any >> notation. the ()'s need to be escaped in most any shell. Where as I >> look at the syntax there is little reason to require the ()'s. >> the keyword table always needs to be followed by a number (and in the >> new version a (word|number) ). > We need _some_ discriminator to ensure that the next parameter after > "to" or "from" is not hostname. > We also have some other places where tables are used: "via > interface|table(X)", lookup X, flow table(X) [new]. > I agree that parenthesis might not be the best choice. (and something > like :tablename:, %tablename%, or even table:tablename might look better). > Theoretically, we can support both (old/new) and show rules with new one > by default. (I'm looking at this from a parseable grammar view, perhaps not 100% fitting) Well my argument is that after the table-keyword "table" there would always be the table-identifier, be it a name or a number. So "table" is a reserved word, now this would exclude hostnames called "table", is that (your) problem? So while parsing the keyword would always consume the next token as the qualifier that goes with the table-keyword. If that results in a not parseable sentence then the sentence needs to be rejected. Regards, --WjW From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 15:56:23 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2B2439C2 for ; Thu, 14 Aug 2014 15:56:23 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by mx1.freebsd.org (Postfix) with ESMTP id 0F252285D for ; Thu, 14 Aug 2014 15:56:22 +0000 (UTC) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 14 Aug 2014 08:55:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.01,863,1400050800"; d="scan'208";a="584907139" Received: from orsmsx101.amr.corp.intel.com ([10.22.225.128]) by fmsmga002.fm.intel.com with ESMTP; 14 Aug 2014 08:55:13 -0700 Received: from orsmsx111.amr.corp.intel.com ([169.254.11.75]) by ORSMSX101.amr.corp.intel.com ([169.254.8.102]) with mapi id 14.03.0195.001; Thu, 14 Aug 2014 08:55:12 -0700 From: "Pieper, Jeffrey E" To: "freebsd-net@freebsd.org" Subject: RE: Intel Support for FreeBSD Thread-Topic: Intel Support for FreeBSD Thread-Index: AQHPtpQ1/Tr6f5B01kC2gD5i7hdHFpvQQhNg Date: Thu, 14 Aug 2014 15:55:12 +0000 Message-ID: <2A35EA60C3C77D438915767F458D65687E9270EB@ORSMSX111.amr.corp.intel.com> References: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> In-Reply-To: <1407892565.51895.YahooMailNeo@web121605.mail.ne1.yahoo.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 15:56:23 -0000 The updated drivers are here: https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=3D17509&lang=3Den= g&ProdId=3D3299 https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=3D15815&lang=3Den= g&ProdId=3D3024 https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=3D14688&lang=3Den= g&ProdId=3D3413 Jeff -----Original Message----- From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-net@freebsd.org] = On Behalf Of Barney Cordoba via freebsd-net Sent: Tuesday, August 12, 2014 6:16 PM To: freebsd-net@freebsd.org Subject: Intel Support for FreeBSD I notice that there hasn't been an update in the Intel Download Center sinc= e July. Is there no official support for 10? We liked to use the intel stuff as an alternative to the "latest" freebsd c= ode, but it doesnt =A0compile. BC _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 15:58:58 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8BF0EA88; Thu, 14 Aug 2014 15:58:58 +0000 (UTC) Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 44CC12884; Thu, 14 Aug 2014 15:58:58 +0000 (UTC) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 1A7D91534C0; Thu, 14 Aug 2014 17:58:56 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jdMbXqM6EbJy; Thu, 14 Aug 2014 17:58:46 +0200 (CEST) Received: from [192.168.101.102] (vpn.ecoracks.nl [31.223.170.173]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id C55191534D2; Thu, 14 Aug 2014 17:58:46 +0200 (CEST) Message-ID: <53ECDCB7.8090703@digiware.nl> Date: Thu, 14 Aug 2014 17:58:47 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Lee Dilkie , Luigi Rizzo , "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> <53ECAFB9.50507@dilkie.com> <53ECD576.8040801@digiware.nl> <53ECDB62.5030708@dilkie.com> In-Reply-To: <53ECDB62.5030708@dilkie.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 15:58:58 -0000 On 14-8-2014 17:53, Lee Dilkie wrote: > > On 8/14/2014 11:27 AM, Willem Jan Withagen wrote: >> On 14-8-2014 14:46, Lee Dilkie wrote: >>> On 8/14/2014 08:08, Willem Jan Withagen wrote: >>>> I've found the notation ipnr:something rather frustrating when using >>>> ipv6 addresses. Sort of like typing a ipv6 address in a browser, the >>>> last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. >>>> compare >>>> 2001:4cb8:3:1::1 >>>> 2001:4cb8:3:1::1:80 >>>> [2001:4cb8:3:1::1]:80 >>>> The first and the last are the same host but a different port, the >>>> middle one is just a different host. >>>> >>>> Could/should we do the same in ipfw? >>> the first and second forms are valid, but as ipv6 addresses *with no port*, >>> >>> The third is an ipv6 address with a port. >>> >>> If the intent of the second form is an address and port, it will not be >>> parsed that way by standard parsers and violates the ivp6 addressing rfc's. >> I agree, but ipfw does not understand [2001:4cb8:3:1::1] last time I tried. >> So I think you rephrased what I meant to say. >> >> Thanx, >> --WjW >> > > and re-reading your original post, yes you did state it correctly. > > ipfw needs to be fixed to understand the correct format of ipv6 addresses. > > however, this isn't the only offender. netstat's output is also > incorrect (linux example) > > > tcp 0 0 :::22 > :::* LISTEN > > should be > > tcp 0 0 [::]:22 > [::]:* LISTEN > > I don't understand why folks dream up incompatible, and unparsable, ipv6 > address formats. Why bother with rfc's if no-one writes to them. > > (see rfc5952) It think that that was the RFC I found when looking into getting the browser to do the right thing when I want it to go to: [2001:4cb8:3:1::1]:8080 Well the RFC would be an argument to at least spec an IPv6 address in a ipfw rule to be allowed either with or without []'s. And if you run into trouble by not using the []'s, they are "easily" added. --WjW From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 16:15:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C6D10D19; Thu, 14 Aug 2014 16:15:50 +0000 (UTC) Received: from smtp.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 826432B2B; Thu, 14 Aug 2014 16:15:50 +0000 (UTC) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id E1264153A8B; Thu, 14 Aug 2014 18:15:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aE2DA3iUEnrT; Thu, 14 Aug 2014 18:15:36 +0200 (CEST) Received: from [192.168.101.102] (vpn.ecoracks.nl [31.223.170.173]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id F02DC1534EC; Thu, 14 Aug 2014 18:15:35 +0200 (CEST) Message-ID: <53ECE0A8.7010705@digiware.nl> Date: Thu, 14 Aug 2014 18:15:36 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "Alexander V. Chernikov" , Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> <53ECD3DA.6060501@yandex-team.ru> In-Reply-To: <53ECD3DA.6060501@yandex-team.ru> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 16:15:50 -0000 On 14-8-2014 17:20, Alexander V. Chernikov wrote: >> I've found the notation ipnr:something rather frustrating when using >> ipv6 addresses. Sort of like typing a ipv6 address in a browser, the >> last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. >> compare >> 2001:4cb8:3:1::1 >> 2001:4cb8:3:1::1:80 >> [2001:4cb8:3:1::1]:80 >> The first and the last are the same host but a different port, the >> middle one is just a different host. >> >> Could/should we do the same in ipfw? > Well, we should, but I'm unsure if we have host:port notation anywhere > in current (or new) syntax: I now remember the case, sort of I think: When using an IPv6 address the last time I ran into the snag with: (From the ipfw(8) manual) ip-addr: .... addr:mask Matches all addresses with base addr (specified as an IP address, a network number, or a hostname) and the mask of mask, specified as a dotted quad. As an example, 1.2.3.4:255.0.255.0 or 1.0.3.0:255.0.255.0 will match 1.*.3.*. This form is advised only for non-contiguous masks. It is better to resort to the addr/masklen format for contiguous masks, which is more compact and less Which tried to use the last quad of an IPv6 adress in a very convoluted case, which I cannot reproduce any longer. Reading the manual, one of my problems is now clearly a RTFM: how to use ftp-data in a rule without the complaint that data is not a valid port-name. :) again something learned. --WjW From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 16:31:03 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AE62570B; Thu, 14 Aug 2014 16:31:03 +0000 (UTC) Received: from relay.mailchannels.net (si-002-i47.relay.mailchannels.net [184.154.112.221]) by mx1.freebsd.org (Postfix) with ESMTP id 5D9632DEA; Thu, 14 Aug 2014 16:31:01 +0000 (UTC) X-Sender-Id: totalchoicehosting|x-authuser|lee@dilkie.com Received: from data.snhdns.com (ip-10-236-1-24.us-west-2.compute.internal [10.236.1.24]) by relay.mailchannels.net (Postfix) with ESMTPA id E3EA16031D; Thu, 14 Aug 2014 15:53:22 +0000 (UTC) X-Sender-Id: totalchoicehosting|x-authuser|lee@dilkie.com Received: from data.snhdns.com (data.snhdns.com [10.245.145.206]) (using TLSv1 with cipher DHE-RSA-AES256-SHA) by 0.0.0.0:2500 (trex/5.2.12); Thu, 14 Aug 2014 15:53:24 GMT X-MC-Relay: Good X-MailChannels-SenderId: totalchoicehosting|x-authuser|lee@dilkie.com X-MailChannels-Auth-Id: totalchoicehosting X-MC-Ingress-Time: 1408031604671 Received: from [216.191.234.70] (port=49055 helo=[10.39.164.100]) by data.snhdns.com with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82) (envelope-from ) id 1XHxKu-00089c-UZ; Thu, 14 Aug 2014 11:53:09 -0400 Message-ID: <53ECDB62.5030708@dilkie.com> Date: Thu, 14 Aug 2014 11:53:06 -0400 From: Lee Dilkie User-Agent: Mozilla/5.0 (Windows NT 5.2; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Willem Jan Withagen , Luigi Rizzo , "Alexander V. Chernikov" Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA6B2.8010003@digiware.nl> <53ECAFB9.50507@dilkie.com> <53ECD576.8040801@digiware.nl> In-Reply-To: <53ECD576.8040801@digiware.nl> X-AuthUser: lee@dilkie.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , freebsd-ipfw , "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 16:31:03 -0000 On 8/14/2014 11:27 AM, Willem Jan Withagen wrote: > On 14-8-2014 14:46, Lee Dilkie wrote: >> On 8/14/2014 08:08, Willem Jan Withagen wrote: >>> I've found the notation ipnr:something rather frustrating when using >>> ipv6 addresses. Sort of like typing a ipv6 address in a browser, the >>> last :xx is always interpreted as portnumber, UNLESS you wrap it in []'s. >>> compare >>> 2001:4cb8:3:1::1 >>> 2001:4cb8:3:1::1:80 >>> [2001:4cb8:3:1::1]:80 >>> The first and the last are the same host but a different port, the >>> middle one is just a different host. >>> >>> Could/should we do the same in ipfw? >> the first and second forms are valid, but as ipv6 addresses *with no port*, >> >> The third is an ipv6 address with a port. >> >> If the intent of the second form is an address and port, it will not be >> parsed that way by standard parsers and violates the ivp6 addressing rfc's. > I agree, but ipfw does not understand [2001:4cb8:3:1::1] last time I tried. > So I think you rephrased what I meant to say. > > Thanx, > --WjW > and re-reading your original post, yes you did state it correctly. ipfw needs to be fixed to understand the correct format of ipv6 addresses. however, this isn't the only offender. netstat's output is also incorrect (linux example) tcp 0 0 :::22 :::* LISTEN should be tcp 0 0 [::]:22 [::]:* LISTEN I don't understand why folks dream up incompatible, and unparsable, ipv6 address formats. Why bother with rfc's if no-one writes to them. (see rfc5952) -lee From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 18:28:49 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 60129CF8 for ; Thu, 14 Aug 2014 18:28:49 +0000 (UTC) Received: from quine.pinyon.org (quine.pinyon.org [65.101.5.249]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 31F172B88 for ; Thu, 14 Aug 2014 18:28:49 +0000 (UTC) Received: by quine.pinyon.org (Postfix, from userid 122) id 99B671603D6; Thu, 14 Aug 2014 11:28:47 -0700 (MST) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on quine.pinyon.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 Received: from feyerabend.n1.pinyon.org (feyerabend.n1.pinyon.org [10.0.10.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by quine.pinyon.org (Postfix) with ESMTPSA id 9EF831602F1 for ; Thu, 14 Aug 2014 11:28:44 -0700 (MST) Message-ID: <53ECFFDC.3000406@pinyon.org> Date: Thu, 14 Aug 2014 11:28:44 -0700 From: "Russell L. Carter" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: NFS client READ performance on -current References: <2136988575.13956627.1405199640153.JavaMail.root@uoguelph.ca> <53C7B774.60304@freebsd.org> <1780417.KfjTWjeQCU@pippin.baldwin.cx> <201408111653.42283.jhb@freebsd.org> In-Reply-To: <201408111653.42283.jhb@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 18:28:49 -0000 I measured some transfer rates, and have appended them. On 08/11/14 13:53, John Baldwin wrote: > On Saturday, July 19, 2014 1:28:19 pm John Baldwin wrote: >> On Thursday 17 July 2014 19:45:56 Julian Elischer wrote: >>> On 7/15/14, 10:34 PM, John Baldwin wrote: >>>> On Saturday, July 12, 2014 5:14:00 pm Rick Macklem wrote: >>>>> Yonghyeon Pyun wrote: >>>>>> On Fri, Jul 11, 2014 at 09:54:23AM -0400, John Baldwin wrote: >>>>>>> On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote: >>>>>>>> John Baldwin wrote: >>>>>>>>> On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote: >>>>>>>>>> Russell L. Carter wrote: >>>>>>>>>>> On 07/02/14 19:09, Rick Macklem wrote: >>>>>>>>>>>> Could you please post the dmesg stuff for the network >>>>>>>>>>>> interface, >>>>>>>>>>>> so I can tell what driver is being used? I'll take a look >>>>>>>>>>>> at >>>>>>>>>>>> it, >>>>>>>>>>>> in case it needs to be changed to use m_defrag(). >>>>>>>>>>> >>>>>>>>>>> em0: port >>>>>>>>>>> 0xd020-0xd03f >>>>>>>>>>> mem >>>>>>>>>>> 0xfe4a0000-0xfe4bffff,0xfe480000-0xfe49ffff irq 44 at >>>>>>>>>>> device 0.0 >>>>>>>>>>> on >>>>>>>>>>> pci2 >>>>>>>>>>> em0: Using an MSI interrupt >>>>>>>>>>> em0: Ethernet address: 00:15:17:bc:29:ba >>>>>>>>>>> 001.000007 [2323] netmap_attach success for em0 >>>>>>>>>>> tx >>>>>>>>>>> 1/1024 >>>>>>>>>>> rx >>>>>>>>>>> 1/1024 queues/slots >>>>>>>>>>> >>>>>>>>>>> This is one of those dual nic cards, so there is em1 as >>>>>>>>>>> well... >>>>>>>>>> >>>>>>>>>> Well, I took a quick look at the driver and it does use >>>>>>>>>> m_defrag(), >>>>>>>>>> but >>>>>>>>>> I think that the "retry:" label it does a goto after doing so >>>>>>>>>> might >>>>>>>>>> be in >>>>>>>>>> the wrong place. >>>>>>>>>> >>>>>>>>>> The attached untested patch might fix this. >>>>>>>>>> >>>>>>>>>> Is it convenient to build a kernel with this patch applied >>>>>>>>>> and then >>>>>>>>>> try >>>>>>>>>> it with TSO enabled? >>>>>>>>>> >>>>>>>>>> rick >>>>>>>>>> ps: It does have the transmit segment limit set to 32. I have >>>>>>>>>> no >>>>>>>>>> idea if >>>>>>>>>> >>>>>>>>>> this is a hardware limitation. >>>>>>>>> >>>>>>>>> I think the retry is not in the wrong place, but the overhead >>>>>>>>> of all >>>>>>>>> those >>>>>>>>> pullups is apparently quite severe. >>>>>>>> >>>>>>>> The m_defrag() call after the first failure will just barely >>>>>>>> squeeze >>>>>>>> the just under 64K TSO segment into 32 mbuf clusters. Then I >>>>>>>> think any >>>>>>>> m_pullup() done during the retry will allocate an mbuf >>>>>>>> (at a glance it seems to always do this when the old mbuf is a >>>>>>>> cluster) >>>>>>>> and prepend that to the list. >>>>>>>> --> Now the list is > 32 mbufs again and the >>>>>>>> bus_dmammap_load_mbuf_sg() >>>>>>>> >>>>>>>> will fail again on the retry, this time fatally, I think? >>>>>>>> >>>>>>>> I can't see any reason to re-do all the stuff using m_pullup() >>>>>>>> and Russell >>>>>>>> reported that moving the "retry:" fixed his problem, from what I >>>>>>>> understood. >>>>>>> >>>>>>> Ah, I had assumed (incorrectly) that the m_pullup()s would all be >>>>>>> nops in this >>>>>>> case. It seems the NIC would really like to have all those things >>>>>>> in a single >>>>>>> segment, but it is not required, so I agree that your patch is >>>>>>> fine. >>>>>> >>>>>> I recall em(4) controllers have various limitation in TSO. Driver >>>>>> has to update IP header to make TSO work so driver has to get a >>>>>> writable mbufs. bpf(4) consumers will see IP packet length is 0 >>>>>> after this change. I think tcpdump has a compile time option to >>>>>> guess correct IP packet length. The firmware of controller also >>>>>> should be able to access complete IP/TCP header in a single buffer. >>>>>> I don't remember more details in TSO limitation but I guess you may >>>>>> be able to get more details TSO limitation from publicly available >>>>>> Intel data sheet. >>>>> >>>>> I think that the patch should handle this ok. All of the m_pullup() >>>>> stuff gets done the first time. Then, if the result is more than 32 >>>>> mbufs in the list, m_defrag() is called to copy the chain. This should >>>>> result in all the header stuff in the first mbuf cluster and the map >>>>> call is done again with this list of clusters. (Without the patch, >>>>> m_pullup() would allocate another prepended mbuf and make the chain >>>>> more than 32mbufs again.) >>>> >>>> Hmm, I am surprised by the m_pullup() behavior that it doesn't just >>>> notice that the first mbuf with a cluster has the desired data already >>>> and returns without doing anything. That is, I'm surprised the first >>>> >>>> statement in m_pullup() isn't just: >>>> if (n->m_len >= len) >>>> >>>> return (n); >>> >>> I seem to remember that the standard behaviour is for the caller to do >>> exactly that. >> >> Huh, the manpage doesn't really state that, and it does check in one case. >> However, I think that means that the code in em(4) is busted and should be >> checking m_len before all the calls to m_pullup(). I think this will fix >> the issue the same as Rick's change but it might also avoid unnecessary >> pullups in some cases when defrag isn't needed in the first place. > > FYI, I still think this patch is worth testing if someone is up for it. > I realize that it would be better to run e.g. netperf. However I originally noticed the problem by running NFS read tests, and since that's trivial to do with rsync I'm doing that again. The source file is sitting on a 6 drive zfs raidz2 and it's writing to a fast ssd. Same hardware with linux is a trifle faster, so probably the test setup is revealing enough. I like to run rsync -avP so that I can watch the behavior over time. The transfer rates have been fairly steady, +-5MB/s. Eyeballing it, I'm not seeing much in the way of cache effects. r269700 2014-08-07 10G transfer 65MB/s nfs read. after patch and install if_em.ko* to both sides and reboot: 10G transfer 62MB/s nfs read. immediately read different 5G file: 67MB/s then reread original 10G file: 236MB/s ?? faster than wire? that's a cache effect, but still, I find it surprising. then read a different 12G file: 62.9MB/s So, previously I was seeing ~65MB/s from the original patch, which is what I see today. JHB's patch seems to be slightly slower, repeatable. But given the variance of the transfer rates, it's not really much different. HTH, Russell ps: a quick question about quickly building modules: Suppose I have a fully populated /usr/obj from buildworld and buildkernel (and have installed it), what's the most efficient method for rebuilding a single module and getting it installed into /boot/kernel? From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 18:35:16 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BC6D7FF3 for ; Thu, 14 Aug 2014 18:35:16 +0000 (UTC) Received: from mail-qc0-x234.google.com (mail-qc0-x234.google.com [IPv6:2607:f8b0:400d:c01::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7C5932CD0 for ; Thu, 14 Aug 2014 18:35:16 +0000 (UTC) Received: by mail-qc0-f180.google.com with SMTP id l6so1487738qcy.11 for ; Thu, 14 Aug 2014 11:35:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ZfHIHL4LyqJ4KnvN7VPegDurTuPGijODMmiAFPahccw=; b=ePKvgvHJWUSeEZMoU/nuWQIn0aAlvxr//9AZBolADtM7MAKfiIVlWsUEI8nVjtz1Xl U8uOy7dU6jtFpoHyT3xu0RrzhTc5VtVirBwrPAKgLYRI77Ie5x6ejc76uxJbtB8Fe5Ut KOfC/adE1HsWgh5lswVWkgkYa+nur5GxwEOpnZrlKzOHeg2yhXwgNQkx5haFAeM4dqON lH1PzBaQIIq4VEbP/SPevznhau9bYsuONp0L6Hf/YcN/vE40gbmMDTvhbSTDJwR0RUEY CM+WHxdEvGmXApsbYj1ezO71z5SxNNtHq5Oz/DgLeKL6m2JXq7Q6myHZwNFE0mzKCXuD LrMg== MIME-Version: 1.0 X-Received: by 10.140.38.17 with SMTP id s17mr18798469qgs.40.1408041314818; Thu, 14 Aug 2014 11:35:14 -0700 (PDT) Received: by 10.96.73.225 with HTTP; Thu, 14 Aug 2014 11:35:14 -0700 (PDT) In-Reply-To: <53ECFFDC.3000406@pinyon.org> References: <2136988575.13956627.1405199640153.JavaMail.root@uoguelph.ca> <53C7B774.60304@freebsd.org> <1780417.KfjTWjeQCU@pippin.baldwin.cx> <201408111653.42283.jhb@freebsd.org> <53ECFFDC.3000406@pinyon.org> Date: Thu, 14 Aug 2014 15:35:14 -0300 Message-ID: Subject: Re: NFS client READ performance on -current From: Christopher Forgeron To: "Russell L. Carter" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 18:35:16 -0000 I've been using : make buildkernel -DKERNFAST which is quite fast compared to the regular buildkernel. There may be a faster way yet. On Thu, Aug 14, 2014 at 3:28 PM, Russell L. Carter wrote: > > > ps: a quick question about quickly building modules: Suppose I have > a fully populated /usr/obj from buildworld and buildkernel (and have > installed it), what's the most efficient method for rebuilding a single > module and getting it installed into /boot/kernel? > _______________________________________________ > > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 18:55:17 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 138A0DE4 for ; Thu, 14 Aug 2014 18:55:17 +0000 (UTC) Received: from quine.pinyon.org (quine.pinyon.org [65.101.5.249]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE63D2FFB for ; Thu, 14 Aug 2014 18:55:16 +0000 (UTC) Received: by quine.pinyon.org (Postfix, from userid 122) id 619491603D6; Thu, 14 Aug 2014 11:55:16 -0700 (MST) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on quine.pinyon.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 Received: from feyerabend.n1.pinyon.org (feyerabend.n1.pinyon.org [10.0.10.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by quine.pinyon.org (Postfix) with ESMTPSA id 7203B1602F1 for ; Thu, 14 Aug 2014 11:55:14 -0700 (MST) Message-ID: <53ED0612.3000002@pinyon.org> Date: Thu, 14 Aug 2014 11:55:14 -0700 From: "Russell L. Carter" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: NFS client READ performance on -current References: <2136988575.13956627.1405199640153.JavaMail.root@uoguelph.ca> <53C7B774.60304@freebsd.org> <1780417.KfjTWjeQCU@pippin.baldwin.cx> <201408111653.42283.jhb@freebsd.org> <53ECFFDC.3000406@pinyon.org> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 18:55:17 -0000 On 08/14/14 11:35, Christopher Forgeron wrote: > I've been using : > > make buildkernel -DKERNFAST 6s for the up-to-date run through, 12s to install. Much better! Thanks, Russell > > which is quite fast compared to the regular buildkernel. There may be a > faster way yet. > From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 19:04:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8328C44E for ; Thu, 14 Aug 2014 19:04:19 +0000 (UTC) Received: from mail-qa0-x236.google.com (mail-qa0-x236.google.com [IPv6:2607:f8b0:400d:c00::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 42D8D2113 for ; Thu, 14 Aug 2014 19:04:19 +0000 (UTC) Received: by mail-qa0-f54.google.com with SMTP id k15so1330590qaq.13 for ; Thu, 14 Aug 2014 12:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9CGsOMmWj2dwIWG4W9R9y/XiXrlP1TlbNsIvRJD19XY=; b=kVd7Xm/r9HV2NFmbYBGn6G6gPTjMDuP61fcEXnLlDBPHTZJlMu3Ti6/699o+xKBLs3 3MTW8KQcw/La9I/mwsEqSB9IIRvKytIKlQF/OAB1EEs1y14ZGeIpA5hgmFMHhdl4x6OS 1JwRpFlstoJpzrEsKY3yfmOq+BfHQvG7RUkfQIQjufbRbWQXhPoqzwH+EvCN6l/W6Twq bcHjgfJpux1jhI6kHF4QrHMl80inkiiX1pZieL0PwTRXgeTf1r3vTiEcsXE8PNNF/HoX loxqvnk5GywCPkkfyAUhwvjb5+Y5fOCjNI6ONuC7Si+Al9bwxFqSBOhYdJk9RGufCKuP Y/CA== MIME-Version: 1.0 X-Received: by 10.140.23.37 with SMTP id 34mr19377342qgo.2.1408043058298; Thu, 14 Aug 2014 12:04:18 -0700 (PDT) Received: by 10.96.170.230 with HTTP; Thu, 14 Aug 2014 12:04:18 -0700 (PDT) In-Reply-To: <08f701cfb69e$1698e2c0$43caa840$@com> References: <08f701cfb69e$1698e2c0$43caa840$@com> Date: Thu, 14 Aug 2014 12:04:18 -0700 Message-ID: Subject: Re: SPAN port doesn't pick up locally generated traffic From: hiren panchasara To: Joseph Ward Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 19:04:19 -0000 On Tue, Aug 12, 2014 at 7:27 PM, Joseph Ward wrote: > I found a workaround that is acceptable. > > First, I want to thank Hiren Panchasara for recommending the work-around > that I hadn't thought about trying. > > For the archives and anyone struggling with the same issue: > > I altered the setup below by giving the LAN IP to the wired interface re1 as > opposed to bridge0. Doing that magically made the span port (re2) get all > the traffic, both passing through in re1 and out ath0 (and vice versa) as > well as the packets that originate inside the system and are passed to the > bridge. > > This isn't ideal as it means that if the physical interface re1 goes down, > clients on ath0 will lose connectivity to the system, and I had always > understood that when bridging it's ideal to give the IPs to the bridge > itself to protect against that possibility. However, I can give each > interface another IP on a different subnet that will at least allow for > remote connectivity in that scenario. > > Does anyone know if this is known/expected behavior? If no one knows I'll > file a bug ticket on the scenario as it certainly doesn't seem kosher to me. I am not sure if this one case of "packets originating from one of the bridge members not showing up on the bridge's span port" is the only one not getting handled correctly or there is more to it. Please file a bug with your testing scenarios and all the details. CC me on the bug and I'll try to take a look. cheers, Hiren [skip] From owner-freebsd-net@FreeBSD.ORG Thu Aug 14 22:01:33 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D90C0E80; Thu, 14 Aug 2014 22:01:32 +0000 (UTC) Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net [IPv6:2a02:6b8:0:202::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E908E2E86; Thu, 14 Aug 2014 20:28:37 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id 0F8AE640582; Fri, 15 Aug 2014 00:28:32 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id BDA7F2C05F8; Fri, 15 Aug 2014 00:28:32 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:c33::a5]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id HJ7r5n9jg8-SWIWUmvU; Fri, 15 Aug 2014 00:28:32 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: b1347399-3fcd-4e6f-a81f-1a68edba693f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408048112; bh=AMhV+l3u5lxVYwQRR0+Y4s9FBo/KJnT4CWVFgwmsEcM=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type; b=xde01vkgzPfZyegBt7DxlWDZoR69gzEzUGykzvJJugnBOw6wDGiziLPBk8Ll2qmuZ NUzoE70kr6gD6ug1m6sq3Cn2pXfb6CP6rBWvk+ffHpFRz+RAKWWRm1CpUj0WezOw7s qu2ZELAgZ0lmr6Xp8585azQYHLkJAMhWXynv2okM= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53ED1BEB.7000409@yandex-team.ru> Date: Fri, 15 Aug 2014 00:28:27 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [CFT] new tables for ipfw References: <53EBC687.9050503@yandex-team.ru> <53EC880B.3020903@yandex-team.ru> <53EC960A.1030603@yandex-team.ru> <53ECA302.8010100@yandex-team.ru> In-Reply-To: <53ECA302.8010100@yandex-team.ru> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , Luigi Rizzo , "Andrey V. Elsukov" , freebsd-ipfw X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Aug 2014 22:01:33 -0000 On 14.08.2014 15:52, Alexander V. Chernikov wrote: > On 14.08.2014 15:15, Luigi Rizzo wrote: >> >> >> >> On Thu, Aug 14, 2014 at 12:57 PM, Alexander V. Chernikov >> > wrote: >> >> On 14.08.2014 14:44, Luigi Rizzo wrote: >>> >>> >>> >>> On Thu, Aug 14, 2014 at 11:57 AM, Alexander V. Chernikov >>> > wrote: >>> >>> On 14.08.2014 13:23, Luigi Rizzo wrote: >>>> >>>> >>>> >>>> On Wed, Aug 13, 2014 at 10:11 PM, Alexander V. Chernikov >>>> > >>>> wrote: >>>> >>>> Hello list. >>>> >>>> I've been hacking ipfw for a while and It seems there >>>> is something ready to test/review in projects/ipfw branch. >>>> >>>> >>>> ​this is a fantastic piece of work, thanks for doing it and for >>>> integrating the feedback. >>>> ​ >>>> I have some detailed feedback that will send you privately, >>>> but just a curiosity: >>>> >>>> ​...​ >>>> >>>> Some examples (see ipfw(8) manual page for the >>>> description): >>>> >>>> >>>> ​... >>>> >>>> >>>> ipfw table mi_test create type cidr algo "cidr:hash >>>> masks=/30,/64" >>>> >>>> >>>> ​why do we need to specify mask lengths in the above​ ? >>> Well, since we're hashing IP we have to know mask to cut >>> host bits in advance. >>> (And the real reason is that I'm too lazy to implement >>> hierarchical matching (check /32, then /31, then /30) like >>> how, for example, >>> >>> >>> ​oh well for that we should use cidr:radix >>> >>> Research results have never shown a strong superiority of >>> hierarchical hash tables over good radix implementations, >>> and in those cases one usually adopts partial prefix >>> expansion so you only have, say, masks that are a >>> multiple of 2..8 bits so you only need a small number of >>> hash lookups. >> Definitely, especially for IPv6. So I was actually thinking about >> covering some special sparse cases (e.g. someone having a bunch >> of /32 and a bunch of /30 and that's all). >> >> Btw, since we're talking about "good radix implementation": what >> license does DXR have? :) >> Is it OK to merge it as another cidr implementation? >> >> >> "cidr" is a very ugly name, i'd rather use "addr" > Ok, no problem with that. "addr" really sounds better. >> >> DXR has a ​bsd license and of course it is possible to use it. >> You should ask Marko Zec for his latest version of the code >> (and probably make sure we have one copy of the code in the source tree). > Great!. I'll ask him :) >> >> Speaking of features, one thing that would be nice is the ability >> for tables to reference the in-kernel tables (e.g. fibs, socket >> lists, interface lists...), perhaps in readonly mode. >> How complex do you think that would be ? Well, the most major problem is that tables handling code assumed that we do known number of items in advance, and since we're holding locks it won't change, so we don't need large contigious buffer to dump data to. This is not the case with "external" tables, so we can't _reliably_ dump them (the same situation as in case of dynamic states). Anyway, I've added cidr:kfib algo ( http://svnweb.freebsd.org/base?view=revision&revision=270001 ) and it looks funny. Quoting commit message: # ipfw table fib2 create algo "cidr:kfib fib=2" # ipfw table fib2 info +++ table(fib2), set(0) +++ kindex: 2, type: cidr, locked valtype: number, references: 0 algorithm: cidr:kfib fib=2 items: 11, size: 288 # ipfw table fib2 list +++ table(fib2), set(0) +++ 10.0.0.0/24 0 127.0.0.1/32 0 ::/96 0 ::1/128 0 ::ffff:0.0.0.0/96 0 2a02:978:2::/112 0 fe80::/10 0 fe80:1::/64 0 fe80:2::/64 0 fe80:3::/64 0 ff02::/16 0 # ipfw table fib2 lookup 10.0.0.5 10.0.0.0/24 0 # ipfw table fib2 lookup 2a02:978:2::11 2a02:978:2::/112 0 # ipfw table fib2 detail +++ table(fib2), set(0) +++ kindex: 2, type: cidr, locked valtype: number, references: 0 algorithm: cidr:kfib fib=2 items: 11, size: 288 IPv4 algorithm radix info items: 0 itemsize: 200 IPv6 algorithm radix info items: 0 itemsize: 200 > Implementing algo support for particular provider like sockets/iflists > shouldn't be hard. Most of the algorithms complexity lies in table > modifications. Here we have to support > lookup and dump operations, so it is the question of providing > necessary bindings to existing mechanisms (via some direct binding or > utilizing things like kernel_sysctl for dump support). > > It looks like the following maps well to current table concept: > * such tables are not created by default > * user issues > `ipfw table kfib create type addr algo "addr:kernel fib=0"` > or > `ipfw table ktcp create type flow algo "flow:kernel_tcp fib=0"` > or > `ipfw table kiface create type iface algo "iface:kernel"` > * tables have special "readonly" type, flush_all requests are ignored > * no state stored internally > > So generic table handling code needs to be modified to support > read-only tables (and making more callbacks optional). > Additionally, we might need to proxy "info" request info algo callback > (optional, "real" algorithms won't implement it) to be able to show > number of items (and some other info) to user. > > > >> >> cheers >> luigi >> > From owner-freebsd-net@FreeBSD.ORG Fri Aug 15 13:25:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9BFD08E0; Fri, 15 Aug 2014 13:25:13 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2C7D32D20; Fri, 15 Aug 2014 13:25:13 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=ptichko.yndx.net) by mail.ipfw.ru with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1XIDXm-000D82-Q8; Fri, 15 Aug 2014 13:11:30 +0400 Message-ID: <53EE0A30.4020800@FreeBSD.org> Date: Fri, 15 Aug 2014 17:25:04 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Dmitry Selivanov Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> In-Reply-To: <53E4BE62.4050303@rlan.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Aug 2014 13:25:13 -0000 On 08.08.2014 16:11, Dmitry Selivanov wrote: > 04.08.2014 23:51, Alexander V. Chernikov пишет: >> On 04.08.2014 15:58, Luigi Rizzo wrote: >>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote: >>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>> >>>>>> >>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>> > wrote: >>>>>> >>>>>> Hello all. >>>>>> >>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>> The most notable (and user-visible) change is named table >>>>>> support. >>>>>> The other one is support for different lookup algorithms >>>>>> for different >>>>>> key types. >>>>>> >>>>>> For example, new ipfw permits writing this: >>>>>> >>>>>> ipfw table tb1 create type cidr >>>>>> ipfw add allow ip from table(tl1) to any >>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>> >>>>>> ipfw table if1 create type iface >>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>> >>>>>> or even this: >>>>>> ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port >>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>> >>>>>> all these changes fully preserve backward compatibility. >>>>>> (actually tables needs now to be created before use and >>>>>> their type needs >>>>>> to match with opcode used, but new ipfw(8) performs >>>>>> auto-creation >>>>>> for cidr tables). >>>>>> >>>>>> There is another thing I'm going to change and I'm not sure >>>>>> I can keep >>>>>> the same compatibility level. >>>>>> >>>>>> Table values, from one point of view, can be classified to >>>>>> the following >>>>>> types: >>>>>> >>>>>> - skipto argument >>>>>> - fwd argument (*) >>>>>> - link to another object (nat, pipe, queue) >>>>>> - plain u32 (not bound to any object) >>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>> >>>>>> There are the following reasons why I think it is necessary >>>>>> to implement >>>>>> explicit table values typing (like tables): >>>>>> - Implementing fwd tablearg for IPv6 hosts requires >>>>>> indirection table >>>>>> - Converting nat/pipe instance ids to names renders values >>>>>> unusable >>>>>> - retiring old hack with storing saved pointer of found >>>>>> object/rule >>>>>> inside rule w/o proper locking >>>>>> - making faster skipto >>>>>> >>>>>> >>>>>> ??????i don't buy the idea that you need typed arguments >>>>>> for all the cases above. Maybe the case that >>>>>> may make sense is the fwd argument (and in the future >>>>>> something else). >>>>>> We already discussed, i think, the fact that now it >>>>>> is legal to have references to non existing things >>>>>> (skipto, pipes etc.) implemented as u32. >>>>>> Removing that would break configurations. >>>>> It depends on actual implementation. This can be preserved by >>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>> we can (and should) avoid breaking in this particular way. >>>> Can you please explain your vision on values another time? >>>> As far as I understand, you're not against it in general, but the >>>> details matter: >>>> * IP address can be one of the types (it won't break much, and we can >>>> simply skip that one for MFC) >>>> * what about typing for nat/pipes ? we're not going to convert >>>> their ids >>>> to names? (or maybe you can suggest other non-disruptive way?) >>>> * everything else is type "u32" >>> >>> Correct, I am mostly concerned about the details, not on the general >>> concept. >>> >>> To summarize the discussion Alexander and I had about converting >>> identifiers from numbers to arbitrary strings (this is partly related >>> to the values stored in tables, but I think we should have a coherent >>> behaviour) >>> >>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits >>> or less) >>> for rules, pipes, queues, tables, probably nat instances. >>> >>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>> non existing object (rule, pipe, table names, etc.), >>> and the kernel will do something reasonable, namely jump to the >>> next rule, drop traffic for non existing pipes, and so on. >>> >>> 3. of course we want to preserve backward compatibility both for >>> the ioctl interface, and for user configurations. >>> >>> 4. The in-kernel representation of identifiers is not visible to users, >>> so we can use a numeric representation in the kernel for >>> identifiers. >>> Strings like "12345" are converted with atoi() or the like, >>> whereas for other identifiers or numbers outside of the 2^16 range >>> the kernel manages a translation table, allocating new numeric >>> identifiers if a new string appears. >>> This permits backward compatibility for old rulesets, and does not >>> impact performance because the translation table is only >>> used during rules additions or deletion. >> Yes. However this requires either holding either (1) 2 pointers (old&new >> arrays), or (2) 65k+ index array, or (3) chained hash table. >> (1) would require additional pointers for each subsystem (and some >> additional management), >> (2) will definitely upset embedded guys and >> (3) is worse in terms of performance >>> >>> With this in mind, i think we should follow a similar approach for >>> objects stored in tables, hence >>> >>> if an u32 value was available in the past, it must be >>> available also in the new implementation. >>> >>> The issue with tables is that some convoluted configuration could >>> use the same table to reference pipes _and_ rules _and_ perhaps >>> other things represented as numbers (the former is not too strange, >>> if i have a large configuration i might place sections at rules >>> 12000, 13000, 14000... and associate pipes with the same numberic >>> identifier to each block of rules). >>> >>> Typed table values would clearly disturb backward compatibility >>> in the above configurations. However it should not be difficult >>> to accept arbitrary strings as the values stored in tables, and >>> then store multiple representations as appropriate, including: >> Well, I've thought about thas one. It may be an option, but the details >> are not so promising (below) >>> - the string representation, unconditionally >>> - for names that can be resolved by DNS, the ipv6 and ipv4 address(es) >>> associated with them. ipfw already translates hostnames in rules >>> so this is POLA >> I'm not happy what ipfw(8) is doing instead of translation. The proper >> way would be not simply using first AF_INET answer but saving ALL >> IPv4+IPv6 records inside rule (and some more tracking should be done >> afterwards, but that's totally different story). Additionally, I'm >> unsure if we really need next-hop value expressed as hostname (how can >> we deal with multiple addresses and diffrent AFs?). We may store strings >> (and I think we should do it) but I'm unsure about this particular >> option of interpreting them. >>> - for other strings, a u32 from the translation table as previously >>> indicated >>> - and for numeric values, the u32 representation (truncated if needed, >>> according to whatever is the existing behaviour) >>> - >>> If we cannot generate an u32 we will put some value (e.g. 0) >>> that hopefully will not cause confusion. >> As far as I understand, we accept some string "s" as table value inside >> the kernel, than, we have some logic that says: >> oh, dummynet pipe has the same name "s"s, oh, nat entity with name "s" >> has just been created, let's save indices. >> >> That would require additional indirection table like: >> >> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >> ( so we will have 2-level indirection table for fwd if we do IPv6) >> >> We can optimize this if we use "same name -> same kidx" approach >> regardless of kernel object we're refering to. That might require some >> more memory, but that's OK from my point of view. >> >> So we end up with >> int [ skipto idx | fwd idx | obj idx ] >> >> idx "0" is special value which means the same as 2.CURRENT >> >> That looks better, but still way to complex. >> I do care about compatibility, but it's hard to improve things without >> changing. >> >> I'd like to propose the following: >> * Split values into 3 types ("ip|nexthop", "number", "object") >> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >> behavior. >> * Retain full compatibility by introducing special value type "legacy" >> which matches any type and is backed by given indirection table. >> * Issue warning in ipfw(8) binary on all auto-created tables that >> auto-creation is legacy and this behavior will be dropped in next major >> release (e.g. 11.0) >> * Save this behavior in MFC but drop "legacy" tables in head after a >> month after actual MFC. >> >> That do you think? >>> >>> If we do it this way, we should be able to preserve backward >>> compatibility _and_ add features that people may need. >>> >>> cheers >>> luigi >>> > Here is my idea: tablearg should contain more than one value. I think > getting several values from one table lookup is faster than several > table lookups with one value. > Let tablearg be not just uint32, but array with different value types > inside it. There are some use cases where we might need 2-level value lookup (e.g. algo returning index for index table where actual data reside) and each data item can really be up to 64-bytes long. The problem is in actual partitioning and compatibility. > > For example I have many such rules: > allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip > 1.1.1.1 Sorry, what task are you solving by using given rules? > > These rules can be replaced with such construction: > allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip > tablearg[3] > > But I don't think indexing by value is a good idea. I think > index==starting byte is a better way: > allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip > tablearg:32 > where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is from > 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are from 32 > to 35. > So we need to create table for it: > table 1 set MAC:0 string:6:26 ip:32 > table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 > > String can be used both for iface and comment. > Other possible value types: > uint16 for nat, pipe, skipto and other 2-bytes actions > IPv4 4 bytes > CIDRv4 5 bytes > IPv6 16 bytes > CIDRv6 17 bytes > table_id 2 bytes - link to another table Well, it seems we have enough space to store most of these, however, problems seem to remain the same: typing and compatibility. When you're creating new table (or it is auto-created) which values types should be assumed ? All of them? What should `ipfw table X list` show as "value" field ? How should ipfw(8) treat "add 1.1.1.1 0" input? What will happen if we want to add another type field to this list? (MAC address of Infiniband MAC address, for example). > > Table value length can be set for example with loader tunable like > net.inet.ip.fw.table_value_length. > Even with default uint32 value length we can get 2 uint16 values or 4 > uint8 values, this can help in some configurations. > > This way is more complex, but much more flexible. It's like netgraph > subsystem. > I think it suites both Alexander and Luigi requests. > > From owner-freebsd-net@FreeBSD.ORG Fri Aug 15 14:19:16 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AE9111DF; Fri, 15 Aug 2014 14:19:16 +0000 (UTC) Received: from mail.rlan.ru (mail.rlan.ru [213.234.25.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 03C372540; Fri, 15 Aug 2014 14:19:15 +0000 (UTC) Message-ID: <53EE16DE.9020209@rlan.ru> Date: Fri, 15 Aug 2014 18:19:10 +0400 From: Dmitry Selivanov User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> In-Reply-To: <53EE0A30.4020800@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Aug 2014 14:19:16 -0000 15.08.2014 17:25, Alexander V. Chernikov пишет: > On 08.08.2014 16:11, Dmitry Selivanov wrote: >> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote: >>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>> > wrote: >>>>>>> >>>>>>> Hello all. >>>>>>> >>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>> The most notable (and user-visible) change is named table support. >>>>>>> The other one is support for different lookup algorithms for different >>>>>>> key types. >>>>>>> >>>>>>> For example, new ipfw permits writing this: >>>>>>> >>>>>>> ipfw table tb1 create type cidr >>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>> >>>>>>> ipfw table if1 create type iface >>>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>>> >>>>>>> or even this: >>>>>>> ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port >>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>> >>>>>>> all these changes fully preserve backward compatibility. >>>>>>> (actually tables needs now to be created before use and their type needs >>>>>>> to match with opcode used, but new ipfw(8) performs auto-creation >>>>>>> for cidr tables). >>>>>>> >>>>>>> There is another thing I'm going to change and I'm not sure I can keep >>>>>>> the same compatibility level. >>>>>>> >>>>>>> Table values, from one point of view, can be classified to the following >>>>>>> types: >>>>>>> >>>>>>> - skipto argument >>>>>>> - fwd argument (*) >>>>>>> - link to another object (nat, pipe, queue) >>>>>>> - plain u32 (not bound to any object) >>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>> >>>>>>> There are the following reasons why I think it is necessary to implement >>>>>>> explicit table values typing (like tables): >>>>>>> - Implementing fwd tablearg for IPv6 hosts requires indirection table >>>>>>> - Converting nat/pipe instance ids to names renders values unusable >>>>>>> - retiring old hack with storing saved pointer of found object/rule >>>>>>> inside rule w/o proper locking >>>>>>> - making faster skipto >>>>>>> >>>>>>> >>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>> for all the cases above. Maybe the case that >>>>>>> may make sense is the fwd argument (and in the future >>>>>>> something else). >>>>>>> We already discussed, i think, the fact that now it >>>>>>> is legal to have references to non existing things >>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>> Removing that would break configurations. >>>>>> It depends on actual implementation. This can be preserved by >>>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>>> we can (and should) avoid breaking in this particular way. >>>>> Can you please explain your vision on values another time? >>>>> As far as I understand, you're not against it in general, but the >>>>> details matter: >>>>> * IP address can be one of the types (it won't break much, and we can >>>>> simply skip that one for MFC) >>>>> * what about typing for nat/pipes ? we're not going to convert their ids >>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>> * everything else is type "u32" >>>> >>>> Correct, I am mostly concerned about the details, not on the general concept. >>>> >>>> To summarize the discussion Alexander and I had about converting >>>> identifiers from numbers to arbitrary strings (this is partly related >>>> to the values stored in tables, but I think we should have a coherent >>>> behaviour) >>>> >>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits or less) >>>> for rules, pipes, queues, tables, probably nat instances. >>>> >>>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>>> non existing object (rule, pipe, table names, etc.), >>>> and the kernel will do something reasonable, namely jump to the >>>> next rule, drop traffic for non existing pipes, and so on. >>>> >>>> 3. of course we want to preserve backward compatibility both for >>>> the ioctl interface, and for user configurations. >>>> >>>> 4. The in-kernel representation of identifiers is not visible to users, >>>> so we can use a numeric representation in the kernel for identifiers. >>>> Strings like "12345" are converted with atoi() or the like, >>>> whereas for other identifiers or numbers outside of the 2^16 range >>>> the kernel manages a translation table, allocating new numeric >>>> identifiers if a new string appears. >>>> This permits backward compatibility for old rulesets, and does not >>>> impact performance because the translation table is only >>>> used during rules additions or deletion. >>> Yes. However this requires either holding either (1) 2 pointers (old&new >>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>> (1) would require additional pointers for each subsystem (and some >>> additional management), >>> (2) will definitely upset embedded guys and >>> (3) is worse in terms of performance >>>> >>>> With this in mind, i think we should follow a similar approach for >>>> objects stored in tables, hence >>>> >>>> if an u32 value was available in the past, it must be >>>> available also in the new implementation. >>>> >>>> The issue with tables is that some convoluted configuration could >>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>> other things represented as numbers (the former is not too strange, >>>> if i have a large configuration i might place sections at rules >>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>> identifier to each block of rules). >>>> >>>> Typed table values would clearly disturb backward compatibility >>>> in the above configurations. However it should not be difficult >>>> to accept arbitrary strings as the values stored in tables, and >>>> then store multiple representations as appropriate, including: >>> Well, I've thought about thas one. It may be an option, but the details >>> are not so promising (below) >>>> - the string representation, unconditionally >>>> - for names that can be resolved by DNS, the ipv6 and ipv4 address(es) >>>> associated with them. ipfw already translates hostnames in rules >>>> so this is POLA >>> I'm not happy what ipfw(8) is doing instead of translation. The proper >>> way would be not simply using first AF_INET answer but saving ALL >>> IPv4+IPv6 records inside rule (and some more tracking should be done >>> afterwards, but that's totally different story). Additionally, I'm >>> unsure if we really need next-hop value expressed as hostname (how can >>> we deal with multiple addresses and diffrent AFs?). We may store strings >>> (and I think we should do it) but I'm unsure about this particular >>> option of interpreting them. >>>> - for other strings, a u32 from the translation table as previously >>>> indicated >>>> - and for numeric values, the u32 representation (truncated if needed, >>>> according to whatever is the existing behaviour) >>>> - >>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>> that hopefully will not cause confusion. >>> As far as I understand, we accept some string "s" as table value inside >>> the kernel, than, we have some logic that says: >>> oh, dummynet pipe has the same name "s"s, oh, nat entity with name "s" >>> has just been created, let's save indices. >>> >>> That would require additional indirection table like: >>> >>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>> >>> We can optimize this if we use "same name -> same kidx" approach >>> regardless of kernel object we're refering to. That might require some >>> more memory, but that's OK from my point of view. >>> >>> So we end up with >>> int [ skipto idx | fwd idx | obj idx ] >>> >>> idx "0" is special value which means the same as 2.CURRENT >>> >>> That looks better, but still way to complex. >>> I do care about compatibility, but it's hard to improve things without >>> changing. >>> >>> I'd like to propose the following: >>> * Split values into 3 types ("ip|nexthop", "number", "object") >>> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >>> behavior. >>> * Retain full compatibility by introducing special value type "legacy" >>> which matches any type and is backed by given indirection table. >>> * Issue warning in ipfw(8) binary on all auto-created tables that >>> auto-creation is legacy and this behavior will be dropped in next major >>> release (e.g. 11.0) >>> * Save this behavior in MFC but drop "legacy" tables in head after a >>> month after actual MFC. >>> >>> That do you think? >>>> >>>> If we do it this way, we should be able to preserve backward >>>> compatibility _and_ add features that people may need. >>>> >>>> cheers >>>> luigi >>>> >> Here is my idea: tablearg should contain more than one value. I think getting several values from one table lookup is faster than several table lookups with one value. >> Let tablearg be not just uint32, but array with different value types inside it. > There are some use cases where we might need 2-level value lookup (e.g. algo returning index for index table where actual data reside) and each data item can > really be up to 64-bytes long. The problem is in actual partitioning and compatibility. >> >> For example I have many such rules: >> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip 1.1.1.1 > Sorry, what task are you solving by using given rules? Small ISP, clients have static IP with MAC-authorization. Src iface must be checked to prevent IP-spoofing. Dst-IP sometimes is used for p2p-channels. >> >> These rules can be replaced with such construction: >> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip tablearg[3] >> >> But I don't think indexing by value is a good idea. I think index==starting byte is a better way: >> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip tablearg:32 >> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are from 32 to 35. > >> So we need to create table for it: >> table 1 set MAC:0 string:6:26 ip:32 >> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >> >> String can be used both for iface and comment. >> Other possible value types: >> uint16 for nat, pipe, skipto and other 2-bytes actions >> IPv4 4 bytes >> CIDRv4 5 bytes >> IPv6 16 bytes >> CIDRv6 17 bytes >> table_id 2 bytes - link to another table > Well, it seems we have enough space to store most of these, however, problems seem to remain the same: typing and compatibility. > When you're creating new table (or it is auto-created) which values types should be assumed ? All of them? Default - as usually uint32. > What should `ipfw table X list` show as "value" field ? I added table "header" in this line: table 1 set MAC:0 string:6:26 ip:32 So `ipfw table X list` should show something like this: ---table(0)--- 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 We can also add "header" description in output (with or without additional parameter - depends on compatibility needs) like this: ---table(0)--- addr MAC iface IPv4 > How should ipfw(8) treat "add 1.1.1.1 0" input? It should look at table "header" and return error message like "Value doesn't match table header" > What will happen if we want to add another type field to this list? (MAC address of Infiniband MAC address, for example). I don't think there is a sense to mix both MAC[6] and MAC[20] values in 1 table. It is easier to create 2 tables with different "headers". For Infiniband we can add another type: MAC20 (or something like this). Or we can use "MAC"-type like string type(see above): MAC:6:25 (1st and last bytes, or 1st and length). > >> >> Table value length can be set for example with loader tunable like net.inet.ip.fw.table_value_length. >> Even with default uint32 value length we can get 2 uint16 values or 4 uint8 values, this can help in some configurations. >> >> This way is more complex, but much more flexible. It's like netgraph subsystem. >> I think it suites both Alexander and Luigi requests. >> >> > From owner-freebsd-net@FreeBSD.ORG Fri Aug 15 15:20:23 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 57212ABF; Fri, 15 Aug 2014 15:20:23 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF1652C86; Fri, 15 Aug 2014 15:20:22 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=ptichko.yndx.net) by mail.ipfw.ru with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1XIFLD-000EOJ-GP; Fri, 15 Aug 2014 15:06:39 +0400 Message-ID: <53EE252D.10109@FreeBSD.org> Date: Fri, 15 Aug 2014 19:20:13 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Dmitry Selivanov Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> <53EE16DE.9020209@rlan.ru> In-Reply-To: <53EE16DE.9020209@rlan.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Aug 2014 15:20:23 -0000 On 15.08.2014 18:19, Dmitry Selivanov wrote: > 15.08.2014 17:25, Alexander V. Chernikov пишет: >> On 08.08.2014 16:11, Dmitry Selivanov wrote: >>> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov >>>>> wrote: >>>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>>> > wrote: >>>>>>>> >>>>>>>> Hello all. >>>>>>>> >>>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>>> The most notable (and user-visible) change is named table >>>>>>>> support. >>>>>>>> The other one is support for different lookup algorithms >>>>>>>> for different >>>>>>>> key types. >>>>>>>> >>>>>>>> For example, new ipfw permits writing this: >>>>>>>> >>>>>>>> ipfw table tb1 create type cidr >>>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>>> >>>>>>>> ipfw table if1 create type iface >>>>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>>>> >>>>>>>> or even this: >>>>>>>> ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port >>>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>>> >>>>>>>> all these changes fully preserve backward compatibility. >>>>>>>> (actually tables needs now to be created before use and >>>>>>>> their type needs >>>>>>>> to match with opcode used, but new ipfw(8) performs >>>>>>>> auto-creation >>>>>>>> for cidr tables). >>>>>>>> >>>>>>>> There is another thing I'm going to change and I'm not >>>>>>>> sure I can keep >>>>>>>> the same compatibility level. >>>>>>>> >>>>>>>> Table values, from one point of view, can be classified >>>>>>>> to the following >>>>>>>> types: >>>>>>>> >>>>>>>> - skipto argument >>>>>>>> - fwd argument (*) >>>>>>>> - link to another object (nat, pipe, queue) >>>>>>>> - plain u32 (not bound to any object) >>>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>>> >>>>>>>> There are the following reasons why I think it is >>>>>>>> necessary to implement >>>>>>>> explicit table values typing (like tables): >>>>>>>> - Implementing fwd tablearg for IPv6 hosts requires >>>>>>>> indirection table >>>>>>>> - Converting nat/pipe instance ids to names renders >>>>>>>> values unusable >>>>>>>> - retiring old hack with storing saved pointer of found >>>>>>>> object/rule >>>>>>>> inside rule w/o proper locking >>>>>>>> - making faster skipto >>>>>>>> >>>>>>>> >>>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>>> for all the cases above. Maybe the case that >>>>>>>> may make sense is the fwd argument (and in the future >>>>>>>> something else). >>>>>>>> We already discussed, i think, the fact that now it >>>>>>>> is legal to have references to non existing things >>>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>>> Removing that would break configurations. >>>>>>> It depends on actual implementation. This can be preserved by >>>>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>>>> we can (and should) avoid breaking in this particular way. >>>>>> Can you please explain your vision on values another time? >>>>>> As far as I understand, you're not against it in general, but the >>>>>> details matter: >>>>>> * IP address can be one of the types (it won't break much, and we >>>>>> can >>>>>> simply skip that one for MFC) >>>>>> * what about typing for nat/pipes ? we're not going to convert >>>>>> their ids >>>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>>> * everything else is type "u32" >>>>> >>>>> Correct, I am mostly concerned about the details, not on the >>>>> general concept. >>>>> >>>>> To summarize the discussion Alexander and I had about converting >>>>> identifiers from numbers to arbitrary strings (this is partly related >>>>> to the values stored in tables, but I think we should have a coherent >>>>> behaviour) >>>>> >>>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 >>>>> bits or less) >>>>> for rules, pipes, queues, tables, probably nat instances. >>>>> >>>>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>>>> non existing object (rule, pipe, table names, etc.), >>>>> and the kernel will do something reasonable, namely jump to the >>>>> next rule, drop traffic for non existing pipes, and so on. >>>>> >>>>> 3. of course we want to preserve backward compatibility both for >>>>> the ioctl interface, and for user configurations. >>>>> >>>>> 4. The in-kernel representation of identifiers is not visible to >>>>> users, >>>>> so we can use a numeric representation in the kernel for >>>>> identifiers. >>>>> Strings like "12345" are converted with atoi() or the like, >>>>> whereas for other identifiers or numbers outside of the 2^16 >>>>> range >>>>> the kernel manages a translation table, allocating new numeric >>>>> identifiers if a new string appears. >>>>> This permits backward compatibility for old rulesets, and does >>>>> not >>>>> impact performance because the translation table is only >>>>> used during rules additions or deletion. >>>> Yes. However this requires either holding either (1) 2 pointers >>>> (old&new >>>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>>> (1) would require additional pointers for each subsystem (and some >>>> additional management), >>>> (2) will definitely upset embedded guys and >>>> (3) is worse in terms of performance >>>>> >>>>> With this in mind, i think we should follow a similar approach for >>>>> objects stored in tables, hence >>>>> >>>>> if an u32 value was available in the past, it must be >>>>> available also in the new implementation. >>>>> >>>>> The issue with tables is that some convoluted configuration could >>>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>>> other things represented as numbers (the former is not too strange, >>>>> if i have a large configuration i might place sections at rules >>>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>>> identifier to each block of rules). >>>>> >>>>> Typed table values would clearly disturb backward compatibility >>>>> in the above configurations. However it should not be difficult >>>>> to accept arbitrary strings as the values stored in tables, and >>>>> then store multiple representations as appropriate, including: >>>> Well, I've thought about thas one. It may be an option, but the >>>> details >>>> are not so promising (below) >>>>> - the string representation, unconditionally >>>>> - for names that can be resolved by DNS, the ipv6 and ipv4 >>>>> address(es) >>>>> associated with them. ipfw already translates hostnames in rules >>>>> so this is POLA >>>> I'm not happy what ipfw(8) is doing instead of translation. The proper >>>> way would be not simply using first AF_INET answer but saving ALL >>>> IPv4+IPv6 records inside rule (and some more tracking should be done >>>> afterwards, but that's totally different story). Additionally, I'm >>>> unsure if we really need next-hop value expressed as hostname (how can >>>> we deal with multiple addresses and diffrent AFs?). We may store >>>> strings >>>> (and I think we should do it) but I'm unsure about this particular >>>> option of interpreting them. >>>>> - for other strings, a u32 from the translation table as previously >>>>> indicated >>>>> - and for numeric values, the u32 representation (truncated if >>>>> needed, >>>>> according to whatever is the existing behaviour) >>>>> - >>>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>>> that hopefully will not cause confusion. >>>> As far as I understand, we accept some string "s" as table value >>>> inside >>>> the kernel, than, we have some logic that says: >>>> oh, dummynet pipe has the same name "s"s, oh, nat entity with name "s" >>>> has just been created, let's save indices. >>>> >>>> That would require additional indirection table like: >>>> >>>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>>> >>>> We can optimize this if we use "same name -> same kidx" approach >>>> regardless of kernel object we're refering to. That might require some >>>> more memory, but that's OK from my point of view. >>>> >>>> So we end up with >>>> int [ skipto idx | fwd idx | obj idx ] >>>> >>>> idx "0" is special value which means the same as 2.CURRENT >>>> >>>> That looks better, but still way to complex. >>>> I do care about compatibility, but it's hard to improve things without >>>> changing. >>>> >>>> I'd like to propose the following: >>>> * Split values into 3 types ("ip|nexthop", "number", "object") >>>> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >>>> behavior. >>>> * Retain full compatibility by introducing special value type "legacy" >>>> which matches any type and is backed by given indirection table. >>>> * Issue warning in ipfw(8) binary on all auto-created tables that >>>> auto-creation is legacy and this behavior will be dropped in next >>>> major >>>> release (e.g. 11.0) >>>> * Save this behavior in MFC but drop "legacy" tables in head after a >>>> month after actual MFC. >>>> >>>> That do you think? >>>>> >>>>> If we do it this way, we should be able to preserve backward >>>>> compatibility _and_ add features that people may need. >>>>> >>>>> cheers >>>>> luigi >>>>> >>> Here is my idea: tablearg should contain more than one value. I >>> think getting several values from one table lookup is faster than >>> several table lookups with one value. >>> Let tablearg be not just uint32, but array with different value >>> types inside it. >> There are some use cases where we might need 2-level value lookup >> (e.g. algo returning index for index table where actual data reside) >> and each data item can >> really be up to 64-bytes long. The problem is in actual partitioning >> and compatibility. >>> >>> For example I have many such rules: >>> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip >>> 1.1.1.1 >> Sorry, what task are you solving by using given rules? > Small ISP, clients have static IP with MAC-authorization. Src iface > must be checked to prevent IP-spoofing. Dst-IP sometimes is used for > p2p-channels. >>> >>> These rules can be replaced with such construction: >>> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip >>> tablearg[3] >>> >>> But I don't think indexing by value is a good idea. I think >>> index==starting byte is a better way: >>> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip >>> tablearg:32 >>> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is >>> from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are >>> from 32 to 35. >> >>> So we need to create table for it: >>> table 1 set MAC:0 string:6:26 ip:32 >>> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >>> >>> String can be used both for iface and comment. >>> Other possible value types: >>> uint16 for nat, pipe, skipto and other 2-bytes actions >>> IPv4 4 bytes >>> CIDRv4 5 bytes >>> IPv6 16 bytes >>> CIDRv6 17 bytes >>> table_id 2 bytes - link to another table >> Well, it seems we have enough space to store most of these, however, >> problems seem to remain the same: typing and compatibility. >> When you're creating new table (or it is auto-created) which values >> types should be assumed ? All of them? > Default - as usually uint32. I can't see "uint32" value in the list you have specified before. I'll rephrase: what value types (from the list above or similar) should ipfw(8) or kernel fill in case of "default" table? (And once again, what should we print as value) ? Please think about a) old ipfw binaries b) new ipfw binaries using exactly the same ruleset they are already using (with, for example, both "skipto tablearg" and "fwd tablearg " tables). >> What should `ipfw table X list` show as "value" field ? > I added table "header" in this line: > table 1 set MAC:0 string:6:26 ip:32 I don't think that user should be able to set any offsets in userland. Exact offsets of variable of given type needs to be enforced by kernel, so you may fill that you want "mac" and "ip" as values for given table, but not lengths or offsets. > So `ipfw table X list` should show something like this: > ---table(0)--- > 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 > We can also add "header" description in output (with or without > additional parameter - depends on compatibility needs) like this: > ---table(0)--- addr MAC iface IPv4 >> How should ipfw(8) treat "add 1.1.1.1 0" input? > It should look at table "header" and return error message like "Value > doesn't match table header" >> What will happen if we want to add another type field to this list? >> (MAC address of Infiniband MAC address, for example). > I don't think there is a sense to mix both MAC[6] and MAC[20] values > in 1 table. It is easier to create 2 tables with different "headers". > For Infiniband we can add another type: MAC20 (or something like > this). Or we can use "MAC"-type like string type(see above): MAC:6:25 > (1st and last bytes, or 1st and length). >> >>> >>> Table value length can be set for example with loader tunable like >>> net.inet.ip.fw.table_value_length. >>> Even with default uint32 value length we can get 2 uint16 values or >>> 4 uint8 values, this can help in some configurations. >>> >>> This way is more complex, but much more flexible. It's like netgraph >>> subsystem. >>> I think it suites both Alexander and Luigi requests. >>> >>> >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Fri Aug 15 16:40:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2896A80F for ; Fri, 15 Aug 2014 16:40:20 +0000 (UTC) Received: from mail-ob0-f177.google.com (mail-ob0-f177.google.com [209.85.214.177]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E069E27F7 for ; Fri, 15 Aug 2014 16:40:19 +0000 (UTC) Received: by mail-ob0-f177.google.com with SMTP id wp18so2111526obc.36 for ; Fri, 15 Aug 2014 09:40:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=69EVvCJEooLRxNxKB3EIageqalelt/v9XwfuMRZfA+8=; b=CmuODtJaPL3dQxrUd/ZgXfG9GETznqB9nGnoMYCyxe+gaPoKKr51s1q2hU33paBl1a 500Ago68sVuRgUV2LRibwQAGFPYwvAwwYbaGVmXLvA+Ek2GKYndLh8gNO41fEBKEvUop Xcf6jNjQMgtZudBqh9HvAzMhnMZvk3KoPZs85yKyduMg52wfdTni8376da/FU1+w8STV ImBOCPRrLCT41pOrWVN3KbAxBTpxHx/bAQsxyweydy6AEPDeZZUWZqnFFYV+DiwEF/P2 aMG0Uo4PeijxdetobX+KVa+RzpagzVweAZSnwIyv8uTRq71VCZYYvLVlESgyRoBJW84d nAcg== X-Gm-Message-State: ALoCoQmeY666pgpkL2MZjh8mUWy6OMtktOEZotEAnk/wklYqN/ELH408VV8/NhI7ULoTe1/2Mf8a MIME-Version: 1.0 X-Received: by 10.60.135.37 with SMTP id pp5mr21196811oeb.54.1408120813075; Fri, 15 Aug 2014 09:40:13 -0700 (PDT) Received: by 10.60.120.37 with HTTP; Fri, 15 Aug 2014 09:40:13 -0700 (PDT) In-Reply-To: <53EBC750.1050203@yandex-team.ru> References: <53EBC750.1050203@yandex-team.ru> Date: Fri, 15 Aug 2014 09:40:13 -0700 Message-ID: Subject: Re: [CFT] new tables for ipfw From: Michael Sierchio To: "freebsd-net@freebsd.org" , freebsd-ipfw Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Aug 2014 16:40:20 -0000 On Wed, Aug 13, 2014 at 1:15 PM, Alexander V. Chernikov wrote: > I've been hacking ipfw for a while and It seems there is something ready = to > test/review in projects/ipfw branch. =D0=9E=D1=82=D0=BB=D0=B8=D1=87=D0=BD=D0=B0=D1=8F =D1=80=D0=B0=D0=B1=D0=BE= =D1=82=D0=B0! Thanks so much. Copious examples will be helpful - the ipfw man page is already too huge, and ipfw merits an entire guide on its own, with examples. I'm volunteering, but the new features will be - well, new. - M From owner-freebsd-net@FreeBSD.ORG Sat Aug 16 19:20:57 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E3823E78; Sat, 16 Aug 2014 19:20:57 +0000 (UTC) Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net [IPv6:2a02:6b8:0:202::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8E07E2D3A; Sat, 16 Aug 2014 19:20:57 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id A87886404E5; Sat, 16 Aug 2014 23:20:53 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 5EB112C0535; Sat, 16 Aug 2014 23:20:53 +0400 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id VG8XbtLtep-KrIWwTTI; Sat, 16 Aug 2014 23:20:53 +0400 (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) X-Yandex-Uniq: 3f4a58df-5fdb-4710-8b56-59ffc8d03cfe DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1408216853; bh=RG+JAc5gu5mEGkQ+R4hFCLhCsWQfy5cZanfSPvAvqPI=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type; b=kbL7l1RYWj76yjU6WtMCdysoJxQMFP6SN7lYFHGHUd6rLsM356mcpQHwqY6x8t/m2 2Snjd5ZZKIuPa5gyUvcPNBbt80EqVri48wbSr/GP/YdEJb6eg6eKphcO+c8kQ50JoM UVouQKwGDgNT/uTewDo9D9jnId9iyI6wBXahvaO4= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <53EFAF0B.6060301@yandex-team.ru> Date: Sat, 16 Aug 2014 23:20:43 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: jfv@FreeBSD.org Subject: ixgbe i2c interface Content-Type: multipart/mixed; boundary="------------040509000500010406020307" Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Aug 2014 19:20:58 -0000 This is a multi-part message in MIME format. --------------040509000500010406020307 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello Jack! Can you please commit (or let me commit) the following one-liner? --------------040509000500010406020307 Content-Type: text/x-patch; name="ixgbe_i2c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ixgbe_i2c.diff" Index: sys/dev/ixgbe/ixgbe.c =================================================================== --- sys/dev/ixgbe/ixgbe.c (revision 270040) +++ sys/dev/ixgbe/ixgbe.c (working copy) @@ -1055,7 +1055,7 @@ ixgbe_ioctl(struct ifnet * ifp, u_long command, ca error = copyin(ifr->ifr_data, &i2c, sizeof(i2c)); if (error) break; - if ((i2c.dev_addr != 0xA0) || (i2c.dev_addr != 0xA2)){ + if ((i2c.dev_addr != 0xA0) && (i2c.dev_addr != 0xA2)){ error = EINVAL; break; } --------------040509000500010406020307-- From owner-freebsd-net@FreeBSD.ORG Sat Aug 16 22:25:19 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F3776738; Sat, 16 Aug 2014 22:25:18 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B87782409; Sat, 16 Aug 2014 22:25:18 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1XIiRv-0008r2-WA; Sat, 16 Aug 2014 22:11:32 +0400 Message-ID: <53EFDA3C.3010008@FreeBSD.org> Date: Sun, 17 Aug 2014 02:25:00 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: hackers@freebsd.org, "net@freebsd.org" Subject: SIOCGI2C ioctl for NIC drivers X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Navdeep Parhar X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Aug 2014 22:25:19 -0000 Hello list. It seems that networking is evolving quite rapidly, so 10g nics are quite common: we have Intel, Chelsio, Mellanox, Emulex, Solarflare and Myricom drivers in our tree (maybe some others). 40G are also here: (Chelsio, Mellanox, Intel). Things like 25G NICs are also getting more interest. Most of them uses SFP+/QSPF+ (either for short range optics or passive/active twinax cabling) and we can improve diagnostics here by providing standard way to request i2c data (like vendor info and signal levels) from transceivers. Chelsio and Intel drivers already provide methods to retrieve those info, but in a different way. I'd like to add SIOCGI2C as standard ioctl for that, picking value like 61, if there are no objections. From owner-freebsd-net@FreeBSD.ORG Sat Aug 16 22:27:20 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B8CB1A15; Sat, 16 Aug 2014 22:27:20 +0000 (UTC) Received: from mail-pa0-x235.google.com (mail-pa0-x235.google.com [IPv6:2607:f8b0:400e:c03::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8B9E52433; Sat, 16 Aug 2014 22:27:20 +0000 (UTC) Received: by mail-pa0-f53.google.com with SMTP id rd3so5431502pab.12 for ; Sat, 16 Aug 2014 15:27:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=d8syMA+f8QKXD8sF05iiD9polPw8Ij5GJv3zTvtlBWE=; b=T7Sj4tjk6PYpeSpu42maU9nzpaqfG3gFT5Obbv3HhdUX99nRCvUxneCeNelh78DaVJ j4YYGcsr/lmuVLk0/c7qZ6/7+rHIbSZqGVcNh7OZnte1yKsup/790oA+AOq0mrzpmqiJ 0vrqCY+5rMVUo3b3CbN/fI4tkZR+b5dMeEOk2+gN+UdhPFmLBC6w3hMp70/f1mHV0iEM GEAMDG0iUl+af6ahx+ldbeDin876ioTe8ZAH27dMFh128IOBm349ZE6I4MMDiS4T5dMw GlJ7MfZjhKSKlbdigm4zFC41zb7/3ClpNlcY4StG285/3Mj1eZ4z06nGuuxc6IXC2IsZ 51JQ== MIME-Version: 1.0 X-Received: by 10.68.215.106 with SMTP id oh10mr23137240pbc.98.1408228040185; Sat, 16 Aug 2014 15:27:20 -0700 (PDT) Received: by 10.70.83.35 with HTTP; Sat, 16 Aug 2014 15:27:19 -0700 (PDT) Received: by 10.70.83.35 with HTTP; Sat, 16 Aug 2014 15:27:19 -0700 (PDT) In-Reply-To: <53EFAF0B.6060301@yandex-team.ru> References: <53EFAF0B.6060301@yandex-team.ru> Date: Sat, 16 Aug 2014 15:27:19 -0700 Message-ID: Subject: Re: ixgbe i2c interface From: Eric Joyner To: "Alexander V. Chernikov" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Jack F Vogel , FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Aug 2014 22:27:20 -0000 I unofficially approve of it! --- Eric Joyner On Aug 16, 2014 12:21 PM, "Alexander V. Chernikov" wrote: > Hello Jack! > > Can you please commit (or let me commit) the following one-liner? > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >