From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 02:23:45 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2429EFCC;
 Mon,  2 Dec 2013 02:23:45 +0000 (UTC)
Received: from mail-pb0-x22b.google.com (mail-pb0-x22b.google.com
 [IPv6:2607:f8b0:400e:c01::22b])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E5AB01689;
 Mon,  2 Dec 2013 02:23:44 +0000 (UTC)
Received: by mail-pb0-f43.google.com with SMTP id rq2so17776899pbb.2
 for <multiple recipients>; Sun, 01 Dec 2013 18:23:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:date:to:cc:subject:message-id:reply-to:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=BMrc5f1xZFo5JoUpPS3cIBbLXZ67fmwF0iYqyuIaQqw=;
 b=mX48HI9a1BB+DOE8EnNjBiYMXkoJ/W5SoeBkSpc6FdTW5EL+X95pNrgvg4SD+nOqQc
 SQUfxRUMgdD3EXm3tmApQMaygLRFG53IZVPClFN+F3I4jAkKUnIp4X8h81d+EYlum3FI
 dMKDFY6Gw4nycDx8igeTAUxIE73s0XsXJcO3JJGm+52foR9vERmkoKA2OkFuARzEaqwJ
 b6jNRxL4Kx1fNbAOfdBsKi/J8IJnoEgrhML0Cu+PbJiLbxno7gnGKuNR5cS9EzvA6wGi
 WwBVppcldCwoGfGdV+0mPvUEfaSw4pcOlxFtMvOpk5b5R56oSUNT78VAijeYkkwfXNti
 ZWTA==
X-Received: by 10.68.102.133 with SMTP id fo5mr453880pbb.175.1385951024505;
 Sun, 01 Dec 2013 18:23:44 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPSA id ki1sm118525907pbd.1.2013.12.01.18.23.41
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Sun, 01 Dec 2013 18:23:43 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Mon, 02 Dec 2013 11:23:38 +0900
From: Yonghyeon PYUN <pyunyh@gmail.com>
Date: Mon, 2 Dec 2013 11:23:38 +0900
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131202022338.GA3500@michelle.cdnetworks.com>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
User-Agent: Mutt/1.4.2.3i
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 02:23:45 -0000

On Fri, Nov 29, 2013 at 06:24:12PM +0100, Michael Tuexen wrote:
> Dear all,
> 
> ifnet(9) says regarding if_transmit():
> 
> Transmit a packet on an interface or queue it if the interface is
> in use.  This function will return ENOBUFS if the devices software
> and hardware queues are both full.
> 
> The drivers for em, igb and ixgbe might also return an error even
> in the case the packet was enqueued. The attached patches fix this
> issue.

How do you know the packet is successfully enqueued but driver
returns an error?  Do non-buf-ring-aware drivers also show the same
behavior?

> 
> Any comments?

I'm afraid the patch you posted ignores any errors(i.e.
m_defrag(9), bus_dma(9) etc) happened during TX processing.

> 
> Jack: What do you think? Would you prefer to commit the fix if
> you think it is acceptable?
> 
> Best regards
> Michael

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 04:17:30 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2BD69A23;
 Mon,  2 Dec 2013 04:17:30 +0000 (UTC)
Received: from mail-la0-x22a.google.com (mail-la0-x22a.google.com
 [IPv6:2a00:1450:4010:c03::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 0E5EF6EDA;
 Mon,  2 Dec 2013 04:17:28 +0000 (UTC)
Received: by mail-la0-f42.google.com with SMTP id ec20so8179546lab.29
 for <multiple recipients>; Sun, 01 Dec 2013 20:17:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=TBydLbV1oTwW4mrWjkravv0PuL225Hkj4WyuqT2q1D0=;
 b=METhUSh9DV/HufXjpgp5NtqgQhsJ0crKGT71Rx4NktT0m70z4FO/lrOQHli+F58BJ/
 hlFw+CljShATBYr88RCpBhD73IrbS+O1kgnvwyvKoQuxAYxD/U4XEJRTLYtCynMco1im
 WzusAUumTtiQHdiGsAcH+wzfFksOkxOYGNUYXio2b1H/8qnWdAJnJPDPeDCniZSqUBVp
 JqZ7k/LpOzhcQS/PNjGYqJHvut4uuZFDcy7Ye0iLncn40oFoepZfdYdGUUMg2Aigf10M
 fWNAw0iUlaieokLs29melEPlqA9CvfXynqNVfILjYWHPFbvIoAcRLfzFt44G8eDtv8p4
 Oagw==
MIME-Version: 1.0
X-Received: by 10.112.29.147 with SMTP id k19mr42224808lbh.9.1385957846897;
 Sun, 01 Dec 2013 20:17:26 -0800 (PST)
Received: by 10.114.166.163 with HTTP; Sun, 1 Dec 2013 20:17:26 -0800 (PST)
In-Reply-To: <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
Date: Mon, 2 Dec 2013 12:17:26 +0800
Message-ID: <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.16
Cc: freebsd-net <freebsd-net@freebsd.org>,
 Oleg Moskalenko <mom040267@gmail.com>, Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 04:17:30 -0000

On Sat, Nov 30, 2013 at 2:42 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:

> Well seems Dragonfly has some version of it already from commit [1].
>
>
The distribution algorithm was changed a little bit after initial commit to
gain more idle time (bnx(4) output has already been maxed out):
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be28b=
150d3f4fd518914bdeba6

Well, I also addressed a reasonable concern from nginx folks (I am not
quite sure about Linux's position on it; Linux original implementation of
SO_REUSEPORT from Google had this drawback, which I mentioned in the commit
message):
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45eb=
69750219f79f5e8982272

As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
dports; should be easier to be back ported to FreeBSD's ports.  I failed to
convince nginx folks to merge it into mainline and I am currently onto
other stuffs, will come back to them later.  If FreeBSD is going to
implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
mainline will be easier.

I also put up a brief description of SO_REUSEPORT in dfly; may be useful to
you:
http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt

Best Regards,
sephe


> In FreeBSD there is the framework for this with by defining PCBGROUP.
> Also the explanation of it at [2] and [3].
> It can achieve approximately the same features of SO_RESUSEPORT of linux.
> The only thing missing is the marketing behind it and i think and better
> RSS support.
> By looking at dates the support is there before linux so all you guys
> looking for it can experiment with it.
>
> What i was trying to accomplish was something else from performance
> improvement and
> maybe put a sysctl behind it to make it more acceptable..
>
> [1]
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c=
021abb8197718d7a2d441c9
> [2]
> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=3Dbigexcerpts#L=
51
> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>
>
> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267@gmail.com
> >wrote:
>
> > Tim, you are wrong. Read what is "multicast" definition, and read how U=
DP
> > and TCP sockets work in Linux 3.9+ kernels.
> >
> > Oleg .
> >
> >
> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle@freebsd.org
> >wrote:
> >
> >>
> >> On Nov 29, 2013, at 4:04 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
> >>
> >> > Hello,
> >> >
> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemon=
s
> to
> >> > share the same port and possibly listening ip =85
> >>
> >> These flags are used with TCP-based servers.
> >>
> >> I=92ve used them to make software upgrades go more smoothly.
> >> Without them, the following often happens:
> >>
> >> * Old server stops.  In the process, all of its TCP connections are
> >> closed.
> >>
> >> * Connections to old server remain in the TCP connection table until t=
he
> >> remote end can acknowledge.
> >>
> >> * New server starts.
> >>
> >> * New server tries to open port but fails because that port is =93stil=
l in
> >> use=94 by connections in the TCP connection table.
> >>
> >> With these flags, the new server can open the port even though
> >> it is =93still in use=94 by existing connections.
> >>
> >>
> >> > This is not the case today.
> >> > Only multicast sockets seem to have the behaviour of broadcasting th=
e
> >> data
> >> > to all sockets sharing the same properties through these options!
> >>
> >> That is what multicast is for.
> >>
> >> If you want the same data sent to all listeners, then
> >> that is multicast behavior and you should be using
> >> a multicast socket.
> >>
> >> > The patch at [1] implements/corrects the behaviour for UDP sockets.
> >>
> >> You=92re trying to turn all UDP sockets with those options
> >> into multicast sockets.
> >>
> >> If you want a multicast socket, you should ask for one.
> >>
> >> Tim
> >>
> >> _______________________________________________
> >> freebsd-net@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> >>
> >
> >
>
>
> --
> Ermal
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org=
"
>



--=20
Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 04:29:25 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A1A36C2B;
 Mon,  2 Dec 2013 04:29:25 +0000 (UTC)
Received: from mail-pd0-x22d.google.com (mail-pd0-x22d.google.com
 [IPv6:2607:f8b0:400e:c02::22d])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6026E6F4D;
 Mon,  2 Dec 2013 04:29:25 +0000 (UTC)
Received: by mail-pd0-f173.google.com with SMTP id p10so17221296pdj.4
 for <multiple recipients>; Sun, 01 Dec 2013 20:29:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=NaCGDx3HFUa3c+mKizZv7n3HQlfVnUN6VxQjv0+vs+c=;
 b=aZgFITjM/m7qc1pCJvCyjNfiGheQPjkmIs8FIqZnwk1kZcX7CBxNFO2a2n6sKbUcsq
 +IAiOMwwnJLZcPEq+ZveIJjTvzIgmUsDBFEkzsoSr2lLC9DIfqOT8CTLh+QAqNrF/4Bp
 +DR0Yus61LH+pLD9Jf5Esvg5BqDtEQ9WOI6tMls3Cf5lDLR2xjdzkMA36QcO7SQp3Mwp
 8cJwI2+yuwGz80so1gUfpyhUy5ZngyGlAmnxrVqTlJUdJqK2Okk8T49Sqh+3egAN6BYn
 XwilHYr6hV4GKsqaN6Te+/kjRDd1VZ/c48kEhNwS03D/RriFq0MjCKWPhTs+JKXynj9x
 77zg==
MIME-Version: 1.0
X-Received: by 10.68.254.164 with SMTP id aj4mr1231772pbd.161.1385958564133;
 Sun, 01 Dec 2013 20:29:24 -0800 (PST)
Received: by 10.68.147.131 with HTTP; Sun, 1 Dec 2013 20:29:24 -0800 (PST)
In-Reply-To: <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
Date: Sun, 1 Dec 2013 20:29:24 -0800
Message-ID: <CALDtMrLgm-D30u8HWWF=sVda0h4QtYdyiGHpYPw1kfTWbMbJ6Q@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Oleg Moskalenko <mom040267@gmail.com>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.16
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 04:29:25 -0000

Sepherosa, while reading your description I noticed another long-standing
problem for UDP application developers: the UDP sockets are always hashed
with 2-tuple. But UDP sockets can be "connected", too, to a remote address,
with connect(...) function. Unfortunately, with 2-tuple hashing, that
pattern is useless for large-scale applications: if a large number of UDP
sockets on the same local port are "connected" to remote address, then the
kernel have to go thru the long list of UDP sockets with the same hash
value.

If the connected UDP sockets would use 4-tuples, then it would be very
helpful for the new generation of the UDP-based media applications. For
example, servers which use DTLS protocol would become simpler and more
efficient.

Thanks
Oleg



On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau <sepherosa@gmail.com>wrote=
:

>
>
>
> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>
>> Well seems Dragonfly has some version of it already from commit [1].
>>
>>
> The distribution algorithm was changed a little bit after initial commit
> to gain more idle time (bnx(4) output has already been maxed out):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be2=
8b150d3f4fd518914bdeba6
>
> Well, I also addressed a reasonable concern from nginx folks (I am not
> quite sure about Linux's position on it; Linux original implementation of
> SO_REUSEPORT from Google had this drawback, which I mentioned in the comm=
it
> message):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45=
eb69750219f79f5e8982272
>
> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
> dports; should be easier to be back ported to FreeBSD's ports.  I failed =
to
> convince nginx folks to merge it into mainline and I am currently onto
> other stuffs, will come back to them later.  If FreeBSD is going to
> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
> mainline will be easier.
>
> I also put up a brief description of SO_REUSEPORT in dfly; may be useful
> to you:
> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>
> Best Regards,
> sephe
>
>
>> In FreeBSD there is the framework for this with by defining PCBGROUP.
>> Also the explanation of it at [2] and [3].
>> It can achieve approximately the same features of SO_RESUSEPORT of linux=
.
>> The only thing missing is the marketing behind it and i think and better
>> RSS support.
>> By looking at dates the support is there before linux so all you guys
>> looking for it can experiment with it.
>>
>> What i was trying to accomplish was something else from performance
>> improvement and
>> maybe put a sysctl behind it to make it more acceptable..
>>
>> [1]
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9=
c021abb8197718d7a2d441c9
>> [2]
>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=3Dbigexcerpts#=
L51
>> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.htm=
l
>>
>>
>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267@gmail.com
>> >wrote:
>>
>> > Tim, you are wrong. Read what is "multicast" definition, and read how
>> UDP
>> > and TCP sockets work in Linux 3.9+ kernels.
>> >
>> > Oleg .
>> >
>> >
>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle@freebsd.org
>> >wrote:
>> >
>> >>
>> >> On Nov 29, 2013, at 4:04 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>> >>
>> >> > Hello,
>> >> >
>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>> daemons to
>> >> > share the same port and possibly listening ip =85
>> >>
>> >> These flags are used with TCP-based servers.
>> >>
>> >> I=92ve used them to make software upgrades go more smoothly.
>> >> Without them, the following often happens:
>> >>
>> >> * Old server stops.  In the process, all of its TCP connections are
>> >> closed.
>> >>
>> >> * Connections to old server remain in the TCP connection table until
>> the
>> >> remote end can acknowledge.
>> >>
>> >> * New server starts.
>> >>
>> >> * New server tries to open port but fails because that port is =93sti=
ll
>> in
>> >> use=94 by connections in the TCP connection table.
>> >>
>> >> With these flags, the new server can open the port even though
>> >> it is =93still in use=94 by existing connections.
>> >>
>> >>
>> >> > This is not the case today.
>> >> > Only multicast sockets seem to have the behaviour of broadcasting t=
he
>> >> data
>> >> > to all sockets sharing the same properties through these options!
>> >>
>> >> That is what multicast is for.
>> >>
>> >> If you want the same data sent to all listeners, then
>> >> that is multicast behavior and you should be using
>> >> a multicast socket.
>> >>
>> >> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>> >>
>> >> You=92re trying to turn all UDP sockets with those options
>> >> into multicast sockets.
>> >>
>> >> If you want a multicast socket, you should ask for one.
>> >>
>> >> Tim
>> >>
>> >> _______________________________________________
>> >> freebsd-net@freebsd.org mailing list
>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org=
"
>> >>
>> >
>> >
>>
>>
>> --
>> Ermal
>> _______________________________________________
>> freebsd-current@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.or=
g
>> "
>>
>
>
>
> --
> Tomorrow Will Never Die
>

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 05:02:53 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 83F38D5;
 Mon,  2 Dec 2013 05:02:53 +0000 (UTC)
Received: from mail-qe0-x22d.google.com (mail-qe0-x22d.google.com
 [IPv6:2607:f8b0:400d:c02::22d])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 0AA3310B2;
 Mon,  2 Dec 2013 05:02:52 +0000 (UTC)
Received: by mail-qe0-f45.google.com with SMTP id 6so12836248qea.18
 for <multiple recipients>; Sun, 01 Dec 2013 21:02:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=5OZpeEVxX+5txyZVFMFemU3TsA/C5ipw0qvzHw7mRw0=;
 b=duNlyUeNGj9wLEaO6kiuT7UiTZfWp6NYk39dDrNzlQJe8LN/AyQsxKWZTd2Chwx/LZ
 0hM6LUGOZ3t2ShHzstqgOwQLuPCWYeUIYLrF6wE1H6qTq6PiaMzSjxpB0n6VpfufrVML
 P7M7EoXNUMk2KmtBuMBzvI/oUDYOQBGJwLBV3ti0KucA6n8k5vrUB7hZ6G3/IijLWN89
 4jIisZdhfjacqpPwU5YBEcX60/18kNIyETFOr6jfI0r9tT/LMxcTNBDpbofzThpAkxA9
 RZ0F5LQpjmRAiy4Tv23GV5VDUIY76SWcx0e4AhFvByMwpfrlOE1LCPTICGiJO8rjhHoL
 vAtA==
MIME-Version: 1.0
X-Received: by 10.229.122.195 with SMTP id m3mr109680144qcr.7.1385960572142;
 Sun, 01 Dec 2013 21:02:52 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Sun, 1 Dec 2013 21:02:52 -0800 (PST)
In-Reply-To: <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
Date: Sun, 1 Dec 2013 21:02:52 -0800
X-Google-Sender-Auth: 1Fcg6ebyMlcDG_508SdPJ3VO2Qc
Message-ID: <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Adrian Chadd <adrian@freebsd.org>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Oleg Moskalenko <mom040267@gmail.com>,
 Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 05:02:53 -0000

Hi! Thanks for the writeup!

On 1 December 2013 20:17, Sepherosa Ziehau <sepherosa@gmail.com> wrote:

> I also put up a brief description of SO_REUSEPORT in dfly; may be useful to
> you:
> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt

Ok, so given this, how do you guarantee the UTHREAD stays on the given
CPU? You assume it stays on the CPU that the initial listen socket was
created on, right? If it's migrated to another CPU core then the
listen queue still stays in the original hash group that's in a netisr
on a different CPU?


-adrian

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 08:02:19 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 02EC9FEC
 for <net@freebsd.org>; Mon,  2 Dec 2013 08:02:19 +0000 (UTC)
Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45])
 by mx1.freebsd.org (Postfix) with ESMTP id E516D197B
 for <net@freebsd.org>; Mon,  2 Dec 2013 08:02:18 +0000 (UTC)
Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1])
 (authenticated bits=0)
 by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id rB282ClP064622
 for <net@freebsd.org>; Mon, 2 Dec 2013 00:02:12 -0800 (PST)
 (envelope-from yuri@rawbw.com)
Message-ID: <529C3E84.1030203@rawbw.com>
Date: Mon, 02 Dec 2013 00:02:12 -0800
From: Yuri <yuri@rawbw.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:24.0) Gecko/20100101 Thunderbird/24.1.0
MIME-Version: 1.0
To: net@freebsd.org
Subject: DIOCNATLOOK fails with ipfw
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 08:02:19 -0000

I have an app with transparent proxy that should intercept all TCP 
connections in the interface.
This is done with ipfw(8) rule like this:
ipfw add 200 fwd 192.168.10.1,15020 tcp from 192.168.10.0/24 to any 80 
keep-state
Transparent proxy is on 192.168.10.1:15020

Proxy accepts the connections, however, it is using /dev/pf to get the 
original destination and the lookup procedure fails:
ioctl(DIOCNATLOOK) failed: No such file or directory
It fails because nobody ever calls pf_state_insert. I see from the 
source that ioctl to add the pf_state is DIOCSTART, which is issued by 
pfctl(8), but I am not using pfctl(8) at all.

My questions are:
What is the relationship between ipfw(8) and pfctl(8)? Do they do the 
same? Why two of them?
If I only use ipfw, is there a way for the acceptor to find what the 
original destination was without /dev/pf?

Yuri

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 08:42:48 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 39C4064C
 for <net@freebsd.org>; Mon,  2 Dec 2013 08:42:48 +0000 (UTC)
Received: from mail-wg0-x232.google.com (mail-wg0-x232.google.com
 [IPv6:2a00:1450:400c:c00::232])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id C60041CA1
 for <net@freebsd.org>; Mon,  2 Dec 2013 08:42:47 +0000 (UTC)
Received: by mail-wg0-f50.google.com with SMTP id a1so9778709wgh.29
 for <net@freebsd.org>; Mon, 02 Dec 2013 00:42:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding;
 bh=29N7aGykL6mBI68fR+uUMezOOpxDVCjVOSFL746EjwY=;
 b=c2Hetn9MISeREG2rah0oje8Ui/Uhbep+RjLwk9Mqy+3ml6JPyVjrGK8I92Ity+d3YR
 NEoKDmoBICUfomXd0opG/clhHERclRC6UYQjFX0vZh3QtAom/XregCw7WNGYd8D3JUcd
 M4VG0qexg/DvGs6MUyORW3qk1UU8Fxrht5HWIDPxr03KHHBnvJKVBU30ctapL0XOYfGd
 U3UsqrtO31h93h+5fVE1ynuxkkAMmAI5m6aymdHo6Vdm8K4aua+nNNXEkHVn3CqyP9dJ
 gnxQVghTu81tLBL6DhblwPLZsz7z3DmVEjdGLHkfVXNPp4Wr3riczV01A5AtRb8tB/qq
 rUNA==
X-Received: by 10.180.89.68 with SMTP id bm4mr17179585wib.0.1385973766311;
 Mon, 02 Dec 2013 00:42:46 -0800 (PST)
Received: from [192.168.2.30] ([2.176.198.47])
 by mx.google.com with ESMTPSA id ll10sm120172426wic.9.2013.12.02.00.42.44
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Mon, 02 Dec 2013 00:42:45 -0800 (PST)
Message-ID: <529C4801.3010000@gmail.com>
Date: Mon, 02 Dec 2013 12:12:41 +0330
From: Hooman Fazaeli <hoomanfazaeli@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/20130215 Thunderbird/17.0.3
MIME-Version: 1.0
To: Yuri <yuri@rawbw.com>
Subject: Re: DIOCNATLOOK fails with ipfw
References: <529C3E84.1030203@rawbw.com>
In-Reply-To: <529C3E84.1030203@rawbw.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 08:42:48 -0000

On 12/2/2013 11:32 AM, Yuri wrote:
> I have an app with transparent proxy that should intercept all TCP connections in the interface.
> This is done with ipfw(8) rule like this:
> ipfw add 200 fwd 192.168.10.1,15020 tcp from 192.168.10.0/24 to any 80 keep-state
> Transparent proxy is on 192.168.10.1:15020
>
> Proxy accepts the connections, however, it is using /dev/pf to get the original destination and the lookup procedure fails:
> ioctl(DIOCNATLOOK) failed: No such file or directory
> It fails because nobody ever calls pf_state_insert. I see from the source that ioctl to add the pf_state is DIOCSTART, which is issued by pfctl(8), but I am not using pfctl(8) at all.
>
> My questions are:
> What is the relationship between ipfw(8) and pfctl(8)? Do they do the same? Why two of them?
> If I only use ipfw, is there a way for the acceptor to find what the original destination was without /dev/pf?
>
> Yuri
> _______________________________________________
ipfw and pf are two completely separate firewalls. You can not use /dev/pf to control/query ipfw.
Use getsockname(2) to find out original destination address with ipfw.


-- 

Best regards.
Hooman Fazaeli


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 11:06:50 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7F151B7B
 for <freebsd-net@FreeBSD.org>; Mon,  2 Dec 2013 11:06:50 +0000 (UTC)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 68EB51954
 for <freebsd-net@FreeBSD.org>; Mon,  2 Dec 2013 11:06:50 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id rB2B6oJn007798
 for <freebsd-net@FreeBSD.org>; Mon, 2 Dec 2013 11:06:50 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.7/8.14.7/Submit) id rB2B6nZJ007796
 for freebsd-net@FreeBSD.org; Mon, 2 Dec 2013 11:06:49 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 2 Dec 2013 11:06:49 GMT
Message-Id: <201312021106.rB2B6nZJ007796@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@freebsd.org>
To: freebsd-net@FreeBSD.org
Subject: Current problem reports assigned to freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 11:06:50 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/184311  net        [bge] [panic] kernel panic with bge(4) on SunFire X210
o kern/184084  net        [ral] kernel crash by ral (RT3090)
o bin/183687   net        [patch] route(8): route add -net 172.20 add wrong host
o kern/183659  net        [tcp] ]TCP stack lock contention with short-lived conn
o conf/183407  net        [rc.d] [patch] Routing restart returns non-zero exitco
o kern/183391  net        [oce] 10gigabit networking problems with Emulex OCE 11
o kern/183390  net        [ixgbe] 10gigabit networking problems
o kern/182917  net        [igb] strange out traffic with igb interfaces
o kern/182847  net        [netinet6] [patch] Remove dead code
o kern/182665  net        [wlan] Kernel panic when creating second wlandev.
o kern/182382  net        [tcp] sysctl to set TCP CC method on BIG ENDIAN system
o kern/182297  net        [cm] ArcNet driver fails to detect the link address - 
o kern/182212  net        [patch] [ng_mppc] ng_mppc(4) blocks on network errors 
o kern/181970  net        [re] LAN Realtek� 8111G is not supported by re driver
o kern/181931  net        [vlan] [lagg] vlan over lagg over mlxen crashes the ke
o kern/181823  net        [ip6] [patch] make ipv6 mroute return same errror code
o kern/181741  net        [kernel] [patch] Packet loss when 'control' messages a
o kern/181703  net        [re] [patch] Fix Realtek 8111G Ethernet controller not
o kern/181657  net        [bpf] [patch] BPF_COP/BPF_COPX instruction reservation
o kern/181257  net        [bge] bge link status change
o kern/181236  net        [igb] igb driver unstable work
o kern/181135  net        [netmap] [patch] sys/dev/netmap patch for Linux compat
o kern/181131  net        [netmap] [patch] sys/dev/netmap memory allocation impr
o kern/181006  net        [run] [patch] mbuf leak in run(4) driver
o kern/180893  net        [if_ethersubr] [patch] Packets received with own LLADD
o kern/180844  net        [panic] [re] Intermittent panic (re driver?)
o kern/180775  net        [bxe] if_bxe driver broken with Broadcom BCM57711 card
o kern/180722  net        [bluetooth] bluetooth takes 30-50 attempts to pair to 
s kern/180468  net        [request] LOCAL_PEERCRED support for PF_INET
o kern/180065  net        [netinet6] [patch] Multicast loopback to own host brok
o kern/179926  net        [lacp] [patch] active aggregator selection bug
o kern/179824  net        [ixgbe] System (9.1-p4) hangs on heavy ixgbe network t
o kern/179733  net        [lagg] [patch] interface loses capabilities when proto
o kern/179429  net        [tap] STP enabled tap bridge
o kern/179299  net        [igb] Intel X540-T2 - unstable driver
a kern/179264  net        [vimage] [pf] Core dump with Packet filter and VIMAGE 
o kern/178947  net        [arp] arp rejecting not working
o kern/178782  net        [ixgbe] 82599EB SFP does not work with passthrough und
o kern/178612  net        [run] kernel panic due the problems with run driver
o kern/178472  net        [ip6] [patch] make return code consistent with IPv4 co
o kern/178079  net        [tcp] Switching TCP CC algorithm panics on sparc64 wit
s kern/178071  net        FreeBSD unable to recongize Kontron (Industrial Comput
o kern/177905  net        [xl] [panic] ifmedia_set when pluging CardBus LAN card
o kern/177618  net        [bridge] Problem with bridge firewall with trunk ports
o kern/177402  net        [igb] [pf] problem with ethernet driver igb + pf / alt
o kern/177400  net        [jme] JMC25x 1000baseT establishment issues
o kern/177366  net        [ieee80211] negative malloc(9) statistics for 80211nod
f kern/177362  net        [netinet] [patch] Wrong control used to return TOS
o kern/177194  net        [netgraph] Unnamed netgraph nodes for vlan interfaces
o kern/177184  net        [bge] [patch] enable wake on lan
o kern/177139  net        [igb] igb drops ethernet ports 2 and 3
o kern/176884  net        [re] re0 flapping up/down
o kern/176671  net        [epair] MAC address for epair device not unique
o kern/176484  net        [ipsec] [enc] [patch] panic: IPsec + enc(4); device na
o kern/176446  net        [netinet] [patch] Concurrency in ixgbe driving out-of-
o kern/176420  net        [kernel] [patch] incorrect errno for LOCAL_PEERCRED
o kern/176419  net        [kernel] [patch] socketpair support for LOCAL_PEERCRED
o kern/176401  net        [netgraph] page fault  in netgraph
o kern/176167  net        [ipsec][lagg] using lagg and ipsec causes immediate pa
o kern/176027  net        [em] [patch] flow control systcl consistency for em dr
o kern/176026  net        [tcp] [patch] TCP wrappers caused quite a lot of warni
o kern/175864  net        [re] Intel MB D510MO, onboard ethernet not working aft
o kern/175852  net        [amd64] [patch] in_cksum_hdr() behaves differently on 
o kern/175734  net        no ethernet detected on system with EG20T PCH chipset 
o kern/175267  net        [pf] [tap] pf + tap keep state problem
o kern/175236  net        [epair] [gif] epair and gif Devices On Bridge
o kern/175182  net        [panic] kernel panic on RADIX_MPATH when deleting rout
o kern/175153  net        [tcp] will there miss a FIN when do TSO?
o kern/174959  net        [net] [patch] rnh_walktree_from visits spurious nodes
o kern/174958  net        [net] [patch] rnh_walktree_from makes unreasonable ass
o kern/174897  net        [route] Interface routes are broken
o kern/174851  net        [bxe] [patch] UDP checksum offload is wrong in bxe dri
o kern/174850  net        [bxe] [patch] bxe driver does not receive multicasts
o kern/174849  net        [bxe] [patch] bxe driver can hang kernel when reset
o kern/174822  net        [tcp] Page fault in tcp_discardcb under high traffic
o kern/174602  net        [gif] [ipsec] traceroute issue on gif tunnel with ipse
o kern/174535  net        [tcp] TCP fast retransmit feature works strange
o kern/173871  net        [gif] process of 'ifconfig gif0 create hangs' when if_
o kern/173475  net        [tun] tun(4) stays opened by PID after process is term
o kern/173201  net        [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu
o kern/173137  net        [em] em(4) unable to run at gigabit with 9.1-RC2
o kern/173002  net        [patch] data type size problem in if_spppsubr.c
o kern/172895  net        [ixgb] [ixgbe] do not properly determine link-state
o kern/172683  net        [ip6] Duplicate IPv6 Link Local Addresses
o kern/172675  net        [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos
p kern/172113  net        [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4
o kern/171840  net        [ip6] IPv6 packets transmitting only on queue 0
o kern/171739  net        [bce] [panic] bce related kernel panic
o kern/171711  net        [dummynet] [panic] Kernel panic in dummynet
o kern/171532  net        [ndis] ndis(4) driver includes 'pccard'-specific code,
o kern/171531  net        [ndis] undocumented dependency for ndis(4)
o kern/171524  net        [ipmi] ipmi driver crashes kernel by reboot or shutdow
s kern/171508  net        [epair] [request] Add the ability to name epair device
o kern/171228  net        [re] [patch] if_re - eeprom write issues
o kern/170701  net        [ppp] killl ppp or reboot with active ppp connection c
o kern/170267  net        [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona
o kern/170081  net        [fxp] pf/nat/jails not working if checksum offloading 
o kern/169898  net        ifconfig(8) fails to set MTU on multiple interfaces.
o kern/169676  net        [bge] [hang] system hangs, fully or partially after re
o kern/169620  net        [ng] [pf] ng_l2tp incoming packet bypass pf firewall
o kern/169459  net        [ppp] umodem/ppp/3g stopped working after update from 
o kern/169438  net        [ipsec] ipv4-in-ipv6 tunnel mode IPsec does not work
p kern/168294  net        [ixgbe] [patch] ixgbe driver compiled in kernel has no
o kern/168246  net        [em] Multiple em(4) not working with qemu
o kern/168245  net        [arp] [regression] Permanent ARP entry not deleted on 
o kern/168244  net        [arp] [regression] Unable to manually remove permanent
o kern/168183  net        [bce] bce driver hang system
o kern/167603  net        [ip] IP fragment reassembly's broken: file transfer ov
o kern/167500  net        [em] [panic] Kernel panics in em driver
o kern/167325  net        [netinet] [patch] sosend sometimes return EINVAL with 
o kern/167202  net        [igmp]: Sending multiple IGMP packets crashes kernel
o kern/166462  net        [gre] gre(4) when using a tunnel source address from c
o kern/166285  net        [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres
o kern/166255  net        [net] [patch] It should be possible to disable "promis
p kern/165903  net        mbuf leak
o kern/165622  net        [ndis][panic][patch] Unregistered use of FPU in kernel
s kern/165562  net        [request] add support for Intel i350 in FreeBSD 7.4
o kern/165526  net        [bxe] UDP packets checksum calculation whithin if_bxe 
o kern/165488  net        [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit
o kern/165305  net        [ip6] [request] Feature parity between IP_TOS and IPV6
o kern/165296  net        [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR
o kern/165181  net        [igb] igb freezes after about 2 weeks of uptime
o kern/165174  net        [patch] [tap] allow tap(4) to keep its address on clos
o kern/165152  net        [ip6] Does not work through the issue of ipv6 addresse
o kern/164495  net        [igb] connect double head igb to switch cause system t
o kern/164490  net        [pfil] Incorrect IP checksum on pfil pass from ip_outp
o kern/164475  net        [gre] gre misses RUNNING flag after a reboot
o kern/164265  net        [netinet] [patch] tcp_lro_rx computes wrong checksum i
o kern/163903  net        [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL
o kern/163481  net        freebsd do not add itself to ping route packet
o kern/162927  net        [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing
o kern/162558  net        [dummynet] [panic] seldom dummynet panics
o kern/162153  net        [em] intel em driver 7.2.4 don't compile
o kern/162110  net        [igb] [panic] RELENG_9 panics on boot in IGB driver - 
o kern/162028  net        [ixgbe] [patch] misplaced #endif in ixgbe.c
o kern/161277  net        [em] [patch] BMC cannot receive IPMI traffic after loa
o kern/160873  net        [igb] igb(4) from HEAD fails to build on 7-STABLE
o kern/160750  net        Intel PRO/1000 connection breaks under load until rebo
o kern/160693  net        [gif] [em] Multicast packet are not passed from GIF0 t
o kern/160293  net        [ieee80211] ppanic] kernel panic during network setup 
o kern/160206  net        [gif] gifX stops working after a while (IPv6 tunnel)
o kern/159817  net        [udp] write UDPv4: No buffer space available (code=55)
o kern/159629  net        [ipsec] [panic] kernel panic with IPsec in transport m
o kern/159621  net        [tcp] [panic] panic: soabort: so_count
o kern/159603  net        [netinet] [patch] in_ifscrubprefix() - network route c
o kern/159601  net        [netinet] [patch] in_scrubprefix() - loopback route re
o kern/159294  net        [em] em watchdog timeouts
o kern/159203  net        [wpi] Intel 3945ABG Wireless LAN not support IBSS
o kern/158930  net        [bpf] BPF element leak in ifp->bpf_if->bif_dlist
o kern/158726  net        [ip6] [patch] ICMPv6 Router Announcement flooding limi
o kern/158694  net        [ix] [lagg] ix0 is not working within lagg(4)
o kern/158665  net        [ip6] [panic] kernel pagefault in in6_setscope()
o kern/158635  net        [em] TSO breaks BPF packet captures with em driver
f kern/157802  net        [dummynet] [panic] kernel panic in dummynet
o kern/157785  net        amd64 + jail + ipfw + natd = very slow outbound traffi
o kern/157418  net        [em] em driver lockup during boot on Supermicro X9SCM-
o kern/157410  net        [ip6] IPv6 Router Advertisements Cause Excessive CPU U
o kern/157287  net        [re] [panic] INVARIANTS panic (Memory modified after f
o kern/157200  net        [network.subr] [patch] stf(4) can not communicate betw
o kern/157182  net        [lagg] lagg interface not working together with epair 
o kern/156877  net        [dummynet] [panic] dummynet move_pkt() null ptr derefe
o kern/156667  net        [em] em0 fails to init on CURRENT after March 17
o kern/156408  net        [vlan] Routing failure when using VLANs vs. Physical e
o kern/156328  net        [icmp]: host can ping other subnet but no have IP from
o kern/156317  net        [ip6] Wrong order of IPv6 NS DAD/MLD Report
o kern/156279  net        [if_bridge][divert][ipfw] unable to correctly re-injec
o kern/156226  net        [lagg]: failover does not announce the failover to swi
o kern/156030  net        [ip6] [panic] Crash in nd6_dad_start() due to null ptr
o kern/155680  net        [multicast] problems with multicast
s kern/155642  net        [new driver] [request] Add driver for Realtek RTL8191S
o kern/155597  net        [panic] Kernel panics with "sbdrop" message
o kern/155420  net        [vlan] adding vlan break existent vlan
o kern/155177  net        [route] [panic] Panic when inject routes in kernel
o kern/155010  net        [msk] ntfs-3g via iscsi using msk driver cause kernel 
o kern/154943  net        [gif] ifconfig gifX create on existing gifX clears IP
s kern/154851  net        [new driver] [request]: Port brcm80211 driver from Lin
o kern/154850  net        [netgraph] [patch] ng_ether fails to name nodes when t
o kern/154679  net        [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R
o kern/154600  net        [tcp] [panic] Random kernel panics on tcp_output
o kern/154557  net        [tcp] Freeze tcp-session of the clients, if in the gat
o kern/154443  net        [if_bridge] Kernel module bridgestp.ko missing after u
o kern/154286  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/154255  net        [nfs] NFS not responding
o kern/154214  net        [stf] [panic] Panic when creating stf interface
o kern/154185  net        race condition in mb_dupcl
p kern/154169  net        [multicast] [ip6] Node Information Query multicast add
o kern/154134  net        [ip6] stuck kernel state in LISTEN on ipv6 daemon whic
o kern/154091  net        [netgraph] [panic] netgraph, unaligned mbuf?
o conf/154062  net        [vlan] [patch] change to way of auto-generatation of v
o kern/153937  net        [ral] ralink panics the system (amd64 freeBSDD 8.X) wh
o kern/153936  net        [ixgbe] [patch] MPRC workaround incorrectly applied to
o kern/153816  net        [ixgbe] ixgbe doesn't work properly with the Intel 10g
o kern/153772  net        [ixgbe] [patch] sysctls reference wrong XON/XOFF varia
o kern/153497  net        [netgraph] netgraph panic due to race conditions
o kern/153454  net        [patch] [wlan] [urtw] Support ad-hoc and hostap modes 
o kern/153308  net        [em] em interface use 100% cpu
o kern/153244  net        [em] em(4) fails to send UDP to port 0xffff
o kern/152893  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/152853  net        [em] tftpd (and likely other udp traffic) fails over e
o kern/152828  net        [em] poor performance on 8.1, 8.2-PRE
o kern/152569  net        [net]: Multiple ppp connections and routing table prob
o kern/152235  net        [arp] Permanent local ARP entries are not properly upd
o kern/152141  net        [vlan] [patch] encapsulate vlan in ng_ether before out
o kern/152036  net        [libc] getifaddrs(3) returns truncated sockaddrs for n
o kern/151690  net        [ep] network connectivity won't work until dhclient is
o kern/151681  net        [nfs] NFS mount via IPv6 leads to hang on client with 
o kern/151593  net        [igb] [panic] Kernel panic when bringing up igb networ
o kern/150920  net        [ixgbe][igb] Panic when packets are dropped with heade
o kern/150557  net        [igb] igb0: Watchdog timeout -- resetting
o kern/150251  net        [patch] [ixgbe] Late cable insertion broken
o kern/150249  net        [ixgbe] Media type detection broken
o bin/150224   net        ppp(8) does not reassign static IP after kill -KILL co
f kern/149969  net        [wlan] [ral] ralink rt2661 fails to maintain connectio
o kern/149643  net        [rum] device not sending proper beacon frames in ap mo
o kern/149609  net        [panic] reboot after adding second default route
o kern/149117  net        [inet] [patch] in_pcbbind: redundant test
o kern/149086  net        [multicast] Generic multicast join failure in 8.1
o kern/148018  net        [flowtable] flowtable crashes on ia64
o kern/147912  net        [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300  11
o kern/147894  net        [ipsec] IPv6-in-IPv4 does not work inside an ESP-only 
o kern/147155  net        [ip6] setfb not work with ipv6
o kern/146845  net        [libc] close(2) returns error 54 (connection reset by 
f kern/146792  net        [flowtable] flowcleaner 100% cpu's core load
o kern/146719  net        [pf] [panic] PF or dumynet kernel panic
o kern/146534  net        [icmp6] wrong source address in echo reply
o kern/146427  net        [mwl] Additional virtual access points don't work on m
f kern/146394  net        [vlan] IP source address for outgoing connections
o bin/146377   net        [ppp] [tun] Interface doesn't clear addresses when PPP
o kern/146358  net        [vlan] wrong destination MAC address
o kern/146165  net        [wlan] [panic] Setting bssid in adhoc mode causes pani
o kern/146082  net        [ng_l2tp] a false invaliant check was performed in ng_
o kern/146037  net        [panic] mpd + CoA = kernel panic
o kern/145825  net        [panic] panic: soabort: so_count
o kern/145728  net        [lagg] Stops working lagg between two servers.
p kern/145600  net        TCP/ECN behaves different to CE/CWR than ns2 reference
f kern/144917  net        [flowtable] [panic] flowtable crashes system [regressi
o kern/144882  net        MacBookPro =>4.1 does not connect to BSD in hostap wit
o kern/144874  net        [if_bridge] [patch] if_bridge frees mbuf after pfil ho
o conf/144700  net        [rc.d] async dhclient breaks stuff for too many people
o kern/144616  net        [nat] [panic] ip_nat panic FreeBSD 7.2
f kern/144315  net        [ipfw] [panic] freebsd 8-stable reboot after add ipfw 
o kern/144231  net        bind/connect/sendto too strict about sockaddr length
o kern/143846  net        [gif] bringing gif3 tunnel down causes gif0 tunnel to 
s kern/143673  net        [stf] [request] there should be a way to support multi
o kern/143622  net        [pfil] [patch] unlock pfil lock while calling firewall
o kern/143593  net        [ipsec] When using IPSec, tcpdump doesn't show outgoin
o kern/143591  net        [ral] RT2561C-based DLink card (DWL-510) fails to work
o kern/143208  net        [ipsec] [gif] IPSec over gif interface not working
o kern/143034  net        [panic] system reboots itself in tcp code [regression]
o kern/142877  net        [hang] network-related repeatable 8.0-STABLE hard hang
o kern/142774  net        Problem with outgoing connections on interface with mu
o kern/142772  net        [libc] lla_lookup: new lle malloc failed
f kern/142518  net        [em] [lagg] Problem on 8.0-STABLE with em and lagg
o kern/142018  net        [iwi] [patch] Possibly wrong interpretation of beacon-
o kern/141861  net        [wi] data garbled with WEP and wi(4) with Prism 2.5
f kern/141741  net        Etherlink III NIC won't work after upgrade to FBSD 8, 
o kern/140742  net        rum(4) Two asus-WL167G adapters cannot talk to each ot
o kern/140682  net        [netgraph] [panic] random panic in netgraph
f kern/140634  net        [vlan] destroying if_lagg interface with if_vlan membe
o kern/140619  net        [ifnet] [patch] refine obsolete if_var.h comments desc
o kern/140346  net        [wlan] High bandwidth use causes loss of wlan connecti
o kern/140142  net        [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6
o kern/140066  net        [bwi] install report for 8.0 RC 2 (multiple problems)
o kern/139387  net        [ipsec] Wrong lenth of PF_KEY messages in promiscuous 
o bin/139346   net        [patch] arp(8) add option to remove static entries lis
o kern/139268  net        [if_bridge] [patch] allow if_bridge to forward just VL
p kern/139204  net        [arp] DHCP server replies rejected, ARP entry lost bef
o kern/139117  net        [lagg] + wlan boot timing (EBUSY)
o kern/138850  net        [dummynet] dummynet doesn't work correctly on a bridge
o kern/138782  net        [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00
o kern/138688  net        [rum] possibly broken on 8 Beta 4 amd64: able to wpa a
o kern/138678  net        [lo] FreeBSD does not assign linklocal address to loop
o kern/138407  net        [gre] gre(4) interface does not come up after reboot
o kern/138332  net        [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/
o kern/138266  net        [panic] kernel panic when udp benchmark test used as r
f kern/138029  net        [bpf] [panic] periodically kernel panic and reboot
o kern/137881  net        [netgraph] [panic] ng_pppoe fatal trap 12
p bin/137841   net        [patch] wpa_supplicant(8) cannot verify SHA256 signed 
p kern/137776  net        [rum] panic in rum(4) driver on 8.0-BETA2
o bin/137641   net        ifconfig(8): various problems with "vlan_device.vlan_i
o kern/137392  net        [ip] [panic] crash in ip_nat.c line 2577
o kern/137372  net        [ral] FreeBSD doesn't support wireless interface from 
o kern/137089  net        [lagg] lagg falsely triggers IPv6 duplicate address de
o kern/136911  net        [netgraph] [panic] system panic on kldload ng_bpf.ko t
o kern/136618  net        [pf][stf] panic on cloning interface without unit numb
o kern/135502  net        [periodic] Warning message raised by rtfree function i
o kern/134583  net        [hang] Machine with jail freezes after random amount o
o kern/134531  net        [route] [panic] kernel crash related to routes/zebra
o kern/134157  net        [dummynet] dummynet loads cpu for 100% and make a syst
o kern/133969  net        [dummynet] [panic] Fatal trap 12: page fault while in 
o kern/133968  net        [dummynet] [panic] dummynet kernel panic
o kern/133736  net        [udp] ip_id not protected ...
o kern/133595  net        [panic] Kernel Panic at pcpu.h:195
o kern/133572  net        [ppp] [hang] incoming PPTP connection hangs the system
o kern/133490  net        [bpf] [panic] 'kmem_map too small' panic on Dell r900 
o kern/133235  net        [netinet] [patch] Process SIOCDLIFADDR command incorre
f kern/133213  net        arp and sshd errors on 7.1-PRERELEASE
o kern/133060  net        [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs
o kern/132889  net        [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d
o conf/132851  net        [patch] rc.conf(5): allow to setfib(1) for service run
o kern/132734  net        [ifmib] [panic] panic in net/if_mib.c
o kern/132705  net        [libwrap] [patch] libwrap - infinite loop if hosts.all
o kern/132672  net        [ndis] [panic] ndis with rt2860.sys causes kernel pani
o kern/132354  net        [nat] Getting some packages to ipnat(8) causes crash
o kern/132277  net        [crypto] [ipsec] poor performance using cryptodevice f
o kern/131781  net        [ndis] ndis keeps dropping the link
o kern/131776  net        [wi] driver fails to init
o kern/131753  net        [altq] [panic] kernel panic in hfsc_dequeue
o bin/131365   net        route(8): route add changes interpretation of network 
f kern/130820  net        [ndis] wpa_supplicant(8) returns 'no space on device'
o kern/130628  net        [nfs] NFS / rpc.lockd deadlock on 7.1-R
o kern/130525  net        [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau
o kern/130311  net        [wlan_xauth] [panic] hostapd restart causing kernel pa
o kern/130109  net        [ipfw] Can not set fib for packets originated from loc
f kern/130059  net        [panic] Leaking 50k mbufs/hour
f kern/129719  net        [nfs] [panic] Panic during shutdown, tcp_ctloutput: in
o kern/129517  net        [ipsec] [panic] double fault / stack overflow
f kern/129508  net        [carp] [panic] Kernel panic with EtherIP (may be relat
o kern/129219  net        [ppp] Kernel panic when using kernel mode ppp
o kern/129197  net        [panic] 7.0 IP stack related panic
o bin/128954   net        ifconfig(8) deletes valid routes
o bin/128602   net        [an] wpa_supplicant(8) crashes with an(4)
o kern/128448  net        [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res
o bin/128295   net        [patch] ifconfig(8) does not print TOE4 or TOE6 capabi
o bin/128001   net        wpa_supplicant(8), wlan(4), and wi(4) issues
o kern/127826  net        [iwi] iwi0 driver has reduced performance and connecti
o kern/127815  net        [gif] [patch] if_gif does not set vlan attributes from
o kern/127724  net        [rtalloc] rtfree: 0xc5a8f870 has 1 refs
f bin/127719   net        [arp] arp: Segmentation fault (core dumped)
f kern/127528  net        [icmp]: icmp socket receives icmp replies not owned by
p kern/127360  net        [socket] TOE socket options missing from sosetopt()
o bin/127192   net        routed(8) removes the secondary alias IP of interface 
f kern/127145  net        [wi]: prism (wi) driver crash at bigger traffic
o kern/126895  net        [patch] [ral] Add antenna selection (marked as TBD)
o kern/126874  net        [vlan]: Zebra problem if ifconfig vlanX destroy
o kern/126695  net        rtfree messages and network disruption upon use of if_
o kern/126339  net        [ipw] ipw driver drops the connection
o kern/126075  net        [inet] [patch] internet control accesses beyond end of
o bin/125922   net        [patch] Deadlock in arp(8)
o kern/125920  net        [arp] Kernel Routing Table loses Ethernet Link status 
o kern/125845  net        [netinet] [patch] tcp_lro_rx() should make use of hard
o kern/125258  net        [socket] socket's SO_REUSEADDR option does not work
o kern/125239  net        [gre] kernel crash when using gre
o kern/124341  net        [ral] promiscuous mode for wireless device ral0 looses
o kern/124225  net        [ndis] [patch] ndis network driver sometimes loses net
o kern/124160  net        [libc] connect(2) function loops indefinitely
o kern/124021  net        [ip6] [panic] page fault in nd6_output()
o kern/123968  net        [rum] [panic] rum driver causes kernel panic with WPA.
o kern/123892  net        [tap] [patch] No buffer space available
o kern/123890  net        [ppp] [panic] crash & reboot on work with PPP low-spee
o kern/123858  net        [stf] [patch] stf not usable behind a NAT
o kern/123758  net        [panic] panic while restarting net/freenet6
o bin/123633   net        ifconfig(8) doesn't set inet and ether address in one 
o kern/123559  net        [iwi] iwi periodically disassociates/associates [regre
o bin/123465   net        [ip6] route(8): route add -inet6 <ipv6_addr> -interfac
o kern/123463  net        [ipsec] [panic] repeatable crash related to ipsec-tool
o conf/123330  net        [nsswitch.conf] Enabling samba wins in nsswitch.conf c
o kern/123160  net        [ip] Panic and reboot at sysctl kern.polling.enable=0
o kern/122989  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/122954  net        [lagg] IPv6 EUI64 incorrectly chosen for lagg devices
f kern/122780  net        [lagg] tcpdump on lagg interface during high pps wedge
o kern/122685  net        It is not visible passing packets in tcpdump(1)
o kern/122319  net        [wi] imposible to enable ad-hoc demo mode with Orinoco
o kern/122290  net        [netgraph] [panic] Netgraph related "kmem_map too smal
o kern/122252  net        [ipmi] [bge] IPMI problem with BCM5704 (does not work 
o kern/122033  net        [ral] [lor] Lock order reversal in ral0 at bootup ieee
o bin/121895   net        [patch] rtsol(8)/rtsold(8) doesn't handle managed netw
s kern/121774  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/121555  net        [panic] Fatal trap 12: current process = 12 (swi1: net
o kern/121534  net        [ipl] [nat] FreeBSD Release 6.3 Kernel Trap 12:
o kern/121443  net        [gif] [lor] icmp6_input/nd6_lookup
o kern/121437  net        [vlan] Routing to layer-2 address does not work on VLA
o bin/121359   net        [patch] [security] ppp(8): fix local stack overflow in
o kern/121257  net        [tcp] TSO + natd  -> slow outgoing tcp traffic
o kern/121181  net        [panic] Fatal trap 3: breakpoint instruction fault whi
o kern/120966  net        [rum] kernel panic with if_rum and WPA encryption
o kern/120566  net        [request]: ifconfig(8) make order of arguments more fr
o kern/120304  net        [netgraph] [patch] netgraph source assumes 32-bit time
o kern/120266  net        [udp] [panic] gnugk causes kernel panic when closing U
o bin/120060   net        routed(8) deletes link-level routes in the presence of
o kern/119945  net        [rum] [panic] rum device in hostap mode, cause kernel 
o kern/119791  net        [nfs] UDP NFS mount of aliased IP addresses from a Sol
o kern/119617  net        [nfs] nfs error on wpa network when reseting/shutdown
f kern/119516  net        [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi
o kern/119432  net        [arp] route add -host <host> -iface <nic> causes arp e
o kern/119225  net        [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr
o kern/118727  net        [netgraph] [patch] [request] add new ng_pf module
o kern/117423  net        [vlan] Duplicate IP on different interfaces
o bin/117339   net        [patch] route(8): loading routing management commands 
o bin/116643   net        [patch] [request] fstat(1): add INET/INET6 socket deta
o kern/116185  net        [iwi] if_iwi driver leads system to reboot
o kern/115239  net        [ipnat] panic with 'kmem_map too small' using ipnat
o kern/115019  net        [netgraph] ng_ether upper hook packet flow stops on ad
o kern/115002  net        [wi] if_wi timeout. failed allocation (busy bit). ifco
o kern/114915  net        [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f
o kern/113432  net        [ucom] WARNING: attempt to net_add_domain(netgraph) af
o kern/112722  net        [ipsec] [udp] IP v4 udp fragmented packet reject
o kern/112686  net        [patm] patm driver freezes System (FreeBSD 6.2-p4) i38
o bin/112557   net        [patch] ppp(8) lock file should not use symlink name
o kern/112528  net        [nfs] NFS over TCP under load hangs with "impossible p
o kern/111537  net        [inet6] [patch] ip6_input() treats mbuf cluster wrong
o kern/111457  net        [ral] ral(4) freeze
o kern/110284  net        [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et
o kern/110249  net        [kernel] [regression] [patch] setsockopt() error regre
o kern/109470  net        [wi] Orinoco Classic Gold PC Card Can't Channel Hop
o bin/108895   net        pppd(8): PPPoE dead connections on 6.2 [regression]
f kern/108197  net        [panic] [gif] [ip6] if_delmulti reference counting pan
o kern/107944  net        [wi] [patch] Forget to unlock mutex-locks
o conf/107035  net        [patch] bridge(8): bridge interface given in rc.conf n
o kern/106444  net        [netgraph] [panic] Kernel Panic on Binding to an ip to
o kern/106316  net        [dummynet] dummynet with multipass ipfw drops packets 
o kern/105945  net        Address can disappear from network interface
s kern/105943  net        Network stack may modify read-only mbuf chain copies
o bin/105925   net        problems with ifconfig(8) and vlan(4) [regression]
o kern/104851  net        [inet6] [patch] On link routes not configured when usi
o kern/104751  net        [netgraph] kernel panic, when getting info about my tr
o kern/104738  net        [inet] [patch] Reentrant problem with inet_ntoa in the
o kern/103191  net        Unpredictable reboot
o kern/103135  net        [ipsec] ipsec with ipfw divert (not NAT) encodes a pac
o kern/102540  net        [netgraph] [patch] supporting vlan(4) by ng_fec(4)
o conf/102502  net        [netgraph] [patch] ifconfig name does't rename netgrap
o kern/102035  net        [plip] plip networking disables parallel port printing
o kern/100709  net        [libc] getaddrinfo(3) should return TTL info
o kern/100519  net        [netisr] suggestion to fix suboptimal network polling
o kern/98597   net        [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu
o bin/98218    net        wpa_supplicant(8) blacklist not working
o kern/97306   net        [netgraph] NG_L2TP locks after connection with failed 
o conf/97014   net        [gif] gifconfig_gif? in rc.conf does not recognize IPv
f kern/96268   net        [socket] TCP socket performance drops by 3000% if pack
o kern/95519   net        [ral] ral0 could not map mbuf
o kern/95288   net        [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr
o kern/95277   net        [netinet] [patch] IP Encapsulation mask_match() return
o kern/95267   net        packet drops periodically appear
f kern/93378   net        [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo
o kern/93019   net        [ppp] ppp and tunX problems: no traffic after restarti
o kern/92880   net        [libc] [patch] almost rewritten inet_network(3) functi
s kern/92279   net        [dc] Core faults everytime I reboot, possible NIC issu
o kern/91859   net        [ndis] if_ndis does not work with Asus WL-138
o kern/91364   net        [ral] [wep] WF-511 RT2500 Card PCI and WEP
o kern/91311   net        [aue] aue interface hanging
o kern/87421   net        [netgraph] [panic]: ng_ether + ng_eiface + if_bridge
o kern/86871   net        [tcp] [patch] allocation logic for PCBs in TIME_WAIT s
o kern/86427   net        [lor] Deadlock with FASTIPSEC and nat
o kern/85780   net        'panic: bogus refcnt 0' in routing/ipv6
o bin/85445    net        ifconfig(8): deprecated keyword to ifconfig inoperativ
o bin/82975    net        route change does not parse classfull network as given
o kern/82881   net        [netgraph] [panic] ng_fec(4) causes kernel panic after
o kern/82468   net        Using 64MB tcp send/recv buffers, trafficflow stops, i
o bin/82185    net        [patch] ndp(8) can delete the incorrect entry
o kern/81095   net        IPsec connection stops working if associated network i
o kern/78968   net        FreeBSD freezes on mbufs exhaustion (network interface
o kern/78090   net        [ipf] ipf filtering on bridged packets doesn't work if
o kern/77341   net        [ip6] problems with IPV6 implementation
o kern/75873   net        Usability problem with non-RFC-compliant IP spoof prot
s kern/75407   net        [an] an(4): no carrier after short time
a kern/71474   net        [route] route lookup does not skip interfaces marked d
o kern/71469   net        default route to internet magically disappears with mu
o kern/68889   net        [panic] m_copym, length > size of mbuf chain
o kern/66225   net        [netgraph] [patch] extend ng_eiface(4) control message
o kern/65616   net        IPSEC can't detunnel GRE packets after real ESP encryp
s kern/60293   net        [patch] FreeBSD arp poison patch
a kern/56233   net        IPsec tunnel (ESP) over IPv6: MTU computation is wrong
s bin/41647    net        ifconfig(8) doesn't accept lladdr along with inet addr
o kern/39937   net        ipstealth issue
a kern/38554   net        [patch] changing interface ipaddress doesn't seem to w
o kern/31940   net        ip queue length too short for >500kpps
o kern/31647   net        [libc] socket calls can return undocumented EINVAL
o kern/30186   net        [libc] getaddrinfo(3) does not handle incorrect servna
f kern/24959   net        [patch] proper TCP_NOPUSH/TCP_CORK compatibility
o conf/23063   net        [arp] [patch] for static ARP tables in rc.network
o kern/21998   net        [socket] [patch] ident only for outgoing connections
o kern/5877    net        [socket] sb_cc counts control data as well as data dat

472 problems total.


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 11:36:48 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7E21C914;
 Mon,  2 Dec 2013 11:36:48 +0000 (UTC)
Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com
 [IPv6:2a00:1450:4010:c04::22e])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 5CD4B1C65;
 Mon,  2 Dec 2013 11:36:47 +0000 (UTC)
Received: by mail-lb0-f174.google.com with SMTP id c11so8403816lbj.33
 for <multiple recipients>; Mon, 02 Dec 2013 03:36:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=qTU7Q/AIp9kHrcHK+lK70NkoSLxkcH8Ceb88oF4YTGA=;
 b=dCpyXHnM47pnGYFAmZzhhNqqcVbBZKgxmy0+ScCJbwPoQVPZ4pJKSqKtMo2MSmqWVg
 IV4mCZKr4iV+zUFDukIiRJVpQ/xWAY7hmU54XyKotPV8XpyWT5v1KsgqDu8IKIfNWYhB
 UqTzc/Ut+WyYlBneXatwFu98v43A/jYH6h+/RD5u34v7GEOHhvtxY0zSH9kZ7cdzJFT5
 Y37mcRR8XvjkAXkAkbdy42TMG744MFBs4/JoPkjM28KOfUZO5LJ4MuXL28jidE3z2Dx/
 kUKh/d2TAD6TObUcgacb1lSoOyus6k5Akz4Kf0xZ03bjOXRMjJMYe3HkHXb3iutseAae
 4s1w==
MIME-Version: 1.0
X-Received: by 10.152.28.230 with SMTP id e6mr39187123lah.3.1385984205170;
 Mon, 02 Dec 2013 03:36:45 -0800 (PST)
Received: by 10.114.166.163 with HTTP; Mon, 2 Dec 2013 03:36:45 -0800 (PST)
In-Reply-To: <CALDtMrLgm-D30u8HWWF=sVda0h4QtYdyiGHpYPw1kfTWbMbJ6Q@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
 <CALDtMrLgm-D30u8HWWF=sVda0h4QtYdyiGHpYPw1kfTWbMbJ6Q@mail.gmail.com>
Date: Mon, 2 Dec 2013 19:36:45 +0800
Message-ID: <CAMOc5cyGZNTwc=gT868J6S=ebEARbTV4+0o7aAhwhygSP-Z6aQ@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Oleg Moskalenko <mom040267@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 11:36:48 -0000

On Mon, Dec 2, 2013 at 12:29 PM, Oleg Moskalenko <mom040267@gmail.com>wrote=
:

> Sepherosa, while reading your description I noticed another long-standing
> problem for UDP application developers: the UDP sockets are always hashed
> with 2-tuple. But UDP sockets can be "connected", too, to a remote addres=
s,
> with connect(...)
>

The connected UDP sockets will be in connect hash, which is hashed using
faddr/laddr/fport/lport.  SO_REUSEPORT only affects wildcard sockets.


> function. Unfortunately, with 2-tuple hashing, that pattern is useless fo=
r
> large-scale applications: if a large number of UDP sockets on the same
> local port are "connected" to remote address, then the kernel have to go
> thru the long list of UDP sockets with the same hash value.
>
> If the connected UDP sockets would use 4-tuples, then it would be very
> helpful for the new generation of the UDP-based media applications. For
> example, servers which use DTLS protocol would become simpler and more
> efficient.
>
>
If you are talking about RSS, then igb, ixgbe and mxge (and may be other
drivers) support RSS extension (mxge is not using RSS, but still 4-tuple
hash), which will include UDP fport/lport into Toeplitz hash calculation.
Well, for fragments of a UDP datagram, if the ports are taken into
consideration the RSS hash will be different for leading fragment and rest
of the fragments; I think that's why MS didn't include ports for UDP.

Best Regards,
sephe


> Thanks
> Oleg
>
>
>
> On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau <sepherosa@gmail.com>wro=
te:
>
>>
>>
>>
>> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>>
>>> Well seems Dragonfly has some version of it already from commit [1].
>>>
>>>
>> The distribution algorithm was changed a little bit after initial commit
>> to gain more idle time (bnx(4) output has already been maxed out):
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be=
28b150d3f4fd518914bdeba6
>>
>> Well, I also addressed a reasonable concern from nginx folks (I am not
>> quite sure about Linux's position on it; Linux original implementation o=
f
>> SO_REUSEPORT from Google had this drawback, which I mentioned in the com=
mit
>> message):
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a4=
5eb69750219f79f5e8982272
>>
>> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is i=
n
>> dports; should be easier to be back ported to FreeBSD's ports.  I failed=
 to
>> convince nginx folks to merge it into mainline and I am currently onto
>> other stuffs, will come back to them later.  If FreeBSD is going to
>> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
>> mainline will be easier.
>>
>> I also put up a brief description of SO_REUSEPORT in dfly; may be useful
>> to you:
>> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>>
>> Best Regards,
>> sephe
>>
>>
>>>  In FreeBSD there is the framework for this with by defining PCBGROUP.
>>> Also the explanation of it at [2] and [3].
>>> It can achieve approximately the same features of SO_RESUSEPORT of linu=
x.
>>> The only thing missing is the marketing behind it and i think and bette=
r
>>> RSS support.
>>> By looking at dates the support is there before linux so all you guys
>>> looking for it can experiment with it.
>>>
>>> What i was trying to accomplish was something else from performance
>>> improvement and
>>> maybe put a sysctl behind it to make it more acceptable..
>>>
>>> [1]
>>>
>>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c=
9c021abb8197718d7a2d441c9
>>> [2]
>>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=3Dbigexcerpts=
#L51
>>> [3]
>>> http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>>>
>>>
>>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267@gmail.com
>>> >wrote:
>>>
>>> > Tim, you are wrong. Read what is "multicast" definition, and read how
>>> UDP
>>> > and TCP sockets work in Linux 3.9+ kernels.
>>> >
>>> > Oleg .
>>> >
>>> >
>>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle@freebsd.org
>>> >wrote:
>>> >
>>> >>
>>> >> On Nov 29, 2013, at 4:04 AM, Ermal Lu=E7i <eri@freebsd.org> wrote:
>>> >>
>>> >> > Hello,
>>> >> >
>>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>>> daemons to
>>> >> > share the same port and possibly listening ip =85
>>> >>
>>> >> These flags are used with TCP-based servers.
>>> >>
>>> >> I=92ve used them to make software upgrades go more smoothly.
>>> >> Without them, the following often happens:
>>> >>
>>> >> * Old server stops.  In the process, all of its TCP connections are
>>> >> closed.
>>> >>
>>> >> * Connections to old server remain in the TCP connection table until
>>> the
>>> >> remote end can acknowledge.
>>> >>
>>> >> * New server starts.
>>> >>
>>> >> * New server tries to open port but fails because that port is =93st=
ill
>>> in
>>> >> use=94 by connections in the TCP connection table.
>>> >>
>>> >> With these flags, the new server can open the port even though
>>> >> it is =93still in use=94 by existing connections.
>>> >>
>>> >>
>>> >> > This is not the case today.
>>> >> > Only multicast sockets seem to have the behaviour of broadcasting
>>> the
>>> >> data
>>> >> > to all sockets sharing the same properties through these options!
>>> >>
>>> >> That is what multicast is for.
>>> >>
>>> >> If you want the same data sent to all listeners, then
>>> >> that is multicast behavior and you should be using
>>> >> a multicast socket.
>>> >>
>>> >> > The patch at [1] implements/corrects the behaviour for UDP sockets=
.
>>> >>
>>> >> You=92re trying to turn all UDP sockets with those options
>>> >> into multicast sockets.
>>> >>
>>> >> If you want a multicast socket, you should ask for one.
>>> >>
>>> >> Tim
>>> >>
>>> >> _______________________________________________
>>> >> freebsd-net@freebsd.org mailing list
>>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or=
g
>>> "
>>> >>
>>> >
>>> >
>>>
>>>
>>> --
>>> Ermal
>>> _______________________________________________
>>> freebsd-current@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "
>>> freebsd-current-unsubscribe@freebsd.org"
>>>
>>
>>
>>
>> --
>> Tomorrow Will Never Die
>>
>
>


--=20
Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 11:45:58 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 60110C12;
 Mon,  2 Dec 2013 11:45:58 +0000 (UTC)
Received: from mail-la0-x236.google.com (mail-la0-x236.google.com
 [IPv6:2a00:1450:4010:c03::236])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 208AD1CE7;
 Mon,  2 Dec 2013 11:45:56 +0000 (UTC)
Received: by mail-la0-f54.google.com with SMTP id b8so2107915lan.13
 for <multiple recipients>; Mon, 02 Dec 2013 03:45:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=vX313e18J9aw/XnxvVPUlETDS0/4b/gcKO+9Hem8Q6g=;
 b=BJsCbZajKAUpeNeIFQ3Jb3izcl56KP4rNuYRW1egj4kOaFQ+dJ8PjDz3las5sIebAI
 8HXmz9ujfR5ilGahZdQk8/EYLIheuPcBHKtEFNL6/lYYLVunJ6i6H47ehHypwSeZdiyJ
 kQ31KxAu+7oo5IBInrUw2qlZfvOOe3HoprWYYSCPP9cOyxCYkbnc1EigYR22cOsxfiEh
 Gw8Zmv0bAKA0n9+Rubi2Ck7UJQGuhdjogql3zow2q4MDZdxaPIzm2LGrv1yXDlgbrbEC
 f4KJrQRFtvkFpo94KMNFKwmxyf2OlNK2CXa0Em4VqK9kXaFmPYIvEj1BNAYrMjOdTb20
 t0xw==
MIME-Version: 1.0
X-Received: by 10.152.140.193 with SMTP id ri1mr45245856lab.18.1385984754969; 
 Mon, 02 Dec 2013 03:45:54 -0800 (PST)
Received: by 10.114.166.163 with HTTP; Mon, 2 Dec 2013 03:45:54 -0800 (PST)
In-Reply-To: <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
 <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
Date: Mon, 2 Dec 2013 19:45:54 +0800
Message-ID: <CAMOc5cyM-+vau7BsZQ5F5L95EQgN=pJqru=9aK_0aJ+VUk=gxQ@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Oleg Moskalenko <mom040267@gmail.com>,
 Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 11:45:58 -0000

On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> Hi! Thanks for the writeup!
>
> On 1 December 2013 20:17, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
>
> > I also put up a brief description of SO_REUSEPORT in dfly; may be useful
> to
> > you:
> > http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>
> Ok, so given this, how do you guarantee the UTHREAD stays on the given
> CPU? You assume it stays on the CPU that the initial listen socket was
> created on, right? If it's migrated to another CPU core then the
> listen queue still stays in the original hash group that's in a netisr
> on a different CPU?
>
>
As I wrote in the above brief introduction, Dfly currently relies on the
scheduler doing the proper thing (the scheduler does do a very good job
during my tests).  I need to export certain kind of socket option to make
that information available to user space programs.  Force UTHREAD binding
in kernel is not helpful, given in reverse proxy application, things are
different.  And even if that kind of binding information was exported to
user space, user space program still would have to poll it periodically (in
Dfly at least), since other programs binding to the same addr/port could
come and go, which will cause reorganizing of the inp localgroup in the
current Dfly implementation.

Best Regards,
sephe

-- 
Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 20:57:52 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id D1F7AF5F;
 Mon,  2 Dec 2013 20:57:52 +0000 (UTC)
Received: from lakerest.net (lakerest.net [162.235.35.161])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6201C1572;
 Mon,  2 Dec 2013 20:57:52 +0000 (UTC)
Received: from [10.1.1.124] (bsd4.lakerest.net [162.235.35.162])
 (authenticated bits=0)
 by lakerest.net (8.14.4/8.14.3) with ESMTP id rB2KuxhD098908
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT);
 Mon, 2 Dec 2013 15:57:04 -0500 (EST) (envelope-from rrs@lakerest.net)
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset=us-ascii
From: Randall Stewart <rrs@lakerest.net>
In-Reply-To: <20131202022338.GA3500@michelle.cdnetworks.com>
Date: Mon, 2 Dec 2013 15:56:59 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <1ED6A1C2-6CED-4FDA-9C61-76FBCB2D7452@lakerest.net>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
To: pyunyh@gmail.com
X-Mailer: Apple Mail (2.1283)
Cc: Jack F Vogel <jfv@freebsd.org>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 20:57:52 -0000


On Dec 1, 2013, at 9:23 PM, Yonghyeon PYUN wrote:

> On Fri, Nov 29, 2013 at 06:24:12PM +0100, Michael Tuexen wrote:
>> Dear all,
>>=20
>> ifnet(9) says regarding if_transmit():
>>=20
>> Transmit a packet on an interface or queue it if the interface is
>> in use.  This function will return ENOBUFS if the devices software
>> and hardware queues are both full.
>>=20
>> The drivers for em, igb and ixgbe might also return an error even
>> in the case the packet was enqueued. The attached patches fix this
>> issue.
>=20
> How do you know the packet is successfully enqueued but driver
> returns an error?  Do non-buf-ring-aware drivers also show the same
> behavior?
>=20
All of the drivers have traditionally (from what I can tell
and all the ones I have poked at) no matter if they are the new
format (with ring-buf) or the old, would only return an error in
the enqueue if we hit the limit.

The driver down the road can in theory drop the packet for other
reasons (errors etc) and there is no communication back up to the
upper layers that this occurred.



>>=20
>> Any comments?
>=20
> I'm afraid the patch you posted ignores any errors(i.e.
> m_defrag(9), bus_dma(9) etc) happened during TX processing.

But that is always the case. Most of the time when you send
down to if_transmit() the first time you are going to get
your thread working on those things m_defrag() and bus_dma().. but if
another thread awoke the driver ahead of you all you get is the return
code of the queue into the buffers.. you can't know what is happening on
the other thread that is actually putting the work out.

This has always been the case.

This patch I think is *very* much needed on all the ring buffer aware =
drivers except maybe
Chelsio (since there's is so different it probably does not have this =
issue).

I will be applying this to all of Adara's code and I would *strongly* =
encourage Jack to get this
in to the intel side.

I will also pull this patch (and fix all the other drivers) in the =
branch I will be creating
shortly per Adrian's suggestion on the multi-Q qos stuff I was working =
on..

Jack? when can you get this in ??

R


>=20
>>=20
>> Jack: What do you think? Would you prefer to commit the fix if
>> you think it is acceptable?
>>=20
>> Best regards
>> Michael
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20

------------------------------
Randall Stewart
803-317-4952 (cell)


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 21:06:37 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A38201D5;
 Mon,  2 Dec 2013 21:06:37 +0000 (UTC)
Received: from lakerest.net (lakerest.net [162.235.35.161])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 2E5B71631;
 Mon,  2 Dec 2013 21:06:36 +0000 (UTC)
Received: from [10.1.1.124] (bsd4.lakerest.net [162.235.35.162])
 (authenticated bits=0)
 by lakerest.net (8.14.4/8.14.3) with ESMTP id rB2L6UBV099061
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT);
 Mon, 2 Dec 2013 16:06:30 -0500 (EST) (envelope-from rrs@lakerest.net)
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset=us-ascii
From: Randall Stewart <rrs@lakerest.net>
In-Reply-To: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
Date: Mon, 2 Dec 2013 16:06:30 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <92126112-73DA-42D3-A8CD-DBF5FB8F45E8@lakerest.net>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
X-Mailer: Apple Mail (2.1283)
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 21:06:37 -0000

Michael:


Looking at this patch (as I apply it to my world), I think=20
you can just take the=20

> -		if ((err =3D igb_xmit(txr, &next)) !=3D 0) {
> +		if (igb_xmit(txr, &next) !=3D 0) {

Type lines

and leave the=20

return(err)

since err will get set to 0 by the drbr_enqueue() and return the proper =
response to the
transport above sending the packet.

R
On Nov 29, 2013, at 12:24 PM, Michael Tuexen wrote:

> Dear all,
>=20
> ifnet(9) says regarding if_transmit():
>=20
> Transmit a packet on an interface or queue it if the interface is
> in use.  This function will return ENOBUFS if the devices software
> and hardware queues are both full.
>=20
> The drivers for em, igb and ixgbe might also return an error even
> in the case the packet was enqueued. The attached patches fix this
> issue.
>=20
> Any comments?
>=20
> Jack: What do you think? Would you prefer to commit the fix if
> you think it is acceptable?
>=20
> Best regards
> Michael
>=20
>=20
> [bsd5:~/head/sys/dev] tuexen% svn diff -x -p
> Index: e1000/if_em.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- e1000/if_em.c	(revision 258746)
> +++ e1000/if_em.c	(working copy)
> @@ -930,7 +930,7 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
>=20
> 	/* Process the queue */
> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
> -		if ((err =3D em_xmit(txr, &next)) !=3D 0) {
> +		if (em_xmit(txr, &next) !=3D 0) {
> 			if (next =3D=3D NULL)
> 				drbr_advance(ifp, txr->br);
> 			else=20
> @@ -957,7 +957,7 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
> 		em_txeof(txr);
> 	if (txr->tx_avail < EM_MAX_SCATTER)
> 		ifp->if_drv_flags |=3D IFF_DRV_OACTIVE;
> -	return (err);
> +	return (0);
> }
>=20
> /*
> Index: e1000/if_igb.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- e1000/if_igb.c	(revision 258746)
> +++ e1000/if_igb.c	(working copy)
> @@ -192,7 +192,7 @@ static int	igb_suspend(device_t);
> static int	igb_resume(device_t);
> #ifndef IGB_LEGACY_TX
> static int	igb_mq_start(struct ifnet *, struct mbuf *);
> -static int	igb_mq_start_locked(struct ifnet *, struct tx_ring *);
> +static void	igb_mq_start_locked(struct ifnet *, struct tx_ring *);
> static void	igb_qflush(struct ifnet *);
> static void	igb_deferred_mq_start(void *, int);
> #else
> @@ -989,31 +989,31 @@ igb_mq_start(struct ifnet *ifp, struct mbuf *m)
> 	if (err)
> 		return (err);
> 	if (IGB_TX_TRYLOCK(txr)) {
> -		err =3D igb_mq_start_locked(ifp, txr);
> +		igb_mq_start_locked(ifp, txr);
> 		IGB_TX_UNLOCK(txr);
> 	} else
> 		taskqueue_enqueue(que->tq, &txr->txq_task);
>=20
> -	return (err);
> +	return (0);
> }
>=20
> -static int
> +static void
> igb_mq_start_locked(struct ifnet *ifp, struct tx_ring *txr)
> {
> 	struct adapter  *adapter =3D txr->adapter;
>         struct mbuf     *next;
> -        int             err =3D 0, enq =3D 0;
> +        int             enq =3D 0;
>=20
> 	IGB_TX_LOCK_ASSERT(txr);
>=20
> 	if (((ifp->if_drv_flags & IFF_DRV_RUNNING) =3D=3D 0) ||
> 	    adapter->link_active =3D=3D 0)
> -		return (ENETDOWN);
> +		return;
>=20
>=20
> 	/* Process the queue */
> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
> -		if ((err =3D igb_xmit(txr, &next)) !=3D 0) {
> +		if (igb_xmit(txr, &next) !=3D 0) {
> 			if (next =3D=3D NULL) {
> 				/* It was freed, move forward */
> 				drbr_advance(ifp, txr->br);
> @@ -1045,7 +1045,7 @@ igb_mq_start_locked(struct ifnet *ifp, struct =
tx_r
> 		igb_txeof(txr);
> 	if (txr->tx_avail <=3D IGB_MAX_SCATTER)
> 		txr->queue_status |=3D IGB_QUEUE_DEPLETED;
> -	return (err);
> +	return;
> }
>=20
> /*
> Index: ixgbe/ixgbe.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- ixgbe/ixgbe.c	(revision 258746)
> +++ ixgbe/ixgbe.c	(working copy)
> @@ -107,7 +107,7 @@ static void     ixgbe_start(struct ifnet *);
> static void     ixgbe_start_locked(struct tx_ring *, struct ifnet *);
> #else /* ! IXGBE_LEGACY_TX */
> static int	ixgbe_mq_start(struct ifnet *, struct mbuf *);
> -static int	ixgbe_mq_start_locked(struct ifnet *, struct tx_ring *);
> +static void	ixgbe_mq_start_locked(struct ifnet *, struct tx_ring *);
> static void	ixgbe_qflush(struct ifnet *);
> static void	ixgbe_deferred_mq_start(void *, int);
> #endif /* IXGBE_LEGACY_TX */
> @@ -831,35 +831,35 @@ ixgbe_mq_start(struct ifnet *ifp, struct mbuf =
*m)
> 	if (err)
> 		return (err);
> 	if (IXGBE_TX_TRYLOCK(txr)) {
> -		err =3D ixgbe_mq_start_locked(ifp, txr);
> +		ixgbe_mq_start_locked(ifp, txr);
> 		IXGBE_TX_UNLOCK(txr);
> 	} else
> 		taskqueue_enqueue(que->tq, &txr->txq_task);
>=20
> -	return (err);
> +	return (0);
> }
>=20
> -static int
> +static void
> ixgbe_mq_start_locked(struct ifnet *ifp, struct tx_ring *txr)
> {
> 	struct adapter  *adapter =3D txr->adapter;
>         struct mbuf     *next;
> -        int             enqueued =3D 0, err =3D 0;
> +        int             enqueued =3D 0;
>=20
> 	if (((ifp->if_drv_flags & IFF_DRV_RUNNING) =3D=3D 0) ||
> 	    adapter->link_active =3D=3D 0)
> -		return (ENETDOWN);
> +		return;
>=20
> 	/* Process the queue */
> #if __FreeBSD_version < 901504
> 	next =3D drbr_dequeue(ifp, txr->br);
> 	while (next !=3D NULL) {
> -		if ((err =3D ixgbe_xmit(txr, &next)) !=3D 0) {
> +		if (ixgbe_xmit(txr, &next) !=3D 0) {
> 			if (next !=3D NULL)
> -				err =3D drbr_enqueue(ifp, txr->br, =
next);
> +				drbr_enqueue(ifp, txr->br, next);
> #else
> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
> -		if ((err =3D ixgbe_xmit(txr, &next)) !=3D 0) {
> +		if (ixgbe_xmit(txr, &next) !=3D 0) {
> 			if (next =3D=3D NULL) {
> 				drbr_advance(ifp, txr->br);
> 			} else {
> @@ -890,7 +890,7 @@ ixgbe_mq_start_locked(struct ifnet *ifp, struct tx
> 	if (txr->tx_avail < IXGBE_TX_CLEANUP_THRESHOLD)
> 		ixgbe_txeof(txr);
>=20
> -	return (err);
> +	return;
> }
>=20
> /*
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20

------------------------------
Randall Stewart
803-317-4952 (cell)


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 21:41:16 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 99EE4AEA;
 Mon,  2 Dec 2013 21:41:16 +0000 (UTC)
Received: from mail-qa0-x229.google.com (mail-qa0-x229.google.com
 [IPv6:2607:f8b0:400d:c00::229])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1D1961AA9;
 Mon,  2 Dec 2013 21:41:16 +0000 (UTC)
Received: by mail-qa0-f41.google.com with SMTP id j5so4920197qaq.7
 for <multiple recipients>; Mon, 02 Dec 2013 13:41:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=EC57TpugfYxzzijezRYWYhKJlDnMDzob9xIx5rcHmhs=;
 b=emc8FrTlknWZQkrwuFVDodUQvlkf+wFO+HbO9ZO3Z0V8X6HdP9tFP+4cxTnC7LfM0h
 420sOCTDlKA4PcapacIL1YSq0vWQop8bNWo0AhSPq6z/oYCQu/3WUAPUOniJBsc7Gf/q
 yJL/7HKRqsX+hxUJZPpOb/e22bmjJx0+QGikNWUOXGUkvPxK4xFcnHCGeGnHhRVEWnNF
 gS/ph5trkDFhP9agRYvENpYjdG5FTj9V/HPqaYAqrz1uBYqkS6p0DiBFYzQUvwN1QrlS
 DhI54HdejTfLCGTmlSD+yQ43oj3/4v4iSfNCPnx8dspEUgLs3hDCV3bAx/uQacHwK5mH
 wW6A==
MIME-Version: 1.0
X-Received: by 10.49.131.5 with SMTP id oi5mr76884665qeb.38.1386020474382;
 Mon, 02 Dec 2013 13:41:14 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Mon, 2 Dec 2013 13:41:14 -0800 (PST)
In-Reply-To: <CAMOc5cyM-+vau7BsZQ5F5L95EQgN=pJqru=9aK_0aJ+VUk=gxQ@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
 <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
 <CAMOc5cyM-+vau7BsZQ5F5L95EQgN=pJqru=9aK_0aJ+VUk=gxQ@mail.gmail.com>
Date: Mon, 2 Dec 2013 13:41:14 -0800
X-Google-Sender-Auth: nFocPVewXGPEdhU8tnGKtSD_dUk
Message-ID: <CAJ-VmokQ_C_t=pZF5QnWMzjzw6YVqTD4ny3hv_cLDch-m2EOmg@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Adrian Chadd <adrian@freebsd.org>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Oleg Moskalenko <mom040267@gmail.com>,
 Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 21:41:16 -0000

On 2 December 2013 03:45, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
>
> On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>
>> Ok, so given this, how do you guarantee the UTHREAD stays on the given
>> CPU? You assume it stays on the CPU that the initial listen socket was
>> created on, right? If it's migrated to another CPU core then the
>> listen queue still stays in the original hash group that's in a netisr
>> on a different CPU?
>
> As I wrote in the above brief introduction, Dfly currently relies on the
> scheduler doing the proper thing (the scheduler does do a very good job
> during my tests).  I need to export certain kind of socket option to make
> that information available to user space programs.  Force UTHREAD binding in
> kernel is not helpful, given in reverse proxy application, things are
> different.  And even if that kind of binding information was exported to
> user space, user space program still would have to poll it periodically (in
> Dfly at least), since other programs binding to the same addr/port could
> come and go, which will cause reorganizing of the inp localgroup in the
> current Dfly implementation.

Right. I kinda gathered that. It's fine, I was conceptually thinking
of doing some thead pinning into this anyway.

How do you see this scaling on massively multi-core machines? Like 32,
48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting
the concept of pcbgroup hash / netisr threads to a subset of CPUs, or
have them be able to float between sockets but only have 1 (or n,
maybe) per socket. Or just have a fixed, smaller pool. The idea then
is the scheduler would need to be told that a given userland
thread/process belongs to a given netisr thread, and to schedule them
on the same CPU when possible.

Anyway, thanks for doing this work. I only wish that you'd do it for
FreeBSD. :-)



-adrian

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 21:44:49 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 6609FCAF;
 Mon,  2 Dec 2013 21:44:49 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 8E9831AD9;
 Mon,  2 Dec 2013 21:44:48 +0000 (UTC)
Received: from [192.168.1.102] (p508F2CD2.dip0.t-ipconnect.de [80.143.44.210])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 9CC361C0C0693;
 Mon,  2 Dec 2013 22:44:45 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <92126112-73DA-42D3-A8CD-DBF5FB8F45E8@lakerest.net>
Date: Mon, 2 Dec 2013 22:44:46 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <E02BDC49-013B-4EC9-B359-198BA0E5C9E1@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <92126112-73DA-42D3-A8CD-DBF5FB8F45E8@lakerest.net>
To: Randall Stewart <rrs@lakerest.net>
X-Mailer: Apple Mail (2.1510)
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 21:44:49 -0000

On Dec 2, 2013, at 10:06 PM, Randall Stewart <rrs@lakerest.net> wrote:

> Michael:
>=20
>=20
> Looking at this patch (as I apply it to my world), I think=20
> you can just take the=20
>=20
>> -		if ((err =3D igb_xmit(txr, &next)) !=3D 0) {
>> +		if (igb_xmit(txr, &next) !=3D 0) {
>=20
> Type lines
>=20
> and leave the=20
>=20
> return(err)
>=20
> since err will get set to 0 by the drbr_enqueue() and return the =
proper response to the
> transport above sending the packet.
True. Just thought this is clearer... But the patch is not minimal, you =
are right.

Best regards
Michael
>=20
> R
> On Nov 29, 2013, at 12:24 PM, Michael Tuexen wrote:
>=20
>> Dear all,
>>=20
>> ifnet(9) says regarding if_transmit():
>>=20
>> Transmit a packet on an interface or queue it if the interface is
>> in use.  This function will return ENOBUFS if the devices software
>> and hardware queues are both full.
>>=20
>> The drivers for em, igb and ixgbe might also return an error even
>> in the case the packet was enqueued. The attached patches fix this
>> issue.
>>=20
>> Any comments?
>>=20
>> Jack: What do you think? Would you prefer to commit the fix if
>> you think it is acceptable?
>>=20
>> Best regards
>> Michael
>>=20
>>=20
>> [bsd5:~/head/sys/dev] tuexen% svn diff -x -p
>> Index: e1000/if_em.c
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> --- e1000/if_em.c	(revision 258746)
>> +++ e1000/if_em.c	(working copy)
>> @@ -930,7 +930,7 @@ em_mq_start_locked(struct ifnet *ifp, struct =
tx_ri
>>=20
>> 	/* Process the queue */
>> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
>> -		if ((err =3D em_xmit(txr, &next)) !=3D 0) {
>> +		if (em_xmit(txr, &next) !=3D 0) {
>> 			if (next =3D=3D NULL)
>> 				drbr_advance(ifp, txr->br);
>> 			else=20
>> @@ -957,7 +957,7 @@ em_mq_start_locked(struct ifnet *ifp, struct =
tx_ri
>> 		em_txeof(txr);
>> 	if (txr->tx_avail < EM_MAX_SCATTER)
>> 		ifp->if_drv_flags |=3D IFF_DRV_OACTIVE;
>> -	return (err);
>> +	return (0);
>> }
>>=20
>> /*
>> Index: e1000/if_igb.c
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> --- e1000/if_igb.c	(revision 258746)
>> +++ e1000/if_igb.c	(working copy)
>> @@ -192,7 +192,7 @@ static int	igb_suspend(device_t);
>> static int	igb_resume(device_t);
>> #ifndef IGB_LEGACY_TX
>> static int	igb_mq_start(struct ifnet *, struct mbuf *);
>> -static int	igb_mq_start_locked(struct ifnet *, struct tx_ring *);
>> +static void	igb_mq_start_locked(struct ifnet *, struct tx_ring *);
>> static void	igb_qflush(struct ifnet *);
>> static void	igb_deferred_mq_start(void *, int);
>> #else
>> @@ -989,31 +989,31 @@ igb_mq_start(struct ifnet *ifp, struct mbuf *m)
>> 	if (err)
>> 		return (err);
>> 	if (IGB_TX_TRYLOCK(txr)) {
>> -		err =3D igb_mq_start_locked(ifp, txr);
>> +		igb_mq_start_locked(ifp, txr);
>> 		IGB_TX_UNLOCK(txr);
>> 	} else
>> 		taskqueue_enqueue(que->tq, &txr->txq_task);
>>=20
>> -	return (err);
>> +	return (0);
>> }
>>=20
>> -static int
>> +static void
>> igb_mq_start_locked(struct ifnet *ifp, struct tx_ring *txr)
>> {
>> 	struct adapter  *adapter =3D txr->adapter;
>>        struct mbuf     *next;
>> -        int             err =3D 0, enq =3D 0;
>> +        int             enq =3D 0;
>>=20
>> 	IGB_TX_LOCK_ASSERT(txr);
>>=20
>> 	if (((ifp->if_drv_flags & IFF_DRV_RUNNING) =3D=3D 0) ||
>> 	    adapter->link_active =3D=3D 0)
>> -		return (ENETDOWN);
>> +		return;
>>=20
>>=20
>> 	/* Process the queue */
>> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
>> -		if ((err =3D igb_xmit(txr, &next)) !=3D 0) {
>> +		if (igb_xmit(txr, &next) !=3D 0) {
>> 			if (next =3D=3D NULL) {
>> 				/* It was freed, move forward */
>> 				drbr_advance(ifp, txr->br);
>> @@ -1045,7 +1045,7 @@ igb_mq_start_locked(struct ifnet *ifp, struct =
tx_r
>> 		igb_txeof(txr);
>> 	if (txr->tx_avail <=3D IGB_MAX_SCATTER)
>> 		txr->queue_status |=3D IGB_QUEUE_DEPLETED;
>> -	return (err);
>> +	return;
>> }
>>=20
>> /*
>> Index: ixgbe/ixgbe.c
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> --- ixgbe/ixgbe.c	(revision 258746)
>> +++ ixgbe/ixgbe.c	(working copy)
>> @@ -107,7 +107,7 @@ static void     ixgbe_start(struct ifnet *);
>> static void     ixgbe_start_locked(struct tx_ring *, struct ifnet *);
>> #else /* ! IXGBE_LEGACY_TX */
>> static int	ixgbe_mq_start(struct ifnet *, struct mbuf *);
>> -static int	ixgbe_mq_start_locked(struct ifnet *, struct tx_ring *);
>> +static void	ixgbe_mq_start_locked(struct ifnet *, struct tx_ring *);
>> static void	ixgbe_qflush(struct ifnet *);
>> static void	ixgbe_deferred_mq_start(void *, int);
>> #endif /* IXGBE_LEGACY_TX */
>> @@ -831,35 +831,35 @@ ixgbe_mq_start(struct ifnet *ifp, struct mbuf =
*m)
>> 	if (err)
>> 		return (err);
>> 	if (IXGBE_TX_TRYLOCK(txr)) {
>> -		err =3D ixgbe_mq_start_locked(ifp, txr);
>> +		ixgbe_mq_start_locked(ifp, txr);
>> 		IXGBE_TX_UNLOCK(txr);
>> 	} else
>> 		taskqueue_enqueue(que->tq, &txr->txq_task);
>>=20
>> -	return (err);
>> +	return (0);
>> }
>>=20
>> -static int
>> +static void
>> ixgbe_mq_start_locked(struct ifnet *ifp, struct tx_ring *txr)
>> {
>> 	struct adapter  *adapter =3D txr->adapter;
>>        struct mbuf     *next;
>> -        int             enqueued =3D 0, err =3D 0;
>> +        int             enqueued =3D 0;
>>=20
>> 	if (((ifp->if_drv_flags & IFF_DRV_RUNNING) =3D=3D 0) ||
>> 	    adapter->link_active =3D=3D 0)
>> -		return (ENETDOWN);
>> +		return;
>>=20
>> 	/* Process the queue */
>> #if __FreeBSD_version < 901504
>> 	next =3D drbr_dequeue(ifp, txr->br);
>> 	while (next !=3D NULL) {
>> -		if ((err =3D ixgbe_xmit(txr, &next)) !=3D 0) {
>> +		if (ixgbe_xmit(txr, &next) !=3D 0) {
>> 			if (next !=3D NULL)
>> -				err =3D drbr_enqueue(ifp, txr->br, =
next);
>> +				drbr_enqueue(ifp, txr->br, next);
>> #else
>> 	while ((next =3D drbr_peek(ifp, txr->br)) !=3D NULL) {
>> -		if ((err =3D ixgbe_xmit(txr, &next)) !=3D 0) {
>> +		if (ixgbe_xmit(txr, &next) !=3D 0) {
>> 			if (next =3D=3D NULL) {
>> 				drbr_advance(ifp, txr->br);
>> 			} else {
>> @@ -890,7 +890,7 @@ ixgbe_mq_start_locked(struct ifnet *ifp, struct =
tx
>> 	if (txr->tx_avail < IXGBE_TX_CLEANUP_THRESHOLD)
>> 		ixgbe_txeof(txr);
>>=20
>> -	return (err);
>> +	return;
>> }
>>=20
>> /*
>>=20
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
>>=20
>=20
> ------------------------------
> Randall Stewart
> 803-317-4952 (cell)
>=20
>=20


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 21:48:08 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 6DB00D8A;
 Mon,  2 Dec 2013 21:48:08 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 2CA481AFD;
 Mon,  2 Dec 2013 21:48:08 +0000 (UTC)
Received: from [192.168.1.102] (p508F2CD2.dip0.t-ipconnect.de [80.143.44.210])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 4729E1C0C0693;
 Mon,  2 Dec 2013 22:48:06 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131202022338.GA3500@michelle.cdnetworks.com>
Date: Mon, 2 Dec 2013 22:48:07 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
To: pyunyh@gmail.com
X-Mailer: Apple Mail (2.1510)
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 21:48:08 -0000

On Dec 2, 2013, at 3:23 AM, Yonghyeon PYUN <pyunyh@gmail.com> wrote:

> On Fri, Nov 29, 2013 at 06:24:12PM +0100, Michael Tuexen wrote:
>> Dear all,
>> 
>> ifnet(9) says regarding if_transmit():
>> 
>> Transmit a packet on an interface or queue it if the interface is
>> in use.  This function will return ENOBUFS if the devices software
>> and hardware queues are both full.
>> 
>> The drivers for em, igb and ixgbe might also return an error even
>> in the case the packet was enqueued. The attached patches fix this
>> issue.
> 
> How do you know the packet is successfully enqueued but driver
> returns an error?  Do non-buf-ring-aware drivers also show the same
When debugging the issue, I saw the packet on the wire but the if_transmit()
returning ENOBUFS.
> behavior?
I don't know. I saw this issue with the igb driver.
> 
>> 
>> Any comments?
> 
> I'm afraid the patch you posted ignores any errors(i.e.
> m_defrag(9), bus_dma(9) etc) happened during TX processing.
Correct. I want to make sure that if ENOBUFS is returned, the
packet hasn't made it on the wire. The other errors can occur
for the packet provided to if_transmit() or due to packet
processing of other packets. Am I missing something?

Best regards
Michael
> 
>> 
>> Jack: What do you think? Would you prefer to commit the fix if
>> you think it is acceptable?
>> 
>> Best regards
>> Michael
> 


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  2 22:10:06 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 69AFF3DA
 for <net@freebsd.org>; Mon,  2 Dec 2013 22:10:06 +0000 (UTC)
Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45])
 by mx1.freebsd.org (Postfix) with ESMTP id 5704E1D3C
 for <net@freebsd.org>; Mon,  2 Dec 2013 22:10:05 +0000 (UTC)
Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1])
 (authenticated bits=0)
 by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id rB2MA5NU043505
 for <net@freebsd.org>; Mon, 2 Dec 2013 14:10:05 -0800 (PST)
 (envelope-from yuri@rawbw.com)
Message-ID: <529D053D.8050700@rawbw.com>
Date: Mon, 02 Dec 2013 14:10:05 -0800
From: Yuri <yuri@rawbw.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:24.0) Gecko/20100101 Thunderbird/24.1.0
MIME-Version: 1.0
To: net@freebsd.org
Subject: How to forward UDP packets to another port and get responses with
 port translation?
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2013 22:10:06 -0000

I would like to translate the port in all DNS requests, so that the 
server works on the different port (ex. 1053) on the same net and the 
client works on the original port 53.

I am thinking about two approaches:
* forward packets into the server:
ipfw add 200 fwd 192.168.10.1,1053 udp from 192.168.10.0/24 to 
192.168.10.1 53
The problem with routing responses is that natd(8) doesn't allow to 
change the source port, only the source address. There is -alias_address 
option but no -alias_port option.

* divert and natd(8):
natd -port 8668 -interface tap0 -redirect_port udp 192.168.10.1:1053 53
$IPF 200 divert natd udp from 192.168.10.0/24 to 192.168.10.1 53 via 
tap0 keep-state

In both cases reply packets have the source port 1053, and it isn't 
clear how to make it 53.
It seems that divert only passes to natd(8) packets from one direction, 
and not from the other.

Is there a way to properly translate the ports back and forth in such 
simple UDP communication?

Yuri

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  3 02:06:28 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 85396B28;
 Tue,  3 Dec 2013 02:06:28 +0000 (UTC)
Received: from mail-pa0-x22e.google.com (mail-pa0-x22e.google.com
 [IPv6:2607:f8b0:400e:c03::22e])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4FECE1C8E;
 Tue,  3 Dec 2013 02:06:28 +0000 (UTC)
Received: by mail-pa0-f46.google.com with SMTP id kl14so2202129pab.5
 for <multiple recipients>; Mon, 02 Dec 2013 18:06:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:date:to:cc:subject:message-id:reply-to:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=nZzME3o/TJ2qf0NkP981vJJVpjvwnP5B3AfryCKhFTQ=;
 b=W+qMX4o8sfuaB3nGJFcDp0SbtqO6Jvkkfz2xHhh0zlSqpU693R1CPx4ETS85JwfJuq
 eiL/XB0VeqJtuhkNRRqTCIkIRxS1dXSDQSSxD13L38nV0hdy0hIl3dEqxu4/nK3Tdaug
 Cpuv/oSOgS/0sXumIr/n6jGyktII/hnSZ6QdN1YTWeB2RAj8D/YkNi6pL7QhfWtydAJb
 ZpRt6Edfoy/eMfGYyVL9eP+VwX95yMUkob8WaTybbcGEbZuaysWDWD8+hn6Tw+IUmw1z
 Swpo0NGp9h+cOyiqHQzIG40bHvY87vUBbhXUjImx3jLgeEHpG0PmWbMUwjl0Xn/VSo1V
 njpA==
X-Received: by 10.68.224.38 with SMTP id qz6mr7547294pbc.156.1386036386327;
 Mon, 02 Dec 2013 18:06:26 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPSA id sg1sm125857604pbb.16.2013.12.02.18.06.22
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Mon, 02 Dec 2013 18:06:25 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Tue, 03 Dec 2013 11:06:18 +0900
From: Yonghyeon PYUN <pyunyh@gmail.com>
Date: Tue, 3 Dec 2013 11:06:18 +0900
To: Randall Stewart <rrs@lakerest.net>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131203020618.GB2981@michelle.cdnetworks.com>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <1ED6A1C2-6CED-4FDA-9C61-76FBCB2D7452@lakerest.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1ED6A1C2-6CED-4FDA-9C61-76FBCB2D7452@lakerest.net>
User-Agent: Mutt/1.4.2.3i
Cc: Jack F Vogel <jfv@freebsd.org>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2013 02:06:28 -0000

On Mon, Dec 02, 2013 at 03:56:59PM -0500, Randall Stewart wrote:
> 
> On Dec 1, 2013, at 9:23 PM, Yonghyeon PYUN wrote:
> 
> > On Fri, Nov 29, 2013 at 06:24:12PM +0100, Michael Tuexen wrote:
> >> Dear all,
> >> 
> >> ifnet(9) says regarding if_transmit():
> >> 
> >> Transmit a packet on an interface or queue it if the interface is
> >> in use.  This function will return ENOBUFS if the devices software
> >> and hardware queues are both full.
> >> 
> >> The drivers for em, igb and ixgbe might also return an error even
> >> in the case the packet was enqueued. The attached patches fix this
> >> issue.
> > 
> > How do you know the packet is successfully enqueued but driver
> > returns an error?  Do non-buf-ring-aware drivers also show the same
> > behavior?
> > 
> All of the drivers have traditionally (from what I can tell
> and all the ones I have poked at) no matter if they are the new
> format (with ring-buf) or the old, would only return an error in
> the enqueue if we hit the limit.
> 
> The driver down the road can in theory drop the packet for other
> reasons (errors etc) and there is no communication back up to the
> upper layers that this occurred.
> 

Hmm, I was under the impression that buf_ring changed old behavior
we had in the past.  Before introduction of if_transmit, queuing
was done in upper layer so returning an error in driver's TX path
didn't affect upper layer.  With if_transmit, queuing and TX
processing would be done in driver.  In order to preserve old
behavior, buf-ring-aware drivers may have to return ENOBUFS as you
said.
The compatibility code introduced in if_transmit for legacy drivers
shall return ENOBUFS when there is no room in if_snd.  This is the
reason why I asked whether Michael sees the same behavior on
non-buf-ring aware drivers.

> 
> 
> >> 
> >> Any comments?
> > 
> > I'm afraid the patch you posted ignores any errors(i.e.
> > m_defrag(9), bus_dma(9) etc) happened during TX processing.
> 
> But that is always the case. Most of the time when you send
> down to if_transmit() the first time you are going to get
> your thread working on those things m_defrag() and bus_dma().. but if
> another thread awoke the driver ahead of you all you get is the return
> code of the queue into the buffers.. you can't know what is happening on
> the other thread that is actually putting the work out.
> 
> This has always been the case.
> 
> This patch I think is *very* much needed on all the ring buffer aware drivers except maybe
> Chelsio (since there's is so different it probably does not have this issue).
> 
> I will be applying this to all of Adara's code and I would *strongly* encourage Jack to get this
> in to the intel side.
> 
> I will also pull this patch (and fix all the other drivers) in the branch I will be creating
> shortly per Adrian's suggestion on the multi-Q qos stuff I was working on..
> 
> Jack? when can you get this in ??
> 
> R
> 
> 
> > 
> >> 
> >> Jack: What do you think? Would you prefer to commit the fix if
> >> you think it is acceptable?
> >> 
> >> Best regards
> >> Michael
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> > 
> 
> ------------------------------
> Randall Stewart
> 803-317-4952 (cell)
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  3 02:17:06 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1DEF2D85;
 Tue,  3 Dec 2013 02:17:06 +0000 (UTC)
Received: from mail-pd0-x22c.google.com (mail-pd0-x22c.google.com
 [IPv6:2607:f8b0:400e:c02::22c])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id DD2A91CFB;
 Tue,  3 Dec 2013 02:17:05 +0000 (UTC)
Received: by mail-pd0-f172.google.com with SMTP id g10so19326517pdj.17
 for <multiple recipients>; Mon, 02 Dec 2013 18:17:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:date:to:cc:subject:message-id:reply-to:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=Z5kB6anVb5378+2xiqTsMA3VoBzRAM9hfX/pJGPeccU=;
 b=VqgHLbR5y7BTGfFFiMMQ7hpw7rX/DwSWULmHzS2mCPtzm3ekusxrv4DobU/9aNSfA7
 dPIKUZ1SuStkTN9OxyihvJj0uXwojJxX/xk567hdsH6Zqqt38xUJRk0u236F0cdOW3/r
 A6KJsEi1R4Bw7GWM8CphEp1S2KH0YWHAFQyDpxTs9kfkWGyNCyOXUYb4YQben/Ss4LjK
 dC9gBg+7sfZ0PiLwY2lexES7c0MverPr+wsoNYAS2zf+ug9HciYGAytD6gYdvvSXglxG
 Y5A7SxWR99Vg0woj52BUINSIA+JqkvKTTzKvTPtt5qRJp7OlLjp4JVWsrPGOQU9ykmA/
 p4Zw==
X-Received: by 10.68.170.225 with SMTP id ap1mr35390933pbc.117.1386037025514; 
 Mon, 02 Dec 2013 18:17:05 -0800 (PST)
Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249])
 by mx.google.com with ESMTPSA id gg10sm125876650pbc.46.2013.12.02.18.17.02
 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Mon, 02 Dec 2013 18:17:04 -0800 (PST)
Received: by pyunyh@gmail.com (sSMTP sendmail emulation);
 Tue, 03 Dec 2013 11:16:58 +0900
From: Yonghyeon PYUN <pyunyh@gmail.com>
Date: Tue, 3 Dec 2013 11:16:58 +0900
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131203021658.GC2981@michelle.cdnetworks.com>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
User-Agent: Mutt/1.4.2.3i
Cc: Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
Reply-To: pyunyh@gmail.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2013 02:17:06 -0000

On Mon, Dec 02, 2013 at 10:48:07PM +0100, Michael Tuexen wrote:
> On Dec 2, 2013, at 3:23 AM, Yonghyeon PYUN <pyunyh@gmail.com> wrote:
> 
> > On Fri, Nov 29, 2013 at 06:24:12PM +0100, Michael Tuexen wrote:
> >> Dear all,
> >> 
> >> ifnet(9) says regarding if_transmit():
> >> 
> >> Transmit a packet on an interface or queue it if the interface is
> >> in use.  This function will return ENOBUFS if the devices software
> >> and hardware queues are both full.
> >> 
> >> The drivers for em, igb and ixgbe might also return an error even
> >> in the case the packet was enqueued. The attached patches fix this
> >> issue.
> > 
> > How do you know the packet is successfully enqueued but driver
> > returns an error?  Do non-buf-ring-aware drivers also show the same
> When debugging the issue, I saw the packet on the wire but the if_transmit()
> returning ENOBUFS.
> > behavior?
> I don't know. I saw this issue with the igb driver.

I see.

> > 
> >> 
> >> Any comments?
> > 
> > I'm afraid the patch you posted ignores any errors(i.e.
> > m_defrag(9), bus_dma(9) etc) happened during TX processing.
> Correct. I want to make sure that if ENOBUFS is returned, the
> packet hasn't made it on the wire. The other errors can occur
> for the packet provided to if_transmit() or due to packet
> processing of other packets. Am I missing something?
> 

No.  It seems the only return code buf-ring-aware drivers can
return is ENOBUFS since queuing is done in the driver.

> Best regards
> Michael
> > 
> >> 
> >> Jack: What do you think? Would you prefer to commit the fix if
> >> you think it is acceptable?
> >> 
> >> Best regards
> >> Michael
> > 
> 

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  3 18:44:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id D131A16C;
 Tue,  3 Dec 2013 18:44:40 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id A7DE718C5;
 Tue,  3 Dec 2013 18:44:40 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1C200B972;
 Tue,  3 Dec 2013 13:44:39 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: Defaults for if_capenable and detecting user initiated changes
Date: Tue, 3 Dec 2013 12:13:41 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; )
References: <0E13D481-9D6D-4B52-A5AD-B671BF3A85AF@scsiguy.com>
In-Reply-To: <0E13D481-9D6D-4B52-A5AD-B671BF3A85AF@scsiguy.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-Id: <201312031213.41677.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 03 Dec 2013 13:44:39 -0500 (EST)
Cc: "Justin T. Gibbs" <gibbs@scsiguy.com>,
 Roger Pau =?utf-8?q?Monn=C3=A9?= <royger@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2013 18:44:40 -0000

On Wednesday, November 27, 2013 12:59:08 pm Justin T. Gibbs wrote:
> Hi net,
>=20
> I=E2=80=99m reviewing a patch from Roger Pau Monn=C3=A9 for the Xen netfr=
ont driver.  The=20
goal of the change is to avoid disturbing the user=E2=80=99s settings for t=
he=20
interface just because the backend device has changed or the connection to =
the=20
backend was reset.  I=E2=80=99ve attached the latest version of the patch.
>=20
> The current patch leaves the interface settings alone if they can be=20
supported by the newly attached backend.  What would be ideal is to enable=
=20
capabilities that default to being enabled if they were not explicitly=20
disabled by the user and can be supported by the new backend.  Unfortunatel=
y,=20
I don=E2=80=99t think the if_capenable and if_capabilities fields are descr=
iptive=20
enough to deal with an interface whose capabilities can change at runtime. =
=20
Just as can be done with link speed, some of these settings need to allow a=
n=20
=E2=80=9Cauto/default=E2=80=9D setting in addition to on or off.  This woul=
d allow the user to=20
explicitly disable a capability if needed, but generally allow the system t=
o=20
chose the most optimal settings when they are supported.  Would this be=20
difficult to add?

Couldn't you maintain this state in the Xen netfront driver's softc?
You already get the ioctls that track changes to the capenable field,
so you when a change explicitly disables a capability you can set that
in a 'forced off' or 'forced on' field.  Perhaps more of a 'forced'
field that you just update by doing:

	sc->capforced |=3D (oldcapenable ^ newcapenable)

However, it's not clear to me if you can get the underlying adapters
initial capenable list.  If so, I think capforced should be all you
need to handle this (though it might be easier if you have separate
forcedon and forcedoff fields).

=2D-=20
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  3 22:51:44 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CD8D4956;
 Tue,  3 Dec 2013 22:51:44 +0000 (UTC)
Received: from olgeni.olgeni.com (host-156-246-171-31.cloudsigma.com
 [31.171.246.156])
 by mx1.freebsd.org (Postfix) with ESMTP id 738BB1918;
 Tue,  3 Dec 2013 22:51:44 +0000 (UTC)
Received: from olgeni.olgeni (unknown [82.84.68.101])
 by olgeni.olgeni.com (Postfix) with ESMTPSA id 9B573174483;
 Tue,  3 Dec 2013 23:51:42 +0100 (CET)
Date: Tue, 3 Dec 2013 23:51:41 +0100 (CET)
From: Jimmy Olgeni <olgeni@olgeni.com>
X-X-Sender: olgeni@olgeni.olgeni
To: freebsd-questions@FreeBSD.org
Subject: ipsec packets apparently not getting to destination
Message-ID: <alpine.BSF.2.00.1312032330100.87957@olgeni.olgeni>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-OpenPGP-KeyID: 0x90B7A98E6450AE47
X-OpenPGP-Fingerprint: 7133 AB4D DFC8 0A0D F891 B0D2 90B7 A98E 6450 AE47
X-OpenPGP-URL: http://olgeni.olgeni.com/~olgeni/pgp/olgeni@olgeni.com
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
Cc: freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2013 22:51:44 -0000


Hello,

I'm trying to setup a VPN server using L2TP/IPSEC, racoon (from
ipsec-tools) and mpd5, with certificates (to avoid patching racoon for
handling wildcard PSKs). PF disabled for testing, no other firewall is
active, no NAT on the server, NAT on the client using server port 4500.

Server is running 9.2-RELEASE r256712, with this config appended to
GENERIC:

device          crypto          # core crypto support
device          cryptodev       # /dev/crypto for access to h/w
device          enc             # IPsec interface.
options         IPSEC           # IP security (requires device crypto)
options         IPSEC_NAT_T     # NAT-T support, UDP encap of ESP
options         IPSEC_FILTERTUNNEL      # filter ipsec packets from a tunnel

(plus other unrelated things, ALTQ, SW_WATCHDOG, DDB, TEKEN_UTF8).

After tens of tests I got to this point...

If I disable ipsec on the Windows 8 client, the L2TP tunnel comes up
perfectly. A sample PPTP tunnel (unrelated) also works fine. I take it as
proof that mpd5 is configured in a more or less sensible manner.

My /etc/ipsec.conf looks like this:

     flush;
     spdflush;
     spdadd 0.0.0.0/0[0] 0.0.0.0/0[1701] udp -P in  ipsec esp/transport//require;
     spdadd 0.0.0.0/0[1701] 0.0.0.0/0[0] udp -P out ipsec esp/transport//require;

Which translates to this at runtime:

0.0.0.0/0[1701] 0.0.0.0/0[any] udp
         in ipsec
         esp/transport//require
         spid=58 seq=1 pid=43822
         refcnt=1
0.0.0.0/0[any] 0.0.0.0/0[1701] udp
         out ipsec
         esp/transport//require
         spid=57 seq=0 pid=43822
         refcnt=1

When connecting with L2TP/IPSEC from the Windows client, racoon shows this
output:

     (C.C.C.C -> NAT address before Windows client, S.S.S.S -> public address of L2TP server)

     2013-12-03 23:10:03: INFO: respond new phase 1 negotiation: S.S.S.S[500]<=>C.C.C.C[49216]
     2013-12-03 23:10:03: INFO: begin Identity Protection mode.
     2013-12-03 23:10:03: INFO: received broken Microsoft ID: MS NT5 ISAKMPOAKLEY
     2013-12-03 23:10:03: INFO: received Vendor ID: RFC 3947
     2013-12-03 23:10:03: INFO: received Vendor ID: draft-ietf-ipsec-nat-t-ike-02
     2013-12-03 23:10:03: INFO: received Vendor ID: FRAGMENTATION
     2013-12-03 23:10:03: [C.C.C.C] INFO: Selected NAT-T version: RFC 3947
     2013-12-03 23:10:03: ERROR: invalid DH group 20.
     2013-12-03 23:10:03: ERROR: invalid DH group 19.
     2013-12-03 23:10:03: [S.S.S.S] INFO: Hashing S.S.S.S[500] with algo #2
     2013-12-03 23:10:03: INFO: NAT-D payload #0 verified
     2013-12-03 23:10:03: [C.C.C.C] INFO: Hashing C.C.C.C[49216] with algo #2
     2013-12-03 23:10:03: INFO: NAT-D payload #1 doesn't match
     2013-12-03 23:10:03: INFO: NAT detected: PEER
     2013-12-03 23:10:03: [C.C.C.C] INFO: Hashing C.C.C.C[49216] with algo #2
     2013-12-03 23:10:03: [S.S.S.S] INFO: Hashing S.S.S.S[500] with algo #2
     2013-12-03 23:10:03: INFO: Adding remote and local NAT-D payloads.
     2013-12-03 23:10:03: INFO: NAT-T: ports changed to: C.C.C.C[4500]<->S.S.S.S[4500]
     2013-12-03 23:10:03: INFO: KA found: S.S.S.S[4500]->C.C.C.C[4500] (in_use=2)
     2013-12-03 23:10:03: WARNING: unable to get certificate CRL(3) at depth:0 SubjectName:/C=IT/ST=Lombardia/L=Milano/O=MovieReading/CN=LiveSub Client
     2013-12-03 23:10:03: WARNING: unable to get certificate CRL(3) at depth:1 SubjectName:/C=IT/ST=Lombardia/O=MovieReading/CN=ROOT CA
     2013-12-03 23:10:03: INFO: ISAKMP-SA established S.S.S.S[4500]-C.C.C.C[4500] spi:077c160ee905cf2e:062d1918ab2b788f
     2013-12-03 23:10:03: INFO: respond new phase 2 negotiation: S.S.S.S[4500]<=>C.C.C.C[4500]
     2013-12-03 23:10:03: INFO: Adjusting my encmode UDP-Transport->Transport
     2013-12-03 23:10:03: INFO: Adjusting peer's encmode UDP-Transport(4)->Transport(2)
     2013-12-03 23:10:03: INFO: IPsec-SA established: ESP/Transport S.S.S.S[500]->C.C.C.C[500] spi=225553014(0xd71aa76)
     2013-12-03 23:10:03: INFO: IPsec-SA established: ESP/Transport S.S.S.S[500]->C.C.C.C[500] spi=2749046390(0xa3db1e76)

CRL aside, which is not configured right now, certificate handling looks
ok. Client side NAT also looks good.

Also, tcpdump on enc0 shows the relevant packets coming through IPSEC:

     tcpdump: WARNING: enc0: no IPv4 address assigned
     tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
     listening on enc0, link-type ENC (OpenBSD encapsulated IP), capture size 65535 bytes
     23:10:03.521573 (authentic,confidential): SPI 0x0d71aa76: IP client.dialup.tiscali.it.l2f > olgeni.olgeni.com.l2f:  l2tp:[TLS](0/0)Ns=0,Nr=0 *MSGTYPE(SCCRQ) *PROTO_VER(1.0) *FRAMING_CAP(S) *BEARER_CAP() FIRM_VER(1539) *HOST_NAME(moviereading) VENDOR_NAME(Microsoft) *ASSND_TUN_ID(16) *RECV_WIN_SIZE(8)
     23:10:04.513077 (authentic,confidential): SPI 0x0d71aa76: IP client.dialup.tiscali.it.l2f > olgeni.olgeni.com.l2f:  l2tp:[TLS](0/0)Ns=0,Nr=0 *MSGTYPE(SCCRQ) *PROTO_VER(1.0) *FRAMING_CAP(S) *BEARER_CAP() FIRM_VER(1539) *HOST_NAME(moviereading) VENDOR_NAME(Microsoft) *ASSND_TUN_ID(16) *RECV_WIN_SIZE(8)

Now, the really weird part is that mpd5 does not even see the packets
addressed to the l2f (1701) port.

I tried to bind mpd5 both to S.S.S.S and to 0.0.0.0, but nothing
changed.

Also, if I run "socat UDP-LISTEN:1701 STDOUT" in place of mpd5, *nothing*
comes through, even if the dump on enc0 shows that something is coming in.

Running "setkey -D" shows this:

S.S.S.S C.C.C.C
         esp mode=transport spi=3417968112(0xcbba0df0) reqid=0(0x00000000)
         E: rijndael-cbc  65260e8e fd0d9dbf 8aa363d8 7cc81f41 2eb89aff d6984fb9 b7bdfc56 50774e0a
         A: hmac-sha1  fd5e6716 fe7e2c57 fc1f42b9 ec5307ab dae3ea6f
         seq=0x00000000 replay=4 flags=0x00000000 state=mature
         created: Dec  3 23:24:16 2013   current: Dec  3 23:24:29 2013
         diff: 13(s)     hard: 3600(s)   soft: 2880(s)
         last:                           hard: 0(s)      soft: 0(s)
         current: 0(bytes)       hard: 0(bytes)  soft: 0(bytes)
         allocated: 0    hard: 0 soft: 0
         sadb_seq=1 pid=43884 refcnt=1
C.C.C.C S.S.S.S
         esp mode=transport spi=253016163(0x0f14b863) reqid=0(0x00000000)
         E: rijndael-cbc  1463f10b 87e52b9b 9d32ee04 350198ae 6779d06d 3f57389b 71bffd18 72211b36
         A: hmac-sha1  1037b02e 7ec2cf51 50351bb6 cf8ab693 25d87e0a
         seq=0x00000004 replay=4 flags=0x00000000 state=mature
         created: Dec  3 23:24:16 2013   current: Dec  3 23:24:29 2013
         diff: 13(s)     hard: 3600(s)   soft: 2880(s)
         last: Dec  3 23:24:23 2013      hard: 0(s)      soft: 0(s)
         current: 532(bytes)     hard: 0(bytes)  soft: 0(bytes)
         allocated: 4    hard: 0 soft: 0
         sadb_seq=0 pid=43884 refcnt=1

I cannot imagine any obvious reason for packets getting "lost" after enc0,
so any hint would be much appreciated :)

--
jimmy

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  4 12:52:11 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CDA4AA4A
 for <freebsd-net@freebsd.org>; Wed,  4 Dec 2013 12:52:11 +0000 (UTC)
Received: from smtp-out.dnepro.net (mail.dnepro.net [178.219.93.41])
 by mx1.freebsd.org (Postfix) with ESMTP id 629701E98
 for <freebsd-net@freebsd.org>; Wed,  4 Dec 2013 12:52:10 +0000 (UTC)
Received: from traktor.dnepro.net (localhost [127.0.0.1])
 by traktor.dnepro.net (8.14.3/8.14.3) with ESMTP id rB4CLGlL010929
 for <freebsd-net@freebsd.org>; Wed, 4 Dec 2013 14:21:16 +0200 (EET)
 (envelope-from john@traktor.dnepro.net)
Received: (from john@localhost)
 by traktor.dnepro.net (8.14.3/8.14.3/Submit) id rB4CLGfw010927
 for freebsd-net@freebsd.org; Wed, 4 Dec 2013 14:21:16 +0200 (EET)
 (envelope-from john)
Date: Wed, 4 Dec 2013 14:21:16 +0200
From: Eugene Perevyazko <john@dnepro.net>
To: freebsd-net@freebsd.org
Subject: Re: ipsec packets apparently not getting to destination
Message-ID: <20131204122115.GA46835@traktor.dnepro.net>
Mail-Followup-To: freebsd-net@freebsd.org
References: <alpine.BSF.2.00.1312032330100.87957@olgeni.olgeni>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.BSF.2.00.1312032330100.87957@olgeni.olgeni>
User-Agent: Mutt/1.4.2.3i
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6
 (traktor.dnepro.net [127.0.0.1]); Wed, 04 Dec 2013 14:21:16 +0200 (EET)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Dec 2013 12:52:11 -0000

On Tue, Dec 03, 2013 at 11:51:41PM +0100, Jimmy Olgeni wrote:
> 
> Hello,
> 
> I'm trying to setup a VPN server using L2TP/IPSEC, racoon (from
> ipsec-tools) and mpd5, with certificates (to avoid patching racoon for
> handling wildcard PSKs). PF disabled for testing, no other firewall is
> active, no NAT on the server, NAT on the client using server port 4500.
> 
> Server is running 9.2-RELEASE r256712, with this config appended to
> GENERIC:
> 
> device          crypto          # core crypto support
> device          cryptodev       # /dev/crypto for access to h/w
> device          enc             # IPsec interface.
> options         IPSEC           # IP security (requires device crypto)
> options         IPSEC_NAT_T     # NAT-T support, UDP encap of ESP
> options         IPSEC_FILTERTUNNEL      # filter ipsec packets from a tunnel
> 
> (plus other unrelated things, ALTQ, SW_WATCHDOG, DDB, TEKEN_UTF8).
> 
> After tens of tests I got to this point...
> 
> If I disable ipsec on the Windows 8 client, the L2TP tunnel comes up
> perfectly. A sample PPTP tunnel (unrelated) also works fine. I take it as
> proof that mpd5 is configured in a more or less sensible manner.
> 
> My /etc/ipsec.conf looks like this:
> 
>     flush;
>     spdflush;
>     spdadd 0.0.0.0/0[0] 0.0.0.0/0[1701] udp -P in  ipsec 
>     esp/transport//require;
>     spdadd 0.0.0.0/0[1701] 0.0.0.0/0[0] udp -P out ipsec 
>     esp/transport//require;
> 
> Which translates to this at runtime:
> 
> 0.0.0.0/0[1701] 0.0.0.0/0[any] udp
>         in ipsec
>         esp/transport//require
>         spid=58 seq=1 pid=43822
>         refcnt=1
> 0.0.0.0/0[any] 0.0.0.0/0[1701] udp
>         out ipsec
>         esp/transport//require
>         spid=57 seq=0 pid=43822
>         refcnt=1
> 
> When connecting with L2TP/IPSEC from the Windows client, racoon shows this
> output:
> 
>     (C.C.C.C -> NAT address before Windows client, S.S.S.S -> public 
>     address of L2TP server)
> 
>     2013-12-03 23:10:03: INFO: respond new phase 1 negotiation: 
>     S.S.S.S[500]<=>C.C.C.C[49216]
>     2013-12-03 23:10:03: INFO: begin Identity Protection mode.
>     2013-12-03 23:10:03: INFO: received broken Microsoft ID: MS NT5 
>     ISAKMPOAKLEY
>     2013-12-03 23:10:03: INFO: received Vendor ID: RFC 3947
>     2013-12-03 23:10:03: INFO: received Vendor ID: 
>     draft-ietf-ipsec-nat-t-ike-02
>     2013-12-03 23:10:03: INFO: received Vendor ID: FRAGMENTATION
>     2013-12-03 23:10:03: [C.C.C.C] INFO: Selected NAT-T version: RFC 3947
>     2013-12-03 23:10:03: ERROR: invalid DH group 20.
>     2013-12-03 23:10:03: ERROR: invalid DH group 19.
>     2013-12-03 23:10:03: [S.S.S.S] INFO: Hashing S.S.S.S[500] with algo #2
>     2013-12-03 23:10:03: INFO: NAT-D payload #0 verified
>     2013-12-03 23:10:03: [C.C.C.C] INFO: Hashing C.C.C.C[49216] with algo #2
>     2013-12-03 23:10:03: INFO: NAT-D payload #1 doesn't match
>     2013-12-03 23:10:03: INFO: NAT detected: PEER
>     2013-12-03 23:10:03: [C.C.C.C] INFO: Hashing C.C.C.C[49216] with algo #2
>     2013-12-03 23:10:03: [S.S.S.S] INFO: Hashing S.S.S.S[500] with algo #2
>     2013-12-03 23:10:03: INFO: Adding remote and local NAT-D payloads.
>     2013-12-03 23:10:03: INFO: NAT-T: ports changed to: 
>     C.C.C.C[4500]<->S.S.S.S[4500]
>     2013-12-03 23:10:03: INFO: KA found: S.S.S.S[4500]->C.C.C.C[4500] 
>     (in_use=2)
>     2013-12-03 23:10:03: WARNING: unable to get certificate CRL(3) at 
>     depth:0 
>     SubjectName:/C=IT/ST=Lombardia/L=Milano/O=MovieReading/CN=LiveSub Client
>     2013-12-03 23:10:03: WARNING: unable to get certificate CRL(3) at 
>     depth:1 SubjectName:/C=IT/ST=Lombardia/O=MovieReading/CN=ROOT CA
>     2013-12-03 23:10:03: INFO: ISAKMP-SA established 
>     S.S.S.S[4500]-C.C.C.C[4500] spi:077c160ee905cf2e:062d1918ab2b788f
>     2013-12-03 23:10:03: INFO: respond new phase 2 negotiation: 
>     S.S.S.S[4500]<=>C.C.C.C[4500]
>     2013-12-03 23:10:03: INFO: Adjusting my encmode UDP-Transport->Transport
>     2013-12-03 23:10:03: INFO: Adjusting peer's encmode 
>     UDP-Transport(4)->Transport(2)
>     2013-12-03 23:10:03: INFO: IPsec-SA established: ESP/Transport 
>     S.S.S.S[500]->C.C.C.C[500] spi=225553014(0xd71aa76)
>     2013-12-03 23:10:03: INFO: IPsec-SA established: ESP/Transport 
>     S.S.S.S[500]->C.C.C.C[500] spi=2749046390(0xa3db1e76)
> 
> CRL aside, which is not configured right now, certificate handling looks
> ok. Client side NAT also looks good.
> 
> Also, tcpdump on enc0 shows the relevant packets coming through IPSEC:
> 
>     tcpdump: WARNING: enc0: no IPv4 address assigned
>     tcpdump: verbose output suppressed, use -v or -vv for full protocol 
>     decode
>     listening on enc0, link-type ENC (OpenBSD encapsulated IP), capture 
>     size 65535 bytes
>     23:10:03.521573 (authentic,confidential): SPI 0x0d71aa76: IP 
>     client.dialup.tiscali.it.l2f > olgeni.olgeni.com.l2f:  
>     l2tp:[TLS](0/0)Ns=0,Nr=0 *MSGTYPE(SCCRQ) *PROTO_VER(1.0) 
>     *FRAMING_CAP(S) *BEARER_CAP() FIRM_VER(1539) *HOST_NAME(moviereading) 
>     VENDOR_NAME(Microsoft) *ASSND_TUN_ID(16) *RECV_WIN_SIZE(8)
>     23:10:04.513077 (authentic,confidential): SPI 0x0d71aa76: IP 
>     client.dialup.tiscali.it.l2f > olgeni.olgeni.com.l2f:  
>     l2tp:[TLS](0/0)Ns=0,Nr=0 *MSGTYPE(SCCRQ) *PROTO_VER(1.0) 
>     *FRAMING_CAP(S) *BEARER_CAP() FIRM_VER(1539) *HOST_NAME(moviereading) 
>     VENDOR_NAME(Microsoft) *ASSND_TUN_ID(16) *RECV_WIN_SIZE(8)
> 
> Now, the really weird part is that mpd5 does not even see the packets
> addressed to the l2f (1701) port.
> 
> I tried to bind mpd5 both to S.S.S.S and to 0.0.0.0, but nothing
> changed.
> 
> Also, if I run "socat UDP-LISTEN:1701 STDOUT" in place of mpd5, *nothing*
> comes through, even if the dump on enc0 shows that something is coming in.
> 
> Running "setkey -D" shows this:
> 
> S.S.S.S C.C.C.C
>         esp mode=transport spi=3417968112(0xcbba0df0) reqid=0(0x00000000)
>         E: rijndael-cbc  65260e8e fd0d9dbf 8aa363d8 7cc81f41 2eb89aff 
>         d6984fb9 b7bdfc56 50774e0a
>         A: hmac-sha1  fd5e6716 fe7e2c57 fc1f42b9 ec5307ab dae3ea6f
>         seq=0x00000000 replay=4 flags=0x00000000 state=mature
>         created: Dec  3 23:24:16 2013   current: Dec  3 23:24:29 2013
>         diff: 13(s)     hard: 3600(s)   soft: 2880(s)
>         last:                           hard: 0(s)      soft: 0(s)
>         current: 0(bytes)       hard: 0(bytes)  soft: 0(bytes)
>         allocated: 0    hard: 0 soft: 0
>         sadb_seq=1 pid=43884 refcnt=1
> C.C.C.C S.S.S.S
>         esp mode=transport spi=253016163(0x0f14b863) reqid=0(0x00000000)
>         E: rijndael-cbc  1463f10b 87e52b9b 9d32ee04 350198ae 6779d06d 
>         3f57389b 71bffd18 72211b36
>         A: hmac-sha1  1037b02e 7ec2cf51 50351bb6 cf8ab693 25d87e0a
>         seq=0x00000004 replay=4 flags=0x00000000 state=mature
>         created: Dec  3 23:24:16 2013   current: Dec  3 23:24:29 2013
>         diff: 13(s)     hard: 3600(s)   soft: 2880(s)
>         last: Dec  3 23:24:23 2013      hard: 0(s)      soft: 0(s)
>         current: 532(bytes)     hard: 0(bytes)  soft: 0(bytes)
>         allocated: 4    hard: 0 soft: 0
>         sadb_seq=0 pid=43884 refcnt=1
> 
> I cannot imagine any obvious reason for packets getting "lost" after enc0,
> so any hint would be much appreciated :)
> 

mpd uses netgraph for most if not all processing. Could it be that ipsec-processed packets do not enter corresponding netgraph node?
You can look at the netgraph tree to see where mpd expects to see incoming
packets.

-- 
Eugene Perevyazko

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 08:41:56 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 0BDFDAC0
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 08:41:56 +0000 (UTC)
Received: from segfault.kiev.ua (segfault.kiev.ua [193.193.193.4])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3EDC3186B
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 08:41:54 +0000 (UTC)
Received: from segfault.kiev.ua (localhost.segfault.kiev.ua [127.0.0.1])
 by segfault.kiev.ua (8.14.5/8.14.5/8.Who.Cares) with ESMTP id rB58flqe031839; 
 Thu, 5 Dec 2013 10:41:47 +0200 (EET)
 (envelope-from netch@segfault.kiev.ua)
Received: (from netch@localhost)
 by segfault.kiev.ua (8.14.5/8.14.5/Submit) id rB58fghP031836;
 Thu, 5 Dec 2013 10:41:42 +0200 (EET) (envelope-from netch)
Date: Thu, 5 Dec 2013 10:41:42 +0200
From: Valentin Nechayev <netch@netch.kiev.ua>
To: freebsd-net@freebsd.org
Subject: SCTP huge connect delays (at amd64) and yet another question
Message-ID: <20131205084142.GA31113@netch.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-42: On
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 08:41:56 -0000

Hi,

I've got some test results which are surprising and I would get
a clarification.

A simple connection is created between two one-to-one SCTP sockets
(AF_INET, SOCK_STREAM, IPPROTO_SCTP) at loopback (127.0.0.1). The
server side sends 6 3-byte messages to client side and optionally
designates writing shutdown. Client receives all them and measures
a time before each receiving.
Code is showed at the end of this message.
Tested systems are:
* FreeBSD 9.2-release/amd64
* FreeBSD 9.1-release/amd64
* FreeBSD 9.1-release/i386
* Linux OpenSuSE 12.2, kernel 3.4.63-2.44-default, x86_64
* Linux RHEL 6.3, kernel 2.6.32-279.22.1.38.0.el6.x86_64

The first discrepancy found is specific for FreeBSD on amd64 and not
for i386 version; it's that connection setup lasts 2-4 seconds (!!)
  Tcpdump shows indication that could be parsed as message miss:

tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 65535 byt
es
08:18:34.639422 IP (tos 0x0, ttl 64, id 65094, offset 0, flags [none], proto SCT
P (132), length 188, bad cksum 0 (->f274)!)
    10.0.0.2.50025 > 127.0.0.1.2500: sctp
        1) [INIT] [init tag: 3943463987] [rwnd: 1864135] [OS: 10] [MIS: 2048] [i
nit TSN: 3475830004]
08:18:34.639450 IP (tos 0x0, ttl 64, id 42621, offset 0, flags [none], proto SCT
P (132), length 524, bad cksum 0 (->48ee)!)
    127.0.0.1.2500 > 10.0.0.2.50025: sctp
        1) [INIT ACK] [init tag: 59811639] [rwnd: 1864135] [OS: 10] [MIS: 2048]
[init TSN: 466863335]
08:18:34.639467 IP (tos 0x0, ttl 64, id 52783, offset 0, flags [none], proto SCT
P (132), length 424, bad cksum 0 (->21a0)!)
    10.0.0.2.50025 > 127.0.0.1.2500: sctp
        1) [COOKIE ECHO]
08:18:35.639618 IP (tos 0x0, ttl 64, id 12109, offset 0, flags [DF], proto SCTP
(132), length 424, bad cksum 0 (->8082)!)
    10.0.0.2.50025 > 127.0.0.1.2500: sctp
        1) [COOKIE ECHO]
08:18:36.692628 IP (tos 0x0, ttl 64, id 48682, offset 0, flags [DF], proto SCTP
(132), length 76, bad cksum 0 (->7e01)!)
    127.0.0.1.2500 > 127.0.0.1.50025: sctp
        1) [HB REQ]
08:18:36.692668 IP (tos 0x0, ttl 64, id 10809, offset 0, flags [DF], proto SCTP (132), length 76, bad cksum 0 (->86f2)!)
    10.0.0.2.50025 > 127.0.0.1.2500: sctp
        1) [HB ACK] 
08:18:36.692707 IP (tos 0x2,ECT(0), ttl 64, id 16588, offset 0, flags [DF], proto SCTP (132), length 52, bad cksum 0 (->fb75)!)
    127.0.0.1.2500 > 127.0.0.1.50025: sctp
        1) [DATA] (B)(E) [TSN: 466863335] [SID: 0] [SSEQ 0] [PPID 0x0] [Payload:
        0x0000:  6162 63                                  abc
[...]

At 08:18:34.639467, cookie echo was sent but likely ignored. One
second later it was resent. Then, yet another strange timeout was
invented before HB REQ.

Test series show this can spend more than 4 seconds, average value
is about 3 seconds. Two 20-times run summary times are 58 to 63
seconds, so, I've got 2.9...3.15 average connect time.

Neither Linux nor 32-bit FreeBSD shows this.

The second discrepancy is well known case of so-called "Nagle"
algorithm adapted for SCTP but details are confusing. If
SCTP_NODELAY isn't turned on on server side, tcpdump shows that the
second packet is sent from sender side without delay, but receiver's
SACK is delayed for 200 ms by default. These results are identical for
FreeBSD (32 bit) and Linux, but not amd64 FreeBSD (see below). But
why? A common sense suggests that, if client receives all immediately,
and server has already prepared its data, no additional delay shall be
invented. In analogue to TCP, I would expect that, until acknoledge
for "abc" is got, "def" isn't sent, but then the latter is sent
immediately.

09:28:11.374335 IP (tos 0x2,ECT(0), ttl 64, id 24204, offset 0, flags [DF], prot
o SCTP (132), length 52, bad cksum 0 (->ddb5)!)
    127.0.0.1.2500 > 127.0.0.1.41007: sctp
        1) [DATA] (B)(E) [TSN: 183313025] [SID: 0] [SSEQ 0] [PPID 0x0] [Payload:
        0x0000:  6162 63                                  abc
09:28:11.374349 IP (tos 0x0, ttl 64, id 522, offset 0, flags [none], proto SCTP 
(132), length 48, bad cksum 0 (->7a3e)!)
    127.0.0.1.41007 > 127.0.0.1.2500: sctp
        1) [SACK] [cum ack 183313025] [a_rwnd 1863876] [#gap acks 0] [#dup tsns 
0] 
09:28:11.374368 IP (tos 0x2,ECT(0), ttl 64, id 64629, offset 0, flags [DF], prot
o SCTP (132), length 52, bad cksum 0 (->3fcc)!)
    127.0.0.1.2500 > 127.0.0.1.41007: sctp
        1) [DATA] (B)(E) [TSN: 183313026] [SID: 0] [SSEQ 1] [PPID 0x0] [Payload:
        0x0000:  6465 66                                  def
09:28:11.573780 IP (tos 0x0, ttl 64, id 12179, offset 0, flags [none], proto SCT
P (132), length 48, bad cksum 0 (->4cb5)!)
    127.0.0.1.41007 > 127.0.0.1.2500: sctp
        1) [SACK] [cum ack 183313026] [a_rwnd 1864135] [#gap acks 0] [#dup tsns 
0] 

But, if server shuts its writing side down ("s" in argv[]), this
laziness disappears. Again, the logic is too opaque and confusing.

64-bit (amd64) FreeBSD shows another behavior (both 9.1 and 9.2): in
addition to setup delay (see above), the delay between 2nd and 3rd
received packet (case SCTP_NODELAY isn't activated) could be longer
than minimally needed one and spreads between a few hundreds of
microseconds up to full 0.2 second delay shown on other platforms.
In average, 1/8 of runs show this delay:

$ fgrep ghi ll | sort -rn -k2,2 -t= | uniq -c
   1 got: ghi (with MSG_EOR) tdiff=200835
   1 got: ghi (with MSG_EOR) tdiff=200829
   1 got: ghi (with MSG_EOR) tdiff=200826
   1 got: ghi (with MSG_EOR) tdiff=200822
   1 got: ghi (with MSG_EOR) tdiff=200819
   1 got: ghi (with MSG_EOR) tdiff=200800
   1 got: ghi (with MSG_EOR) tdiff=200792
   1 got: ghi (with MSG_EOR) tdiff=199885
   1 got: ghi (with MSG_EOR) tdiff=163816
   1 got: ghi (with MSG_EOR) tdiff=55849
   1 got: ghi (with MSG_EOR) tdiff=1825
  21 got: ghi (with MSG_EOR) tdiff=2
  38 got: ghi (with MSG_EOR) tdiff=1

It's definitely better than delay each run, as on other platforms
(but the initial delay annoys roughly).

The testing code:
===
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/poll.h>
#include <sys/time.h>
#include <netinet/in.h>
#include <netinet/sctp.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <err.h>

#define PORT 2500

int main(int argc, char *argv[])
{
  int s_li, s_ac, s_cl;
  struct sockaddr_in sia;
  struct iovec iov[1];
  struct msghdr msg;
  socklen_t slen;
  struct timeval tv0, tv1;
  int tdiff;
  int i;

  s_li = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
  if (s_li < 0)
    err(1, "socket");
  memset(&sia, 0, sizeof(sia));
  sia.sin_family = AF_INET;
  sia.sin_addr.s_addr = htonl(0x7F000001);
  sia.sin_port = htons(PORT);
  if (bind(s_li, (struct sockaddr*)&sia, sizeof(sia)) < 0)
    err(1, "bind");
  if (listen(s_li, 1) < 0)
    err(1, "listen");
  s_cl = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
  if (s_cl < 0)
    err(1, "socket");
  if (connect(s_cl, (struct sockaddr*)&sia, sizeof(sia)) < 0)
    err(1, "connect");
  slen = sizeof(sia);
  s_ac = accept(s_li, (struct sockaddr*) &sia, &slen);
  if (s_ac < 0)
    err(1, "accept");
  for (i = 1; i < argc; ++i) {
    if (!strcmp(argv[i], "nn")) {
      const int one = 1;
      if (setsockopt(s_ac, IPPROTO_SCTP, SCTP_NODELAY, &one, sizeof(one)) < 0)
        warn("setsockopt(SCTP_NODELAY)");
    }
  }
  if (send(s_ac, "abc", 3, 0) != 3)
    err(1, "send");
  if (send(s_ac, "def", 3, MSG_EOR) != 3)
    err(1, "send");
  if (send(s_ac, "ghi", 3, 0) != 3)
    err(1, "send");
  if (send(s_ac, "jkl", 3, MSG_EOR) != 3)
    err(1, "send");
  if (send(s_ac, "mno", 3, 0) != 3)
    err(1, "send");
  if (send(s_ac, "pqr", 3, MSG_EOR) != 3)
    err(1, "send");
  for (i = 1; i < argc; ++i) {
    if (!strcmp(argv[i], "s"))
      shutdown(s_ac, SHUT_WR);
  }
  for(;;) {
    char buf[1024];
    memset(&msg, 0, sizeof(msg));
    iov[0].iov_base = buf; iov[0].iov_len = sizeof(buf) - 1;
    msg.msg_iov = iov; msg.msg_iovlen = 1;
    gettimeofday(&tv0, NULL);
    ssize_t got = recvmsg(s_cl, &msg, 0);
    gettimeofday(&tv1, NULL);
    tdiff = (int)tv1.tv_usec - (int)tv0.tv_usec;
    if (tdiff < 0)
      tdiff += 1000000;
    if (got == 0)
      break;
    if (got == -1) {
      perror("recvmsg");
      break;
    }
    buf[got] = 0;
    printf("got: %s (%s MSG_EOR) tdiff=%d\n",
        buf,
        (msg.msg_flags & MSG_EOR) ? "with" : "without",
        tdiff);
    if (!strncmp(buf, "pqr", 3))
      break;
  }
  return 0;
}
// vim:ts=2:sts=2:sw=2:et:si:
===


-netch-

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 10:32:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id E85085A5
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 10:32:04 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1F4ED103B
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 10:32:04 +0000 (UTC)
Received: from [10.225.9.5] (unknown [194.95.73.101])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id ED9271C0C0692;
 Thu,  5 Dec 2013 11:32:00 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: SCTP huge connect delays (at amd64) and yet another question
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131205084142.GA31113@netch.kiev.ua>
Date: Thu, 5 Dec 2013 11:32:03 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
References: <20131205084142.GA31113@netch.kiev.ua>
To: Valentin Nechayev <netch@netch.kiev.ua>
X-Mailer: Apple Mail (2.1510)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 10:32:05 -0000

On Dec 5, 2013, at 9:41 AM, Valentin Nechayev <netch@netch.kiev.ua> =
wrote:

> Hi,
>=20
> I've got some test results which are surprising and I would get
> a clarification.
>=20
> A simple connection is created between two one-to-one SCTP sockets
> (AF_INET, SOCK_STREAM, IPPROTO_SCTP) at loopback (127.0.0.1). The
> server side sends 6 3-byte messages to client side and optionally
> designates writing shutdown. Client receives all them and measures
> a time before each receiving.
> Code is showed at the end of this message.
> Tested systems are:
> * FreeBSD 9.2-release/amd64
> * FreeBSD 9.1-release/amd64
> * FreeBSD 9.1-release/i386
> * Linux OpenSuSE 12.2, kernel 3.4.63-2.44-default, x86_64
> * Linux RHEL 6.3, kernel 2.6.32-279.22.1.38.0.el6.x86_64
>=20
> The first discrepancy found is specific for FreeBSD on amd64 and not
> for i386 version; it's that connection setup lasts 2-4 seconds (!!)
>  Tcpdump shows indication that could be parsed as message miss:
Hi Valentin,

could you send me the .pcap file instead of the tcpdump output.
I would like to see the addresses listed in the INIT and INIT-ACK.

You can send that file to tuexen@freebsd.org.
>=20
> tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size =
65535 byt
> es
> 08:18:34.639422 IP (tos 0x0, ttl 64, id 65094, offset 0, flags [none], =
proto SCT
> P (132), length 188, bad cksum 0 (->f274)!)
>    10.0.0.2.50025 > 127.0.0.1.2500: sctp
I'm wondering why 10.0.0.2 is the source address and not 127.0.0.1
>        1) [INIT] [init tag: 3943463987] [rwnd: 1864135] [OS: 10] [MIS: =
2048] [i
> nit TSN: 3475830004]
> 08:18:34.639450 IP (tos 0x0, ttl 64, id 42621, offset 0, flags [none], =
proto SCT
> P (132), length 524, bad cksum 0 (->48ee)!)
>    127.0.0.1.2500 > 10.0.0.2.50025: sctp
>        1) [INIT ACK] [init tag: 59811639] [rwnd: 1864135] [OS: 10] =
[MIS: 2048]
> [init TSN: 466863335]
> 08:18:34.639467 IP (tos 0x0, ttl 64, id 52783, offset 0, flags [none], =
proto SCT
> P (132), length 424, bad cksum 0 (->21a0)!)
>    10.0.0.2.50025 > 127.0.0.1.2500: sctp
>        1) [COOKIE ECHO]
> 08:18:35.639618 IP (tos 0x0, ttl 64, id 12109, offset 0, flags [DF], =
proto SCTP
> (132), length 424, bad cksum 0 (->8082)!)
>    10.0.0.2.50025 > 127.0.0.1.2500: sctp
>        1) [COOKIE ECHO]
> 08:18:36.692628 IP (tos 0x0, ttl 64, id 48682, offset 0, flags [DF], =
proto SCTP
> (132), length 76, bad cksum 0 (->7e01)!)
>    127.0.0.1.2500 > 127.0.0.1.50025: sctp
The retransmission goes from 127.0.0.1. Hmm. Not sure why.
>        1) [HB REQ]
> 08:18:36.692668 IP (tos 0x0, ttl 64, id 10809, offset 0, flags [DF], =
proto SCTP (132), length 76, bad cksum 0 (->86f2)!)
>    10.0.0.2.50025 > 127.0.0.1.2500: sctp
>        1) [HB ACK]=20
> 08:18:36.692707 IP (tos 0x2,ECT(0), ttl 64, id 16588, offset 0, flags =
[DF], proto SCTP (132), length 52, bad cksum 0 (->fb75)!)
>    127.0.0.1.2500 > 127.0.0.1.50025: sctp
>        1) [DATA] (B)(E) [TSN: 466863335] [SID: 0] [SSEQ 0] [PPID 0x0] =
[Payload:
>        0x0000:  6162 63                                  abc
> [...]
>=20
> At 08:18:34.639467, cookie echo was sent but likely ignored. One
> second later it was resent. Then, yet another strange timeout was
> invented before HB REQ.
>=20
> Test series show this can spend more than 4 seconds, average value
> is about 3 seconds. Two 20-times run summary times are 58 to 63
> seconds, so, I've got 2.9...3.15 average connect time.
>=20
> Neither Linux nor 32-bit FreeBSD shows this.
FreeBSD should neither... Do you see this on FreeBSD 9.2 amd64?
>=20
> The second discrepancy is well known case of so-called "Nagle"
> algorithm adapted for SCTP but details are confusing. If
> SCTP_NODELAY isn't turned on on server side, tcpdump shows that the
> second packet is sent from sender side without delay, but receiver's
> SACK is delayed for 200 ms by default. These results are identical for
> FreeBSD (32 bit) and Linux, but not amd64 FreeBSD (see below). But
> why? A common sense suggests that, if client receives all immediately,
> and server has already prepared its data, no additional delay shall be
> invented. In analogue to TCP, I would expect that, until acknoledge
> for "abc" is got, "def" isn't sent, but then the latter is sent
> immediately.
>=20
> 09:28:11.374335 IP (tos 0x2,ECT(0), ttl 64, id 24204, offset 0, flags =
[DF], prot
> o SCTP (132), length 52, bad cksum 0 (->ddb5)!)
>    127.0.0.1.2500 > 127.0.0.1.41007: sctp
>        1) [DATA] (B)(E) [TSN: 183313025] [SID: 0] [SSEQ 0] [PPID 0x0] =
[Payload:
>        0x0000:  6162 63                                  abc
> 09:28:11.374349 IP (tos 0x0, ttl 64, id 522, offset 0, flags [none], =
proto SCTP=20
> (132), length 48, bad cksum 0 (->7a3e)!)
>    127.0.0.1.41007 > 127.0.0.1.2500: sctp
>        1) [SACK] [cum ack 183313025] [a_rwnd 1863876] [#gap acks 0] =
[#dup tsns=20
> 0]=20
> 09:28:11.374368 IP (tos 0x2,ECT(0), ttl 64, id 64629, offset 0, flags =
[DF], prot
> o SCTP (132), length 52, bad cksum 0 (->3fcc)!)
>    127.0.0.1.2500 > 127.0.0.1.41007: sctp
>        1) [DATA] (B)(E) [TSN: 183313026] [SID: 0] [SSEQ 1] [PPID 0x0] =
[Payload:
>        0x0000:  6465 66                                  def
> 09:28:11.573780 IP (tos 0x0, ttl 64, id 12179, offset 0, flags [none], =
proto SCT
> P (132), length 48, bad cksum 0 (->4cb5)!)
>    127.0.0.1.41007 > 127.0.0.1.2500: sctp
>        1) [SACK] [cum ack 183313026] [a_rwnd 1864135] [#gap acks 0] =
[#dup tsns=20
> 0]=20
>=20
Please note, that the first SACK is returned without the 200ms delay. =
This is
required by the RFC and the above trace seems to show that.
> But, if server shuts its writing side down ("s" in argv[]), this
> laziness disappears. Again, the logic is too opaque and confusing.
What do you mean by this?
>=20
> 64-bit (amd64) FreeBSD shows another behavior (both 9.1 and 9.2): in
> addition to setup delay (see above), the delay between 2nd and 3rd
> received packet (case SCTP_NODELAY isn't activated) could be longer
> than minimally needed one and spreads between a few hundreds of
> microseconds up to full 0.2 second delay shown on other platforms.
> In average, 1/8 of runs show this delay:
>=20
> $ fgrep ghi ll | sort -rn -k2,2 -t=3D | uniq -c
>   1 got: ghi (with MSG_EOR) tdiff=3D200835
>   1 got: ghi (with MSG_EOR) tdiff=3D200829
>   1 got: ghi (with MSG_EOR) tdiff=3D200826
>   1 got: ghi (with MSG_EOR) tdiff=3D200822
>   1 got: ghi (with MSG_EOR) tdiff=3D200819
>   1 got: ghi (with MSG_EOR) tdiff=3D200800
>   1 got: ghi (with MSG_EOR) tdiff=3D200792
>   1 got: ghi (with MSG_EOR) tdiff=3D199885
>   1 got: ghi (with MSG_EOR) tdiff=3D163816
>   1 got: ghi (with MSG_EOR) tdiff=3D55849
>   1 got: ghi (with MSG_EOR) tdiff=3D1825
>  21 got: ghi (with MSG_EOR) tdiff=3D2
>  38 got: ghi (with MSG_EOR) tdiff=3D1
>=20
> It's definitely better than delay each run, as on other platforms
> (but the initial delay annoys roughly).
Without SCTP_NODELAY bundling can happen or not, it depends on timing.
It would be great, if you can provide a .pcap file for a transfer you
think shows some buggy behaviour. Then we can figure out what is going =
on.
>=20
> The testing code:
> =3D=3D=3D
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <sys/poll.h>
> #include <sys/time.h>
> #include <netinet/in.h>
> #include <netinet/sctp.h>
> #include <arpa/inet.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <string.h>
> #include <err.h>
>=20
> #define PORT 2500
>=20
> int main(int argc, char *argv[])
> {
>  int s_li, s_ac, s_cl;
>  struct sockaddr_in sia;
>  struct iovec iov[1];
>  struct msghdr msg;
>  socklen_t slen;
>  struct timeval tv0, tv1;
>  int tdiff;
>  int i;
>=20
>  s_li =3D socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
>  if (s_li < 0)
>    err(1, "socket");
>  memset(&sia, 0, sizeof(sia));
>  sia.sin_family =3D AF_INET;
>  sia.sin_addr.s_addr =3D htonl(0x7F000001);
>  sia.sin_port =3D htons(PORT);
>  if (bind(s_li, (struct sockaddr*)&sia, sizeof(sia)) < 0)
>    err(1, "bind");
>  if (listen(s_li, 1) < 0)
>    err(1, "listen");
>  s_cl =3D socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
>  if (s_cl < 0)
>    err(1, "socket");
>  if (connect(s_cl, (struct sockaddr*)&sia, sizeof(sia)) < 0)
>    err(1, "connect");
>  slen =3D sizeof(sia);
>  s_ac =3D accept(s_li, (struct sockaddr*) &sia, &slen);
>  if (s_ac < 0)
>    err(1, "accept");
>  for (i =3D 1; i < argc; ++i) {
>    if (!strcmp(argv[i], "nn")) {
>      const int one =3D 1;
>      if (setsockopt(s_ac, IPPROTO_SCTP, SCTP_NODELAY, &one, =
sizeof(one)) < 0)
>        warn("setsockopt(SCTP_NODELAY)");
>    }
>  }
>  if (send(s_ac, "abc", 3, 0) !=3D 3)
>    err(1, "send");
>  if (send(s_ac, "def", 3, MSG_EOR) !=3D 3)
MSG_EOR is nothing you provide at a send() call. The flag is only
returned by the recvmsg() call.
>    err(1, "send");
>  if (send(s_ac, "ghi", 3, 0) !=3D 3)
>    err(1, "send");
>  if (send(s_ac, "jkl", 3, MSG_EOR) !=3D 3)
>    err(1, "send");
>  if (send(s_ac, "mno", 3, 0) !=3D 3)
>    err(1, "send");
>  if (send(s_ac, "pqr", 3, MSG_EOR) !=3D 3)
>    err(1, "send");
>  for (i =3D 1; i < argc; ++i) {
>    if (!strcmp(argv[i], "s"))
>      shutdown(s_ac, SHUT_WR);
>  }
>  for(;;) {
>    char buf[1024];
>    memset(&msg, 0, sizeof(msg));
>    iov[0].iov_base =3D buf; iov[0].iov_len =3D sizeof(buf) - 1;
>    msg.msg_iov =3D iov; msg.msg_iovlen =3D 1;
>    gettimeofday(&tv0, NULL);
>    ssize_t got =3D recvmsg(s_cl, &msg, 0);
>    gettimeofday(&tv1, NULL);
>    tdiff =3D (int)tv1.tv_usec - (int)tv0.tv_usec;
>    if (tdiff < 0)
>      tdiff +=3D 1000000;
>    if (got =3D=3D 0)
>      break;
>    if (got =3D=3D -1) {
>      perror("recvmsg");
>      break;
>    }
>    buf[got] =3D 0;
>    printf("got: %s (%s MSG_EOR) tdiff=3D%d\n",
>        buf,
>        (msg.msg_flags & MSG_EOR) ? "with" : "without",
>        tdiff);
>    if (!strncmp(buf, "pqr", 3))
>      break;
>  }
>  return 0;
> }
OK. Here is what I would expect on the wire:

Without SCTP_NODELAY:

> INIT
< INIT_ACK
> COOKIE_ECHO
< COOKIE_ACK
< DATA(abc)
> SACK
< DATA(def);DATA(ghi);DATA(jkl);DATA(mno);DATA(pqr)
> SACK
> SHUTDOWN
< SHUTDOWN_ACK
> SHUTDOWN_COMPLETE

There should be no substantial delay between any messages above.

With SCTP_NODELAY
> INIT
< INIT_ACK
> COOKIE_ECHO
< COOKIE_ACK
< DATA(abc)
< DATA(def)
< DATA(ghi)
< DATA(mno)
< DATA(pqr)
> SHUTDOWN
< SHUTDOWN_ACK
> SHUTDOWN_COMPLETE

There will be three SACK somewhere between the DATA chunks depending on
the timing.

There should be no substantial delay between any messages above.

I think if you see anything else, there is a bug. So do you see a =
different
behavior on FreeBSD 9.2 (i386/amd64)? If yes, can you provide a .pcap =
file?


Here is what I see on a 9.2 amd64 system:

tuexen@bsd9:~ % uname -a
FreeBSD bsd9.fh-muenster.de 9.2-RELEASE FreeBSD 9.2-RELEASE #0 r255898: =
Thu Sep 26 22:50:31 UTC 2013     =
root@bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
tuexen@bsd9:~ % ./valentin=20
got: abc (with MSG_EOR) tdiff=3D3
got: def (with MSG_EOR) tdiff=3D1
got: ghi (with MSG_EOR) tdiff=3D1
got: jkl (with MSG_EOR) tdiff=3D1
got: mno (with MSG_EOR) tdiff=3D1
got: pqr (with MSG_EOR) tdiff=3D0
tuexen@bsd9:~ % ./valentin nn
got: abc (with MSG_EOR) tdiff=3D4
got: def (with MSG_EOR) tdiff=3D2
got: ghi (with MSG_EOR) tdiff=3D1
got: jkl (with MSG_EOR) tdiff=3D1
got: mno (with MSG_EOR) tdiff=3D1
got: pqr (with MSG_EOR) tdiff=3D1

Do you have any special routing setup?

Best regards
Michael
> // vim:ts=3D2:sts=3D2:sw=3D2:et:si:
> =3D=3D=3D
>=20
>=20
> -netch-
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 10:57:45 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id BDE1DC5C
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 10:57:45 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E5B7F11AF
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 10:57:44 +0000 (UTC)
Received: from [10.225.9.5] (unknown [194.95.73.101])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 655081C0C0693;
 Thu,  5 Dec 2013 11:57:43 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: SCTP huge connect delays (at amd64) and yet another question
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
Date: Thu, 5 Dec 2013 11:57:46 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <45DB7B10-68DE-41F2-A5E9-22AFFC65999E@lurchi.franken.de>
References: <20131205084142.GA31113@netch.kiev.ua>
 <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
To: Valentin Nechayev <netch@netch.kiev.ua>
X-Mailer: Apple Mail (2.1510)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 10:57:45 -0000

More thinking and testing.

Without SCTP_NODELAY the following can also happen:

  > INIT
  < INIT-ACK
  < COOKIE-ECHO
  > COOKIE-ACK
  < DATA(abc)
  > SACK
  < DATA(def) possibly more...
200 ms delay
  > SACK
  < all remaining DATA chunks
  > SHUTDOWN
  < SHUTDOWN-ACK
  > SHUTDOWN-COMPLETE

Timing comes into the game. The question is if all send() calls have =
been completed
before the first SACK is received. Not sure this depends in i386 vs. =
amd64, but
timing is important. On a Raspberry Pi I saw in a reproducable way

  > INIT
  < INIT-ACK
  < COOKIE-ECHO
  > COOKIE-ACK
  < DATA(abc)
  > SACK
  < DATA(def)
200 ms delay
  > SACK
  < DATA(ghi);DATA(jkl);DATA(mno);DATA(pqr);
  > SHUTDOWN
  < SHUTDOWN-ACK
  > SHUTDOWN-COMPLETE

Best regards
Michael
On Dec 5, 2013, at 11:32 AM, Michael Tuexen =
<Michael.Tuexen@lurchi.franken.de> wrote:

> On Dec 5, 2013, at 9:41 AM, Valentin Nechayev <netch@netch.kiev.ua> =
wrote:
>=20
>> Hi,
>>=20
>> I've got some test results which are surprising and I would get
>> a clarification.
>>=20
>> A simple connection is created between two one-to-one SCTP sockets
>> (AF_INET, SOCK_STREAM, IPPROTO_SCTP) at loopback (127.0.0.1). The
>> server side sends 6 3-byte messages to client side and optionally
>> designates writing shutdown. Client receives all them and measures
>> a time before each receiving.
>> Code is showed at the end of this message.
>> Tested systems are:
>> * FreeBSD 9.2-release/amd64
>> * FreeBSD 9.1-release/amd64
>> * FreeBSD 9.1-release/i386
>> * Linux OpenSuSE 12.2, kernel 3.4.63-2.44-default, x86_64
>> * Linux RHEL 6.3, kernel 2.6.32-279.22.1.38.0.el6.x86_64
>>=20
>> The first discrepancy found is specific for FreeBSD on amd64 and not
>> for i386 version; it's that connection setup lasts 2-4 seconds (!!)
>> Tcpdump shows indication that could be parsed as message miss:
> Hi Valentin,
>=20
> could you send me the .pcap file instead of the tcpdump output.
> I would like to see the addresses listed in the INIT and INIT-ACK.
>=20
> You can send that file to tuexen@freebsd.org.
>>=20
>> tcpdump: listening on lo0, link-type NULL (BSD loopback), capture =
size 65535 byt
>> es
>> 08:18:34.639422 IP (tos 0x0, ttl 64, id 65094, offset 0, flags =
[none], proto SCT
>> P (132), length 188, bad cksum 0 (->f274)!)
>>   10.0.0.2.50025 > 127.0.0.1.2500: sctp
> I'm wondering why 10.0.0.2 is the source address and not 127.0.0.1
>>       1) [INIT] [init tag: 3943463987] [rwnd: 1864135] [OS: 10] [MIS: =
2048] [i
>> nit TSN: 3475830004]
>> 08:18:34.639450 IP (tos 0x0, ttl 64, id 42621, offset 0, flags =
[none], proto SCT
>> P (132), length 524, bad cksum 0 (->48ee)!)
>>   127.0.0.1.2500 > 10.0.0.2.50025: sctp
>>       1) [INIT ACK] [init tag: 59811639] [rwnd: 1864135] [OS: 10] =
[MIS: 2048]
>> [init TSN: 466863335]
>> 08:18:34.639467 IP (tos 0x0, ttl 64, id 52783, offset 0, flags =
[none], proto SCT
>> P (132), length 424, bad cksum 0 (->21a0)!)
>>   10.0.0.2.50025 > 127.0.0.1.2500: sctp
>>       1) [COOKIE ECHO]
>> 08:18:35.639618 IP (tos 0x0, ttl 64, id 12109, offset 0, flags [DF], =
proto SCTP
>> (132), length 424, bad cksum 0 (->8082)!)
>>   10.0.0.2.50025 > 127.0.0.1.2500: sctp
>>       1) [COOKIE ECHO]
>> 08:18:36.692628 IP (tos 0x0, ttl 64, id 48682, offset 0, flags [DF], =
proto SCTP
>> (132), length 76, bad cksum 0 (->7e01)!)
>>   127.0.0.1.2500 > 127.0.0.1.50025: sctp
> The retransmission goes from 127.0.0.1. Hmm. Not sure why.
>>       1) [HB REQ]
>> 08:18:36.692668 IP (tos 0x0, ttl 64, id 10809, offset 0, flags [DF], =
proto SCTP (132), length 76, bad cksum 0 (->86f2)!)
>>   10.0.0.2.50025 > 127.0.0.1.2500: sctp
>>       1) [HB ACK]=20
>> 08:18:36.692707 IP (tos 0x2,ECT(0), ttl 64, id 16588, offset 0, flags =
[DF], proto SCTP (132), length 52, bad cksum 0 (->fb75)!)
>>   127.0.0.1.2500 > 127.0.0.1.50025: sctp
>>       1) [DATA] (B)(E) [TSN: 466863335] [SID: 0] [SSEQ 0] [PPID 0x0] =
[Payload:
>>       0x0000:  6162 63                                  abc
>> [...]
>>=20
>> At 08:18:34.639467, cookie echo was sent but likely ignored. One
>> second later it was resent. Then, yet another strange timeout was
>> invented before HB REQ.
>>=20
>> Test series show this can spend more than 4 seconds, average value
>> is about 3 seconds. Two 20-times run summary times are 58 to 63
>> seconds, so, I've got 2.9...3.15 average connect time.
>>=20
>> Neither Linux nor 32-bit FreeBSD shows this.
> FreeBSD should neither... Do you see this on FreeBSD 9.2 amd64?
>>=20
>> The second discrepancy is well known case of so-called "Nagle"
>> algorithm adapted for SCTP but details are confusing. If
>> SCTP_NODELAY isn't turned on on server side, tcpdump shows that the
>> second packet is sent from sender side without delay, but receiver's
>> SACK is delayed for 200 ms by default. These results are identical =
for
>> FreeBSD (32 bit) and Linux, but not amd64 FreeBSD (see below). But
>> why? A common sense suggests that, if client receives all =
immediately,
>> and server has already prepared its data, no additional delay shall =
be
>> invented. In analogue to TCP, I would expect that, until acknoledge
>> for "abc" is got, "def" isn't sent, but then the latter is sent
>> immediately.
>>=20
>> 09:28:11.374335 IP (tos 0x2,ECT(0), ttl 64, id 24204, offset 0, flags =
[DF], prot
>> o SCTP (132), length 52, bad cksum 0 (->ddb5)!)
>>   127.0.0.1.2500 > 127.0.0.1.41007: sctp
>>       1) [DATA] (B)(E) [TSN: 183313025] [SID: 0] [SSEQ 0] [PPID 0x0] =
[Payload:
>>       0x0000:  6162 63                                  abc
>> 09:28:11.374349 IP (tos 0x0, ttl 64, id 522, offset 0, flags [none], =
proto SCTP=20
>> (132), length 48, bad cksum 0 (->7a3e)!)
>>   127.0.0.1.41007 > 127.0.0.1.2500: sctp
>>       1) [SACK] [cum ack 183313025] [a_rwnd 1863876] [#gap acks 0] =
[#dup tsns=20
>> 0]=20
>> 09:28:11.374368 IP (tos 0x2,ECT(0), ttl 64, id 64629, offset 0, flags =
[DF], prot
>> o SCTP (132), length 52, bad cksum 0 (->3fcc)!)
>>   127.0.0.1.2500 > 127.0.0.1.41007: sctp
>>       1) [DATA] (B)(E) [TSN: 183313026] [SID: 0] [SSEQ 1] [PPID 0x0] =
[Payload:
>>       0x0000:  6465 66                                  def
>> 09:28:11.573780 IP (tos 0x0, ttl 64, id 12179, offset 0, flags =
[none], proto SCT
>> P (132), length 48, bad cksum 0 (->4cb5)!)
>>   127.0.0.1.41007 > 127.0.0.1.2500: sctp
>>       1) [SACK] [cum ack 183313026] [a_rwnd 1864135] [#gap acks 0] =
[#dup tsns=20
>> 0]=20
>>=20
> Please note, that the first SACK is returned without the 200ms delay. =
This is
> required by the RFC and the above trace seems to show that.
>> But, if server shuts its writing side down ("s" in argv[]), this
>> laziness disappears. Again, the logic is too opaque and confusing.
> What do you mean by this?
>>=20
>> 64-bit (amd64) FreeBSD shows another behavior (both 9.1 and 9.2): in
>> addition to setup delay (see above), the delay between 2nd and 3rd
>> received packet (case SCTP_NODELAY isn't activated) could be longer
>> than minimally needed one and spreads between a few hundreds of
>> microseconds up to full 0.2 second delay shown on other platforms.
>> In average, 1/8 of runs show this delay:
>>=20
>> $ fgrep ghi ll | sort -rn -k2,2 -t=3D | uniq -c
>>  1 got: ghi (with MSG_EOR) tdiff=3D200835
>>  1 got: ghi (with MSG_EOR) tdiff=3D200829
>>  1 got: ghi (with MSG_EOR) tdiff=3D200826
>>  1 got: ghi (with MSG_EOR) tdiff=3D200822
>>  1 got: ghi (with MSG_EOR) tdiff=3D200819
>>  1 got: ghi (with MSG_EOR) tdiff=3D200800
>>  1 got: ghi (with MSG_EOR) tdiff=3D200792
>>  1 got: ghi (with MSG_EOR) tdiff=3D199885
>>  1 got: ghi (with MSG_EOR) tdiff=3D163816
>>  1 got: ghi (with MSG_EOR) tdiff=3D55849
>>  1 got: ghi (with MSG_EOR) tdiff=3D1825
>> 21 got: ghi (with MSG_EOR) tdiff=3D2
>> 38 got: ghi (with MSG_EOR) tdiff=3D1
>>=20
>> It's definitely better than delay each run, as on other platforms
>> (but the initial delay annoys roughly).
> Without SCTP_NODELAY bundling can happen or not, it depends on timing.
> It would be great, if you can provide a .pcap file for a transfer you
> think shows some buggy behaviour. Then we can figure out what is going =
on.
>>=20
>> The testing code:
>> =3D=3D=3D
>> #include <sys/types.h>
>> #include <sys/socket.h>
>> #include <sys/poll.h>
>> #include <sys/time.h>
>> #include <netinet/in.h>
>> #include <netinet/sctp.h>
>> #include <arpa/inet.h>
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <err.h>
>>=20
>> #define PORT 2500
>>=20
>> int main(int argc, char *argv[])
>> {
>> int s_li, s_ac, s_cl;
>> struct sockaddr_in sia;
>> struct iovec iov[1];
>> struct msghdr msg;
>> socklen_t slen;
>> struct timeval tv0, tv1;
>> int tdiff;
>> int i;
>>=20
>> s_li =3D socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
>> if (s_li < 0)
>>   err(1, "socket");
>> memset(&sia, 0, sizeof(sia));
>> sia.sin_family =3D AF_INET;
>> sia.sin_addr.s_addr =3D htonl(0x7F000001);
>> sia.sin_port =3D htons(PORT);
>> if (bind(s_li, (struct sockaddr*)&sia, sizeof(sia)) < 0)
>>   err(1, "bind");
>> if (listen(s_li, 1) < 0)
>>   err(1, "listen");
>> s_cl =3D socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
>> if (s_cl < 0)
>>   err(1, "socket");
>> if (connect(s_cl, (struct sockaddr*)&sia, sizeof(sia)) < 0)
>>   err(1, "connect");
>> slen =3D sizeof(sia);
>> s_ac =3D accept(s_li, (struct sockaddr*) &sia, &slen);
>> if (s_ac < 0)
>>   err(1, "accept");
>> for (i =3D 1; i < argc; ++i) {
>>   if (!strcmp(argv[i], "nn")) {
>>     const int one =3D 1;
>>     if (setsockopt(s_ac, IPPROTO_SCTP, SCTP_NODELAY, &one, =
sizeof(one)) < 0)
>>       warn("setsockopt(SCTP_NODELAY)");
>>   }
>> }
>> if (send(s_ac, "abc", 3, 0) !=3D 3)
>>   err(1, "send");
>> if (send(s_ac, "def", 3, MSG_EOR) !=3D 3)
> MSG_EOR is nothing you provide at a send() call. The flag is only
> returned by the recvmsg() call.
>>   err(1, "send");
>> if (send(s_ac, "ghi", 3, 0) !=3D 3)
>>   err(1, "send");
>> if (send(s_ac, "jkl", 3, MSG_EOR) !=3D 3)
>>   err(1, "send");
>> if (send(s_ac, "mno", 3, 0) !=3D 3)
>>   err(1, "send");
>> if (send(s_ac, "pqr", 3, MSG_EOR) !=3D 3)
>>   err(1, "send");
>> for (i =3D 1; i < argc; ++i) {
>>   if (!strcmp(argv[i], "s"))
>>     shutdown(s_ac, SHUT_WR);
>> }
>> for(;;) {
>>   char buf[1024];
>>   memset(&msg, 0, sizeof(msg));
>>   iov[0].iov_base =3D buf; iov[0].iov_len =3D sizeof(buf) - 1;
>>   msg.msg_iov =3D iov; msg.msg_iovlen =3D 1;
>>   gettimeofday(&tv0, NULL);
>>   ssize_t got =3D recvmsg(s_cl, &msg, 0);
>>   gettimeofday(&tv1, NULL);
>>   tdiff =3D (int)tv1.tv_usec - (int)tv0.tv_usec;
>>   if (tdiff < 0)
>>     tdiff +=3D 1000000;
>>   if (got =3D=3D 0)
>>     break;
>>   if (got =3D=3D -1) {
>>     perror("recvmsg");
>>     break;
>>   }
>>   buf[got] =3D 0;
>>   printf("got: %s (%s MSG_EOR) tdiff=3D%d\n",
>>       buf,
>>       (msg.msg_flags & MSG_EOR) ? "with" : "without",
>>       tdiff);
>>   if (!strncmp(buf, "pqr", 3))
>>     break;
>> }
>> return 0;
>> }
> OK. Here is what I would expect on the wire:
>=20
> Without SCTP_NODELAY:
>=20
>> INIT
> < INIT_ACK
>> COOKIE_ECHO
> < COOKIE_ACK
> < DATA(abc)
>> SACK
> < DATA(def);DATA(ghi);DATA(jkl);DATA(mno);DATA(pqr)
>> SACK
>> SHUTDOWN
> < SHUTDOWN_ACK
>> SHUTDOWN_COMPLETE
>=20
> There should be no substantial delay between any messages above.
>=20
> With SCTP_NODELAY
>> INIT
> < INIT_ACK
>> COOKIE_ECHO
> < COOKIE_ACK
> < DATA(abc)
> < DATA(def)
> < DATA(ghi)
> < DATA(mno)
> < DATA(pqr)
>> SHUTDOWN
> < SHUTDOWN_ACK
>> SHUTDOWN_COMPLETE
>=20
> There will be three SACK somewhere between the DATA chunks depending =
on
> the timing.
>=20
> There should be no substantial delay between any messages above.
>=20
> I think if you see anything else, there is a bug. So do you see a =
different
> behavior on FreeBSD 9.2 (i386/amd64)? If yes, can you provide a .pcap =
file?
>=20
>=20
> Here is what I see on a 9.2 amd64 system:
>=20
> tuexen@bsd9:~ % uname -a
> FreeBSD bsd9.fh-muenster.de 9.2-RELEASE FreeBSD 9.2-RELEASE #0 =
r255898: Thu Sep 26 22:50:31 UTC 2013     =
root@bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
> tuexen@bsd9:~ % ./valentin=20
> got: abc (with MSG_EOR) tdiff=3D3
> got: def (with MSG_EOR) tdiff=3D1
> got: ghi (with MSG_EOR) tdiff=3D1
> got: jkl (with MSG_EOR) tdiff=3D1
> got: mno (with MSG_EOR) tdiff=3D1
> got: pqr (with MSG_EOR) tdiff=3D0
> tuexen@bsd9:~ % ./valentin nn
> got: abc (with MSG_EOR) tdiff=3D4
> got: def (with MSG_EOR) tdiff=3D2
> got: ghi (with MSG_EOR) tdiff=3D1
> got: jkl (with MSG_EOR) tdiff=3D1
> got: mno (with MSG_EOR) tdiff=3D1
> got: pqr (with MSG_EOR) tdiff=3D1
>=20
> Do you have any special routing setup?
>=20
> Best regards
> Michael
>> // vim:ts=3D2:sts=3D2:sw=3D2:et:si:
>> =3D=3D=3D
>>=20
>>=20
>> -netch-
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
>>=20
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 11:46:11 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2750216F
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 11:46:11 +0000 (UTC)
Received: from mail-ve0-f177.google.com (mail-ve0-f177.google.com
 [209.85.128.177])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id D96CC158B
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 11:46:10 +0000 (UTC)
Received: by mail-ve0-f177.google.com with SMTP id db12so13229678veb.8
 for <freebsd-net@freebsd.org>; Thu, 05 Dec 2013 03:46:04 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:date:message-id:subject:from:to
 :content-type;
 bh=cZ0SuuLqXcu1J5zThNldubYBkpS6gR2tDX0Hzf/jIEU=;
 b=BsmnILlK0YkOwK11zmMLln3azyvhJUxRKPS3jaKkk4K3sG1PwTljotlcfHR+qqOyR3
 x8G7rKN+hNJXRjOpTtKBgAIF18klyEZue7p0xMxmEZK+6rFkVtkvdJD7VEJidR2MQr1m
 EDX6+V4iuGLa1AjycMRY5JmZEdSxCinRJAr+R2/gD/62s/t3JCKBTOYrUFZ82zD0mxX4
 ZV5VsuSJbt6wpEIJo/xBqdligGyfjNB0L1SIQS2DzB+C9eODtvAFrlnTWYz98pmjHdeg
 SDRRcHTrz/9GHM8ybQTZvHjEJDZRg5UdfC/K9EyomUGRiSEkPq3sKe1Fvgkmk3MqA7iv
 CtAw==
X-Gm-Message-State: ALoCoQlaocxIH4l9bH6seEsPkp9dyhitHRHCile58Z6PLWgnVaopKzrgZnTzl8rEIL+IVJzgAteS
MIME-Version: 1.0
X-Received: by 10.220.86.69 with SMTP id r5mr62999959vcl.9.1386243964186; Thu,
 05 Dec 2013 03:46:04 -0800 (PST)
Received: by 10.221.48.3 with HTTP; Thu, 5 Dec 2013 03:46:04 -0800 (PST)
Date: Thu, 5 Dec 2013 13:46:04 +0200
Message-ID: <CA+0O9FEh2CVLbkyjJ2KFtCSmqqCu8s7BfBWN77qShcdxzNP17Q@mail.gmail.com>
Subject: Relayd and load balancing modes
From: Ilias Bertsimas <ilias@synthesio.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 11:46:11 -0000

Hello All,

We are baffled by the relayd modes for relays as it seems some of them are
not working at all.

We are running  FreeBSD 9.1-RELEASE-p7 and the latest relayd from
ports relayd-5.4.20131122.

We notice on relays with 2 hosts both of them up 100% we only get traffic
sent to just one server. We tried mode loadbalance/hash/source-hash without
any success.

We only get load balancing between the 2 hosts with roundrobin or random.

Both clients and target hosts are on the same vlan.

We also had issues with any version apart from the "stable" packaged one
that comes with FreeBSD 9.1.

We end up with 2-3 relayd child procs at 100% cpu without doing anything I
tried ktrace on them but there were no syscalls or anything else going on.

They are unresponsive and survive reloads and only can be terminated with
kill -9.

Any ideas what is going on ?


Kind Regards,

Ilias Bertsimas.

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 12:30:27 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5549ADB3
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 12:30:27 +0000 (UTC)
Received: from segfault.kiev.ua (segfault.kiev.ua [193.193.193.4])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9303D1856
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 12:30:26 +0000 (UTC)
Received: from segfault.kiev.ua (localhost.segfault.kiev.ua [127.0.0.1])
 by segfault.kiev.ua (8.14.5/8.14.5/8.Who.Cares) with ESMTP id rB5CUAtj057583; 
 Thu, 5 Dec 2013 14:30:10 +0200 (EET)
 (envelope-from netch@segfault.kiev.ua)
Received: (from netch@localhost)
 by segfault.kiev.ua (8.14.5/8.14.5/Submit) id rB5CU5Pe057580;
 Thu, 5 Dec 2013 14:30:05 +0200 (EET) (envelope-from netch)
Date: Thu, 5 Dec 2013 14:30:05 +0200
From: Valentin Nechayev <netch@netch.kiev.ua>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Subject: Re: SCTP huge connect delays (at amd64) and yet another question
Message-ID: <20131205123005.GE71737@netch.kiev.ua>
References: <20131205084142.GA31113@netch.kiev.ua>
 <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="bCsyhTFzCvuiizWE"
Content-Disposition: inline
In-Reply-To: <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
X-42: On
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 12:30:27 -0000


--bCsyhTFzCvuiizWE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi,

 Thu, Dec 05, 2013 at 11:32:03, Michael.Tuexen wrote about "Re: SCTP huge connect delays (at amd64) and yet another question": 

> > The first discrepancy found is specific for FreeBSD on amd64 and not
> > for i386 version; it's that connection setup lasts 2-4 seconds (!!)
> >  Tcpdump shows indication that could be parsed as message miss:
> Hi Valentin,
> 
> could you send me the .pcap file instead of the tcpdump output.
> I would like to see the addresses listed in the INIT and INIT-ACK.

I've sent them, thanks.

> > tcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 65535 byt
> > es
> > 08:18:34.639422 IP (tos 0x0, ttl 64, id 65094, offset 0, flags [none], proto SCT
> > P (132), length 188, bad cksum 0 (->f274)!)
> >    10.0.0.2.50025 > 127.0.0.1.2500: sctp
> I'm wondering why 10.0.0.2 is the source address and not 127.0.0.1

I've showed the code, it doesn't make any explicit binding or address
suggestion. For this host (9.1/i386), 10.0.0.2 resides on xl0. There
is no routing specifics which forces it to select 10.0.0.2:

$ route -n get 127.0.0.1
   route to: 127.0.0.1
destination: 127.0.0.1
  interface: lo0
      flags: <UP,HOST,DONE,LOCAL>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0     16384         1         0
$ telnet 127.0.0.1 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 iv.local ESMTP Sendmail 8.14.5/8.14.5; Thu, 5 Dec 2013 13:48:31 +0200 (EET)
ehlo zzz
250-iv.local Hello netch@localhost [127.0.0.1], pleased to meet you
[...]

At least for TCP and UDP, it's quite straightforward.

> > At 08:18:34.639467, cookie echo was sent but likely ignored. One
> > second later it was resent. Then, yet another strange timeout was
> > invented before HB REQ.
> > 
> > Test series show this can spend more than 4 seconds, average value
> > is about 3 seconds. Two 20-times run summary times are 58 to 63
> > seconds, so, I've got 2.9...3.15 average connect time.
> > 
> > Neither Linux nor 32-bit FreeBSD shows this.
> FreeBSD should neither... Do you see this on FreeBSD 9.2 amd64?

Yes. A fresh dump has reproduced this.

> > It's definitely better than delay each run, as on other platforms
> > (but the initial delay annoys roughly).
> Without SCTP_NODELAY bundling can happen or not, it depends on timing.
> It would be great, if you can provide a .pcap file for a transfer you
> think shows some buggy behaviour. Then we can figure out what is going on.

> MSG_EOR is nothing you provide at a send() call. The flag is only
> returned by the recvmsg() call.

Yes, I know. This has remained from the code which exposes
SOCK_SEQPACKET specifics over different transport families (e.g.
FreeBSD keeps this flag over AF_UNIX but Linux doesn't). I didn't take
it into account, but, if is needed for sight clarity, I'll remove it:)

> > }
> OK. Here is what I would expect on the wire:
> 
> Without SCTP_NODELAY:
> 
> > INIT
> < INIT_ACK
> > COOKIE_ECHO
> < COOKIE_ACK
> < DATA(abc)
> > SACK
> < DATA(def);DATA(ghi);DATA(jkl);DATA(mno);DATA(pqr)
> > SACK
> > SHUTDOWN
> < SHUTDOWN_ACK
> > SHUTDOWN_COMPLETE
> 
> There should be no substantial delay between any messages above.
> 
> With SCTP_NODELAY
> > INIT
> < INIT_ACK
> > COOKIE_ECHO
> < COOKIE_ACK
> < DATA(abc)
> < DATA(def)
> < DATA(ghi)
> < DATA(mno)
> < DATA(pqr)
> > SHUTDOWN
> < SHUTDOWN_ACK
> > SHUTDOWN_COMPLETE
> 
> There will be three SACK somewhere between the DATA chunks depending on
> the timing.
> 
> There should be no substantial delay between any messages above.
> 
> I think if you see anything else, there is a bug. So do you see a different
> behavior on FreeBSD 9.2 (i386/amd64)? If yes, can you provide a .pcap file?

Sorry, I don't have 9.2/i386 yet. The dump from 9.1 is attached. It
has no address mess but the event sequence is following:

> INIT
< INIT_ACK
> COOKIE_ECHO
< COOKIE_ACK
< DATA(abc)
> SACK
< DATA(def)
... delay 200ms...
> SACK
< DATA(ghi); DATA(jkl); DATA(mno); DATA(pqr)

Comparing to your description, it has unexplained waiting after
DATA(def) from the server side, and SACK delay from the client side.

If you think it's fixed in 9.2, we can postpone this part of
discussion until my upgrade to 9.2.

> Do you have any special routing setup?

Just this box (9.1/i386) is trivial, no any routing specifics.
For amd64 boxes, I've sent routing details privately. But it seems
there are also none principally "special" these except multiple
addresses at loopback.

> Please note, that the first SACK is returned without the 200ms delay. This is
> required by the RFC and the above trace seems to show that.
> > But, if server shuts its writing side down ("s" in argv[]), this
> > laziness disappears. Again, the logic is too opaque and confusing.
> What do you mean by this?

At least, removing this delay by shutdown(,SHUT_WR) is unexpected.


-netch-

--bCsyhTFzCvuiizWE
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="dump.blocking.91.i386"
Content-Transfer-Encoding: base64

1MOyoQIABAAAAAAAAAAAAP//AAAAAAAA7GmgUpLSBgCoAAAAqAAAAAIAAABFAACkjqAAAECE
AAB/AAABfwAAAbCOCcQAAAAAAAAAAAEAAITCIlkIABxxxwAKCACDBLlQAAwACAAFAAbABgAI
UExSU4AAAATAAAAEgAgACsGAwIGCDwAAgAIAJJXYsfcXBl4xhHUBiJhYaQrSXyWD9MxR7Ovu
h+UaZfMDgAQACAABAAOAAwAGgMEAAAAFAAjBwcEEAAUACAoAAAEABQAIfwAAAexpoFK/0gYA
8AEAAPABAAACAAAARQAB7PA7AABAhAAAfwAAAX8AAAEJxLCOwiJZCAAAAAACAAHMmtfahgAc
cccACggAjNQE78AGAAhQTFJTgAAABMAAAASACAAKwYDAgYIPAACAAgAk9Jph42rEKuidkZoz
GJX+yxzBaaTvAJD628lchrCAA6yABAAIAAEAA4ADAAaAwQAAAAcBaEtBTUUtQlNEIDEuMQAA
AAAp8nAAfAEEAGDqAAAAAAAAAAAAAMIiWQia19qGfwAAAQAAAAAAAAAAAAAAAAUAAAB/AAAB
AAAAAAAAAAAAAAAABQAAAAAAAACwjgnEAQAAAQEBAAAAAAAAAQAAhMIiWQgAHHHHAAoIAIME
uVAADAAIAAUABsAGAAhQTFJTgAAABMAAAASACAAKwYDAgYIPAACAAgAkldix9xcGXjGEdQGI
mFhpCtJfJYP0zFHs6+6H5Rpl8wOABAAIAAEAA4ADAAaAwQAAAAUACMHBwQQABQAICgAAAQAF
AAh/AAABAgABzJrX2oYAHHHHAAoIAIzUBO/ABgAIUExSU4AAAATAAAAEgAgACsGAwIGCDwAA
gAIAJPSaYeNqxCronZGaMxiV/sscwWmk7wCQ+tvJXIawgAOsgAQACAABAAOAAwAGgMEAABNy
y/c5fNDpoXfpgYzvgMFLDkul7GmgUtrSBgCMAQAAjAEAAAIAAABFAAGIpeUAAECEAAB/AAAB
fwAAAbCOCcSa19qGAAAAAAoAAWhLQU1FLUJTRCAxLjEAAAAAKfJwAHwBBABg6gAAAAAAAAAA
AADCIlkImtfahn8AAAEAAAAAAAAAAAAAAAAFAAAAfwAAAQAAAAAAAAAAAAAAAAUAAAAAAAAA
sI4JxAEAAAEBAQAAAAAAAAEAAITCIlkIABxxxwAKCACDBLlQAAwACAAFAAbABgAIUExSU4AA
AATAAAAEgAgACsGAwIGCDwAAgAIAJJXYsfcXBl4xhHUBiJhYaQrSXyWD9MxR7Ovuh+UaZfMD
gAQACAABAAOAAwAGgMEAAAAFAAjBwcEEAAUACAoAAAEABQAIfwAAAQIAAcya19qGABxxxwAK
CACM1ATvwAYACFBMUlOAAAAEwAAABIAIAArBgMCBgg8AAIACACT0mmHjasQq6J2RmjMYlf7L
HMFppO8AkPrbyVyGsIADrIAEAAgAAQADgAMABoDBAAATcsv3OXzQ6aF36YGM74DBSw5Lpexp
oFIM0wYAKAAAACgAAAACAAAARQAAJHjVQABAhAAAfwAAAX8AAAEJxLCOwiJZCAAAAAALAAAE
7GmgUpbTBgA4AAAAOAAAAAIAAABFAgA0aepAAECEAAB/AAABfwAAAQnEsI7CIlkIAAAAAAAD
ABOM1ATvAAAAAAAAAABhYmMA7GmgUqTTBgA0AAAANAAAAAIAAABFAAAwx1YAAECEAAB/AAAB
fwAAAbCOCcSa19qGAAAAAAMAABCM1ATvABxwxAAAAADsaaBSt9MGADgAAAA4AAAAAgAAAEUC
ADRihEAAQIQAAH8AAAF/AAABCcSwjsIiWQgAAAAAAAMAE4zUBPAAAAABAAAAAGRlZgDsaaBS
e94JADQAAAA0AAAAAgAAAEUAADChDwAAQIQAAH8AAAF/AAABsI4JxJrX2oYAAAAAAwAAEIzU
BPAAHHHHAAAAAOxpoFKZ3gkAdAAAAHQAAAACAAAARQIAcKjqQABAhAAAfwAAAX8AAAEJxLCO
wiJZCAAAAAAAAwATjNQE8QAAAAIAAAAAZ2hpAAADABOM1ATyAAAAAwAAAABqa2wAAAMAE4zU
BPMAAAAEAAAAAG1ubwAAAwATjNQE9AAAAAUAAAAAcHFyAOxpoFIg3wkALAAAACwAAAACAAAA
RQAAKALUQABAhAAAfwAAAX8AAAGwjgnEmtfahgAAAAAHAAAIjNQE9OxpoFIt3wkAKAAAACgA
AAACAAAARQAAJDX4QABAhAAAfwAAAX8AAAEJxLCOwiJZCAAAAAAIAAAE7GmgUjffCQAoAAAA
KAAAAAIAAABFAAAkAjRAAECEAAB/AAABfwAAAbCOCcSa19qGAAAAAA4AAAQ=

--bCsyhTFzCvuiizWE
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="dump.blocking.91.i386.with_shutdown"
Content-Transfer-Encoding: base64

1MOyoQIABAAAAAAAAAAAAP//AAAAAAAAR3GgUo43AQCoAAAAqAAAAAIAAABFAACk7TkAAECE
AAB/AAABfwAAAbqICcQAAAAAAAAAAAEAAIR66MIVABxxxwAKCAD9YvR3AAwACAAFAAbABgAI
UExSU4AAAATAAAAEgAgACsGAwIGCDwAAgAIAJEhbs8Eq+J+5eMsDCDZeYbQkOOToNyfE9mtW
kOtYMkRpgAQACAABAAOAAwAGgMEAAAAFAAjBwcEEAAUACAoAAAEABQAIfwAAAUdxoFLLNwEA
8AEAAPABAAACAAAARQAB7B05AABAhAAAfwAAAX8AAAEJxLqIeujCFQAAAAACAAHM1zt92QAc
cccACggA+kkUysAGAAhQTFJTgAAABMAAAASACAAKwYDAgYIPAACAAgAkf1KiUVKTzCLyCOhR
3JrwxqftPTy4UkCNqSAfMInuC9CABAAIAAEAA4ADAAaAwQAAAAcBaEtBTUUtQlNEIDEuMQAA
AACD+XAAJqoNAGDqAAAAAAAAAAAAAHrowhXXO33ZfwAAAQAAAAAAAAAAAAAAAAUAAAB/AAAB
AAAAAAAAAAAAAAAABQAAAAAAAAC6iAnEAQAAAQEBAAAAAAAAAQAAhHrowhUAHHHHAAoIAP1i
9HcADAAIAAUABsAGAAhQTFJTgAAABMAAAASACAAKwYDAgYIPAACAAgAkSFuzwSr4n7l4ywMI
Nl5htCQ45Og3J8T2a1aQ61gyRGmABAAIAAEAA4ADAAaAwQAAAAUACMHBwQQABQAICgAAAQAF
AAh/AAABAgABzNc7fdkAHHHHAAoIAPpJFMrABgAIUExSU4AAAATAAAAEgAgACsGAwIGCDwAA
gAIAJH9SolFSk8wi8gjoUdya8Man7T08uFJAjakgHzCJ7gvQgAQACAABAAOAAwAGgMEAAPpS
mvBRdTocxYM2w4bfg3L/e5M5R3GgUuo3AQCMAQAAjAEAAAIAAABFAAGIkH4AAECEAAB/AAAB
fwAAAbqICcTXO33ZAAAAAAoAAWhLQU1FLUJTRCAxLjEAAAAAg/lwACaqDQBg6gAAAAAAAAAA
AAB66MIV1zt92X8AAAEAAAAAAAAAAAAAAAAFAAAAfwAAAQAAAAAAAAAAAAAAAAUAAAAAAAAA
uogJxAEAAAEBAQAAAAAAAAEAAIR66MIVABxxxwAKCAD9YvR3AAwACAAFAAbABgAIUExSU4AA
AATAAAAEgAgACsGAwIGCDwAAgAIAJEhbs8Eq+J+5eMsDCDZeYbQkOOToNyfE9mtWkOtYMkRp
gAQACAABAAOAAwAGgMEAAAAFAAjBwcEEAAUACAoAAAEABQAIfwAAAQIAAczXO33ZABxxxwAK
CAD6SRTKwAYACFBMUlOAAAAEwAAABIAIAArBgMCBgg8AAIACACR/UqJRUpPMIvII6FHcmvDG
p+09PLhSQI2pIB8wie4L0IAEAAgAAQADgAMABoDBAAD6UprwUXU6HMWDNsOG34Ny/3uTOUdx
oFIcOAEAKAAAACgAAAACAAAARQAAJJFZQABAhAAAfwAAAX8AAAEJxLqIeujCFQAAAAALAAAE
R3GgUks5AQA4AAAAOAAAAAIAAABFAgA0h0JAAECEAAB/AAABfwAAAQnEuoh66MIVAAAAAAAD
ABP6SRTKAAAAAAAAAABhYmMAR3GgUlo5AQA0AAAANAAAAAIAAABFAAAwEKkAAECEAAB/AAAB
fwAAAbqICcTXO33ZAAAAAAMAABD6SRTKABxwxAAAAABHcaBSszkBADgAAAA4AAAAAgAAAEUC
ADTXKkAAQIQAAH8AAAF/AAABCcS6iHrowhUAAAAAAAMAE/pJFMsAAAABAAAAAGRlZgBHcaBS
yzkBAHQAAAB0AAAAAgAAAEUCAHBuJkAAQIQAAH8AAAF/AAABCcS6iHrowhUAAAAAAAMAE/pJ
FMwAAAACAAAAAGdoaQAAAwAT+kkUzQAAAAMAAAAAamtsAAADABP6SRTOAAAABAAAAABtbm8A
AAMAE/pJFM8AAAAFAAAAAHBxcgBHcaBS1DkBADQAAAA0AAAAAgAAAEUAADC2AwAAQIQAAH8A
AAF/AAABuogJxNc7fdkAAAAAAwAAEPpJFM8AHGu1AAAAAEdxoFLhOQEALAAAACwAAAACAAAA
RQAAKNpBQABAhAAAfwAAAX8AAAEJxLqIeujCFQAAAAAHAAAI/WL0dkdxoFLpOQEAKAAAACgA
AAACAAAARQAAJDaaQABAhAAAfwAAAX8AAAG6iAnE1zt92QAAAAAIAAAER3GgUu85AQAoAAAA
KAAAAAIAAABFAAAkdutAAECEAAB/AAABfwAAAQnEuoh66MIVAAAAAA4AAAQ=

--bCsyhTFzCvuiizWE--

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 13:39:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 956EED4A
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 13:39:05 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id A74FD1D47
 for <freebsd-net@freebsd.org>; Thu,  5 Dec 2013 13:39:04 +0000 (UTC)
Received: from [10.225.9.5] (unknown [194.95.73.101])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 7B7691C0C0693;
 Thu,  5 Dec 2013 14:39:01 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: SCTP huge connect delays (at amd64) and yet another question
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131205123005.GE71737@netch.kiev.ua>
Date: Thu, 5 Dec 2013 14:39:01 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <1564E942-DC9E-4142-89F3-B82EEF1A103C@lurchi.franken.de>
References: <20131205084142.GA31113@netch.kiev.ua>
 <11932BA9-A734-4D4F-BCBB-6A0D926A22A9@lurchi.franken.de>
 <20131205123005.GE71737@netch.kiev.ua>
To: Valentin Nechayev <netch@netch.kiev.ua>
X-Mailer: Apple Mail (2.1510)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 13:39:05 -0000

On Dec 5, 2013, at 1:30 PM, Valentin Nechayev <netch@netch.kiev.ua> =
wrote:

> Hi,
>=20
> Thu, Dec 05, 2013 at 11:32:03, Michael.Tuexen wrote about "Re: SCTP =
huge connect delays (at amd64) and yet another question":=20
>=20
>>> The first discrepancy found is specific for FreeBSD on amd64 and not
>>> for i386 version; it's that connection setup lasts 2-4 seconds (!!)
>>> Tcpdump shows indication that could be parsed as message miss:
>> Hi Valentin,
>>=20
>> could you send me the .pcap file instead of the tcpdump output.
>> I would like to see the addresses listed in the INIT and INIT-ACK.
>=20
> I've sent them, thanks.
I answered...
>=20
>>> tcpdump: listening on lo0, link-type NULL (BSD loopback), capture =
size 65535 byt
>>> es
>>> 08:18:34.639422 IP (tos 0x0, ttl 64, id 65094, offset 0, flags =
[none], proto SCT
>>> P (132), length 188, bad cksum 0 (->f274)!)
>>>   10.0.0.2.50025 > 127.0.0.1.2500: sctp
>> I'm wondering why 10.0.0.2 is the source address and not 127.0.0.1
>=20
> I've showed the code, it doesn't make any explicit binding or address
> suggestion. For this host (9.1/i386), 10.0.0.2 resides on xl0. There
> is no routing specifics which forces it to select 10.0.0.2:
>=20
> $ route -n get 127.0.0.1
>   route to: 127.0.0.1
> destination: 127.0.0.1
>  interface: lo0
>      flags: <UP,HOST,DONE,LOCAL>
> recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
>       0         0         0         0     16384         1         0
> $ telnet 127.0.0.1 25
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> 220 iv.local ESMTP Sendmail 8.14.5/8.14.5; Thu, 5 Dec 2013 13:48:31 =
+0200 (EET)
> ehlo zzz
> 250-iv.local Hello netch@localhost [127.0.0.1], pleased to meet you
> [...]
>=20
> At least for TCP and UDP, it's quite straightforward.
There might be an issue in the SCTP stack. It does handle addresses =
differently
than UDP. However, I wasn't able to reproduce your problem. I need to =
test a
setup similar to your, which I haven't done yet.
>=20
>>> At 08:18:34.639467, cookie echo was sent but likely ignored. One
>>> second later it was resent. Then, yet another strange timeout was
>>> invented before HB REQ.
>>>=20
>>> Test series show this can spend more than 4 seconds, average value
>>> is about 3 seconds. Two 20-times run summary times are 58 to 63
>>> seconds, so, I've got 2.9...3.15 average connect time.
>>>=20
>>> Neither Linux nor 32-bit FreeBSD shows this.
>> FreeBSD should neither... Do you see this on FreeBSD 9.2 amd64?
>=20
> Yes. A fresh dump has reproduced this.
OK. Fine. This might an issue in the address handling... I'll try
to reproduce this,
>=20
>>> It's definitely better than delay each run, as on other platforms
>>> (but the initial delay annoys roughly).
>> Without SCTP_NODELAY bundling can happen or not, it depends on =
timing.
>> It would be great, if you can provide a .pcap file for a transfer you
>> think shows some buggy behaviour. Then we can figure out what is =
going on.
>=20
>> MSG_EOR is nothing you provide at a send() call. The flag is only
>> returned by the recvmsg() call.
>=20
> Yes, I know. This has remained from the code which exposes
> SOCK_SEQPACKET specifics over different transport families (e.g.
> FreeBSD keeps this flag over AF_UNIX but Linux doesn't). I didn't take
> it into account, but, if is needed for sight clarity, I'll remove it:)
>=20
>>> }
>> OK. Here is what I would expect on the wire:
>>=20
>> Without SCTP_NODELAY:
>>=20
>>> INIT
>> < INIT_ACK
>>> COOKIE_ECHO
>> < COOKIE_ACK
>> < DATA(abc)
>>> SACK
>> < DATA(def);DATA(ghi);DATA(jkl);DATA(mno);DATA(pqr)
>>> SACK
>>> SHUTDOWN
>> < SHUTDOWN_ACK
>>> SHUTDOWN_COMPLETE
>>=20
>> There should be no substantial delay between any messages above.
>>=20
>> With SCTP_NODELAY
>>> INIT
>> < INIT_ACK
>>> COOKIE_ECHO
>> < COOKIE_ACK
>> < DATA(abc)
>> < DATA(def)
>> < DATA(ghi)
>> < DATA(mno)
>> < DATA(pqr)
>>> SHUTDOWN
>> < SHUTDOWN_ACK
>>> SHUTDOWN_COMPLETE
>>=20
>> There will be three SACK somewhere between the DATA chunks depending =
on
>> the timing.
>>=20
>> There should be no substantial delay between any messages above.
>>=20
>> I think if you see anything else, there is a bug. So do you see a =
different
>> behavior on FreeBSD 9.2 (i386/amd64)? If yes, can you provide a .pcap =
file?
>=20
> Sorry, I don't have 9.2/i386 yet. The dump from 9.1 is attached. It
I actually don't expect a difference between 32-bit or 64-bit. I guess
it might be more related to different address setup or timing.
> has no address mess but the event sequence is following:
>=20
>> INIT
> < INIT_ACK
>> COOKIE_ECHO
> < COOKIE_ACK
> < DATA(abc)
>> SACK
> < DATA(def)
> ... delay 200ms...
>> SACK
> < DATA(ghi); DATA(jkl); DATA(mno); DATA(pqr)
>=20
> Comparing to your description, it has unexplained waiting after
> DATA(def) from the server side, and SACK delay from the client side.
It is timing related as described in my other mail. Is the SACK received
before the send() calls finish or vice versa...
>=20
> If you think it's fixed in 9.2, we can postpone this part of
> discussion until my upgrade to 9.2.
>=20
>> Do you have any special routing setup?
>=20
> Just this box (9.1/i386) is trivial, no any routing specifics.
> For amd64 boxes, I've sent routing details privately. But it seems
> there are also none principally "special" these except multiple
> addresses at loopback.
>=20
>> Please note, that the first SACK is returned without the 200ms delay. =
This is
>> required by the RFC and the above trace seems to show that.
>>> But, if server shuts its writing side down ("s" in argv[]), this
>>> laziness disappears. Again, the logic is too opaque and confusing.
>> What do you mean by this?
>=20
> At least, removing this delay by shutdown(,SHUT_WR) is unexpected.
When you shutdown(,SHUT_WR) we send out pending data without waiting
for a SACK, since there will be no more data from the user. This is
shown by your attached traces and is intended.

So it seems that
* the timing is as expected for the data transmission phase
* there is an issue with setting up associations when there
  are specific addresses on loopback.

Do you agree?

Best regards
Michael
>=20
>=20
> -netch-
> <dump.blocking.91.i386><dump.blocking.91.i386.with_shutdown>


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 18:29:52 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 1709778F;
 Thu,  5 Dec 2013 18:29:52 +0000 (UTC)
Received: from mail-qe0-x232.google.com (mail-qe0-x232.google.com
 [IPv6:2607:f8b0:400d:c02::232])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id B69FB11D2;
 Thu,  5 Dec 2013 18:29:51 +0000 (UTC)
Received: by mail-qe0-f50.google.com with SMTP id 1so15503204qec.23
 for <multiple recipients>; Thu, 05 Dec 2013 10:29:51 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=G0HpOnWOZgQvdeO5/FtHLsKbi5gBRJlunrlZSQE22TM=;
 b=bzUxqf/p7idy6hQlp6lmK6lcJ7yE7OI8fH2oSe1XGsowXlAApXqZlHnpvdD+r1l7aQ
 SzKWgdUgoUuGxlea7QluVlCo5ySlCy4GHzZfS61pXJ+ksXrYlVB/igAKtc9HzRUNewYg
 wRkJGO6y5yXApYCnGEx0JYRuoxWXkSqh3qa9rkVgffmRzcdOUphtadGS6ZRrrX9KUM/r
 0fDJ32Z84Lc4WJEuOzDejnpWLe604pp3dfHcuR+bTmiyjATgd0oq8VxHHVjHT3v3EvvG
 /zkeyAlzTmRYDPD72f7qiqTbxiaDNKcZZtpyf0sX6XAx+qx+NMPdo5iHVxGyA5l+/hQv
 OyXQ==
MIME-Version: 1.0
X-Received: by 10.49.24.163 with SMTP id v3mr87399765qef.78.1386268190994;
 Thu, 05 Dec 2013 10:29:50 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Thu, 5 Dec 2013 10:29:50 -0800 (PST)
In-Reply-To: <20131203021658.GC2981@michelle.cdnetworks.com>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
Date: Thu, 5 Dec 2013 10:29:50 -0800
X-Google-Sender-Auth: GgzO7v3Q3v1O6TEyiMPh6f7avAU
Message-ID: <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Adrian Chadd <adrian@freebsd.org>
To: Yong-Hyeon Pyun <pyunyh@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Jack F Vogel <jfv@freebsd.org>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 18:29:52 -0000

Hi,

Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
error from ixgbe_xmit() but if it fails, it puts the buffer back. But
it's already successfully queued a frame to the driver, so in this
instance it shouldn't return the error from ixgbe_mq_start_locked().

The same deal in if_em.c and igb.c

Now, drbr_putback() used to fail and now it doesn't, as you've said.
So we should change the xxx_mq_start_locked() to set err=0 if we go
via the drbr_putback() routine, as it hasn't actually failed to
transmit.

Now the very dirty thing is this - the error from xxx_transmit() is
for the mbuf being queued at the end; but xxx_mq_start_locked()
failures are for transmitting from the front. If there's only packet
in the queue and that fails then they're the same thing and returning
the error from xxx_mq_start_locked() matches the current mbuf being
queued. But otherwise, they're referring to totally different packets.
For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
kicks off a timer to schedule a retransmit. I don't think we can fix
_this_ right now.

So Michael - can you redo your patch to set err=0 if drbr_putback() is
called, and retest?

Thanks!




-adrian

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 19:07:04 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B0C5158E;
 Thu,  5 Dec 2013 19:07:04 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 3E0021437;
 Thu,  5 Dec 2013 19:07:04 +0000 (UTC)
Received: from [192.168.1.102] (p508F016D.dip0.t-ipconnect.de [80.143.1.109])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 5A0D01C0C0692;
 Thu,  5 Dec 2013 20:07:01 +0100 (CET)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
Date: Thu, 5 Dec 2013 20:07:00 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <AFE80228-4B6F-46FE-BF74-A27DCC0E8F52@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 19:07:04 -0000


On Dec 5, 2013, at 7:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> Hi,
> 
> Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
> error from ixgbe_xmit() but if it fails, it puts the buffer back. But
> it's already successfully queued a frame to the driver, so in this
> instance it shouldn't return the error from ixgbe_mq_start_locked().
> 
> The same deal in if_em.c and igb.c
> 
> Now, drbr_putback() used to fail and now it doesn't, as you've said.
> So we should change the xxx_mq_start_locked() to set err=0 if we go
> via the drbr_putback() routine, as it hasn't actually failed to
> transmit.
> 
> Now the very dirty thing is this - the error from xxx_transmit() is
> for the mbuf being queued at the end; but xxx_mq_start_locked()
> failures are for transmitting from the front. If there's only packet
> in the queue and that fails then they're the same thing and returning
> the error from xxx_mq_start_locked() matches the current mbuf being
> queued. But otherwise, they're referring to totally different packets.
> For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
> kicks off a timer to schedule a retransmit. I don't think we can fix
> _this_ right now.
> 
> So Michael - can you redo your patch to set err=0 if drbr_putback() is
> called, and retest?
Sure. I'll report the result.

Best regards
Michael
> 
> Thanks!
> 
> 
> 
> 
> -adrian
> 


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 21:05:21 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id F1DB2297;
 Thu,  5 Dec 2013 21:05:20 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 83B6C1C7A;
 Thu,  5 Dec 2013 21:05:20 +0000 (UTC)
Received: from [192.168.1.102] (p508F016D.dip0.t-ipconnect.de [80.143.1.109])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 778861C0C0695;
 Thu,  5 Dec 2013 22:05:18 +0100 (CET)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
Date: Thu, 5 Dec 2013 22:05:16 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 21:05:21 -0000

On Dec 5, 2013, at 7:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> Hi,
> 
> Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
> error from ixgbe_xmit() but if it fails, it puts the buffer back. But
> it's already successfully queued a frame to the driver, so in this
> instance it shouldn't return the error from ixgbe_mq_start_locked().
> 
> The same deal in if_em.c and igb.c
> 
> Now, drbr_putback() used to fail and now it doesn't, as you've said.
> So we should change the xxx_mq_start_locked() to set err=0 if we go
> via the drbr_putback() routine, as it hasn't actually failed to
> transmit.
> 
> Now the very dirty thing is this - the error from xxx_transmit() is
> for the mbuf being queued at the end; but xxx_mq_start_locked()
> failures are for transmitting from the front. If there's only packet
> in the queue and that fails then they're the same thing and returning
> the error from xxx_mq_start_locked() matches the current mbuf being
> queued. But otherwise, they're referring to totally different packets.
> For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
> kicks off a timer to schedule a retransmit. I don't think we can fix
> _this_ right now.
Just to be clear: This would mean that xxx_transmit() would return
an error even if the packet provided in the call xxx_transmit() is
enqueued and not dropped?
This would also be problem with the current SCTP stack.

Best regards
Michael
> 
> So Michael - can you redo your patch to set err=0 if drbr_putback() is
> called, and retest?
> 
> Thanks!
> 
> 
> 
> 
> -adrian
> 


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 22:01:39 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9F26FD79;
 Thu,  5 Dec 2013 22:01:39 +0000 (UTC)
Received: from mail-qe0-x22d.google.com (mail-qe0-x22d.google.com
 [IPv6:2607:f8b0:400d:c02::22d])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 47B811FB8;
 Thu,  5 Dec 2013 22:01:39 +0000 (UTC)
Received: by mail-qe0-f45.google.com with SMTP id 6so18006147qea.32
 for <multiple recipients>; Thu, 05 Dec 2013 14:01:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=u1PSICOLJ42xaneOr7sr51F8D9iFoAXIuPsswbBOP5E=;
 b=cRFfFZLRwcikolmBhurz5tIabVn/+/lUEIpmiyuE5iF+PfzeBZ1/4ixW93cJhphNT9
 BcZMpQ8HlMJzcNGn1jo0kR+jBTATcwskL+f4OsqkxD+Ct3lvMqx+IARB7E0CDB+08zjk
 xodUV8VRLGGRCThZB2UNCVauIgvWdPnVL6TIEed27bp86T6E7BbO+sRZu7Eu+725AWv9
 gGDOo0eAp2hWkmHrzKTfQqndoLtW6t2IlkNSQOMphzvZn1n7rAV97lxhCzEosE/j7M/x
 iJniXwcNVsONYF87mkK+0mp6gdY721NUK7G12faPXuT8/Ixe9H7/qllP1hP7+pUjo036
 evCg==
MIME-Version: 1.0
X-Received: by 10.229.137.69 with SMTP id v5mr633018qct.4.1386280898446; Thu,
 05 Dec 2013 14:01:38 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Thu, 5 Dec 2013 14:01:38 -0800 (PST)
In-Reply-To: <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
Date: Thu, 5 Dec 2013 14:01:38 -0800
X-Google-Sender-Auth: Q9_u_SdHfVKBB2BXEE5tt5WEKS4
Message-ID: <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Adrian Chadd <adrian@freebsd.org>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 22:01:39 -0000

On 5 December 2013 13:05, Michael Tuexen
<Michael.Tuexen@lurchi.franken.de> wrote:

> Just to be clear: This would mean that xxx_transmit() would return
> an error even if the packet provided in the call xxx_transmit() is
> enqueued and not dropped?
> This would also be problem with the current SCTP stack.

I think it'll return an error only if:

* it queued the frame to the tail of the drbd;
* it then tried to transmit a frame from the head of the drbd;
* it failed to transmit the first frame in the drbd and it couldn't
put it back into the queue for whatever reason.

So I think it should be "ok enough" for both TCP and SCTP.

Give it a go and let me know how it goes.

It's an interesting architectural problem to completely solve.


-adrian

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 22:37:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id EBF493B9;
 Thu,  5 Dec 2013 22:37:13 +0000 (UTC)
Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id A4117117D;
 Thu,  5 Dec 2013 22:37:13 +0000 (UTC)
Received: from h2.funkthat.com (localhost [127.0.0.1])
 by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rB5MbBfr074799
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Thu, 5 Dec 2013 14:37:12 -0800 (PST)
 (envelope-from jmg@h2.funkthat.com)
Received: (from jmg@localhost)
 by h2.funkthat.com (8.14.3/8.14.3/Submit) id rB5MbBH3074798;
 Thu, 5 Dec 2013 14:37:11 -0800 (PST) (envelope-from jmg)
Date: Thu, 5 Dec 2013 14:37:11 -0800
From: John-Mark Gurney <jmg@funkthat.com>
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131205223711.GB55638@funkthat.com>
Mail-Followup-To: Adrian Chadd <adrian@freebsd.org>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
User-Agent: Mutt/1.4.2.3i
X-Operating-System: FreeBSD 7.2-RELEASE i386
X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88  9322 9CB1 8F74 6D3F A396
X-Files: The truth is out there
X-URL: http://resnet.uoregon.edu/~gurney_j/
X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html
X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger?
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (h2.funkthat.com [127.0.0.1]); Thu, 05 Dec 2013 14:37:12 -0800 (PST)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 22:37:14 -0000

Adrian Chadd wrote this message on Thu, Dec 05, 2013 at 14:01 -0800:
> On 5 December 2013 13:05, Michael Tuexen
> <Michael.Tuexen@lurchi.franken.de> wrote:
> 
> > Just to be clear: This would mean that xxx_transmit() would return
> > an error even if the packet provided in the call xxx_transmit() is
> > enqueued and not dropped?
> > This would also be problem with the current SCTP stack.
> 
> I think it'll return an error only if:
> 
> * it queued the frame to the tail of the drbd;
> * it then tried to transmit a frame from the head of the drbd;
> * it failed to transmit the first frame in the drbd and it couldn't
> put it back into the queue for whatever reason.
> 
> So I think it should be "ok enough" for both TCP and SCTP.

IMO it should only return an error if the specific frame failed to be
sent or queued.  If you cannot determine at return time if the frame
failed to be transmitted/queued, then it should return success.

In the above case, if there were other frames queued ahead, and the
first one failed, then it sounds like the frame may eventually be sent
and we will end up sending a duplicate frame.

> Give it a go and let me know how it goes.
> 
> It's an interesting architectural problem to completely solve.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  5 23:10:31 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5602BF9B;
 Thu,  5 Dec 2013 23:10:31 +0000 (UTC)
Received: from mail-qa0-x233.google.com (mail-qa0-x233.google.com
 [IPv6:2607:f8b0:400d:c00::233])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E19B61371;
 Thu,  5 Dec 2013 23:10:30 +0000 (UTC)
Received: by mail-qa0-f51.google.com with SMTP id o15so51607qap.17
 for <multiple recipients>; Thu, 05 Dec 2013 15:10:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:content-type;
 bh=jAK6VOKg3hHSGafZSkKKlWuv5na1bc7HdO1VllnyIj4=;
 b=tOE0j8P8w/2uYvUv9eFVoN8O7gSQGKlRyyx1vSwcGg5r92wP84fSDwVUoEMZTl010o
 sITmErzYkSCdDAPsU2ek1IiekIm+yGmxpSlOyB0A9IJeH03jmULYxbWwxMAvnKbcG94/
 WwbAKM3U/QiC0n/gfvekenMJu6R2VqR7pIPcFeLkOP6QF5hHWLs1Z9SMhyAMM3vvHXXr
 yPF2AQ0Lv5TgMe51oBz2dVqSRxpJihu6PkdWKm8UQulmbIjOD+j9fDBXfEL56QNyMdrb
 0ACmljiCMezsmKFkHc519Naoza2JZR2Cs9lf7NT90wizH/I0Wao/EzwTY5dLOXD1s9o3
 LPtA==
MIME-Version: 1.0
X-Received: by 10.49.17.232 with SMTP id r8mr625571qed.74.1386285030050; Thu,
 05 Dec 2013 15:10:30 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Thu, 5 Dec 2013 15:10:29 -0800 (PST)
In-Reply-To: <20131205223711.GB55638@funkthat.com>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
 <20131205223711.GB55638@funkthat.com>
Date: Thu, 5 Dec 2013 15:10:29 -0800
X-Google-Sender-Auth: uIGcF04FtMbrYJKt4ij-9g8qE7s
Message-ID: <CAJ-VmonMRgMoJSe8x4Jk6iPB+MTV3___hdCd7LnEus=OHNnDmQ@mail.gmail.com>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Adrian Chadd <adrian@freebsd.org>
To: Adrian Chadd <adrian@freebsd.org>,
 Michael Tuexen <Michael.Tuexen@lurchi.franken.de>, 
 Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>, 
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 23:10:31 -0000

On 5 December 2013 14:37, John-Mark Gurney <jmg@funkthat.com> wrote:
> Adrian Chadd wrote this message on Thu, Dec 05, 2013 at 14:01 -0800:
>> On 5 December 2013 13:05, Michael Tuexen
>> <Michael.Tuexen@lurchi.franken.de> wrote:
>>
>> > Just to be clear: This would mean that xxx_transmit() would return
>> > an error even if the packet provided in the call xxx_transmit() is
>> > enqueued and not dropped?
>> > This would also be problem with the current SCTP stack.
>>
>> I think it'll return an error only if:
>>
>> * it queued the frame to the tail of the drbd;
>> * it then tried to transmit a frame from the head of the drbd;
>> * it failed to transmit the first frame in the drbd and it couldn't
>> put it back into the queue for whatever reason.
>>
>> So I think it should be "ok enough" for both TCP and SCTP.
>
> IMO it should only return an error if the specific frame failed to be
> sent or queued.  If you cannot determine at return time if the frame
> failed to be transmitted/queued, then it should return success.

For the long term solution, I agree.

> In the above case, if there were other frames queued ahead, and the
> first one failed, then it sounds like the frame may eventually be sent
> and we will end up sending a duplicate frame.

Right. We should also fix this properly.

I think the right thing, long term, is something like this;

* xxx_mq_start_locked() returns whether the head frame was transmitted or not;
* the if_transmit() entry point(s) return whether the given frame was
queued to the software queue or not;
* the if_transmit() entry point(s) ignore the return value of
xxx_mq_start_locked(), as the stack _should_ handle the case of a
frame handed to the driver but dropped.

So, I'd like to get Michael to first test fixing up
xxx_mq_start_locked() to only return an error if it failed to transmit
a frame and the frame was dropped. Then, once we get feedback from
that, I was going to propose that we also do what Michael initially
did - and that's ignore the error from calling xxx_mq_start_locked().
Followed, hopefully, with some comments explaining how this all holds
together.

How's that sound?



-adrian

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 02:51:40 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2DB0EAD8
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 02:51:40 +0000 (UTC)
Received: from nm11-vm1.bullet.mail.bf1.yahoo.com
 (nm11-vm1.bullet.mail.bf1.yahoo.com [98.139.213.152])
 by mx1.freebsd.org (Postfix) with SMTP id C3BA111F4
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 02:51:39 +0000 (UTC)
Received: from [66.196.81.172] by nm11.bullet.mail.bf1.yahoo.com with NNFMP;
 06 Dec 2013 02:51:33 -0000
Received: from [68.142.230.65] by tm18.bullet.mail.bf1.yahoo.com with NNFMP;
 06 Dec 2013 02:51:32 -0000
Received: from [127.0.0.1] by smtp222.mail.bf1.yahoo.com with NNFMP;
 06 Dec 2013 02:51:32 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1386298292; bh=KgMYV567kAtn8o8hpPVZL2Q667pb0084Lr7nz9T18Ec=;
 h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:Content-Type;
 b=5GqqE2L6HEGF7l+1sDKTOwkJzVveksqHORTdJ6n3klcJPDYymWS7ORdyK08j4Y2Uak50uH06UwNGFuT07mb5UK7Ba5dMLtVJd/pS97a2GacnE1pXCvwokxcJPz/Iiuy+c5y2Lr8jRa4ByG4GPHaejKjJHEZzeDw78SdIGjA8LiY=
X-Yahoo-Newman-Id: 957067.31842.bm@smtp222.mail.bf1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: 91Vh7LAVM1niDGiwgV5R_KnKbYP67F12qauK4WQWJDqokmd
 EMq441Hjgq6E6P3g13wSNzwQlYcQtgStoIrETCb4VxSMH6acapw_6ScI5llJ
 iBB1Nk72uPDT9L0V1oqDrIcWgxoJ.12LEqLyc6Kzivnq5twArb9hXza8jQCc
 5RefqM319e4oqihJKwGk8QBQXelnnOlUiLKO6Rj2HVgbMy9ok8qrKYEt31gt
 0RqwEGGivcsC1hn4P2l3Pp5mDux3HQQgVyMrRTBadyOlh4IceqOzLGW8loT3
 rDRc7Tghv_uatRR3dxn4lvknROJoTySNH1K3Er73uOD.DRL7aAWhamD4x5dU
 PZJ7HIPm7BgMnd3nVTI2ZZRAsfpVH5PASr2NDfdLD24K3DW..X64SkaJHsFX
 SCS7Vg5BeN_XKq5LXif1I7fUfe.8yLbMuygso.WWQ5_ptnSr3.AwvyGTuuh8
 EVAcEpyrjaqZbShDJ1aF7Vi59kkWGApdYFMoA2BMEYchQGf6whxmjAt4T29d
 2Ob.PR52qfmbAYFUlbAMpwdvAoWHr.oZv4nVOnajeFESbLTtz9q9Qzd9LKw- -
X-Yahoo-SMTP: sHqPI42swBDl6e.0QxkIIsC77EttkMXsaRT5OA--
X-Rocket-Received: from [192.168.1.18] (blue_phoenix316@76.4.203.61 with )
 by smtp222.mail.bf1.yahoo.com with SMTP; 06 Dec 2013 02:51:32 +0000 UTC
Message-ID: <52A13BB1.50106@yahoo.com>
Date: Thu, 05 Dec 2013 21:51:29 -0500
From: Darryl Lyle <blue_phoenix316@yahoo.com>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:24.0) Gecko/20100101 Thunderbird/24.1.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org, yongari@freebsd.org
Subject: Can't connect to network with my NIC
Content-Type: multipart/mixed; boundary="------------050205050605040301090407"
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 02:51:40 -0000

This is a multi-part message in MIME format.
--------------050205050605040301090407
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hey guys,

     I just installed the latest pc-bsd 10 stable image, and I can't 
connect to my network with my nic, but I can connect fine via wifi.

ifconfig re0
e0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
         ether f8:b1:56:9d:84:3a
         inet6 fe80::fab1:56ff:fe9d:843a%re0 prefixlen 64 scopeid 0x1
         inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255
         nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active

Attached is dmesg and pciconf -lv

If I try dhclient re0 I get no DHCPOFFERS




v/r
Darryl Lyle

--------------050205050605040301090407
Content-Type: text/plain; charset=UTF-8;
 name="dmesg.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="dmesg.txt"

Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10-STABLE-p4 #0 403eae4(stable/10): Mon Nov 18 16:35:51 EST 2013
    root@avenger:/usr/obj/root/pcbsd-build/git/freebsd/sys/GENERIC amd64
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (3392.21-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x306c3  Family = 0x6  Model = 0x3c  Stepping = 3
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,<b11>,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x21<LAHF,ABM>
  Standard Extended Features=0x2fbb<GSFSBASE,TSCADJ,BMI1,HLE,AVX2,SMEP,BMI2,ENHMOVSB,INVPCID,RTM>
  TSC: P-state invariant, performance statistics
real memory  = 9110028288 (8688 MB)
avail memory = 8212455424 (7832 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <DELL   FX09   >
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 SMT threads
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): APIC ID:  6
 cpu7 (AP): APIC ID:  7
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
random: <Software, Yarrow> initialized
cryptosoft0: <software crypto> on motherboard
aesni0: <AES-CBC,AES-XTS> on motherboard
acpi0: <DELL FX09   > on motherboard
acpi0: Power Button (fixed)
acpi0: reservation of 67, 1 (4) failed
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
cpu4: <ACPI CPU> on acpi0
cpu5: <ACPI CPU> on acpi0
cpu6: <ACPI CPU> on acpi0
cpu7: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0
atrtc0: Warning: Couldn't map I/O.
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
vgapci0: <VGA-compatible display> port 0xe000-0xe07f mem 0xf6000000-0xf6ffffff,0xe8000000-0xefffffff,0xf0000000-0xf1ffffff irq 16 at device 0.0 on pci1
nvidia0: <GeForce GTX 660> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: child nvidia0 requested pci_enable_io
hdac0: <NVIDIA (0x0e0a) HDA Controller> mem 0xf7080000-0xf7083fff irq 17 at device 0.1 on pci1
xhci0: <Intel Lynx Point USB 3.0 controller> mem 0xf7300000-0xf730ffff irq 16 at device 20.0 on pci0
xhci0: 32 byte context size.
xhci0: Port routing mask set to 0xffffffff
usbus0 on xhci0
pci0: <simple comms> at device 22.0 (no driver attached)
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xf7318000-0xf73183ff irq 16 at device 26.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci0
hdac1: <Intel Lynx Point HDA Controller> mem 0xf7310000-0xf7313fff irq 22 at device 27.0 on pci0
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pci3: <ACPI PCI bus> on pcib3
re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xd000-0xd0ff mem 0xf7200000-0xf7200fff,0xf2100000-0xf2103fff irq 18 at device 0.0 on pci3
re0: Using 1 MSI-X message
re0: Chip rev. 0x4c000000
re0: MAC rev. 0x00000000
miibus0: <MII bus> on re0
rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0
rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
re0: Ethernet address: f8:b1:56:9d:84:3a
pcib4: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0
pci4: <ACPI PCI bus> on pcib4
ath0: <Atheros AR9485> mem 0xf7100000-0xf717ffff irq 19 at device 0.0 on pci4
ar9300_set_stub_functions: setting stub functions
ar9300_set_stub_functions: setting stub functions
ar9300_attach: calling ar9300_hw_attach
ar9300_hw_attach: calling ar9300_eeprom_attach
ar9300_flash_map: unimplemented for now
Restoring Cal data from DRAM
Restoring Cal data from EEPROM
Restoring Cal data from Flash
Restoring Cal data from Flash
Restoring Cal data from OTP
ar9300_hw_attach: ar9300_eeprom_attach returned 0
ath0: RX status length: 48
ath0: RX buffer size: 4096
ath0: TX descriptor length: 128
ath0: TX status length: 36
ath0: TX buffers per descriptor: 4
ar9300_freebsd_setup_x_tx_desc: called, 0x0/0, 0x0/0, 0x0/0
ath0: ath_edma_setup_rxfifo: type=0, FIFO depth = 16 entries
ath0: ath_edma_setup_rxfifo: type=1, FIFO depth = 128 entries
ath0: [HT] enabling HT modes
ath0: [HT] enabling short-GI in 20MHz mode
ath0: [HT] 1 stream STBC receive enabled
ath0: [HT] 1 RX streams; 1 TX streams
ath0: AR9485 mac 576.1 RF5110 phy 0.0
ath0: 2GHz radio: 0x0000; 5GHz radio: 0x0000
ehci1: <EHCI (generic) USB 2.0 controller> mem 0xf7317000-0xf73173ff irq 23 at device 29.0 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci1
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel Lynx Point AHCI SATA controller> port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xf7316000-0xf73167ff irq 19 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahciem0: <AHCI enclosure management bridge> on ahci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
acpi_tz1: <Thermal Zone> on acpi0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
est4: <Enhanced SpeedStep Frequency Control> on cpu4
p4tcc4: <CPU Frequency Thermal Control> on cpu4
est5: <Enhanced SpeedStep Frequency Control> on cpu5
p4tcc5: <CPU Frequency Thermal Control> on cpu5
est6: <Enhanced SpeedStep Frequency Control> on cpu6
p4tcc6: <CPU Frequency Thermal Control> on cpu6
est7: <Enhanced SpeedStep Frequency Control> on cpu7
p4tcc7: <CPU Frequency Thermal Control> on cpu7
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Timecounters tick every 1.000 msec
vboxdrv: fAsync=0 offMin=0x468 offMax=0x668
hdacc0: <NVIDIA (0x0040) HDA CODEC> at cad 0 on hdac0
hdaa0: <NVIDIA (0x0040) Audio Function Group> at nid 1 on hdacc0
pcm0: <NVIDIA (0x0040) (HDMI/DP 8ch)> at nid 4 on hdaa0
pcm1: <NVIDIA (0x0040) (HDMI/DP 8ch)> at nid 5 on hdaa0
pcm2: <NVIDIA (0x0040) (HDMI/DP 8ch)> at nid 6 on hdaa0
pcm3: <NVIDIA (0x0040) (HDMI/DP 8ch)> at nid 7 on hdaa0
hdacc1: <Realtek ALC899 HDA CODEC> at cad 0 on hdac1
hdaa1: <Realtek ALC899 Audio Function Group> at nid 1 on hdacc1
pcm4: <Realtek ALC899 (Analog 7.1+HP/2.0)> at nid 20,22,21,23,27 and 24,25,26 on hdaa1
random: unblocking device.
usbus0: 5.0Gbps Super Speed USB v3.0
usbus1: 480Mbps High Speed USB v2.0
usbus2: 480Mbps High Speed USB v2.0
ugen2.1: <Intel> at usbus2
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ugen0.1: <0x8086> at usbus0
uhub2: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <ST1000DM003-1CH162 CC47> ATA-9 SATA 3.x device
ada0: Serial Number Z1D7GHVB
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada0: quirks=0x1<4K>
ada0: Previously was known as ad4
ses0 at ahciem0 bus 0 scbus2 target 0 lun 0
ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device
ses0: SEMB SES Device
cd0 at ahcich1 bus 0 scbus1 target 0 lun 0
cd0: <TSSTcorp DVD+-RW SH-216DB D100> Removable CD-ROM SCSI-0 device 
cd0: Serial Number S10Q6YBD800E6C
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
Netvsc initializing... SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #7 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #5 Launched!
Timecounter "TSC-low" frequency 1696106832 Hz quality 1000
Root mount waiting for: usbus2 usbus1 usbus0
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 21 ports with 21 removable, self powered
Root mount waiting for: usbus2 usbus1 usbus0
xhci0: Port routing mask set to 0x00000000
usb_alloc_device: device init 2 failed (USB_ERR_IOERROR, ignored)
ugen0.2: <Unknown> at usbus0 (disconnected)
uhub_reattach_port: could not allocate new device
ugen2.2: <vendor 0x8087> at usbus2
uhub3: <vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, addr 2> on usbus2
ugen1.2: <vendor 0x8087> at usbus1
uhub4: <vendor 0x8087 product 0x8008, class 9/0, rev 2.00/0.05, addr 2> on usbus1
uhub4: 6 ports with 6 removable, self powered
uhub3: 8 ports with 8 removable, self powered
Root mount waiting for: usbus2 usbus1
ugen1.3: <Generic> at usbus1
umass0: <Bulk-In, Bulk-Out, Interface> on usbus1
umass0:  SCSI over Bulk-Only; quirks = 0x4000
umass0:3:0:-1: Attached to scbus3
da0 at umass-sim0 bus 0 scbus3 target 0 lun 0
da0: <Generic- Compact Flash 1.00> Removable Direct Access SCSI-0 device 
da0: Serial Number 20100818841300000
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
da0: quirks=0x2<NO_6_BYTE>
da1 at umass-sim0 bus 0 scbus3 target 0 lun 1
da1: <Generic- SM/xD-Picture 1.00> Removable Direct Access SCSI-0 device 
da1: Serial Number 20100818841300000
da1: 40.000MB/s transfers
da1: Attempt to query device size failed: NOT READY, Medium not present
da1: quirks=0x2<NO_6_BYTE>
ugen2.3: <DELL> at usbus2
ukbd0: <DELL Dell USB Wired Multimedia Keyboard, class 0/0, rev 1.10/1.07, addr 3> on usbus2
da2 at umass-sim0 bus 0 scbus3 target 0 lun 2
da2: <Generic- SD/MMC 1.00> Removable Direct Access SCSI-0 device 
da2: Serial Number 20100818841300000
da2: 40.000MB/s transfers
da2: Attempt to query device size failed: NOT READY, Medium not present
da2: quirks=0x2<NO_6_BYTE>
kbd2 at ukbd0
da3 at umass-sim0 bus 0 scbus3 target 0 lun 3
da3: <Generic- M.S./M.S.Pro/HG 1.00> Removable Direct Access SCSI-0 device 
da3: Serial Number 20100818841300000
da3: 40.000MB/s transfers
da3: Attempt to query device size failed: NOT READY, Medium not present
da3: quirks=0x2<NO_6_BYTE>
ugen2.4: <DELL> at usbus2
Root mount waiting for: usbus2
ugen2.5: <Atheros Communications> at usbus2
Trying to mount root from zfs:tank/ROOT/default []...
wlan0: Ethernet address: 80:56:f2:3b:a5:67
uhid0: <DELL Dell USB Wired Multimedia Keyboard, class 0/0, rev 1.10/1.07, addr 3> on usbus2
ums0: <DELL DELL USB Laser Mouse, class 0/0, rev 2.00/57.00, addr 4> on usbus2
ums0: 8 buttons and [XYZT] coordinates ID=0
Cuse4BSD v0.1.30 @ /dev/cuse
pefs: AESNI hardware acceleration enabled
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to deny, logging disabled
WARNING: attempt to domain_add(bluetooth) after domainfinalize()
pid 3347 (VBoxSVC), uid 0: exited on signal 6
re0: link state changed to DOWN
re0: link state changed to UP
wlan0: Ethernet address: 80:56:f2:3b:a5:67
ath0: ath_edma_recv_tasklet: sc_inreset_cnt > 0; skipping
wlan0: link state changed to UP
wlan0: link state changed to DOWN
re0: link state changed to DOWN
re0: link state changed to UP
wlan0: Ethernet address: 80:56:f2:3b:a5:67
wlan0: link state changed to UP

--------------050205050605040301090407
Content-Type: text/plain; charset=UTF-8;
 name="pciconf.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="pciconf.txt"

hostb0@pci0:0:0:0:	class=0x060000 card=0x05b71028 chip=0x0c008086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Haswell DRAM Controller'
    class      = bridge
    subclass   = HOST-PCI
pcib1@pci0:0:1:0:	class=0x060400 card=0x05b71028 chip=0x0c018086 rev=0x06 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Haswell PCI Express x16 Controller'
    class      = bridge
    subclass   = PCI-PCI
xhci0@pci0:0:20:0:	class=0x0c0330 card=0x05b71028 chip=0x8c318086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point USB xHCI Host Controller'
    class      = serial bus
    subclass   = USB
none0@pci0:0:22:0:	class=0x078000 card=0x05b71028 chip=0x8c3a8086 rev=0x04 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point MEI Controller'
    class      = simple comms
ehci0@pci0:0:26:0:	class=0x0c0320 card=0x05b71028 chip=0x8c2d8086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point USB Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
hdac1@pci0:0:27:0:	class=0x040300 card=0x05b71028 chip=0x8c208086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point High Definition Audio Controller'
    class      = multimedia
    subclass   = HDA
pcib2@pci0:0:28:0:	class=0x060400 card=0x05b71028 chip=0x8c108086 rev=0xd5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
pcib3@pci0:0:28:2:	class=0x060400 card=0x05b71028 chip=0x8c148086 rev=0xd5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
pcib4@pci0:0:28:7:	class=0x060400 card=0x05b71028 chip=0x8c1e8086 rev=0xd5 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
ehci1@pci0:0:29:0:	class=0x0c0320 card=0x05b71028 chip=0x8c268086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point USB Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
isab0@pci0:0:31:0:	class=0x060100 card=0x05b71028 chip=0x8c448086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point LPC Controller'
    class      = bridge
    subclass   = PCI-ISA
ahci0@pci0:0:31:2:	class=0x010601 card=0x05b71028 chip=0x8c028086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point 6-port SATA Controller 1 [AHCI mode]'
    class      = mass storage
    subclass   = SATA
none1@pci0:0:31:3:	class=0x0c0500 card=0x05b71028 chip=0x8c228086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Lynx Point SMBus Controller'
    class      = serial bus
    subclass   = SMBus
vgapci0@pci0:1:0:0:	class=0x030000 card=0x098a10de chip=0x118510de rev=0xa1 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    class      = display
    subclass   = VGA
hdac0@pci0:1:0:1:	class=0x040300 card=0x098a10de chip=0x0e0a10de rev=0xa1 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    device     = 'GK104 HDMI Audio Controller'
    class      = multimedia
    subclass   = HDA
re0@pci0:3:0:0:	class=0x020000 card=0x05b71028 chip=0x816810ec rev=0x0c hdr=0x00
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168B PCI Express Gigabit Ethernet controller'
    class      = network
    subclass   = ethernet
ath0@pci0:4:0:0:	class=0x028000 card=0x02091028 chip=0x0032168c rev=0x01 hdr=0x00
    vendor     = 'Atheros Communications Inc.'
    device     = 'AR9485 Wireless Network Adapter'
    class      = network

--------------050205050605040301090407--

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 03:04:33 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 67052F0;
 Fri,  6 Dec 2013 03:04:33 +0000 (UTC)
Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com
 [IPv6:2a00:1450:4010:c03::22d])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 2E7BF133A;
 Fri,  6 Dec 2013 03:04:32 +0000 (UTC)
Received: by mail-la0-f45.google.com with SMTP id eh20so39634lab.4
 for <multiple recipients>; Thu, 05 Dec 2013 19:04:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=mm46h5ABagl0FCh8JpvgvRzoQH2L1l1AWnt4Tz+UZ3s=;
 b=FX2egA1tHxz2DbTswccBj2H9zS1iSgk2BW8PI4eK/xa2zMhDL8jvWRA0lPNf+sWsFD
 8+cg15Xe/tNrLLAg0SNt9YIHnlMvzOWgpSIRWhGnbMqWNzKYKYoDBGJee4dFKRwkh6Uh
 trg1L5vy/2vFtnp6hPYPvv3SFLVWI5KeQlYVFoDPg8cOUGm6Oj8i/TSel1cDJVf8mMf4
 0o1yGvAclB4x0NoXChxW0rAdASsFcy+RO6vSKVM0kiejyWcr58BDe/g15mCCLK0bPVlM
 dOOLBQinezRSfVCsO4HqwQHzW/V4Ov+YNGkpWlqwB8QKXM9QeXsdsLQCj/sFoxEHLR+Z
 PZ7w==
MIME-Version: 1.0
X-Received: by 10.152.234.170 with SMTP id uf10mr238362lac.43.1386299069530;
 Thu, 05 Dec 2013 19:04:29 -0800 (PST)
Received: by 10.114.166.163 with HTTP; Thu, 5 Dec 2013 19:04:29 -0800 (PST)
In-Reply-To: <CAJ-VmokQ_C_t=pZF5QnWMzjzw6YVqTD4ny3hv_cLDch-m2EOmg@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
 <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
 <CAMOc5cyM-+vau7BsZQ5F5L95EQgN=pJqru=9aK_0aJ+VUk=gxQ@mail.gmail.com>
 <CAJ-VmokQ_C_t=pZF5QnWMzjzw6YVqTD4ny3hv_cLDch-m2EOmg@mail.gmail.com>
Date: Fri, 6 Dec 2013 11:04:29 +0800
Message-ID: <CAMOc5cznTT-0qOUrG1F55=yPpbWfn3EqWT4+V_RDqG6+DCckOQ@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Oleg Moskalenko <mom040267@gmail.com>,
 Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 03:04:33 -0000

On Tue, Dec 3, 2013 at 5:41 AM, Adrian Chadd <adrian@freebsd.org> wrote:
>
> On 2 December 2013 03:45, Sepherosa Ziehau <sepherosa@gmail.com> wrote:
> >
> > On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> >
> >> Ok, so given this, how do you guarantee the UTHREAD stays on the given
> >> CPU? You assume it stays on the CPU that the initial listen socket was
> >> created on, right? If it's migrated to another CPU core then the
> >> listen queue still stays in the original hash group that's in a netisr
> >> on a different CPU?
> >
> > As I wrote in the above brief introduction, Dfly currently relies on the
> > scheduler doing the proper thing (the scheduler does do a very good job
> > during my tests).  I need to export certain kind of socket option to make
> > that information available to user space programs.  Force UTHREAD binding in
> > kernel is not helpful, given in reverse proxy application, things are
> > different.  And even if that kind of binding information was exported to
> > user space, user space program still would have to poll it periodically (in
> > Dfly at least), since other programs binding to the same addr/port could
> > come and go, which will cause reorganizing of the inp localgroup in the
> > current Dfly implementation.
>
> Right. I kinda gathered that. It's fine, I was conceptually thinking
> of doing some thead pinning into this anyway.
>
> How do you see this scaling on massively multi-core machines? Like 32,
> 48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting

We do have a 48 core box.  It is mainly used for package building and
other stuffs.  I didn't run network stress tests on it.  However, we
do address some message passing problems on it which will not be
unveiled on 8 cpu boxes.

> the concept of pcbgroup hash / netisr threads to a subset of CPUs, or
> have them be able to float between sockets but only have 1 (or n,

Floating around may be good, but by pinning netisr to a specific CPU
you could enjoy lockless per-cpu data.

> maybe) per socket. Or just have a fixed, smaller pool. The idea then

We used to have dedicated threads for UDP and TCP processing, but it
turns out that one netisr per cpu works best in Dfly.  You probably
need to try and measure before deciding to move to 1 or N netisrs per
cpu.

Best Regards,
sephe

> is the scheduler would need to be told that a given userland
> thread/process belongs to a given netisr thread, and to schedule them
> on the same CPU when possible.
>
> Anyway, thanks for doing this work. I only wish that you'd do it for
> FreeBSD. :-)
>
>
>
> -adrian




-- 
Tomorrow Will Never Die

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 03:50:59 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9BF7C194;
 Fri,  6 Dec 2013 03:50:59 +0000 (UTC)
Received: from mail-qe0-x230.google.com (mail-qe0-x230.google.com
 [IPv6:2607:f8b0:400d:c02::230])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1E0B21669;
 Fri,  6 Dec 2013 03:50:59 +0000 (UTC)
Received: by mail-qe0-f48.google.com with SMTP id gc15so119188qeb.35
 for <multiple recipients>; Thu, 05 Dec 2013 19:50:58 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=J41ZYyv2zM12gpm4hTJRsBZIJO9j6ai5ubyi0NvaAhQ=;
 b=bAT6sFNJnbcw8+8RESL27KPy3tmRymlOWsfAsiBp9OTBi1Hk6mjcJSnIp5KCUExQbk
 6nkdztjH/VR2pXgLmwf194RU9HEVZFvKFfKYKcm3Xjt0ktE9W7VqHtQ7pz5ZQfvJYrOw
 pdCUziOT+7Jdt3Min6Hm+lJb33/fK0vF8q04oZeKPQ3ZRN6OCvtLB52DXZQ09j6WEEes
 LEfFGPothrY8eQy4cH4U0pv/nD3W4h4sW3bJCehQSon0e8FjTiHD9DAsQ1x8hm+zbfUU
 xyBO2D5FrCDKQW8wPDS/NyMEPjikV+f39LHz9mx867vrn6CsLXXiYgLqzRxScDP6LmK1
 3jeA==
MIME-Version: 1.0
X-Received: by 10.49.116.141 with SMTP id jw13mr2419321qeb.2.1386301858239;
 Thu, 05 Dec 2013 19:50:58 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Thu, 5 Dec 2013 19:50:58 -0800 (PST)
In-Reply-To: <CAMOc5cznTT-0qOUrG1F55=yPpbWfn3EqWT4+V_RDqG6+DCckOQ@mail.gmail.com>
References: <CAPBZQG29BEJJ8BK=gn+g_n5o7JSnPbsKQ-=3=6AkFOxzt+=wGQ@mail.gmail.com>
 <4053E074-EDC5-49AB-91A7-E50ABE36602E@freebsd.org>
 <CALDtMrKvwXW-ou8X7zsKx2ST=dKD7FqHvvnQtGo30znTWU+VQQ@mail.gmail.com>
 <CAPBZQG0=bcHyv7aZse=WKfjk5=6D2-+6EQHiAaDZqGtaodhMMA@mail.gmail.com>
 <CAMOc5cwFGwk0dS5VT-YxfP3Yt38R8aO-KJTX6W832uOFEdavgA@mail.gmail.com>
 <CAJ-Vmonc7SVxndmVN1jphFRa5svD5BdnMrCudSbYkx4djHXW0A@mail.gmail.com>
 <CAMOc5cyM-+vau7BsZQ5F5L95EQgN=pJqru=9aK_0aJ+VUk=gxQ@mail.gmail.com>
 <CAJ-VmokQ_C_t=pZF5QnWMzjzw6YVqTD4ny3hv_cLDch-m2EOmg@mail.gmail.com>
 <CAMOc5cznTT-0qOUrG1F55=yPpbWfn3EqWT4+V_RDqG6+DCckOQ@mail.gmail.com>
Date: Thu, 5 Dec 2013 19:50:58 -0800
X-Google-Sender-Auth: BJDVu1NITTCeRaUA2CYm0PA57Pw
Message-ID: <CAJ-Vmoko+Gz4PdYJTaLg-22BrQqG0qO9hq+ZadPwwCPgkTsGNg@mail.gmail.com>
Subject: Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
From: Adrian Chadd <adrian@freebsd.org>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>,
 freebsd-net <freebsd-net@freebsd.org>, Oleg Moskalenko <mom040267@gmail.com>,
 Tim Kientzle <kientzle@freebsd.org>,
 "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 03:50:59 -0000

I was thinking of n netisrs per m CPUs, where n < m; or maybe 1 netisr
for m CPUs, where m is less than the total number.

Having 48 cores contending on netisr stuff is a bit crazy. It's highly
unlikely you need that many cores doing packet pushing.


-a

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 08:47:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 415BE8B8;
 Fri,  6 Dec 2013 08:47:05 +0000 (UTC)
Received: from mail-ve0-x22b.google.com (mail-ve0-x22b.google.com
 [IPv6:2607:f8b0:400c:c01::22b])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E071118DB;
 Fri,  6 Dec 2013 08:47:04 +0000 (UTC)
Received: by mail-ve0-f171.google.com with SMTP id pa12so421249veb.16
 for <multiple recipients>; Fri, 06 Dec 2013 00:47:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=isOs9QJ92qWiJyDuRr0Jnv4a07JtfaIu3Fe/Mbn4pcE=;
 b=N0n52CteadD35VO+wTzmMtM8DFaPSA5tf1wIAW1RZ+RrmbixCYUndlL6c4HX8q713V
 KXFWvEffysJcbWN1evreurLjPV2664bmC7Ca5opoECDhYExYvzbxRKEmpR4p7OqV5YCJ
 wl4H8t8LBiQnK7reHgoAPJ2/X7gqUW5d915Y5jbO7UlKriUEP4hstn7MsJTgbFEUAn9a
 ON9xqHJ5IbKlyW2/HmlAAAWr3T5ASUN+LkBnrFq6EBFH5F1zi+CSsq6YzOEG/vPNfko9
 jJhnWPtQtInOupUAGtQuzGrf2MacSZ1fdf9LZUclCYPx+/QAgoDkMIeL5C2YecooJ67Z
 4WkQ==
MIME-Version: 1.0
X-Received: by 10.220.174.200 with SMTP id u8mr1309687vcz.6.1386319623977;
 Fri, 06 Dec 2013 00:47:03 -0800 (PST)
Received: by 10.58.7.169 with HTTP; Fri, 6 Dec 2013 00:47:03 -0800 (PST)
In-Reply-To: <52A13BB1.50106@yahoo.com>
References: <52A13BB1.50106@yahoo.com>
Date: Fri, 6 Dec 2013 12:47:03 +0400
Message-ID: <CAAr=-G6HUut2G=32H8PgR8Ht-4BceQJbpKQe+EF122S8ucWhuQ@mail.gmail.com>
Subject: Re: Can't connect to network with my NIC
From: Mikhail Vorobyev <compasmih@gmail.com>
To: Darryl Lyle <blue_phoenix316@yahoo.com>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: freebsd-net@freebsd.org, yongari@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 08:47:05 -0000

Hi man. Apparently it is necessary to properly configure network interfaces.


2013/12/6 Darryl Lyle <blue_phoenix316@yahoo.com>

> Hey guys,
>
>     I just installed the latest pc-bsd 10 stable image, and I can't
> connect to my network with my nic, but I can connect fine via wifi.
>
> ifconfig re0
> e0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_
> HWCSUM,WOL_MAGIC,LINKSTATE>
>         ether f8:b1:56:9d:84:3a
>         inet6 fe80::fab1:56ff:fe9d:843a%re0 prefixlen 64 scopeid 0x1
>         inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255
>         nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
>
> Attached is dmesg and pciconf -lv
>
> If I try dhclient re0 I get no DHCPOFFERS
>
>
>
>
> v/r
> Darryl Lyle
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



-- 
Regards, Vorobyev Mikhail.

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 09:42:55 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id E4233825
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 09:42:55 +0000 (UTC)
Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au
 [122.100.2.194])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7BA541DDB
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 09:42:54 +0000 (UTC)
Received: from vps.rulingia.com (localhost [127.0.0.1])
 by vps.rulingia.com (8.14.7/8.14.7) with ESMTP id rB69RZhY052065
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Fri, 6 Dec 2013 20:27:35 +1100 (EST)
 (envelope-from peter@vps.rulingia.com)
Received: (from peter@localhost)
 by vps.rulingia.com (8.14.7/8.14.7/Submit) id rB69RZqu052064;
 Fri, 6 Dec 2013 20:27:35 +1100 (EST) (envelope-from peter)
Date: Fri, 6 Dec 2013 20:27:35 +1100
From: Peter Jeremy <peter@vps.rulingia.com>
To: Darryl Lyle <blue_phoenix316@yahoo.com>
Subject: Re: Can't connect to network with my NIC
Message-ID: <20131206092735.GA51955@vps.rulingia.com>
References: <52A13BB1.50106@yahoo.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="IJpNTDwzlM2Ie8A6"
Content-Disposition: inline
In-Reply-To: <52A13BB1.50106@yahoo.com>
X-PGP-Key: http://www.rulingia.com/keys/peter.pgp
User-Agent: Mutt/1.5.22 (2013-10-16)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 09:42:56 -0000


--IJpNTDwzlM2Ie8A6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On 2013-Dec-05 21:51:29 -0500, Darryl Lyle <blue_phoenix316@yahoo.com> wrot=
e:
>     I just installed the latest pc-bsd 10 stable image, and I can't=20
>connect to my network with my nic, but I can connect fine via wifi.

I don't see anything immediately obvious that's wrong.

>If I try dhclient re0 I get no DHCPOFFERS

I presume there's a DHCP server visible from whatever re0 is plugged into.

As further debugging steps:
If you "tcpdump -n -i re0" on the affected box, do you see any network
traffic?  Can you see the outgoing DHCP requests?  Is there any response?

If you run "tcpdump -n -i ... ether host f8:b1:56:9d:84:3a" on another
box on the network (ideally the DHCP server), do you see the DHCP
requests?  If it's the switch or DHCP server, do you see any responses?

--=20
Peter

--IJpNTDwzlM2Ie8A6
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iEYEARECAAYFAlKhmIcACgkQ/opHv/APuIfRzACeNK7stJoPNq2rdU4OhbQ4mGtA
CuIAoLSLaOUOJJ/GQVJFdQx2iYH5nP86
=1xw2
-----END PGP SIGNATURE-----

--IJpNTDwzlM2Ie8A6--

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 16:26:04 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id C7AD37A7;
 Fri,  6 Dec 2013 16:26:04 +0000 (UTC)
Received: from aslan.scsiguy.com (mail.scsiguy.com [70.89.174.89])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9BEC31B52;
 Fri,  6 Dec 2013 16:26:04 +0000 (UTC)
Received: from raycaruso-lt.sldomain.com (207-225-98-3.dia.static.qwest.net
 [207.225.98.3]) (authenticated bits=0)
 by aslan.scsiguy.com (8.14.7/8.14.5) with ESMTP id rB6GPsHg047332
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
 Fri, 6 Dec 2013 09:25:57 -0700 (MST)
 (envelope-from gibbs@scsiguy.com)
Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\))
Subject: Re: Defaults for if_capenable and detecting user initiated changes
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
In-Reply-To: <201312031213.41677.jhb@freebsd.org>
Date: Fri, 6 Dec 2013 09:25:48 -0700
Message-Id: <526A243B-7B66-45BD-9B45-3BFB04F1E16D@scsiguy.com>
References: <0E13D481-9D6D-4B52-A5AD-B671BF3A85AF@scsiguy.com>
 <201312031213.41677.jhb@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
X-Mailer: Apple Mail (2.1822)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3
 (aslan.scsiguy.com [70.89.174.89]); Fri, 06 Dec 2013 09:25:58 -0700 (MST)
Content-Type: text/plain;
	charset=windows-1252
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: freebsd-net@freebsd.org,
 =?iso-8859-1?Q?Roger_Pau_Monn=E9?= <royger@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 16:26:04 -0000

On Dec 3, 2013, at 10:13 AM, John Baldwin <jhb@freebsd.org> wrote:

> On Wednesday, November 27, 2013 12:59:08 pm Justin T. Gibbs wrote:
>> Hi net,
>>=20
>> I=92m reviewing a patch from Roger Pau Monn=E9 for the Xen netfront =
driver.  The=20
> goal of the change is to avoid disturbing the user=92s settings for =
the=20
> interface just because the backend device has changed or the =
connection to the=20
> backend was reset.  I=92ve attached the latest version of the patch.
>>=20
>> The current patch leaves the interface settings alone if they can be=20=

> supported by the newly attached backend.  What would be ideal is to =
enable=20
> capabilities that default to being enabled if they were not explicitly=20=

> disabled by the user and can be supported by the new backend.  =
Unfortunately,=20
> I don=92t think the if_capenable and if_capabilities fields are =
descriptive=20
> enough to deal with an interface whose capabilities can change at =
runtime. =20
> Just as can be done with link speed, some of these settings need to =
allow an=20
> =93auto/default=94 setting in addition to on or off.  This would allow =
the user to=20
> explicitly disable a capability if needed, but generally allow the =
system to=20
> chose the most optimal settings when they are supported.  Would this =
be=20
> difficult to add?
>=20
> Couldn't you maintain this state in the Xen netfront driver's softc?
> You already get the ioctls that track changes to the capenable field,
> so you when a change explicitly disables a capability you can set that
> in a 'forced off' or 'forced on' field.  Perhaps more of a 'forced'
> field that you just update by doing:
>=20
> 	sc->capforced |=3D (oldcapenable ^ newcapenable)
>=20
> However, it's not clear to me if you can get the underlying adapters
> initial capenable list.  If so, I think capforced should be all you
> need to handle this (though it might be easier if you have separate
> forcedon and forcedoff fields).
>=20
> --=20
> John Baldwin

Certainly this could be done in the Xen driver.  The reason I posted my =
question, however, was to ask whether this should be more generically =
tracked by the if layer instead of handled by the underlying driver.  =
Lots of user interfaces support a =93restore defaults=94 capability =
(e.g. for the novice administrator who screws up, or as a step in =
writing a script/procedure that starts by getting to a known state), so =
I think this is interesting for more than this particular Xen issue.

=97
Justin


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 20:08:16 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9E6301F7;
 Fri,  6 Dec 2013 20:08:16 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id C44A51B48;
 Fri,  6 Dec 2013 20:08:15 +0000 (UTC)
Received: from [192.168.1.200] (p508F3521.dip0.t-ipconnect.de [80.143.53.33])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id A43481C0C0692;
 Fri,  6 Dec 2013 21:08:12 +0100 (CET)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
Date: Fri, 6 Dec 2013 21:08:13 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <2D0F95A6-1321-4F8E-87FB-1B9DD33FD319@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 20:08:16 -0000

On Dec 5, 2013, at 7:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> Hi,
>=20
> Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
> error from ixgbe_xmit() but if it fails, it puts the buffer back. But
> it's already successfully queued a frame to the driver, so in this
> instance it shouldn't return the error from ixgbe_mq_start_locked().
>=20
> The same deal in if_em.c and igb.c
>=20
> Now, drbr_putback() used to fail and now it doesn't, as you've said.
> So we should change the xxx_mq_start_locked() to set err=3D0 if we go
> via the drbr_putback() routine, as it hasn't actually failed to
> transmit.
>=20
> Now the very dirty thing is this - the error from xxx_transmit() is
> for the mbuf being queued at the end; but xxx_mq_start_locked()
> failures are for transmitting from the front. If there's only packet
> in the queue and that fails then they're the same thing and returning
> the error from xxx_mq_start_locked() matches the current mbuf being
> queued. But otherwise, they're referring to totally different packets.
> For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
> kicks off a timer to schedule a retransmit. I don't think we can fix
> _this_ right now.
>=20
> So Michael - can you redo your patch to set err=3D0 if drbr_putback() =
is
> called, and retest?
Hi Adrian,

I guess you are talking about a patch like:

[bsd5:~/head/sys/dev] tuexen% svn diff -x -p
Index: e1000/if_em.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- e1000/if_em.c	(revision 259039)
+++ e1000/if_em.c	(working copy)
@@ -935,6 +935,7 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
 				drbr_advance(ifp, txr->br);
 			else=20
 				drbr_putback(ifp, txr->br, next);
+				err =3D 0;
 			break;
 		}
 		drbr_advance(ifp, txr->br);
Index: e1000/if_igb.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- e1000/if_igb.c	(revision 259039)
+++ e1000/if_igb.c	(working copy)
@@ -1024,6 +1024,7 @@ igb_mq_start_locked(struct ifnet *ifp, struct tx_r
 				 * may have changed it.
 				 */
 				drbr_putback(ifp, txr->br, next);
+				err =3D 0;
 			}
 			break;
 		}
Index: ixgbe/ixgbe.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- ixgbe/ixgbe.c	(revision 259039)
+++ ixgbe/ixgbe.c	(working copy)
@@ -864,6 +864,7 @@ ixgbe_mq_start_locked(struct ifnet *ifp, struct tx
 				drbr_advance(ifp, txr->br);
 			} else {
 				drbr_putback(ifp, txr->br, next);
+				err =3D 0;
 			}
 #endif
 			break;
Index: ixgbe/ixv.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- ixgbe/ixv.c	(revision 259039)
+++ ixgbe/ixv.c	(working copy)
@@ -629,6 +629,7 @@ ixv_mq_start_locked(struct ifnet *ifp, struct tx_r
 				drbr_advance(ifp, txr->br);
 			} else {
 				drbr_putback(ifp, txr->br, next);
+				err =3D 0;
 			}
 			break;
 		}
Index: virtio/network/if_vtnet.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- virtio/network/if_vtnet.c	(revision 259039)
+++ virtio/network/if_vtnet.c	(working copy)
@@ -2242,9 +2242,10 @@ vtnet_txq_mq_start_locked(struct vtnet_txq *txq, =
s
 	while ((m =3D drbr_peek(ifp, br)) !=3D NULL) {
 		error =3D vtnet_txq_encap(txq, &m);
 		if (error) {
-			if (m !=3D NULL)
+			if (m !=3D NULL) {
 				drbr_putback(ifp, br, m);
-			else
+				error =3D 0;
+			} else
 				drbr_advance(ifp, br);
 			break;
 		}

I looked for drivers using drbr_putback() and used a similar fix. Please =
note
that sys/dev/oce/oce_if.c seems strange. It uses drbr_putback() and =
drbr_enqueue(),
so I left it out for now.

I tested the igb driver and the above patch fixes the problem I saw.

=46rom your above description I think the above patch is a valid patch.
However, xxx_transmit() can still return an error, even if there is no
problem with the provided packet. This is an issue for transport =
protocols...

Best regards
Michael



>=20
> Thanks!
>=20
>=20
>=20
>=20
> -adrian
>=20


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 20:15:22 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5B29968E;
 Fri,  6 Dec 2013 20:15:22 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E000E1D0D;
 Fri,  6 Dec 2013 20:15:21 +0000 (UTC)
Received: from [192.168.1.200] (p508F3521.dip0.t-ipconnect.de [80.143.53.33])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id E5B941C0C069B;
 Fri,  6 Dec 2013 21:15:19 +0100 (CET)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
Date: Fri, 6 Dec 2013 21:15:17 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <4E82B807-12DE-441E-BCB3-261866CC5B28@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
To: Adrian Chadd <adrian@freebsd.org>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 20:15:22 -0000

On Dec 5, 2013, at 11:01 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> On 5 December 2013 13:05, Michael Tuexen
> <Michael.Tuexen@lurchi.franken.de> wrote:
> 
>> Just to be clear: This would mean that xxx_transmit() would return
>> an error even if the packet provided in the call xxx_transmit() is
>> enqueued and not dropped?
>> This would also be problem with the current SCTP stack.
> 
> I think it'll return an error only if:
> 
> * it queued the frame to the tail of the drbd;
> * it then tried to transmit a frame from the head of the drbd;
> * it failed to transmit the first frame in the drbd and it couldn't
> put it back into the queue for whatever reason.
> 
> So I think it should be "ok enough" for both TCP and SCTP.
No it isn't. The transport layer calls ip_output() (or the v6 variant),
and it needs to know if the packet provided will not be put on the wire.
If it knows for sure that the provided packet was dropped by the local
stack it can do some special treatment. In all other cases, the packet
may or may not make it to the peer and the transport layer will take
care, but can't optimize.

If the above describes what I get from ip_output(), I can only ignore
it, since it doesn't help.

Which layer can make use of the above information?
> 
> Give it a go and let me know how it goes.
The patch in the other mail fixes the problem and improves the
driver.
> 
> It's an interesting architectural problem to completely solve.
Yes, it is. I think setting err=0 makes sense.
However, the information returned by xxx_transmit() as described
above seems useless for me. This is an architectural point as
you said and I'm interested in knowing which consumer of the
return code of xxx_transmit() can make use of it.

Best regards
Michael
> 
> 
> -adrian
> 


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 20:17:16 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id F029D7B0;
 Fri,  6 Dec 2013 20:17:16 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 804C51D38;
 Fri,  6 Dec 2013 20:17:16 +0000 (UTC)
Received: from [192.168.1.200] (p508F3521.dip0.t-ipconnect.de [80.143.53.33])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id A267B1C0C0692;
 Fri,  6 Dec 2013 21:17:14 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131205223711.GB55638@funkthat.com>
Date: Fri, 6 Dec 2013 21:17:15 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <3576B69E-E943-46E0-83E5-0B2194A44ED0@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
 <20131205223711.GB55638@funkthat.com>
To: John-Mark Gurney <jmg@funkthat.com>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 20:17:17 -0000

On Dec 5, 2013, at 11:37 PM, John-Mark Gurney <jmg@funkthat.com> wrote:

> Adrian Chadd wrote this message on Thu, Dec 05, 2013 at 14:01 -0800:
>> On 5 December 2013 13:05, Michael Tuexen
>> <Michael.Tuexen@lurchi.franken.de> wrote:
>> 
>>> Just to be clear: This would mean that xxx_transmit() would return
>>> an error even if the packet provided in the call xxx_transmit() is
>>> enqueued and not dropped?
>>> This would also be problem with the current SCTP stack.
>> 
>> I think it'll return an error only if:
>> 
>> * it queued the frame to the tail of the drbd;
>> * it then tried to transmit a frame from the head of the drbd;
>> * it failed to transmit the first frame in the drbd and it couldn't
>> put it back into the queue for whatever reason.
>> 
>> So I think it should be "ok enough" for both TCP and SCTP.
> 
> IMO it should only return an error if the specific frame failed to be
> sent or queued.  If you cannot determine at return time if the frame
> failed to be transmitted/queued, then it should return success.
Yes, this is exactly what I think too. This is what my first patch
realizes.
> 
> In the above case, if there were other frames queued ahead, and the
> first one failed, then it sounds like the frame may eventually be sent
> and we will end up sending a duplicate frame.
Correct. SCTP will consider the frame even unsent... So the SCTP stack
behaves strange and sends a packet at wirespeed over and over again (which
is not good...).

Best regards
Michael
> 
>> Give it a go and let me know how it goes.
>> 
>> It's an interesting architectural problem to completely solve.
> 
> -- 
>  John-Mark Gurney				Voice: +1 415 225 5579
> 
>     "All that I will do, has been done, All that I have, has not."
> 


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 20:17:55 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B2017882;
 Fri,  6 Dec 2013 20:17:55 +0000 (UTC)
Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 88DFA1D4E;
 Fri,  6 Dec 2013 20:17:55 +0000 (UTC)
Received: from h2.funkthat.com (localhost [127.0.0.1])
 by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rB6KHmgo092246
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Fri, 6 Dec 2013 12:17:48 -0800 (PST)
 (envelope-from jmg@h2.funkthat.com)
Received: (from jmg@localhost)
 by h2.funkthat.com (8.14.3/8.14.3/Submit) id rB6KHmEr092245;
 Fri, 6 Dec 2013 12:17:48 -0800 (PST) (envelope-from jmg)
Date: Fri, 6 Dec 2013 12:17:48 -0800
From: John-Mark Gurney <jmg@funkthat.com>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131206201748.GF55638@funkthat.com>
Mail-Followup-To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 Adrian Chadd <adrian@freebsd.org>,
 Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <2D0F95A6-1321-4F8E-87FB-1B9DD33FD319@lurchi.franken.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <2D0F95A6-1321-4F8E-87FB-1B9DD33FD319@lurchi.franken.de>
User-Agent: Mutt/1.4.2.3i
X-Operating-System: FreeBSD 7.2-RELEASE i386
X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88  9322 9CB1 8F74 6D3F A396
X-Files: The truth is out there
X-URL: http://resnet.uoregon.edu/~gurney_j/
X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html
X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger?
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (h2.funkthat.com [127.0.0.1]); Fri, 06 Dec 2013 12:17:49 -0800 (PST)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 20:17:55 -0000

Michael Tuexen wrote this message on Fri, Dec 06, 2013 at 21:08 +0100:
> On Dec 5, 2013, at 7:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> 
> > Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
> > error from ixgbe_xmit() but if it fails, it puts the buffer back. But
> > it's already successfully queued a frame to the driver, so in this
> > instance it shouldn't return the error from ixgbe_mq_start_locked().
> > 
> > The same deal in if_em.c and igb.c
> > 
> > Now, drbr_putback() used to fail and now it doesn't, as you've said.
> > So we should change the xxx_mq_start_locked() to set err=0 if we go
> > via the drbr_putback() routine, as it hasn't actually failed to
> > transmit.
> > 
> > Now the very dirty thing is this - the error from xxx_transmit() is
> > for the mbuf being queued at the end; but xxx_mq_start_locked()
> > failures are for transmitting from the front. If there's only packet
> > in the queue and that fails then they're the same thing and returning
> > the error from xxx_mq_start_locked() matches the current mbuf being
> > queued. But otherwise, they're referring to totally different packets.
> > For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
> > kicks off a timer to schedule a retransmit. I don't think we can fix
> > _this_ right now.
> > 
> > So Michael - can you redo your patch to set err=0 if drbr_putback() is
> > called, and retest?
> Hi Adrian,
> 
> I guess you are talking about a patch like:
> 
> [bsd5:~/head/sys/dev] tuexen% svn diff -x -p
> Index: e1000/if_em.c
> ===================================================================
> --- e1000/if_em.c	(revision 259039)
> +++ e1000/if_em.c	(working copy)
> @@ -935,6 +935,7 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
>  				drbr_advance(ifp, txr->br);
>  			else 
>  				drbr_putback(ifp, txr->br, next);
> +				err = 0;

You probably want curly braces around this...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 20:20:13 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id BC69E988;
 Fri,  6 Dec 2013 20:20:13 +0000 (UTC)
Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 8FB451D6E;
 Fri,  6 Dec 2013 20:20:13 +0000 (UTC)
Received: from h2.funkthat.com (localhost [127.0.0.1])
 by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rB6KKCLE092298
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Fri, 6 Dec 2013 12:20:12 -0800 (PST)
 (envelope-from jmg@h2.funkthat.com)
Received: (from jmg@localhost)
 by h2.funkthat.com (8.14.3/8.14.3/Submit) id rB6KKC3Z092297;
 Fri, 6 Dec 2013 12:20:12 -0800 (PST) (envelope-from jmg)
Date: Fri, 6 Dec 2013 12:20:12 -0800
From: John-Mark Gurney <jmg@funkthat.com>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
Message-ID: <20131206202012.GG55638@funkthat.com>
Mail-Followup-To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>,
 Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
 <20131205223711.GB55638@funkthat.com>
 <3576B69E-E943-46E0-83E5-0B2194A44ED0@lurchi.franken.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3576B69E-E943-46E0-83E5-0B2194A44ED0@lurchi.franken.de>
User-Agent: Mutt/1.4.2.3i
X-Operating-System: FreeBSD 7.2-RELEASE i386
X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88  9322 9CB1 8F74 6D3F A396
X-Files: The truth is out there
X-URL: http://resnet.uoregon.edu/~gurney_j/
X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html
X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger?
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (h2.funkthat.com [127.0.0.1]); Fri, 06 Dec 2013 12:20:13 -0800 (PST)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 20:20:13 -0000

Michael Tuexen wrote this message on Fri, Dec 06, 2013 at 21:17 +0100:
> On Dec 5, 2013, at 11:37 PM, John-Mark Gurney <jmg@funkthat.com> wrote:
> 
> > Adrian Chadd wrote this message on Thu, Dec 05, 2013 at 14:01 -0800:
> >> On 5 December 2013 13:05, Michael Tuexen
> >> <Michael.Tuexen@lurchi.franken.de> wrote:
> >> 
> >>> Just to be clear: This would mean that xxx_transmit() would return
> >>> an error even if the packet provided in the call xxx_transmit() is
> >>> enqueued and not dropped?
> >>> This would also be problem with the current SCTP stack.
> >> 
> >> I think it'll return an error only if:
> >> 
> >> * it queued the frame to the tail of the drbd;
> >> * it then tried to transmit a frame from the head of the drbd;
> >> * it failed to transmit the first frame in the drbd and it couldn't
> >> put it back into the queue for whatever reason.
> >> 
> >> So I think it should be "ok enough" for both TCP and SCTP.
> > 
> > IMO it should only return an error if the specific frame failed to be
> > sent or queued.  If you cannot determine at return time if the frame
> > failed to be transmitted/queued, then it should return success.
> Yes, this is exactly what I think too. This is what my first patch
> realizes.
> > 
> > In the above case, if there were other frames queued ahead, and the
> > first one failed, then it sounds like the frame may eventually be sent
> > and we will end up sending a duplicate frame.
> Correct. SCTP will consider the frame even unsent... So the SCTP stack
> behaves strange and sends a packet at wirespeed over and over again (which
> is not good...).

Sounds like a bug in SCTP, if it gets an error like that, it needs to back
off a bit.. Though when to wake up, etc, is harder to decide...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 21:04:43 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 3D3C0184;
 Fri,  6 Dec 2013 21:04:43 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 66D3B10B6;
 Fri,  6 Dec 2013 21:04:42 +0000 (UTC)
Received: from [192.168.1.200] (p508F3521.dip0.t-ipconnect.de [80.143.53.33])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 08CA31C0C0695;
 Fri,  6 Dec 2013 22:04:39 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131206201748.GF55638@funkthat.com>
Date: Fri, 6 Dec 2013 22:04:41 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <956436B1-5E20-4470-B415-3311F5CC24B8@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <2D0F95A6-1321-4F8E-87FB-1B9DD33FD319@lurchi.franken.de>
 <20131206201748.GF55638@funkthat.com>
To: John-Mark Gurney <jmg@funkthat.com>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 21:04:43 -0000

On Dec 6, 2013, at 9:17 PM, John-Mark Gurney <jmg@funkthat.com> wrote:

> Michael Tuexen wrote this message on Fri, Dec 06, 2013 at 21:08 +0100:
>> On Dec 5, 2013, at 7:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:
>> 
>>> Yes. Looking at the ixgbe code, ixgbe_mq_start_locked() returns an
>>> error from ixgbe_xmit() but if it fails, it puts the buffer back. But
>>> it's already successfully queued a frame to the driver, so in this
>>> instance it shouldn't return the error from ixgbe_mq_start_locked().
>>> 
>>> The same deal in if_em.c and igb.c
>>> 
>>> Now, drbr_putback() used to fail and now it doesn't, as you've said.
>>> So we should change the xxx_mq_start_locked() to set err=0 if we go
>>> via the drbr_putback() routine, as it hasn't actually failed to
>>> transmit.
>>> 
>>> Now the very dirty thing is this - the error from xxx_transmit() is
>>> for the mbuf being queued at the end; but xxx_mq_start_locked()
>>> failures are for transmitting from the front. If there's only packet
>>> in the queue and that fails then they're the same thing and returning
>>> the error from xxx_mq_start_locked() matches the current mbuf being
>>> queued. But otherwise, they're referring to totally different packets.
>>> For TCP this may hurt; the TCP stack treats ENOBUFS a certain way and
>>> kicks off a timer to schedule a retransmit. I don't think we can fix
>>> _this_ right now.
>>> 
>>> So Michael - can you redo your patch to set err=0 if drbr_putback() is
>>> called, and retest?
>> Hi Adrian,
>> 
>> I guess you are talking about a patch like:
>> 
>> [bsd5:~/head/sys/dev] tuexen% svn diff -x -p
>> Index: e1000/if_em.c
>> ===================================================================
>> --- e1000/if_em.c	(revision 259039)
>> +++ e1000/if_em.c	(working copy)
>> @@ -935,6 +935,7 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
>> 				drbr_advance(ifp, txr->br);
>> 			else 
>> 				drbr_putback(ifp, txr->br, next);
>> +				err = 0;
> 
> You probably want curly braces around this...
For sure. Thanks for catching it:

[bsd5:~/head/sys/dev] tuexen% svn diff -x -p
Index: e1000/if_em.c
===================================================================
--- e1000/if_em.c	(revision 259039)
+++ e1000/if_em.c	(working copy)
@@ -933,8 +933,10 @@ em_mq_start_locked(struct ifnet *ifp, struct tx_ri
 		if ((err = em_xmit(txr, &next)) != 0) {
 			if (next == NULL)
 				drbr_advance(ifp, txr->br);
-			else 
+			else {
 				drbr_putback(ifp, txr->br, next);
+				err = 0;
+			}
 			break;
 		}
 		drbr_advance(ifp, txr->br);
Index: e1000/if_igb.c
===================================================================
--- e1000/if_igb.c	(revision 259039)
+++ e1000/if_igb.c	(working copy)
@@ -1024,6 +1024,7 @@ igb_mq_start_locked(struct ifnet *ifp, struct tx_r
 				 * may have changed it.
 				 */
 				drbr_putback(ifp, txr->br, next);
+				err = 0;
 			}
 			break;
 		}
Index: ixgbe/ixgbe.c
===================================================================
--- ixgbe/ixgbe.c	(revision 259039)
+++ ixgbe/ixgbe.c	(working copy)
@@ -864,6 +864,7 @@ ixgbe_mq_start_locked(struct ifnet *ifp, struct tx
 				drbr_advance(ifp, txr->br);
 			} else {
 				drbr_putback(ifp, txr->br, next);
+				err = 0;
 			}
 #endif
 			break;
Index: ixgbe/ixv.c
===================================================================
--- ixgbe/ixv.c	(revision 259039)
+++ ixgbe/ixv.c	(working copy)
@@ -629,6 +629,7 @@ ixv_mq_start_locked(struct ifnet *ifp, struct tx_r
 				drbr_advance(ifp, txr->br);
 			} else {
 				drbr_putback(ifp, txr->br, next);
+				err = 0;
 			}
 			break;
 		}
Index: virtio/network/if_vtnet.c
===================================================================
--- virtio/network/if_vtnet.c	(revision 259039)
+++ virtio/network/if_vtnet.c	(working copy)
@@ -2242,9 +2242,10 @@ vtnet_txq_mq_start_locked(struct vtnet_txq *txq, s
 	while ((m = drbr_peek(ifp, br)) != NULL) {
 		error = vtnet_txq_encap(txq, &m);
 		if (error) {
-			if (m != NULL)
+			if (m != NULL) {
 				drbr_putback(ifp, br, m);
-			else
+				error = 0;
+			} else
 				drbr_advance(ifp, br);
 			break;
 		}

> 
> -- 
>  John-Mark Gurney				Voice: +1 415 225 5579
> 
>     "All that I will do, has been done, All that I have, has not."
> 


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 21:10:51 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id D5DB730A;
 Fri,  6 Dec 2013 21:10:51 +0000 (UTC)
Received: from mail-n.franken.de (drew.ipv6.franken.de
 [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 6410F10FC;
 Fri,  6 Dec 2013 21:10:51 +0000 (UTC)
Received: from [192.168.1.200] (p508F3521.dip0.t-ipconnect.de [80.143.53.33])
 (Authenticated sender: macmic)
 by mail-n.franken.de (Postfix) with ESMTP id 5E9881C0C0695;
 Fri,  6 Dec 2013 22:10:49 +0100 (CET)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
In-Reply-To: <20131206202012.GG55638@funkthat.com>
Date: Fri, 6 Dec 2013 22:10:50 +0100
Content-Transfer-Encoding: 7bit
Message-Id: <609C63CD-9332-4EAE-AACE-5B911416DF80@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
 <20131205223711.GB55638@funkthat.com>
 <3576B69E-E943-46E0-83E5-0B2194A44ED0@lurchi.franken.de>
 <20131206202012.GG55638@funkthat.com>
To: John-Mark Gurney <jmg@funkthat.com>
X-Mailer: Apple Mail (2.1510)
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 Adrian Chadd <adrian@freebsd.org>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 21:10:52 -0000

On Dec 6, 2013, at 9:20 PM, John-Mark Gurney <jmg@funkthat.com> wrote:

> Michael Tuexen wrote this message on Fri, Dec 06, 2013 at 21:17 +0100:
>> On Dec 5, 2013, at 11:37 PM, John-Mark Gurney <jmg@funkthat.com> wrote:
>> 
>>> Adrian Chadd wrote this message on Thu, Dec 05, 2013 at 14:01 -0800:
>>>> On 5 December 2013 13:05, Michael Tuexen
>>>> <Michael.Tuexen@lurchi.franken.de> wrote:
>>>> 
>>>>> Just to be clear: This would mean that xxx_transmit() would return
>>>>> an error even if the packet provided in the call xxx_transmit() is
>>>>> enqueued and not dropped?
>>>>> This would also be problem with the current SCTP stack.
>>>> 
>>>> I think it'll return an error only if:
>>>> 
>>>> * it queued the frame to the tail of the drbd;
>>>> * it then tried to transmit a frame from the head of the drbd;
>>>> * it failed to transmit the first frame in the drbd and it couldn't
>>>> put it back into the queue for whatever reason.
>>>> 
>>>> So I think it should be "ok enough" for both TCP and SCTP.
>>> 
>>> IMO it should only return an error if the specific frame failed to be
>>> sent or queued.  If you cannot determine at return time if the frame
>>> failed to be transmitted/queued, then it should return success.
>> Yes, this is exactly what I think too. This is what my first patch
>> realizes.
>>> 
>>> In the above case, if there were other frames queued ahead, and the
>>> first one failed, then it sounds like the frame may eventually be sent
>>> and we will end up sending a duplicate frame.
>> Correct. SCTP will consider the frame even unsent... So the SCTP stack
>> behaves strange and sends a packet at wirespeed over and over again (which
>> is not good...).
> 
> Sounds like a bug in SCTP, if it gets an error like that, it needs to back
> off a bit.. Though when to wake up, etc, is harder to decide...
Well, this is what happens:
The sender takes a packet from the send-queue, calls ip-output. Since
it returns an error, we don't move it to the sent-queue, but leave
it in the send queue (assuming it doesn't went on the wire).
However, the driver puts it on the wire, it makes it to the peer,
the peer sends SACK, and we receive the SACK. Since the packet is
not on the sent queue, we don't realize that it is acked. Receiving
a SACK is a trigger for sending a packet. So we take the next one
from the send-queue (the one from the beginning), and send it again.
So it is a wire speed ping pong...
So in case the lower layer tells us that there was a problem in
sending the packet, we
* don't consider it sent
* wait for the next normal protocol trigger for send another packet.
This sounds OK to me...

That is why I need to know what an error from ip_output() means.
If I can't conclude that the provided packet was dropped, I can just
consider it sent and don't try to do any optimisation.

Best regards
Michael
> 
> -- 
>  John-Mark Gurney				Voice: +1 415 225 5579
> 
>     "All that I will do, has been done, All that I have, has not."
> 


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 22:54:42 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 0024BA3D
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 22:54:41 +0000 (UTC)
Received: from mail-yh0-x230.google.com (mail-yh0-x230.google.com
 [IPv6:2607:f8b0:4002:c01::230])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id B5CBB192C
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 22:54:41 +0000 (UTC)
Received: by mail-yh0-f48.google.com with SMTP id f73so1008990yha.21
 for <freebsd-net@freebsd.org>; Fri, 06 Dec 2013 14:54:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :content-type:content-transfer-encoding;
 bh=xjntvCHfUWHfvlMVypviH0HxyFQaECAq6HOHVseK69s=;
 b=PS0CtaqHwcoA763gFgdQnSb1JFTMMgUwGMTsDd3rk8ReaoQBr54xbaataZmR1Z8xld
 TKvoYzkstPsg+gVId2/hMtRSGYOKrGOmMaJ066jLJGY9H+qsdrfvS2nfx0vERrCAGh9k
 qbb/03fBXsXGm1f1G7PDl5YW7vOFQPtSj4jCbDO5FUhUao7gVLVaZ01o2xMd5XtITaEc
 nI72NTpgsM5UQUzTcBq8/zF7jkS1s93jXlcj+OKzRd/30Ht+k7lkVyoLnrzWhI39ejfC
 3JrHw3Xjmil8Pi+WPfFW/myg++BBCdmKfzgAmn7hKiXpbpZ5Req2MH28EYs/b1Nk2Bif
 iEJQ==
X-Received: by 10.236.174.37 with SMTP id w25mr4623808yhl.36.1386370480977;
 Fri, 06 Dec 2013 14:54:40 -0800 (PST)
Received: from [10.10.1.35] ([192.252.130.194])
 by mx.google.com with ESMTPSA id b30sm106230yhm.5.2013.12.06.14.54.40
 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 06 Dec 2013 14:54:40 -0800 (PST)
Message-ID: <52A255AB.8040905@gmail.com>
Date: Fri, 06 Dec 2013 17:54:35 -0500
From: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:24.0) Gecko/20100101 Thunderbird/24.1.1
MIME-Version: 1.0
To: freebsd-net@FreeBSD.org
Subject: Avoiding an infinite loop in e1000 82575
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Jack Vogel <jfvogel@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 22:54:42 -0000

Hi,

I have encountered a strange issue were the igb driver goes into an 
infinite loop (I'm using version - 2.3.10) if many incantations of 
ifconfig are running in a while loop very fast. The following patch 
fixed it for me:

@@ -1052,12 +1052,11 @@ static void e1000_release_swfw_sync_82575(struct 
e1000_hw *hw, u16 mask)
  {
         u32 swfw_sync;

         DEBUGFUNC("e1000_release_swfw_sync_82575");

-       while (e1000_get_hw_semaphore_generic(hw) != E1000_SUCCESS)
-               ; /* Empty */
+       e1000_get_hw_semaphore_generic(hw);

         swfw_sync = E1000_READ_REG(hw, E1000_SW_FW_SYNC);
         swfw_sync &= ~mask;
         E1000_WRITE_REG(hw, E1000_SW_FW_SYNC, swfw_sync);

Now, I haven't seen any side effect of this change except that it fixed 
my issue although I wonder what they are and what effect will this 
change have on the system?

Thanks,

Karim.

PS: Some more information on the devices:

dmesg:

igb0: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 
0xc880-0xc89f mem 0xfba80000-0xfbafffff,0xfbb78000-0xfbb7bfff irq 16 at 
device 0.0 on pci4
igb0: Using MSIX interrupts with 2 vectors
igb0: Ethernet address: 00:90:0b:2f:b8:00
igb0: [ITHREAD]
igb0: [ITHREAD]
igb1: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 
0xcc00-0xcc1f mem 0xfbb80000-0xfbbfffff,0xfbb7c000-0xfbb7ffff irq 17 at 
device 0.1 on pci4
igb1: Using MSIX interrupts with 2 vectors
igb1: Ethernet address: 00:90:0b:2f:b8:01
igb1: [ITHREAD]
igb1: [ITHREAD]
igb2: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 
0xd880-0xd89f mem 0xfbc80000-0xfbcfffff,0xfbd78000-0xfbd7bfff irq 16 at 
device 0.0 on pci5
igb2: Using MSIX interrupts with 2 vectors
igb2: Ethernet address: 00:90:0b:2f:b8:02
igb2: [ITHREAD]
igb2: [ITHREAD]
igb3: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 
0xdc00-0xdc1f mem 0xfbd80000-0xfbdfffff,0xfbd7c000-0xfbd7ffff irq 17 at 
device 0.1 on pci5
igb3: Using MSIX interrupts with 2 vectors
igb3: Ethernet address: 00:90:0b:2f:b8:03
igb3: [ITHREAD]
igb3: [ITHREAD]

pciconf

igb0@pci0:4:0:0:        class=0x020000 card=0x00008086 chip=0x150e8086 
rev=0x01 hdr=0x00
igb1@pci0:4:0:1:        class=0x020000 card=0x00008086 chip=0x150e8086 
rev=0x01 hdr=0x00
igb2@pci0:5:0:0:        class=0x020000 card=0x00008086 chip=0x150e8086 
rev=0x01 hdr=0x00
igb3@pci0:5:0:1:        class=0x020000 card=0x00008086 chip=0x150e8086 
rev=0x01 hdr=0x00


From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 23:25:05 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id E3BE9BB9;
 Fri,  6 Dec 2013 23:25:05 +0000 (UTC)
Received: from mail-qc0-x235.google.com (mail-qc0-x235.google.com
 [IPv6:2607:f8b0:400d:c01::235])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 892E11CB8;
 Fri,  6 Dec 2013 23:25:05 +0000 (UTC)
Received: by mail-qc0-f181.google.com with SMTP id e9so994543qcy.26
 for <multiple recipients>; Fri, 06 Dec 2013 15:25:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=DY2p0OeQFtAmRu+8iIRt/bYIZT6fpNHMpFhsaXt/uUc=;
 b=qWk0cLZBCEW0B1R32tyTNUpVChOJNnJ6EccV4cH/tY4QCsE9ZTSNsQkGOLeYJlxAGb
 R7Lr18F+oEt1VX74E2W05IymVU0cuWlMRfwhwIyOTilXsvvT3SNGwY0vzLDgQqHSDPUD
 L9iYhi23//2/BjVvj74HohfERHSUxYS8ZAyO65RE9BS9+X/B+aS7tDJJZBe300dJJVeB
 KYnXsIb0tcdplFU0JFzTMj2ZFs6ucjnN10u8XG1Zc7Ak8Rk1FtaHdUT8MC0I96vxOsT4
 Qhv/Bo6h+V5b5kp4adBydJepZxC0603mUSwf9T93hNm17C5g1poS/oXkZpPxj6qrepGY
 /eYA==
MIME-Version: 1.0
X-Received: by 10.224.89.73 with SMTP id d9mr11480031qam.5.1386372304778; Fri,
 06 Dec 2013 15:25:04 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Fri, 6 Dec 2013 15:25:04 -0800 (PST)
In-Reply-To: <609C63CD-9332-4EAE-AACE-5B911416DF80@lurchi.franken.de>
References: <521B9C2A-EECC-4412-9F68-2235320EF324@lurchi.franken.de>
 <20131202022338.GA3500@michelle.cdnetworks.com>
 <B9593E83-E687-49E9-ABDC-B2DD615180E9@lurchi.franken.de>
 <20131203021658.GC2981@michelle.cdnetworks.com>
 <CAJ-Vmo=kfoPMYjZ0WAtqmoJMz1utXH50SW9N92RA83EMUzY7WA@mail.gmail.com>
 <B89B1E2D-BAF0-4815-B3AB-EB226F4F76DE@lurchi.franken.de>
 <CAJ-Vmo=4Zwv5V6ZYDuDLtt+owgbvmqyvrnrfnU+HeXQ3vAn-KA@mail.gmail.com>
 <20131205223711.GB55638@funkthat.com>
 <3576B69E-E943-46E0-83E5-0B2194A44ED0@lurchi.franken.de>
 <20131206202012.GG55638@funkthat.com>
 <609C63CD-9332-4EAE-AACE-5B911416DF80@lurchi.franken.de>
Date: Fri, 6 Dec 2013 15:25:04 -0800
X-Google-Sender-Auth: zRzZdbNmCEGXZcUFoZOuhoLJDug
Message-ID: <CAJ-Vmomnu4VLE0Q8A+QS6+7LA7ry_kD9j05=TvNZeocRjsuE7A@mail.gmail.com>
Subject: Re: A small fix for if_em.c, if_igb.c, if_ixgbe.c
From: Adrian Chadd <adrian@freebsd.org>
To: Michael Tuexen <Michael.Tuexen@lurchi.franken.de>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, Jack F Vogel <jfv@freebsd.org>,
 John-Mark Gurney <jmg@funkthat.com>,
 "freebsd-net@freebsd.org list" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 23:25:06 -0000

On 6 December 2013 13:10, Michael Tuexen
<Michael.Tuexen@lurchi.franken.de> wrote:

> Well, this is what happens:
> The sender takes a packet from the send-queue, calls ip-output. Since
> it returns an error, we don't move it to the sent-queue, but leave
> it in the send queue (assuming it doesn't went on the wire).
> However, the driver puts it on the wire, it makes it to the peer,
> the peer sends SACK, and we receive the SACK. Since the packet is
> not on the sent queue, we don't realize that it is acked. Receiving
> a SACK is a trigger for sending a packet. So we take the next one
> from the send-queue (the one from the beginning), and send it again.
> So it is a wire speed ping pong...
> So in case the lower layer tells us that there was a problem in
> sending the packet, we
> * don't consider it sent
> * wait for the next normal protocol trigger for send another packet.
> This sounds OK to me...
>
> That is why I need to know what an error from ip_output() means.
> If I can't conclude that the provided packet was dropped, I can just
> consider it sent and don't try to do any optimisation.

We're heading down the right path.

I'm increasingly believing that ignoring the return value is the
correct thing to do.


-adrian

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  6 23:26:28 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2BF31CC1
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 23:26:28 +0000 (UTC)
Received: from mail-qa0-x22c.google.com (mail-qa0-x22c.google.com
 [IPv6:2607:f8b0:400d:c00::22c])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id E022E1CDB
 for <freebsd-net@freebsd.org>; Fri,  6 Dec 2013 23:26:27 +0000 (UTC)
Received: by mail-qa0-f44.google.com with SMTP id i13so1041941qae.17
 for <freebsd-net@freebsd.org>; Fri, 06 Dec 2013 15:26:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=+hx6Q8PAcqLWlg81Rd0NBizlRy1v2VkAwDD6V6lpLak=;
 b=aZ4cECot0lL1Z1FzzewUunkFXxE+7siwmYk9xaiIbwBLKmSfJl8wwW+OG/cZoYKTby
 MkPYXugH5ra9JDAKnqzW6xbRicOAYEp9UQO+YIuiBz2sF4peSHx4P8XnppDliq39oyTm
 Y8lJc67LXJ1ubc6BqUQY6gZH7Kz77BFErO6fHMbequMe8f7TRikIBl5eBix49bFiaTJ5
 3mCuu4vFHg7jQgBU0nWw2Vnys9wKN+1nlrAAB1E6VrPTabDP9eQgeZDuTyg9CX9t3L9f
 9EVWZjmxXq7MOqQHxSTp9dqzqZbj2VdjiU775GB4XJ5WmfDL20QsZufgYa9VMYnT3Lip
 6Dgw==
MIME-Version: 1.0
X-Received: by 10.229.56.200 with SMTP id z8mr10765856qcg.1.1386372387046;
 Fri, 06 Dec 2013 15:26:27 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.224.53.200 with HTTP; Fri, 6 Dec 2013 15:26:27 -0800 (PST)
In-Reply-To: <52A255AB.8040905@gmail.com>
References: <52A255AB.8040905@gmail.com>
Date: Fri, 6 Dec 2013 15:26:27 -0800
X-Google-Sender-Auth: aUEcvx-obMbMaL5hhxIhlI-mkgs
Message-ID: <CAJ-Vmomy6ZCYTHsY3GnfESCFAjFDMyyzV3k9oCnbqHksjaBmEQ@mail.gmail.com>
Subject: Re: Avoiding an infinite loop in e1000 82575
From: Adrian Chadd <adrian@freebsd.org>
To: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: FreeBSD Net <freebsd-net@freebsd.org>, Jack Vogel <jfvogel@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Dec 2013 23:26:28 -0000

Heh, your solution isn't correct. There's a higher level race
condition somewhere.. :(

-a

On 6 December 2013 14:54, Karim Fodil-Lemelin
<fodillemlinkarim@gmail.com> wrote:
> Hi,
>
> I have encountered a strange issue were the igb driver goes into an infinite
> loop (I'm using version - 2.3.10) if many incantations of ifconfig are
> running in a while loop very fast. The following patch fixed it for me:
>
> @@ -1052,12 +1052,11 @@ static void e1000_release_swfw_sync_82575(struct
> e1000_hw *hw, u16 mask)
>  {
>         u32 swfw_sync;
>
>         DEBUGFUNC("e1000_release_swfw_sync_82575");
>
> -       while (e1000_get_hw_semaphore_generic(hw) != E1000_SUCCESS)
> -               ; /* Empty */
> +       e1000_get_hw_semaphore_generic(hw);
>
>         swfw_sync = E1000_READ_REG(hw, E1000_SW_FW_SYNC);
>         swfw_sync &= ~mask;
>         E1000_WRITE_REG(hw, E1000_SW_FW_SYNC, swfw_sync);
>
> Now, I haven't seen any side effect of this change except that it fixed my
> issue although I wonder what they are and what effect will this change have
> on the system?
>
> Thanks,
>
> Karim.
>
> PS: Some more information on the devices:
>
> dmesg:
>
> igb0: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port
> 0xc880-0xc89f mem 0xfba80000-0xfbafffff,0xfbb78000-0xfbb7bfff irq 16 at
> device 0.0 on pci4
> igb0: Using MSIX interrupts with 2 vectors
> igb0: Ethernet address: 00:90:0b:2f:b8:00
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb1: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port
> 0xcc00-0xcc1f mem 0xfbb80000-0xfbbfffff,0xfbb7c000-0xfbb7ffff irq 17 at
> device 0.1 on pci4
> igb1: Using MSIX interrupts with 2 vectors
> igb1: Ethernet address: 00:90:0b:2f:b8:01
> igb1: [ITHREAD]
> igb1: [ITHREAD]
> igb2: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port
> 0xd880-0xd89f mem 0xfbc80000-0xfbcfffff,0xfbd78000-0xfbd7bfff irq 16 at
> device 0.0 on pci5
> igb2: Using MSIX interrupts with 2 vectors
> igb2: Ethernet address: 00:90:0b:2f:b8:02
> igb2: [ITHREAD]
> igb2: [ITHREAD]
> igb3: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port
> 0xdc00-0xdc1f mem 0xfbd80000-0xfbdfffff,0xfbd7c000-0xfbd7ffff irq 17 at
> device 0.1 on pci5
> igb3: Using MSIX interrupts with 2 vectors
> igb3: Ethernet address: 00:90:0b:2f:b8:03
> igb3: [ITHREAD]
> igb3: [ITHREAD]
>
> pciconf
>
> igb0@pci0:4:0:0:        class=0x020000 card=0x00008086 chip=0x150e8086
> rev=0x01 hdr=0x00
> igb1@pci0:4:0:1:        class=0x020000 card=0x00008086 chip=0x150e8086
> rev=0x01 hdr=0x00
> igb2@pci0:5:0:0:        class=0x020000 card=0x00008086 chip=0x150e8086
> rev=0x01 hdr=0x00
> igb3@pci0:5:0:1:        class=0x020000 card=0x00008086 chip=0x150e8086
> rev=0x01 hdr=0x00
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Sat Dec  7 23:16:31 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9342F3B7
 for <freebsd-net@freebsd.org>; Sat,  7 Dec 2013 23:16:31 +0000 (UTC)
Received: from mail-qe0-x231.google.com (mail-qe0-x231.google.com
 [IPv6:2607:f8b0:400d:c02::231])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4D50A10F5
 for <freebsd-net@freebsd.org>; Sat,  7 Dec 2013 23:16:31 +0000 (UTC)
Received: by mail-qe0-f49.google.com with SMTP id w7so1667508qeb.8
 for <freebsd-net@freebsd.org>; Sat, 07 Dec 2013 15:16:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=eitanadler.com; s=0xdeadbeef;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc:content-type;
 bh=NBeLw8RIAncRFcU2iYwYlfWLIemjg6JQfKq2RO2ft7I=;
 b=qO9ZaYLgrVV26a+N2T1hgssjQKV4qO9aHSXO4kn1H8w3V1wzFCNUZBsvefmoanxyIh
 4nXywBVZU5TySi/G/j2B+rbxEqMoS6ghusBeX8EIr3CXuokP6+uzBQekWfdrZlvnfslH
 uwAIEGSwjJCuWQLUn1I9e+ZuWHsh09L5D9vrs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc:content-type;
 bh=NBeLw8RIAncRFcU2iYwYlfWLIemjg6JQfKq2RO2ft7I=;
 b=aMdQktbor6lBx8zda/9i2GO6zapRu6hGgeTAPlz6Xow6LJnv8sdujkLWJejV8h55BH
 emaWKmPtGQD0s4MkSK74McDFSZWkf7/KVZMcTZye1qTuwdu4PORXLEP43n78c4n3yHKz
 TjQq296jTWtQR6+k2OTpiCJ+NCthGTi+fEeFHZcufhjQ6eLCWamgEtswCu/l009lWmCz
 9VbnL00FbKmKm14l3wxW1KRtHV3RAjdpsKP5D4LAK350LmV4vTOUemm+QvQ8m3IKhShi
 IfX6Jwx8ib4ickDH2xl0nKM4ahD1hap5bVGfTj4pW58jcaH0lGE42/MaT1MkLKVFgXn3
 E6BQ==
X-Gm-Message-State: ALoCoQmwoqbQg6BubdUbsIhCAFXA1DskhK5ORvwu5SuJWTiX1spMP7rorPDrurvO8rnSkBoWD3Ja
X-Received: by 10.49.24.211 with SMTP id w19mr20056909qef.9.1386458190505;
 Sat, 07 Dec 2013 15:16:30 -0800 (PST)
MIME-Version: 1.0
Received: by 10.96.86.42 with HTTP; Sat, 7 Dec 2013 15:16:00 -0800 (PST)
In-Reply-To: <523457A1.3090606@debian.org>
References: <523457A1.3090606@debian.org>
From: Eitan Adler <lists@eitanadler.com>
Date: Sat, 7 Dec 2013 18:16:00 -0500
Message-ID: <CAF6rxgntjNFdr8unFQC=OWCNs7-UDYJaE30v4heWh_EeOg1JGA@mail.gmail.com>
Subject: Re: IPSEC
To: Robert Millan <rmh@debian.org>
Content-Type: text/plain; charset=UTF-8
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>,
 "debian-bsd@lists.debian.org" <debian-bsd@lists.debian.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Dec 2013 23:16:31 -0000

Hi all,

I understand this is an old thread but I do not see an answer here.
Can anyone answer the question below?

On Sat, Sep 14, 2013 at 8:33 AM, Robert Millan <rmh@debian.org> wrote:
>
> Hi!
>
> Is there any particular reason (performance, stability concerns...)
> IPSEC support is not enabled in GENERIC?
>
> In Debian GNU/kFreeBSD we're considering enabling it in our default
> builds, due to increased user demand and as it is already enabled for
> our Linux-based flavours.
>
> However we're concerned about diverging from FreeBSD as there might be
> unforeseen consequences. Is there any specific concern on your side?
>
> If not, perhaps it could be considered for HEAD after 10.0 release?



-- 
Eitan Adler