Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Aug 2017 21:36:17 -0700
From:      Bakul Shah <bakul@bitblocks.com>
To:        Mike Karels <mike@karels.net>
Cc:        Gopakumar Pillai <gpillai@vmware.com>, "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, "freebsd-net@FreeBSD.org" <freebsd-net@FreeBSD.org>
Subject:   Re: Only last IP frag sent if ARP entry absent
Message-ID:  <80D8F4C8-7A02-4F0E-88A5-9A316FC6E99F@bitblocks.com>
In-Reply-To: <9B1B1A12-CD9F-4A9F-B596-A2F6E5BAED1E@karels.net>
References:  <F9ABB88D-108D-4EF0-8962-091662F488FD@vmware.com> <43CC3432-DB42-4170-B3E7-E305561973F3@lists.zabbadoz.net> <AFD0C317-D4E2-4A9E-B6F2-CCA2B0B7464F@vmware.com> <9B1B1A12-CD9F-4A9F-B596-A2F6E5BAED1E@karels.net>

next in thread | previous in thread | raw e-mail | index | archive | help
RFC 826 is the one that says this:

    If it does not, it probably informs the caller that it is throwing
    the packet away (on the assumption the packet will be retransmitted
    by a higher network layer)

Not worth fixing for the reasons you mention.=20

> On Aug 17, 2017, at 8:33 PM, Mike Karels <mike@karels.net> wrote:
>=20
> Another $.02 (inline):
>=20
> On 17 Aug 2017, at 18:39, Gopakumar Pillai wrote:
>=20
>> Thank You Bjoern. Could you please point me to the RFC?
>=20
> I don=E2=80=99t know if there is anything more recent than RFC1122 on =
this.  IIRC, it requires queuing at least one packet.  Queing one packet =
is what BSD has done essentially since ARP was implemented.
>=20
>> If this is not a MUST behavior in RFC, would my fix be good? I agree =
that this would affect only ICMP/UDP traffic.
>=20
> People have been asking for queuing of multiple packets for years.  =
That is a more general change.  Consider another dumb application that =
starts out by sending multiple UDP packets back-to-back.  However, =
well-designed application protocols don=E2=80=99t experience problems =
like this.  I=E2=80=99ll quickly note that ping isn=E2=80=99t an =
application, but a network measuring tool.  If you ask the question =
=E2=80=9Cwhat happens if I start off a session with a single large =
packet and I don=E2=80=99t support retransmission=E2=80=9D, ping answers =
that question correctly.
>=20
> If badly-designed protocols get bad performance, that doesn=E2=80=99t =
seem like a bug to me, but a feature.
>=20
>> On 8/17/17, 2:40 PM, "Bjoern A. Zeeb" =
<bzeeb-lists@lists.zabbadoz.net> wrote:
>>=20
>>    On 17 Aug 2017, at 21:16, Gopakumar Pillai wrote:
>>=20
>>    > Hi FreeBSD Networking Gurus,
>>    > I came across an issue with an old version of FreeBSD and =
looking at
>>    > the latest FreeBSD code, seems it exists even now. I am assuming =
that
>>    > this issue is not reported.
>>    >
>>    > Observation:
>>    > When a ping was performed with larger payload than MTU, the =
first ping
>>    > failed when the ARP entry was absent for that IP.
>>=20
>>    That is because ping/ICMP has no retransmit.
>>=20
>>=20
>>    > Noticed on the wire that the last IP fragment was sent for the =
first
>>    > request and then the subsequent requests were fine.
>>    >
>>    > Root Cause:
>>    >   * ip_output fragments the packets and loops through the =
fragments to
>>    > send them to ether_output.
>>    >   * ether_output does an arpresolve and if there is no existing =
ARP
>>    > entry it'll return EWOULDBLOCK after sending ARP Request.
>>    >   * ether_output ignores the error and propagates success to =
ip_output
>>    > and it continues to send the remaining fragments.
>>    >   * llentry keeps only one mbuf and the last fragment is =
retained when
>>    > the ARP Reply comes and the fragment is sent.
>>=20
>>    Yes, according to the spec (RFC) we are supposed to throw the =
packet
>>    away entirely and simply report that to the next upper layer.  =
However
>>    over the years people realised that this sucks for a TCP SYN =
packet with
>>    a retransmit timer and hence we store one of them.
>>=20
>>    A large UDP packet would btw see the same behaviour to your ping.
>>    There=E2=80=99s no guarantee any of these packets will not be =
dropped anywhere
>>    on the network, so we can as well.
>>=20
>>    Just my 2ct
>>=20
>>    /bz
>=20
> 		Mike
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?80D8F4C8-7A02-4F0E-88A5-9A316FC6E99F>