Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Aug 2017 17:03:45 +0800
From:      Julian Elischer <julian@freebsd.org>
To:        Gopakumar Pillai <gpillai@vmware.com>, Mike Karels <mike@karels.net>
Cc:        "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, "freebsd-net@FreeBSD.org" <freebsd-net@FreeBSD.org>
Subject:   Re: Only last IP frag sent if ARP entry absent
Message-ID:  <23e6afd8-bdfd-1bd3-ff56-353a44e0d806@freebsd.org>
In-Reply-To: <DA5C9C72-44E2-4D9F-B8C1-850784F36320@vmware.com>
References:  <F9ABB88D-108D-4EF0-8962-091662F488FD@vmware.com> <43CC3432-DB42-4170-B3E7-E305561973F3@lists.zabbadoz.net> <AFD0C317-D4E2-4A9E-B6F2-CCA2B0B7464F@vmware.com> <9B1B1A12-CD9F-4A9F-B596-A2F6E5BAED1E@karels.net> <DA5C9C72-44E2-4D9F-B8C1-850784F36320@vmware.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 18/8/17 12:36 pm, Gopakumar Pillai wrote:
> Thank You Bjoern and Mike.
>
> While I agree with you Mike that ping can fail, a UDP application could also be affected – if its sending >MTU data and if ARP entry is absent. And ether_output wouldn’t even tell the app if the sending failed or not (as per existing code). Agree that badly written applications would suffer the most. My fix only helps applications.
>
> I guess I am not totally out of line.
>
> But not sure whether I should checkin this fix or not! ☺
>
> Thank You again.
>
> --Gopu
>
>
>
> From: Mike Karels <mike@karels.net>
> Date: Thursday, August 17, 2017 at 8:33 PM
> To: Gopakumar Pillai <gpillai@vmware.com>
> Cc: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, "freebsd-net@FreeBSD.org" <freebsd-net@FreeBSD.org>
> Subject: Re: Only last IP frag sent if ARP entry absent
>
>
> Another $.02 (inline):
>
> On 17 Aug 2017, at 18:39, Gopakumar Pillai wrote:
>
> Thank You Bjoern. Could you please point me to the RFC?
>
> I don’t know if there is anything more recent than RFC1122 on this. IIRC, it requires queuing at least one packet. Queing one packet is what BSD has done essentially since ARP was implemented.
>
> If this is not a MUST behavior in RFC, would my fix be good? I agree that this would affect only ICMP/UDP traffic.
>
> People have been asking for queuing of multiple packets for years. That is a more general change. Consider another dumb application that starts out by sending multiple UDP packets back-to-back. However, well-designed application protocols don’t experience problems like this. I’ll quickly note that ping isn’t an application, but a network measuring tool. If you ask the question “what happens if I start off a session with a single large packet and I don’t support retransmission”, ping answers that question correctly.
>
> If badly-designed protocols get bad performance, that doesn’t seem like a bug to me, but a feature.
>
> On 8/17/17, 2:40 PM, "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net> wrote:
>
> On 17 Aug 2017, at 21:16, Gopakumar Pillai wrote:
>
>> Hi FreeBSD Networking Gurus,
>> I came across an issue with an old version of FreeBSD and looking at
>> the latest FreeBSD code, seems it exists even now. I am assuming that
>> this issue is not reported.
>>
>> Observation:
>> When a ping was performed with larger payload than MTU, the first ping
>> failed when the ARP entry was absent for that IP.
> That is because ping/ICMP has no retransmit.
>
>
>> Noticed on the wire that the last IP fragment was sent for the first
>> request and then the subsequent requests were fine.
>>
>> Root Cause:
>> * ip_output fragments the packets and loops through the fragments to
>> send them to ether_output.
>> * ether_output does an arpresolve and if there is no existing ARP
>> entry it'll return EWOULDBLOCK after sending ARP Request.
>> * ether_output ignores the error and propagates success to ip_output
>> and it continues to send the remaining fragments.
>> * llentry keeps only one mbuf and the last fragment is retained when
>> the ARP Reply comes and the fragment is sent.
> Yes, according to the spec (RFC) we are supposed to throw the packet
> away entirely and simply report that to the next upper layer. However
> over the years people realised that this sucks for a TCP SYN packet with
> a retransmit timer and hence we store one of them.
The canonical example of this was always NFS over UDP, where after 
sitting idle for  a while, the
first NFS packet would need to be retransmitted because the first part 
of the 8K NFS packet went AWOL.
>
> A large UDP packet would btw see the same behaviour to your ping.
> There’s no guarantee any of these packets will not be dropped anywhere
> on the network, so we can as well.
>
> Just my 2ct
>
> /bz
>
>      Mike
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23e6afd8-bdfd-1bd3-ff56-353a44e0d806>