Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Sep 2005 13:48:45 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        =?ISO-8859-1?Q?Sten_Daniel_S=F8rsdal?= <lists@wm-access.no>
Cc:        freebsd-net@freebsd.org
Subject:   Re: UDP dont fragment bit
Message-ID:  <20050921134029.M34322@fledge.watson.org>
In-Reply-To: <4331539D.9030204@wm-access.no>
References:  <20050918212110.61962.qmail@web54501.mail.yahoo.com> <20050920134408.Y34322@fledge.watson.org> <43313924.9050009@wm-access.no> <20050921114511.D34322@fledge.watson.org> <4331539D.9030204@wm-access.no>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1545815129-1127306925=:34322
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE


On Wed, 21 Sep 2005, Sten Daniel S=F8rsdal wrote:

> Robert Watson wrote:
>>
>> So if someone could generate some application pseudo-code that suggests
>> what specifically is necessary from the socket layer in order for the
>> application to function, we can talk about socket service extensions
>> that might support the application.  For example, a way to query
>> detailed error information rather than just the SO_ERROR socket option.
>> Or a longer haul PMTU data gathering mechanism for UDP sockets.  Or ways
>> for UDP applications to more usefully query the kernel for the TCP PMTU
>> data already being recorded.
>>
>> It sounds like for the bandwidth tester, IP raw sockets already provide=
=20
>> what you need, since you want to be able to do fairly irregular UDP=20
>> things (i.e., receive UDP packets with bad checksums, and see=20
>> fragments).
>
> IP raw sockets? Sure, Everything can be solved the complicated way :o)=20
> Some userland applications could benefit from having the option of DF=20
> flag set/unset.

UDP sockets are defined as being a way to send and receive valid UDP=20
datagrams.  Your list of things to receive included fragments and invalid=
=20
datagrams.  While I agree with your comments below about things UDP=20
applications want to do, I don't agree that we should teach UDP sockets to=
=20
receive UDP datagram fragments or packets with bad checksums.=20
Applications looking for non-accepted IP packets and complete ICMP=20
messages should be using the raw socket interface.  Applications looking=20
for post-processed abstracted interfaces to a datagram service should be=20
using UDP sockets.  See below for discussion of enhancing UDP sockets.

> What about applications that wants to have a way of optimizing UDP
> transfers in their network path?
>
> Some networks filter icmp and fragments irresponsibly (imho) and=20
> sometimes the combination of two or more networks that would cause=20
> problems for multicast/video/voip applications.
>
> Sometimes in one network udp packets need fragmenting and in the next=20
> network fragments need to get reassembled to pass a firewall which in=20
> turn runs out of reassembling resources. ( It is more common to block=20
> icmp messages about reassembly problems than DF problems IF a message is=
=20
> generated in the first place. )
>
> Sure, all of this could be fixed the complicated way but what if one=20
> already has an application that runs in unprivileged userland. How many=
=20
> lines of code would a simple socket option plus the "tuning" code=20
> require?

You're still not answering my question about application pseudo-code,=20
however. Adding an IP_DF option to UDP sockets is easy, and can be done in=
=20
ten lines of code or less.  Adding a way to provide detailed feedback on=20
error conditions associated with UDP packets sent at arbitrary points in=20
the path is not something that falls naturally out of the socket API, and=
=20
will require non-trivial amounts of work.  Hence my asking about the=20
structure and event model of your application: what exactly do you want to=
=20
know about UDP packet delivery?

Specifically, what information do you as a developer need in order to=20
handle asynchronous error delivery from UDP packet send, and how will it=20
affect your application's interaction with the network stack? We can=20
already deliver an synchronous EMSGSIZE when you try to send a UDP packet=
=20
out of an interface with an MTU that is lower than the packet size, given=
=20
a socket option to force IP_DF. However, if the packet hits a potential=20
fragmentation problem out in the wide area network, that notification is=20
completely asynchronous from packet transmission, and we will need a way=20
to feed more detailed ICMP information to the application.  Right now=20
asynchronous error delivery on a UDP socket is already fairly messy due to=
=20
the fact that generally applications can only pick up the error when doing=
=20
further I/O, confusing the issue of which operation actually generated the=
=20
error.

Robert N M Watson
--0-1545815129-1127306925=:34322--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050921134029.M34322>