From owner-freebsd-net@FreeBSD.ORG Wed Sep 21 12:48:46 2005 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B6E0D16A41F for ; Wed, 21 Sep 2005 12:48:46 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0DD1343D45 for ; Wed, 21 Sep 2005 12:48:46 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id 7B3D046B3C; Wed, 21 Sep 2005 08:48:45 -0400 (EDT) Date: Wed, 21 Sep 2005 13:48:45 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: =?ISO-8859-1?Q?Sten_Daniel_S=F8rsdal?= In-Reply-To: <4331539D.9030204@wm-access.no> Message-ID: <20050921134029.M34322@fledge.watson.org> References: <20050918212110.61962.qmail@web54501.mail.yahoo.com> <20050920134408.Y34322@fledge.watson.org> <43313924.9050009@wm-access.no> <20050921114511.D34322@fledge.watson.org> <4331539D.9030204@wm-access.no> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1545815129-1127306925=:34322" Cc: freebsd-net@freebsd.org Subject: Re: UDP dont fragment bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2005 12:48:46 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1545815129-1127306925=:34322 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 21 Sep 2005, Sten Daniel S=F8rsdal wrote: > Robert Watson wrote: >> >> So if someone could generate some application pseudo-code that suggests >> what specifically is necessary from the socket layer in order for the >> application to function, we can talk about socket service extensions >> that might support the application. For example, a way to query >> detailed error information rather than just the SO_ERROR socket option. >> Or a longer haul PMTU data gathering mechanism for UDP sockets. Or ways >> for UDP applications to more usefully query the kernel for the TCP PMTU >> data already being recorded. >> >> It sounds like for the bandwidth tester, IP raw sockets already provide= =20 >> what you need, since you want to be able to do fairly irregular UDP=20 >> things (i.e., receive UDP packets with bad checksums, and see=20 >> fragments). > > IP raw sockets? Sure, Everything can be solved the complicated way :o)=20 > Some userland applications could benefit from having the option of DF=20 > flag set/unset. UDP sockets are defined as being a way to send and receive valid UDP=20 datagrams. Your list of things to receive included fragments and invalid= =20 datagrams. While I agree with your comments below about things UDP=20 applications want to do, I don't agree that we should teach UDP sockets to= =20 receive UDP datagram fragments or packets with bad checksums.=20 Applications looking for non-accepted IP packets and complete ICMP=20 messages should be using the raw socket interface. Applications looking=20 for post-processed abstracted interfaces to a datagram service should be=20 using UDP sockets. See below for discussion of enhancing UDP sockets. > What about applications that wants to have a way of optimizing UDP > transfers in their network path? > > Some networks filter icmp and fragments irresponsibly (imho) and=20 > sometimes the combination of two or more networks that would cause=20 > problems for multicast/video/voip applications. > > Sometimes in one network udp packets need fragmenting and in the next=20 > network fragments need to get reassembled to pass a firewall which in=20 > turn runs out of reassembling resources. ( It is more common to block=20 > icmp messages about reassembly problems than DF problems IF a message is= =20 > generated in the first place. ) > > Sure, all of this could be fixed the complicated way but what if one=20 > already has an application that runs in unprivileged userland. How many= =20 > lines of code would a simple socket option plus the "tuning" code=20 > require? You're still not answering my question about application pseudo-code,=20 however. Adding an IP_DF option to UDP sockets is easy, and can be done in= =20 ten lines of code or less. Adding a way to provide detailed feedback on=20 error conditions associated with UDP packets sent at arbitrary points in=20 the path is not something that falls naturally out of the socket API, and= =20 will require non-trivial amounts of work. Hence my asking about the=20 structure and event model of your application: what exactly do you want to= =20 know about UDP packet delivery? Specifically, what information do you as a developer need in order to=20 handle asynchronous error delivery from UDP packet send, and how will it=20 affect your application's interaction with the network stack? We can=20 already deliver an synchronous EMSGSIZE when you try to send a UDP packet= =20 out of an interface with an MTU that is lower than the packet size, given= =20 a socket option to force IP_DF. However, if the packet hits a potential=20 fragmentation problem out in the wide area network, that notification is=20 completely asynchronous from packet transmission, and we will need a way=20 to feed more detailed ICMP information to the application. Right now=20 asynchronous error delivery on a UDP socket is already fairly messy due to= =20 the fact that generally applications can only pick up the error when doing= =20 further I/O, confusing the issue of which operation actually generated the= =20 error. Robert N M Watson --0-1545815129-1127306925=:34322--