Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Feb 2001 03:49:52 +0100
From:      Jesper Skriver <jesper@skriver.dk>
To:        Jonathan Lemon <jlemon@flugsvamp.com>
Cc:        net@freebsd.org
Subject:   Re: ICMP unreachables, take II.
Message-ID:  <20010223034952.A6694@skriver.dk>
In-Reply-To: <20010222185412.E5714@prism.flugsvamp.com>; from jlemon@flugsvamp.com on Thu, Feb 22, 2001 at 06:54:12PM -0600
References:  <20010222185412.E5714@prism.flugsvamp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 22, 2001 at 06:54:12PM -0600, Jonathan Lemon wrote:

I was just about to send a MFC of the current code out for review, will
ditch that ...

> I recently had a bug report regarding kqueue, where the kevent() call
> for a TCP socket would return because so_error was set, but the
> connection was still valid.
> 
> The cause of this was because a non-blocking connect() call was made,
> and then the socket was monitored for writability.  However, an ICMP
> error was returned, which eventually (after several retransmits) caused
> so_error to be set in tcp_notify() without changing the connection state.
> 
> However, it doesn't really seem reasonable to leave the connection
> pending in a SYN_SENT state at this point.  From the user's perspective,
> the select/kevent call returns, indicating writability, but the next
> operation (probably write) would fail, returning the contents of so_error.
> 
> I would propose that instead of this behavior, the connection should 
> be dropped instead.

Makes sense.

> Also, RFC 1122 indicates that the application layer SHOULD report
> soft errors, and it seems that a half-hearted attempt is made in 
> tcp_notify(), by calling so{rw}wakeup.
> 
> However, this also seems to be incorrect, since select will not return
> and any tests performed on the socket state will show no change, so it
> seems that the wakeup calls should be removed.
> 
> Also, (still reading?) while I'm in this bit of code, we currently
> react to all ICMP unreachable errors during setup of a connection;
> this is incorrect, only port/protocol and administrative icmp subtypes
> should be valid, so fix this as well.  In this case, return ENETRESET
> as the error code to the user layer.

Agree, it should be a different from ECONNREFUSED

I still think we should react to the following as a minimum 
 - type 3 code  0  net unreachable
 - type 3 code  1  host unreachable
 - type 3 code  9  net administrative prohibited
 - type 3 code 10  host administrative prohibited

in addition to

 - type 3 code  2  protocol unreachable
 - type 3 code  3  port unreachable

The first too, so connections won't wait for timeout if the routers tell
you that the net/host is unreachable.

> Finally, ICMP errors should never be allowed to kill an existing TCP
> connection; if an administrative filter is installed across some existing
> flows, then those flows should be allowed to time out per the TCP protocol.

What I submitted was what we agreed upon earlier, but the above is fine
by me.

> Patch attached, please review.

See comments inline, I havn't build a kernel with it to verify
functionality.

> Index: ip_icmp.c
> ===================================================================
> RCS file: /ncvs/src/sys/netinet/ip_icmp.c,v
> retrieving revision 1.52
> diff -u -r1.52 ip_icmp.c
> --- ip_icmp.c	2001/02/18 09:34:51	1.52
> +++ ip_icmp.c	2001/02/22 23:48:25
> @@ -315,67 +315,35 @@
>  	case ICMP_UNREACH:
>  		switch (code) {
>  			case ICMP_UNREACH_NET:
> -				code = PRC_UNREACH_HOST;
> -				break;
> -
>  			case ICMP_UNREACH_HOST:
> -				code = PRC_UNREACH_HOST;
> -				break;

These 2 I don't agree upon, see above.

> Index: tcp_subr.c
> ===================================================================
> RCS file: /ncvs/src/sys/netinet/tcp_subr.c,v
> retrieving revision 1.91
> diff -u -r1.91 tcp_subr.c
> --- tcp_subr.c	2001/02/22 21:23:45	1.91
> +++ tcp_subr.c	2001/02/22 23:43:33
> @@ -134,32 +134,9 @@
>  SYSCTL_INT(_net_inet_tcp, OID_AUTO, pcbcount, CTLFLAG_RD, 
>      &tcbinfo.ipi_count, 0, "Number of active PCBs");
>  
> -/*
> - * Treat ICMP unreachables like a TCP RST as required by rfc1122 section 3.2.2.1
> - *
> - * Administatively prohibited kill's sessions regardless of
> - * their current state, other unreachable by default only kill 
> - * sessions if they are in SYN-SENT state, this ensure temporary 
> - * routing problems doesn't kill existing TCP sessions.
> - * This can be overridden by icmp_like_rst_syn_sent_only.
> - */
> - 
> -static int	icmp_unreach_like_rst = 1;
> -SYSCTL_INT(_net_inet_tcp, OID_AUTO, icmp_unreach_like_rst, CTLFLAG_RW,
> -	&icmp_unreach_like_rst, 0, 
> -	"Treat ICMP unreachable messages like TCP RST, rfc1122 section 3.2.2.1");
> -
> -/*
> - * Control if ICMP unreachable messages other that administratively prohibited
> - * ones will kill sessions not in SYN-SENT state.
> - *
> - * Has no effect unless icmp_unreach_like_rst is enabled.
> - */
> -
> -static int	icmp_like_rst_syn_sent_only = 1;
> -SYSCTL_INT(_net_inet_tcp, OID_AUTO, icmp_like_rst_syn_sent_only, CTLFLAG_RW,
> -	&icmp_like_rst_syn_sent_only, 0, 
> -	"When icmp_unreach_like_rst is enabled, only act on sessions in SYN-SENT state");

Perhaps you could keep some of the comment ...
/*
 * Treat some ICMP unreachables like a TCP RST as required by 
 * rfc1122 section 3.2.2.1
 */

>  	if (cmd == PRC_QUENCH)
>  		notify = tcp_quench;
> -	else if ((icmp_unreach_like_rst == 1) && ((cmd == PRC_UNREACH_HOST) ||
> -			(cmd == PRC_UNREACH_ADMIN_PROHIB)) && (ip) && 
> -			((IP_VHL_HL(ip->ip_vhl) << 2) == sizeof(struct ip))) {

Sure we'll not try to read off the end of the recieved packet, when we
remove the check for the header length.

I put it there as a extra check against "attackers" sending us malformed
ICMP messages with only part of the attached IP header, or even without
it.

/Jesper

-- 
Jesper Skriver, jesper(at)skriver(dot)dk  -  CCIE #5456
Work:    Network manager   @ AS3292 (Tele Danmark DataNetworks)
Private: FreeBSD committer @ AS2109 (A much smaller network ;-)

One Unix to rule them all, One Resolver to find them,
One IP to bring them all and in the zone to bind them.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010223034952.A6694>