Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Jan 2004 23:35:04 -0000
From:      Andre Oppermann <andre@freebsd.org>
To:        rw@codeburst.co.uk, freebsd-net@freebsd.org
Subject:   Re: kern/60889 - zero IP id change issues in 5.2RC2
Message-ID:  <3FFC97A1.4F397CC3@freebsd.org>
References:  <200401072023.UAA18922@starburst.demon.co.uk> <3FFC8CBE.13B527A3@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Andre Oppermann wrote:
> 
> Richard,
> 
> I've attached a patch that fixes the problem with FIN/ACK and one more
> case which got it wrong.  I also fixed the host byte order problem in
> ip_output.c for the normal ip_id incrementor.
> 
> Some comments and questions:
> 
>  1. Do you think it is neccessary to do a htons() on the randomized
>     ip_id too?  I'd say yes if there is a case where it has to
>     monotonically increase afterwards.  Does it?
> 
>  2. I have a Win2k machine but have check out how I can get tcp header
>     compression to work with my Cisco AS5300 (if it doesn't do that by
>     default).  Will I see the problem when I do a download from a FreeBSD
>     5.2RC2 machine or do I have to use the Windoze as router sending
>     packets upwards?`
> 
>  3. There are indeed devices clearing the DF bit.  For example Cisco
>     is recommending this in it's trouble shooting section for broadband
>     access via DSL/L2TP where the MTU is lower than 1500 because of the
>     tunnelling overhead.  Make a route-map to unset the DF bit and apply
>     it to the incoming interface.

4. After reading the pf.conf man page from OpenBSD (where it talks about
   scrubbing IP packets) I don't think it's a good idea to set the ip_id
   to zero in the DF case.  It seems to cause various kinds of problems
   when for some reason DF is removed along the path (which it shouldn't
   but whatever).  I think it is clearly better to put a ip_id into every
   packet no matter what.  Although the ip_randomid() function doesn't
   look really cheap to run...  but then security comes at a cost.

5. Random ip_id is already a kernel option but it is *not* enabled by
   default in 5.2RC2 GENERIC or -current.  On the other hand the code
   *does* zero the ip_id when DF is set in any case which is troublesome
   as we just found out.

Updated fix attached.  Same tab-brokenness due to c&p.

-- 
Andre

Index: ip_output.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
retrieving revision 1.203
diff -u -p -r1.203 ip_output.c
--- ip_output.c 20 Nov 2003 20:07:38 -0000      1.203
+++ ip_output.c 7 Jan 2004 23:31:56 -0000
@@ -229,10 +229,10 @@ ip_output(struct mbuf *m0, struct mbuf *

        /*
         * Fill in IP header.  If we are not allowing fragmentation,
-        * then the ip_id field is meaningless, so send it as zero
-        * to reduce information leakage.  Otherwise, if we are not
-        * randomizing ip_id, then don't bother to convert it to network
-        * byte order -- it's just a nonce.  Note that a 16-bit counter
+        * then the ip_id field is meaningless, but we don't set it
+        * to zero.  Doing so causes various problems when devices along
+        * the path (routers, load balancers, firewalls, etc.) illegally
+        * disable DF on our packet.  Note that a 16-bit counter
         * will wrap around in less than 10 seconds at 100 Mbit/s on a
         * medium with MTU 1500.  See Steven M. Bellovin, "A Technique
         * for Counting NATted Hosts", Proc. IMW'02, available at
@@ -241,17 +241,11 @@ ip_output(struct mbuf *m0, struct mbuf *
        if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
                ip->ip_v = IPVERSION;
                ip->ip_hl = hlen >> 2;
-               if ((ip->ip_off & IP_DF) == 0) {
-                       ip->ip_off = 0;
 #ifdef RANDOM_IP_ID
-                       ip->ip_id = ip_randomid();
+               ip->ip_id = ip_randomid();
 #else
-                       ip->ip_id = ip_id++;
+               ip->ip_id = htons(ip_id++);
 #endif
-               } else {
-                       ip->ip_off = IP_DF;
-                       ip->ip_id = 0;
-               }
                ipstat.ips_localout++;
        } else {
                hlen = ip->ip_hl << 2;
Index: tcp_subr.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
retrieving revision 1.172
diff -u -p -r1.172 tcp_subr.c
--- tcp_subr.c  6 Jan 2004 23:29:46 -0000       1.172
+++ tcp_subr.c  7 Jan 2004 23:31:57 -0000
@@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
        tlen += sizeof (struct tcpiphdr);
        ip->ip_len = tlen;
        ip->ip_ttl = ip_defttl;
+       if (path_mtu_discovery)
+               ip->ip_off |= IP_DF;
       }
        m->m_len = tlen;
        m->m_pkthdr.len = tlen;
@@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
                m->m_pkthdr.csum_flags = CSUM_TCP;
                m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
                ip->ip_len = m->m_pkthdr.len;
+               if (path_mtu_discovery)
+                       ip->ip_off |= IP_DF;
                error = ip_output(m, inp->inp_options, NULL,
                    (tw->tw_so_options & SO_DONTROUTE), NULL, inp);
        }


> If these questions are answered I can prepare the final patch and
> commit it pending review and re@ approval.
> 
> --
> Andre
> 
> (Note: tabs converted to whitespace because of c&p)
> 
> Index: ip_output.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
> retrieving revision 1.203
> diff -u -p -r1.203 ip_output.c
> --- ip_output.c 20 Nov 2003 20:07:38 -0000      1.203
> +++ ip_output.c 7 Jan 2004 22:43:54 -0000
> @@ -246,7 +246,7 @@ ip_output(struct mbuf *m0, struct mbuf *
>  #ifdef RANDOM_IP_ID
>                         ip->ip_id = ip_randomid();
>  #else
> -                       ip->ip_id = ip_id++;
> +                       ip->ip_id = htons(ip_id++);
>  #endif
>                 } else {
>                         ip->ip_off = IP_DF;
> Index: tcp_subr.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
> retrieving revision 1.172
> diff -u -p -r1.172 tcp_subr.c
> --- tcp_subr.c  6 Jan 2004 23:29:46 -0000       1.172
> +++ tcp_subr.c  7 Jan 2004 22:43:55 -0000
> @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
>         tlen += sizeof (struct tcpiphdr);
>         ip->ip_len = tlen;
>         ip->ip_ttl = ip_defttl;
> +       if (path_mtu_discovery)
> +               ip->ip_off |= IP_DF;
>        }
>         m->m_len = tlen;
>         m->m_pkthdr.len = tlen;
> @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
>                 m->m_pkthdr.csum_flags = CSUM_TCP;
>                 m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
>                 ip->ip_len = m->m_pkthdr.len;
> +               if (path_mtu_discovery)
> +                       ip->ip_off |= IP_DF;
>                 error = ip_output(m, inp->inp_options, NULL,
>                     (tw->tw_so_options & SO_DONTROUTE), NULL, inp);
>         }
> 
> Richard Wendland wrote:
> >
> > I've been asked (for freebsd-bugs) to open a discussion about PR
> > kern/60889 on freebsd-net, to decide if a recent change to IP should be
> > reversed before 5.2-RELEASE (scheduled this month) to give more time
> > for some serious issues and risks my PR has raised to be considered
> > and tested.  My proposal is to revert to 5.1 behaviour for now.
> >
> > The recent change is to emit a zero fragmentation id when DF is set, with
> > the objective of improving privacy from external id sequence observation,
> > an issue raised by Steve Bellovin's paper in IMW'02.
> >
> > I've identified 4 problems with this change:
> >
> > 1) Even with path_mtu_discovery set, for some reason TCP emits FIN-ACK
> >    without DF.  This causes two problems for this change:
> >
> >    a) Currently the change doesn't really meet its objective for TCP,
> >       it just means FIN-ACK must be observed.
> >
> >    b) Because now just one id is consumed per TCP connection on the
> >       FIN-ACK, for most/many systems id becomes a close approximate
> >       count of the number of TCP connections it has made, which external
> >       observers can see, not possible before this change.  To me this
> >       seems more of a privacy issue for all FreeBSD users than the NAT
> >       issue in Steve Bellovin's paper this change seeks to solve.
> >
> > 2) Linux made exactly this change some time ago, around Linux 2.4.4,
> >    but was forced to back-out the change because (I think) of practical
> >    connectivity issues related to the comment in include/net/ip.h
> >    ip_select_ident() where it now implements an incrementing ip_id for DF:
> >
> >      /* This is only to work around buggy Windows95/2000
> >       * VJ compression implementations.  If the ID field
> >       * does not change, they drop every other packet in
> >       * a TCP stream using header compression.
> >       */
> >
> >    I'm not aware that anyone checked that this buggy Windows95/2000 VJ
> >    compression problem is no longer an issue in practice.  I doubt that
> >    the large web hosters who use FreeBSD would be best-pleased if they
> >    ran into this for even just a few users - especially as there is no
> >    config option to disable this change.
> >
> > 3) This change causes ip_id for non-DF to be output in native
> >    byte order in ip_output.c.  Unfortunately ip_id is still output in
> >    Network Byte Order in ip_mroute.c and raw_ip.c, so this change risks
> >    little-endian machines emitting the same fragmentation id at about
> >    the same time from these different modules, rather than the usual
> >    64k cycle; creating a small but real risk of re-assembly errors.
> >    [This isn't in my PR, I've only just noticed it.]
> >
> > I also have a suspicion that some middle-boxes (like HTTP load-balancers)
> > may clear DF without setting a fresh IP id - clearing DF would save them
> > the bother of routing ICMP fragmentation needed back to the source server.
> > If this is so this is another problem this change could show up which
> > may cause re-assembly errors.  I know the bug is elsewhere, but it would
> > still become a practical problem for some FreeBSD users.
> >
> > So before going with this change I think four things need to be done:
> >
> > 1) TCP changed so FIN-ACK goes out with DF if path_mtu_discovery set.
> >
> > 2) Tests with Windows95/2000 TCP VJ compression (RFC1144) run.
> >
> > 3) ip_id should be emitted in the same byte order everywhere.
> >
> > 4) The change made a config option, so sites can disable it should
> >    they run into problems, just as RANDOM_IP_ID is an option.
> >
> > This all seems too much to do for 5.2-RELEASE, and as I think the problems
> > and risks sufficiently serious, I propose reversing the change until this
> > can be done.  We need a quick decision on this to get it into 5.2-RELEASE.
> >
> > The PR has some more detail and tcpdump output demonstrating the
> > issue:
> >
> >     http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/60889
> >
> > The change to be reversed can be seen at:
> >
> >     http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c#rev1.189
> >     http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c.diff?r1=1.188&r2=1.189
> >
> > NB If anyone can easily run tests with Windows95/2000 TCP VJ compression that
> > would be great (using tcpdump to see if there is abnormal retransmission).
> >
> > NB2 If anyone knows why FIN-ACK goes out without DF that would be helpful.
> > I've quickly looked at the source, and I can't see why.  It's emitted
> > as TCP moves from FIN_WAIT_n state to TIME_WAIT probably at line 3091 of
> > tcp_input.c with a call to tcp_output().  tcp_output() always appears to
> > set IP_DF at line 998 of tcp_output.c, if path_mtu_discovery is enabled.
> > A puzzle.
> >
> > Thanks to Boris Staeblow <bs@dva.in-berlin.de> and Tim Rylance
> > <tkr@puffball.demon.co.uk> for highlighting this change and helping me
> > diagnose the issues.
> >
> >         Richard
> > --
> > Richard Wendland                                richard@wendland.org.uk
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3FFC97A1.4F397CC3>