Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Apr 2006 17:34:44 +0200
From:      Daniel Hartmeier <daniel@benzedrine.cx>
To:        Max Laier <max@love2party.net>
Cc:        Andrew Thompson <thompsa@freebsd.org>, freebsd-pf@freebsd.org
Subject:   Re: broken ip checksum after frag reassemble of nfs READDIR?
Message-ID:  <20060404153443.GX2684@insomnia.benzedrine.cx>
In-Reply-To: <20060404145704.GW2684@insomnia.benzedrine.cx>
References:  <20060402054532.GF17711@egr.msu.edu> <200604021734.09622.max@love2party.net> <20060404145704.GW2684@insomnia.benzedrine.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
Ok, I found the reason for all these IP checksum problems. The reason is
that OpenBSD's bridge code always recalculates the IP checksum, while
FreeBSD's doesn't.

In OpenBSD we have src/sys/net/if_bridge.c with the following code path.

    bridgeintr_frame()

    Main entry point on input path, gets called for every frame coming
    in on a member interface. It calls

      bridge_filter(IN, real_src_if)

      to filter the incoming frame on the real source interface the
      frame came in on. This corresponds to pfil_run_hooks(), as it
      calls the packet filter with

        pf_test(IN, real_src_if)

        this can not only drop or pass the IP packet unmodified, it can
        also drop a fragment after consumption into the reassembly
        cache, or return a different modified packet, for instance when
        rdr is used, or when a final fragment is consumed and the fully
        reassembled packet is returned instead.


     That completes the input path. pf doesn't really care to set the IP
     checksum correctly when it modifies the IP header in various ways,
     because the output path will fix it:

      if frame should be broadcast
        bridge_broadcast()
          bridge_filter(OUT, real_dst_if)
            pf_test(OUT, real_dst_if)
      else (unicast)
        bridge_filter(OUT, real_dst_if)
          pf_test(OUT, real_dst_if)

      if size fits mtu
        bridge_ifenqueue()
      else
        bridge_fragment()

      bridge_fragment()
        ip_fragment()
        bridge_ifenqueue()

      bridge_ifenqueue()
        IFQ_ENQUEUE(dst_if)
        dst_if->if_start()

What I missed before is in bridge_filter(), right after the pf_test()
call:

                if (pf_test(dir, ifp, &m, eh) != PF_PASS)
                        goto dropit;
                if (m == NULL)
                        goto dropit;

                /* Rebuild the IP header */   
                if (m->m_len < hlen && ((m = m_pullup(m, hlen)) == NULL))
                        return (NULL);
                if (m->m_len < sizeof(struct ip))
                        goto dropit;
                ip = mtod(m, struct ip *);
                ip->ip_sum = 0;
                ip->ip_sum = in_cksum(m, hlen);

FreeBSD has no such part that I can find. Hence, when pf_test() returns
a packet with an invalid IP checksum, nothing fixes the checksum, maybe
except for hardware-checksumming NICs.

Andrew, what do you suggest we do about this? Are the FreeBSD semantics
very clear and state that the IP checksum is pfil hook's responsibility,
and other pfil hooks (besides pf) are doing exactly that? I haven't used
the FreeBSD bridge code with other packet filters beside pf, so I simply
don't know.

If pf should return only IP packets to bridge which have correct IP
checksums already, we can either force an unconditional recomputation in
pf's pfil hook function (which wraps pf_test()), or we can go further
down the road of doing incremental checksum fixups whenever pf changes
the IP header internally. Once that would be complete, OpenBSD's bridge
could remove the unconditional checksum recomputation, too.

But I'm not sure what's cheaper, on average, fixing up the checksum
on each header change (there might be multiple changes per packet
processed), or simply doing it once, unconditionally, at the end.

Right now, we're in the suboptimal middle. pf does some incremental
fixups, but leaves the checksum incorrect in other cases.

Daniel



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060404153443.GX2684>