From owner-freebsd-net@FreeBSD.ORG Thu Nov 20 12:51:38 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 125AC1065677 for ; Thu, 20 Nov 2008 12:51:38 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 421B78FC12 for ; Thu, 20 Nov 2008 12:51:37 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 47985 invoked from network); 20 Nov 2008 11:16:38 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 20 Nov 2008 11:16:38 -0000 Message-ID: <49255D5B.5040303@freebsd.org> Date: Thu, 20 Nov 2008 13:51:39 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.14 (Windows/20071210) MIME-Version: 1.0 To: Harti Brandt References: <491F2C47.4050500@dlr.de> <0A4BB2F1-AC9F-4316-94E3-790E2D80F651@freebsd.org> <49201859.2080605@dlr.de> <4921B3C6.5020002@freebsd.org> <4921F2CD.503@freebsd.org> <20081119234543.A90462@beagle.kn.op.dlr.de> In-Reply-To: <20081119234543.A90462@beagle.kn.op.dlr.de> Content-Type: multipart/mixed; boundary="------------040901000405020901010901" Cc: freebsd-net@freebsd.org, bz@freebsd.org, Rui Paulo Subject: Re: TCP and syncache question X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2008 12:51:38 -0000 This is a multi-part message in MIME format. --------------040901000405020901010901 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Harti Brandt wrote: > Hi Andre, > > On Mon, 17 Nov 2008, Andre Oppermann wrote: > > AO>This is a bit more complicated because of interactions with tcp_input() > AO>where syncache_expand() is called from. > AO> > AO>The old code (as of December 2002) behaved slightly different. It would > AO>not remove the syncache entry when (SND.UNA == SEG.ACK) but send a RST. > AO>The (RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND) test wasn't done at > AO>all. Instead a socket was opened whenever (SND.UNA == SEG.ACK) succeeded. > AO>This gave way to the "LAND" DoS attack which was mostly fixed with a test > AO>for (RCV.IRS < SEG.SEQ). > AO> > AO>See the attached patch for fixed version of syncache_expand(). This patch > AO>is untested though. My development machine is currently down. Harti, Rui > AO>and Bjoern, please have a look at the patch and review it. > > Some small problems: ... > Need another cast here: *lsop = (struct socket *)1. Changed the logic to use a NULL *lsop to differentiate in tcp_input(). Much simpler. > [snip] > > I've re-run my test scripts and they seem to indicate, that the socket is > now kept in the correct state when the incoming segment had an incorrect > ack. I could no yet run the tests for an incorrect seqno, though. This is > because there is an interesting problem in RFC793 (and MIL-STD-1778): the RFC > states on page 36 a general rule that a 'reset (RST) must be sent whenever > a segment arrives which apparently is not intended for the current connection.' The full quote from page 36 is: "As a general rule, reset (RST) must be sent whenever a segment arrives which apparently is not intended for the current connection. A reset must not be sent if it is not clear that this is the case." > I would say, that a segment carrying a sequence number at irs (when without SYN) > and below irs (in any case) cannot belong to the current connection - those > sequence numbers just don't exist for the connection. On the other hand p.69 > says that we must send an ACK. If this were an ITU-T standard, things > would be clear, because the prosa description would be normative, not the > algorithm. The more specific rule wins. Sending the "challenge" ACK is done under the assumption that the remote end will send a reset to our "challenge" ACK if such an connection doesn't exist there. > I've tried to follow the classical BSD code in Stevens and > it seems that before syncaches and stuff, an ACK was sent for a bad > sequence number. So I'll change my tests (probably tomorrow in the evening) > and check that the patch is correct for this case too. At a first glance > a nice SYN+ACK came out... An updated patch is attached. -- Andre --------------040901000405020901010901 Content-Type: text/plain; name="syncache_expand_fix-20081120.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="syncache_expand_fix-20081120.diff" --- tcp_input.c.1.390 Mon Nov 17 21:33:25 2008 +++ tcp_input.c.1.390.mod Thu Nov 20 08:25:36 2008 @@ -642,10 +642,13 @@ findpcb: if (!syncache_expand(&inc, &to, th, &so, m)) { /* * No syncache entry or ACK was not - * for our SYN/ACK. Send a RST. + * for our SYN/ACK. Send a RST or + * an ACK for re-synchronization. * NB: syncache did its own logging * of the failure cause. */ + if (so == NULL) + goto dropunlock; rstreason = BANDLIM_RST_OPENPORT; goto dropwithreset; } --- tcp_syncache.c.1.160 Mon Nov 17 16:49:01 2008 +++ tcp_syncache.c.1.160.mod Thu Nov 20 12:13:26 2008 @@ -823,53 +823,39 @@ syncache_expand(struct in_conninfo *inc, if (sc == NULL) { /* * There is no syncache entry, so see if this ACK is - * a returning syncookie. To do this, first: - * A. See if this socket has had a syncache entry dropped in - * the past. We don't want to accept a bogus syncookie - * if we've never received a SYN. - * B. check that the syncookie is valid. If it is, then - * cobble up a fake syncache entry, and return. + * a returning syncookie. If the syncookie is valid, + * cobble up a fake syncache entry and create a socket. + * + * NB: Syncache head is locked for the syncookie access. */ if (!tcp_syncookies) { - SCH_UNLOCK(sch); if ((s = tcp_log_addrs(inc, th, NULL, NULL))) log(LOG_DEBUG, "%s; %s: Spurious ACK, " "segment rejected (syncookies disabled)\n", s, __func__); - goto failed; + goto sendrst; } bzero(&scs, sizeof(scs)); sc = syncookie_lookup(inc, sch, &scs, to, th, *lsop); - SCH_UNLOCK(sch); if (sc == NULL) { if ((s = tcp_log_addrs(inc, th, NULL, NULL))) log(LOG_DEBUG, "%s; %s: Segment failed " "SYNCOOKIE authentication, segment rejected " "(probably spoofed)\n", s, __func__); - goto failed; + goto sendrst; } - } else { - /* Pull out the entry to unlock the bucket row. */ - TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash); - sch->sch_length--; - V_tcp_syncache.cache_count--; - SCH_UNLOCK(sch); + goto expand; /* fully validated through syncookie */ } /* * Segment validation: - * ACK must match our initial sequence number + 1 (the SYN|ACK). - */ - if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) { - if ((s = tcp_log_addrs(inc, th, NULL, NULL))) - log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment " - "rejected\n", s, __func__, th->th_ack, sc->sc_iss); - goto failed; - } - - /* + * * The SEQ must fall in the window starting at the received * initial receive sequence number + 1 (the SYN). + * If not the segment may be from an earlier connection. + * We send an ACK to re-synchronize the connection and keep + * the syncache entry without ajusting its timers. + * See RFC793 page 69, first check sequence number [SYN_RECEIVED]. */ if ((SEQ_LEQ(th->th_seq, sc->sc_irs) || SEQ_GT(th->th_seq, sc->sc_irs + sc->sc_wnd)) && @@ -877,14 +863,41 @@ syncache_expand(struct in_conninfo *inc, if ((s = tcp_log_addrs(inc, th, NULL, NULL))) log(LOG_DEBUG, "%s; %s: SEQ %u != IRS+1 %u, segment " "rejected\n", s, __func__, th->th_seq, sc->sc_irs); - goto failed; + (void) syncache_respond(sc); + *lsop = NULL; /* prevent RST */ + goto sendrstkeep; + } + + /* + * ACK must match our initial sequence number + 1 (the SYN|ACK). + * If not the segment may be from an earlier connection. We send + * a RST but keep the syncache entry without ajusting its timers. + * See RFC793 page 72, fifth check the ACK field, [SYN_RECEIVED]. + */ + if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) + log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment " + "rejected\n", s, __func__, th->th_ack, sc->sc_iss); + goto sendrstkeep; } + /* + * Remove the entry to unlock the bucket row. + * Tests from now on are fatal and remove the syncache entry. + */ + TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash); + sch->sch_length--; + V_tcp_syncache.cache_count--; + + /* + * If timestamps were not negotiated they must not show up later. + * See RFC1312bis, section 1.3, second paragraph + */ if (!(sc->sc_flags & SCF_TIMESTAMP) && (to->to_flags & TOF_TS)) { if ((s = tcp_log_addrs(inc, th, NULL, NULL))) log(LOG_DEBUG, "%s; %s: Timestamp not expected, " "segment rejected\n", s, __func__); - goto failed; + goto sendrst; } /* * If timestamps were negotiated the reflected timestamp @@ -896,9 +909,11 @@ syncache_expand(struct in_conninfo *inc, log(LOG_DEBUG, "%s; %s: TSECR %u != TS %u, " "segment rejected\n", s, __func__, to->to_tsecr, sc->sc_ts); - goto failed; + goto sendrst; } +expand: + SCH_UNLOCK(sch); *lsop = syncache_socket(sc, *lsop, m); if (*lsop == NULL) @@ -906,16 +921,18 @@ syncache_expand(struct in_conninfo *inc, else V_tcpstat.tcps_sc_completed++; -/* how do we find the inp for the new socket? */ if (sc != &scs) syncache_free(sc); return (1); -failed: - if (sc != NULL && sc != &scs) + +sendrst: + if (sc != &scs) syncache_free(sc); +sendrstkeep: + SCH_LOCK_ASSERT(sch); + SCH_UNLOCK(sch); if (s != NULL) free(s, M_TCPLOG); - *lsop = NULL; return (0); } --------------040901000405020901010901--