From owner-freebsd-net@FreeBSD.ORG Wed Nov 19 23:05:00 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9B481065670; Wed, 19 Nov 2008 23:05:00 +0000 (UTC) (envelope-from Hartmut.Brandt@dlr.de) Received: from smtp-3.dlr.de (smtp-3.dlr.de [195.37.61.187]) by mx1.freebsd.org (Postfix) with ESMTP id 7D61D8FC18; Wed, 19 Nov 2008 23:05:00 +0000 (UTC) (envelope-from Hartmut.Brandt@dlr.de) Received: from beagle.kn.op.dlr.de ([129.247.178.136]) by smtp-3.dlr.de over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 20 Nov 2008 00:04:58 +0100 Date: Thu, 20 Nov 2008 00:04:54 +0100 (CET) From: Harti Brandt X-X-Sender: brandt_h@beagle.kn.op.dlr.de To: Andre Oppermann In-Reply-To: <4921F2CD.503@freebsd.org> Message-ID: <20081119234543.A90462@beagle.kn.op.dlr.de> References: <491F2C47.4050500@dlr.de> <0A4BB2F1-AC9F-4316-94E3-790E2D80F651@freebsd.org> <49201859.2080605@dlr.de> <4921B3C6.5020002@freebsd.org> <4921F2CD.503@freebsd.org> X-OpenPGP-Key: harti@freebsd.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-OriginalArrivalTime: 19 Nov 2008 23:04:58.0119 (UTC) FILETIME=[3E3AB170:01C94A9B] Cc: freebsd-net@freebsd.org, bz@freebsd.org, Rui Paulo Subject: Re: TCP and syncache question X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Harti Brandt List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2008 23:05:01 -0000 Hi Andre, On Mon, 17 Nov 2008, Andre Oppermann wrote: AO>This is a bit more complicated because of interactions with tcp_input() AO>where syncache_expand() is called from. AO> AO>The old code (as of December 2002) behaved slightly different. It would AO>not remove the syncache entry when (SND.UNA == SEG.ACK) but send a RST. AO>The (RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND) test wasn't done at AO>all. Instead a socket was opened whenever (SND.UNA == SEG.ACK) succeeded. AO>This gave way to the "LAND" DoS attack which was mostly fixed with a test AO>for (RCV.IRS < SEG.SEQ). AO> AO>See the attached patch for fixed version of syncache_expand(). This patch AO>is untested though. My development machine is currently down. Harti, Rui AO>and Bjoern, please have a look at the patch and review it. Some small problems: AO>--- tcp_input.c.1.390 Mon Nov 17 21:33:25 2008 AO>+++ tcp_input.c.1.390.mod Mon Nov 17 21:35:22 2008 AO>@@ -642,10 +642,13 @@ findpcb: AO> if (!syncache_expand(&inc, &to, th, &so, m)) { AO> /* AO> * No syncache entry or ACK was not AO>- * for our SYN/ACK. Send a RST. AO>+ * for our SYN/ACK. Send a RST or AO>+ * an ACK for re-synchronization. AO> * NB: syncache did its own logging AO> * of the failure cause. AO> */ AO>+ if (so == 1) Need a cast here: if (so == (struct socket *)1) AO>+ goto dropunlock; AO> rstreason = BANDLIM_RST_OPENPORT; AO> goto dropwithreset; AO> } AO>--- tcp_syncache.c.1.160 Mon Nov 17 16:49:01 2008 AO>+++ tcp_syncache.c.1.160.mod Mon Nov 17 23:35:39 2008 AO>@@ -817,59 +817,47 @@ syncache_expand(struct in_conninfo *inc, AO> INP_INFO_WLOCK_ASSERT(&V_tcbinfo); AO> KASSERT((th->th_flags & (TH_RST|TH_ACK|TH_SYN)) == TH_ACK, AO> ("%s: can handle only ACK", __func__)); AO>+ *lsop = NULL; This leads to a panic on sonewconn. *lsop points to the listening socket when the function is entered and this socket is needed below in syncache_socket(). I just removed this line. AO> AO> sc = syncache_lookup(inc, &sch); /* returns locked sch */ AO> SCH_LOCK_ASSERT(sch); AO> if (sc == NULL) { AO> /* AO> * There is no syncache entry, so see if this ACK is AO>- * a returning syncookie. To do this, first: AO>- * A. See if this socket has had a syncache entry dropped in AO>- * the past. We don't want to accept a bogus syncookie AO>- * if we've never received a SYN. AO>- * B. check that the syncookie is valid. If it is, then AO>- * cobble up a fake syncache entry, and return. AO>+ * a returning syncookie. If the syncookie is valid, AO>+ * cobble up a fake syncache entry and create a socket. AO>+ * AO>+ * NB: Syncache head is locked for the syncookie access. AO> */ AO> if (!tcp_syncookies) { AO>- SCH_UNLOCK(sch); AO> if ((s = tcp_log_addrs(inc, th, NULL, NULL))) AO> log(LOG_DEBUG, "%s; %s: Spurious ACK, " AO> "segment rejected (syncookies disabled)\n", AO> s, __func__); AO>- goto failed; AO>+ goto sendrst; AO> } AO> bzero(&scs, sizeof(scs)); AO> sc = syncookie_lookup(inc, sch, &scs, to, th, *lsop); AO>- SCH_UNLOCK(sch); AO> if (sc == NULL) { AO> if ((s = tcp_log_addrs(inc, th, NULL, NULL))) AO> log(LOG_DEBUG, "%s; %s: Segment failed " AO> "SYNCOOKIE authentication, segment rejected " AO> "(probably spoofed)\n", s, __func__); AO>- goto failed; AO>+ goto sendrst; AO> } AO>- } else { AO>- /* Pull out the entry to unlock the bucket row. */ AO>- TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash); AO>- sch->sch_length--; AO>- V_tcp_syncache.cache_count--; AO>- SCH_UNLOCK(sch); AO>+ goto expand; /* fully validated through syncookie */ AO> } AO>+ SCH_LOCK_ASSERT(sch); AO> AO> /* AO> * Segment validation: AO>- * ACK must match our initial sequence number + 1 (the SYN|ACK). AO>- */ AO>- if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) { AO>- if ((s = tcp_log_addrs(inc, th, NULL, NULL))) AO>- log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment " AO>- "rejected\n", s, __func__, th->th_ack, sc->sc_iss); AO>- goto failed; AO>- } AO>- AO>- /* AO>+ * AO> * The SEQ must fall in the window starting at the received AO> * initial receive sequence number + 1 (the SYN). AO>+ * If not the segment may be from an earlier connection. We send AO>+ * an ACK to re-synchronize the connection and keep the syncache AO>+ * entry without ajusting its timers. AO>+ * See RFC793 page 69, first check sequence number [SYN_RECEIVED]. AO> */ AO> if ((SEQ_LEQ(th->th_seq, sc->sc_irs) || AO> SEQ_GT(th->th_seq, sc->sc_irs + sc->sc_wnd)) && AO>@@ -877,14 +865,41 @@ syncache_expand(struct in_conninfo *inc, AO> if ((s = tcp_log_addrs(inc, th, NULL, NULL))) AO> log(LOG_DEBUG, "%s; %s: SEQ %u != IRS+1 %u, segment " AO> "rejected\n", s, __func__, th->th_seq, sc->sc_irs); AO>- goto failed; AO>+ (void) syncache_respond(sc); AO>+ *lsop = 1; /* prevent RST */ Need another cast here: *lsop = (struct socket *)1. [snip] I've re-run my test scripts and they seem to indicate, that the socket is now kept in the correct state when the incoming segment had an incorrect ack. I could no yet run the tests for an incorrect seqno, though. This is because there is an interesting problem in RFC793 (and MIL-STD-1778): the RFC states on page 36 a general rule that a 'reset (RST) must be sent whenever a segment arrives which apparently is not intended for the current connection.' I would say, that a segment carrying a sequence number at irs (when without SYN) and below irs (in any case) cannot belong to the current connection - those sequence numbers just don't exist for the connection. On the other hand p.69 says that we must send an ACK. If this were an ITU-T standard, things would be clear, because the prosa description would be normative, not the algorithm. I've tried to follow the classical BSD code in Stevens and it seems that before syncaches and stuff, an ACK was sent for a bad sequence number. So I'll change my tests (probably tomorrow in the evening) and check that the patch is correct for this case too. At a first glance a nice SYN+ACK came out... harti