From owner-freebsd-net@FreeBSD.ORG Mon Jun 30 17:04:46 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 93325F62 for ; Mon, 30 Jun 2014 17:04:46 +0000 (UTC) Received: from mail-we0-x22d.google.com (mail-we0-x22d.google.com [IPv6:2a00:1450:400c:c03::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1D5FE2D4D for ; Mon, 30 Jun 2014 17:04:45 +0000 (UTC) Received: by mail-we0-f173.google.com with SMTP id t60so8540972wes.4 for ; Mon, 30 Jun 2014 10:04:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:subject:message-id:mime-version:content-type :content-disposition:hackerspace:user-agent; bh=qonc10JhApLF3wDev1vMQKztDHY7NGA0CF5NdyK7ttA=; b=1KIJKYmKaJ2Q9iJjlm1cjJj4wigdHK+H6oFgo83kHRLGdFFAIt/spVWUa6pvydl767 nVyu7aT9UvNDbfof14frSjK9IUkHpfcipUFqJQz8M98zwHsREpcg8ci1gVrsVkRCSZbW E4RZlgQgERpCfXGwcxPYr4esql3Ztx1aLhsXKE1dQepvMQ/DX7vTZegiOgFdyEfT4Bq6 lkwMRWgEO6lwl6dcqP4jf2e7331GZuLM1vNz9AMtDfnnBvJ3p+UFSpECJXE0IGYZslSg qRz/Fnfl9XzmwhVkuIh75EejodH0/GxEuieeHopwEVaeBOI3UjoylcELykiHvjzUCqtd 4O8A== X-Received: by 10.180.81.1 with SMTP id v1mr30537326wix.10.1404147884031; Mon, 30 Jun 2014 10:04:44 -0700 (PDT) Received: from gmail.com ([2001:630:241:204:9c18:7fce:d679:ccea]) by mx.google.com with ESMTPSA id f6sm42468686wja.25.2014.06.30.10.04.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Jun 2014 10:04:43 -0700 (PDT) Sender: Tom Jones Date: Mon, 30 Jun 2014 18:04:56 +0100 From: Tom Jones To: freebsd-net@freebsd.org Subject: [PATCH] Implementation of draft-ietf-tcpm-newcwv-06 Message-ID: <20140630170453.GA21404@gmail.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="SLDf9lqlvOQaIe6s" Content-Disposition: inline Hackerspace: 57North Hacklab User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jun 2014 17:04:46 -0000 --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hello, Attached is a patch which implements draft-ietf-tcpm-newcwv-06, "Updating TCP to support Rate-Limited Traffic". The patch is a port of the Linux implementation by Raffaello Secchi with influence from Aris Angelogiannopoulos's patch set that was sent to the list earlier this year. -- Tom | I don't see how we are going to build the dystopian megapoli @adventureloop | we were promised in 80s and early 90s cyberpunk fiction if adventurist.me | you guys keep complaining. --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename="newcwv.patch" Index: sys/conf/files =================================================================== --- sys/conf/files (revision 268040) +++ sys/conf/files (working copy) @@ -3380,6 +3380,7 @@ netinet/tcp_reass.c optional inet | inet6 netinet/tcp_sack.c optional inet | inet6 netinet/tcp_subr.c optional inet | inet6 netinet/tcp_syncache.c optional inet | inet6 +netinet/tcp_newcwv.c optional inet | inet6 netinet/tcp_timer.c optional inet | inet6 netinet/tcp_timewait.c optional inet | inet6 netinet/tcp_usrreq.c optional inet | inet6 Index: sys/netinet/tcp_input.c =================================================================== --- sys/netinet/tcp_input.c (revision 268040) +++ sys/netinet/tcp_input.c (working copy) @@ -105,6 +105,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #ifdef TCPDEBUG #include #endif /* TCPDEBUG */ @@ -174,6 +175,11 @@ SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, abc_l_var &VNET_NAME(tcp_abc_l_var), 2, "Cap the max cwnd increment during slow-start to this number of segments"); +VNET_DEFINE(int, tcp_do_newcwv) = 0; +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, newcwv, CTLFLAG_RW, + &VNET_NAME(tcp_do_newcwv), 0, + "Enable draft-ietf-tcpm-newcwv-06 (New Congestion Window Validation)"); + static SYSCTL_NODE(_net_inet_tcp, OID_AUTO, ecn, CTLFLAG_RW, 0, "TCP ECN"); VNET_DEFINE(int, tcp_do_ecn) = 0; @@ -309,6 +315,12 @@ cc_ack_received(struct tcpcb *tp, struct tcphdr *t tp->ccv->curack = th->th_ack; CC_ALGO(tp)->ack_received(tp->ccv, type); } + + /* + * update draft-ietf-newcwv-06 pipeack + */ + if(V_tcp_do_newcwv && !IN_FASTRECOVERY(tp->t_flags)) + tcp_newcwv_update_pipeack(tp); } static void inline @@ -378,6 +390,12 @@ cc_conn_init(struct tcpcb *tp) tp->snd_cwnd = 4 * tp->t_maxseg; } + /* + * Initialise NewCWV state + */ + tp->init_cwnd = tp->snd_cwnd; + tcp_newcwv_reset(tp); + if (CC_ALGO(tp)->conn_init != NULL) CC_ALGO(tp)->conn_init(tp->ccv); } @@ -426,6 +444,11 @@ cc_cong_signal(struct tcpcb *tp, struct tcphdr *th tp->t_badrxtwin = 0; break; } + + if (V_tcp_do_newcwv && + (type == CC_NDUPACK || type == CC_ECN) && + tp->pipeack <= (tp->snd_cwnd >> 1) ) + tcp_newcwv_enter_recovery(tp); if (CC_ALGO(tp)->cong_signal != NULL) { if (th != NULL) @@ -447,6 +470,13 @@ cc_post_recovery(struct tcpcb *tp, struct tcphdr * } /* XXXLAS: EXIT_RECOVERY ? */ tp->t_bytes_acked = 0; + + if(V_tcp_do_newcwv) { + if(tp->loss_flight_size) + tcp_newcwv_end_recovery(tp); + tcp_newcwv_reset(tp); + } + tp->loss_flight_size = 0; } #ifdef TCP_SIGNATURE Index: sys/netinet/tcp_newcwv.c =================================================================== --- sys/netinet/tcp_newcwv.c (revision 0) +++ sys/netinet/tcp_newcwv.c (working copy) @@ -0,0 +1,174 @@ +/* + * Copyright (c) 2014 Tom Jones + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + */ + +#include + +#include +#include +#include +#include +#include + +#include +#include +#include + +/* + * An implementation of NewCWV (draft-ietf-tcpm-newcwv-06) for FreeBSD. + * Based on the Linux implementation by Raffaello Secchi and the an initial + * implementation of draft-ietf-tcpm-newcwv-00 by Aris Angelogiannopoulos. + */ + +#define nextbin(x) (((x)+1) & 0x03) +#define prevbin(x) (((x)-1) & 0x03) + +#define NCWV_UNDEF 0xFFFFFFFF +#define NCWV_FIVEMINS (300*hz) + +void add_element(struct tcpcb *,u_int32_t); +u_int32_t remove_expired_elements(struct tcpcb *); + +void +tcp_newcwv_update_pipeack(struct tcpcb *tp) +{ + u_int32_t tmp_pipeack; + tp->psp = MAX(3 * tp->t_srtt,hz); + + if (tp->snd_una >= tp->prev_snd_nxt) { + /* get the pipeack sample */ + tmp_pipeack = tp->snd_una - tp->prev_snd_una; + + tp->prev_snd_una = tp->snd_una; + tp->prev_snd_nxt = tp->snd_nxt; + + /* create a new element at the end of current pmp */ + if(ticks > tp->time_stamp[tp->head] + (tp->psp >> 2)) + add_element(tp,tmp_pipeack); + else + tp->psample[tp->head] = tmp_pipeack; + } + + tp->pipeack = remove_expired_elements(tp); + + /* check if cwnd is validated */ + if (tp->pipeack == NCWV_UNDEF || + ((tp->pipeack << 1) >= (tp->snd_cwnd * tp->t_maxseg))) { + tp->cwnd_valid_ts = ticks; + } +} + +void +add_element(struct tcpcb *tp,u_int32_t value) +{ + tp->head = nextbin(tp->head); + tp->psample[tp->head] = value; + tp->time_stamp[tp->head] = ticks; +} + +u_int32_t +remove_expired_elements(struct tcpcb *tp) +{ + uint8_t head = tp->head; + u_int32_t tmp = tp->psample[head]; + + while(tp->psample[head] != NCWV_UNDEF) { + /* remove the element if expired */ + if (tp->time_stamp[head] < ticks - tp->psp) { + tp->psample[head] = NCWV_UNDEF; + return tmp; + } + + /* search for the max pipeack */ + if(tp->psample[head] > tmp) + tmp = tp->psample[head]; + + head = prevbin(head); + if(head == tp->head) + return tmp; + } + + return tmp; +} + +/* Initialise NewCWV state */ +void +tcp_newcwv_reset(struct tcpcb *tp) +{ + tp->prev_snd_una = tp->snd_una; + tp->prev_snd_nxt = tp->snd_nxt; + tp->cwnd_valid_ts = ticks; + tp->loss_flight_size = 0; + + tp->head = 0; + tp->psample[0] = NCWV_UNDEF; + tp->pipeack = NCWV_UNDEF; +} + +/* NewCWV actions at loss detection */ +void +tcp_newcwv_enter_recovery(struct tcpcb *tp) +{ + u_int32_t pipe; + + if(tp->pipeack == NCWV_UNDEF) + return; + + tp->prior_retrans = tp->t_sndrexmitpack; + + /* Calculate the flight size */ + u_int32_t awnd = (tp->snd_nxt - tp->snd_fack) + tp->sackhint.sack_bytes_rexmit; + tp->loss_flight_size = awnd; + + pipe = MAX(tp->pipeack,tp->loss_flight_size); + tp->snd_cwnd = MAX(pipe >> 1,1); +} + +/* NewCWV actions at the end of recovery */ +void +tcp_newcwv_end_recovery(struct tcpcb *tp) +{ + u_int32_t retrans,pipe; + + retrans = (tp->t_sndrexmitpack - tp->prior_retrans) * tp->t_maxseg; + pipe = MAX(tp->pipeack,tp->loss_flight_size) - retrans; + + /* Ensure that snd_ssthresh is non 0 */ + tp->snd_ssthresh = MAX(pipe >> 1,1); + tp->snd_cwnd = tp->snd_ssthresh; +} + +void +tcp_newcwv_datalim_closedown(struct tcpcb *tp) +{ + while ((ticks - tp->cwnd_valid_ts) > NCWV_FIVEMINS && + tp->snd_cwnd > tp->init_cwnd) { + + tp->cwnd_valid_ts += NCWV_FIVEMINS; + tp->snd_ssthresh = MAX( (3 * tp->snd_cwnd ) >> 2,tp->snd_ssthresh); + tp->snd_cwnd = MAX(tp->snd_cwnd >> 1, tp->init_cwnd); + } +} Property changes on: sys/netinet/tcp_newcwv.c ___________________________________________________________________ Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: sys/netinet/tcp_newcwv.h =================================================================== --- sys/netinet/tcp_newcwv.h (revision 0) +++ sys/netinet/tcp_newcwv.h (working copy) @@ -0,0 +1,39 @@ +/* + * Copyright (c) 2014 Tom Jones + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + */ + +#ifndef _NETINET_TCP_NEWCWV_H_ +#define _NETINET_TCP_NEWCWV_H_ + +#include + +void tcp_newcwv_update_pipeack(struct tcpcb *); +void tcp_newcwv_reset(struct tcpcb *); +void tcp_newcwv_enter_recovery(struct tcpcb *); +void tcp_newcwv_end_recovery(struct tcpcb *); +void tcp_newcwv_datalim_closedown(struct tcpcb *); + +#endif /* _NETINET_TCP_NEWCWV_H_ */ Property changes on: sys/netinet/tcp_newcwv.h ___________________________________________________________________ Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: sys/netinet/tcp_output.c =================================================================== --- sys/netinet/tcp_output.c (revision 268040) +++ sys/netinet/tcp_output.c (working copy) @@ -74,6 +74,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #ifdef TCPDEBUG #include #endif @@ -691,6 +692,10 @@ send: #endif hdrlen = sizeof (struct tcpiphdr); + /* Trigger the newcwv timer */ + if(V_tcp_do_newcwv) + tcp_newcwv_datalim_closedown(tp); + /* * Compute options for segment. * We only have to care about SYN and established connection Index: sys/netinet/tcp_subr.c =================================================================== --- sys/netinet/tcp_subr.c (revision 268040) +++ sys/netinet/tcp_subr.c (working copy) @@ -800,6 +800,7 @@ tcp_newtcpcb(struct inpcb *inp) tp->t_flags = (TF_REQ_SCALE|TF_REQ_TSTMP); if (V_tcp_do_sack) tp->t_flags |= TF_SACK_PERMIT; + TAILQ_INIT(&tp->snd_holes); tp->t_inpcb = inp; /* XXX */ /* Index: sys/netinet/tcp_var.h =================================================================== --- sys/netinet/tcp_var.h (revision 268040) +++ sys/netinet/tcp_var.h (working copy) @@ -172,6 +172,20 @@ struct tcpcb { int t_sndzerowin; /* zero-window updates sent */ u_int t_badrxtwin; /* window for retransmit recovery */ u_char snd_limited; /* segments limited transmitted */ +/* NewCWV releated state */ + u_int32_t pipeack; + u_int32_t psp; /* pipeack sampling period */ + + u_int32_t head; + u_int32_t psample[4]; /* pipe ack samples */ + u_int32_t time_stamp[4]; /* time stamp samples */ + u_int32_t prev_snd_una; /* previous snd_una in this sampe */ + u_int32_t prev_snd_nxt; /* previous snd_nxt in this sampe */ + + u_int32_t loss_flight_size; /* flightsize at loss detection */ + u_int32_t prior_retrans; /* Retransmission before going into FR */ + u_int32_t cwnd_valid_ts; /*last time cwnd was found valid */ + u_int32_t init_cwnd; /* The inital cwnd */ /* SACK related state */ int snd_numholes; /* number of holes seen by sender */ TAILQ_HEAD(sackhole_head, sackhole) snd_holes; @@ -605,6 +619,7 @@ VNET_DECLARE(int, tcp_recvspace); VNET_DECLARE(int, path_mtu_discovery); VNET_DECLARE(int, tcp_do_rfc3465); VNET_DECLARE(int, tcp_abc_l_var); +VNET_DECLARE(int, tcp_do_newcwv); #define V_tcb VNET(tcb) #define V_tcbinfo VNET(tcbinfo) #define V_tcp_mssdflt VNET(tcp_mssdflt) @@ -617,6 +632,7 @@ VNET_DECLARE(int, tcp_abc_l_var); #define V_path_mtu_discovery VNET(path_mtu_discovery) #define V_tcp_do_rfc3465 VNET(tcp_do_rfc3465) #define V_tcp_abc_l_var VNET(tcp_abc_l_var) +#define V_tcp_do_newcwv VNET(tcp_do_newcwv) VNET_DECLARE(int, tcp_do_sack); /* SACK enabled/disabled */ VNET_DECLARE(int, tcp_sc_rst_sock_fail); /* RST on sock alloc failure */ --SLDf9lqlvOQaIe6s--