From owner-svn-src-stable-11@freebsd.org Thu May 24 11:02:22 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 225CFEB171C; Thu, 24 May 2018 11:02:22 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C8B12840B1; Thu, 24 May 2018 11:02:21 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A9D0411669; Thu, 24 May 2018 11:02:21 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w4OB2LBg002196; Thu, 24 May 2018 11:02:21 GMT (envelope-from ae@FreeBSD.org) Received: (from ae@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w4OB2Lre002195; Thu, 24 May 2018 11:02:21 GMT (envelope-from ae@FreeBSD.org) Message-Id: <201805241102.w4OB2Lre002195@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: ae set sender to ae@FreeBSD.org using -f From: "Andrey V. Elsukov" Date: Thu, 24 May 2018 11:02:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r334149 - stable/11/sys/netpfil/ipfw X-SVN-Group: stable-11 X-SVN-Commit-Author: ae X-SVN-Commit-Paths: stable/11/sys/netpfil/ipfw X-SVN-Commit-Revision: 334149 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2018 11:02:22 -0000 Author: ae Date: Thu May 24 11:02:21 2018 New Revision: 334149 URL: https://svnweb.freebsd.org/changeset/base/334149 Log: MFC r333986: Remove check for matching the rulenum, ruleid and rule pointer from dyn_lookup_ipv[46]_state_locked(). These checks are remnants of not ready to be committed code, and they are there by accident. Due to the race these checks can lead to creating of duplicate states when concurrent threads in the same time will try to add state for two packets of the same flow, but in reverse directions and matched by different parent rules. Reported by: lev MFC r334039: Restore the ability to keep states after parent rule deletion. This feature is disabled by default and was removed when dynamic states implementation changed to be lockless. Now it is reimplemented with small differences - when dyn_keep_states sysctl variable is enabled, dyn_match_ipv[46]_state() function doesn't match child states of deleted rule. And thus they are keept alive until expired. ipfw_dyn_lookup_state() function does check that state was not orphaned, and if so, it returns pointer to default_rule and its position in the rules map. The main visible difference is that orphaned states still have the same rule number that they have before parent rule deleted, because now a state has many fields related to rule and changing them all atomically to point to default_rule seems hard enough. Reported by: Approved by: re (kib) Modified: stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c ============================================================================== --- stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c Thu May 24 10:55:26 2018 (r334148) +++ stable/11/sys/netpfil/ipfw/ip_fw_dynamic.c Thu May 24 11:02:21 2018 (r334149) @@ -310,6 +310,9 @@ static VNET_DEFINE(struct callout, dyn_timeout); static VNET_DEFINE(uint32_t, curr_max_length); #define V_curr_max_length VNET(curr_max_length) +static VNET_DEFINE(uint32_t, dyn_keep_states); +#define V_dyn_keep_states VNET(dyn_keep_states) + static VNET_DEFINE(uma_zone_t, dyn_data_zone); static VNET_DEFINE(uma_zone_t, dyn_parent_zone); static VNET_DEFINE(uma_zone_t, dyn_ipv4_zone); @@ -360,6 +363,7 @@ static VNET_DEFINE(uint32_t, dyn_max); /* max # of dy static VNET_DEFINE(uint32_t, dyn_count); /* number of states */ static VNET_DEFINE(uint32_t, dyn_parent_max); /* max # of parent states */ static VNET_DEFINE(uint32_t, dyn_parent_count); /* number of parent states */ + #define V_dyn_max VNET(dyn_max) #define V_dyn_count VNET(dyn_count) #define V_dyn_parent_max VNET(dyn_parent_max) @@ -474,7 +478,11 @@ SYSCTL_U32(_net_inet_ip_fw, OID_AUTO, dyn_short_lifeti SYSCTL_U32(_net_inet_ip_fw, OID_AUTO, dyn_keepalive, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(dyn_keepalive), 0, "Enable keepalives for dynamic states."); +SYSCTL_U32(_net_inet_ip_fw, OID_AUTO, dyn_keep_states, + CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(dyn_keep_states), 0, + "Do not flush dynamic states on rule deletion"); + #ifdef IPFIREWALL_DYNDEBUG #define DYN_DEBUG(fmt, ...) do { \ printf("%s: " fmt "\n", __func__, __VA_ARGS__); \ @@ -489,8 +497,7 @@ static struct dyn_ipv6_state *dyn_lookup_ipv6_state( const struct ipfw_flow_id *, uint32_t, const void *, struct ipfw_dyn_info *, int); static int dyn_lookup_ipv6_state_locked(const struct ipfw_flow_id *, - uint32_t, const void *, int, const void *, uint32_t, uint16_t, uint32_t, - uint16_t); + uint32_t, const void *, int, uint32_t, uint16_t); static struct dyn_ipv6_state *dyn_alloc_ipv6_state( const struct ipfw_flow_id *, uint32_t, uint16_t, uint8_t); static int dyn_add_ipv6_state(void *, uint32_t, uint16_t, uint8_t, @@ -546,7 +553,7 @@ static void dyn_update_proto_state(struct dyn_data *, struct dyn_ipv4_state *dyn_lookup_ipv4_state(const struct ipfw_flow_id *, const void *, struct ipfw_dyn_info *, int); static int dyn_lookup_ipv4_state_locked(const struct ipfw_flow_id *, - const void *, int, const void *, uint32_t, uint16_t, uint32_t, uint16_t); + const void *, int, uint32_t, uint16_t); static struct dyn_ipv4_state *dyn_alloc_ipv4_state( const struct ipfw_flow_id *, uint16_t, uint8_t); static int dyn_add_ipv4_state(void *, uint32_t, uint16_t, uint8_t, @@ -1065,8 +1072,7 @@ restart: */ static int dyn_lookup_ipv4_state_locked(const struct ipfw_flow_id *pkt, - const void *ulp, int pktlen, const void *parent, uint32_t ruleid, - uint16_t rulenum, uint32_t bucket, uint16_t kidx) + const void *ulp, int pktlen, uint32_t bucket, uint16_t kidx) { struct dyn_ipv4_state *s; int dir; @@ -1077,15 +1083,6 @@ dyn_lookup_ipv4_state_locked(const struct ipfw_flow_id if (s->proto != pkt->proto || s->kidx != kidx) continue; - /* - * XXXAE: Install synchronized state only when there are - * no matching states. - */ - if (pktlen != 0 && ( - s->data->parent != parent || - s->data->ruleid != ruleid || - s->data->rulenum != rulenum)) - continue; if (s->sport == pkt->src_port && s->dport == pkt->dst_port && s->src == pkt->src_ip && s->dst == pkt->dst_ip) { @@ -1227,8 +1224,7 @@ restart: */ static int dyn_lookup_ipv6_state_locked(const struct ipfw_flow_id *pkt, uint32_t zoneid, - const void *ulp, int pktlen, const void *parent, uint32_t ruleid, - uint16_t rulenum, uint32_t bucket, uint16_t kidx) + const void *ulp, int pktlen, uint32_t bucket, uint16_t kidx) { struct dyn_ipv6_state *s; int dir; @@ -1239,15 +1235,6 @@ dyn_lookup_ipv6_state_locked(const struct ipfw_flow_id if (s->proto != pkt->proto || s->kidx != kidx || s->zoneid != zoneid) continue; - /* - * XXXAE: Install synchronized state only when there are - * no matching states. - */ - if (pktlen != 0 && ( - s->data->parent != parent || - s->data->ruleid != ruleid || - s->data->rulenum != rulenum)) - continue; if (s->sport == pkt->src_port && s->dport == pkt->dst_port && IN6_ARE_ADDR_EQUAL(&s->src, &pkt->src_ip6) && IN6_ARE_ADDR_EQUAL(&s->dst, &pkt->dst_ip6)) { @@ -1407,18 +1394,32 @@ ipfw_dyn_lookup_state(const struct ip_fw_args *args, c * that will be added into head of this bucket. * And the state that we currently have matched * should be deleted by dyn_expire_states(). + * + * In case when dyn_keep_states is enabled, return + * pointer to default rule and corresponding f_pos + * value. + * XXX: In this case we lose the cache efficiency, + * since f_pos is not cached, because it seems + * there is no easy way to atomically switch + * all fields related to parent rule of given + * state. */ - if (V_layer3_chain.map[data->f_pos] == rule) + if (V_layer3_chain.map[data->f_pos] == rule) { data->chain_id = V_layer3_chain.id; - else { + info->f_pos = data->f_pos; + } else if (V_dyn_keep_states != 0) { + rule = V_layer3_chain.default_rule; + info->f_pos = V_layer3_chain.n_rules - 1; + } else { rule = NULL; info->direction = MATCH_NONE; DYN_DEBUG("rule %p [%u, %u] is considered " "invalid in data %p", rule, data->ruleid, data->rulenum, data); + /* info->f_pos doesn't matter here. */ } - } - info->f_pos = data->f_pos; + } else + info->f_pos = data->f_pos; } DYNSTATE_CRITICAL_EXIT(); #if 0 @@ -1594,8 +1595,8 @@ dyn_add_ipv4_state(void *parent, uint32_t ruleid, uint * Bucket version has been changed since last lookup, * do lookup again to be sure that state does not exist. */ - if (dyn_lookup_ipv4_state_locked(pkt, ulp, pktlen, parent, - ruleid, rulenum, bucket, kidx) != 0) { + if (dyn_lookup_ipv4_state_locked(pkt, ulp, pktlen, + bucket, kidx) != 0) { DYN_BUCKET_UNLOCK(bucket); return (EEXIST); } @@ -1726,7 +1727,7 @@ dyn_add_ipv6_state(void *parent, uint32_t ruleid, uint * do lookup again to be sure that state does not exist. */ if (dyn_lookup_ipv6_state_locked(pkt, zoneid, ulp, pktlen, - parent, ruleid, rulenum, bucket, kidx) != 0) { + bucket, kidx) != 0) { DYN_BUCKET_UNLOCK(bucket); return (EEXIST); } @@ -2119,7 +2120,8 @@ dyn_match_ipv4_state(struct dyn_ipv4_state *s, const i if (s->type == O_LIMIT) return (dyn_match_range(s->data->rulenum, s->data->set, rt)); - if (dyn_match_range(s->data->rulenum, s->data->set, rt)) + if (V_dyn_keep_states == 0 && + dyn_match_range(s->data->rulenum, s->data->set, rt)) return (1); return (0); @@ -2137,7 +2139,8 @@ dyn_match_ipv6_state(struct dyn_ipv6_state *s, const i if (s->type == O_LIMIT) return (dyn_match_range(s->data->rulenum, s->data->set, rt)); - if (dyn_match_range(s->data->rulenum, s->data->set, rt)) + if (V_dyn_keep_states == 0 && + dyn_match_range(s->data->rulenum, s->data->set, rt)) return (1); return (0);