Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 05 May 2013 21:00:40 +0400
From:      "Alexander V. Chernikov" <melifaro@ipfw.ru>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        freebsd-ipfw@freebsd.org, luigi@freebsd.org
Subject:   Re: [patch] ipfw interface tracking and opcode rewriting
Message-ID:  <51869038.8060302@ipfw.ru>
In-Reply-To: <20130424202308.GA11146@onelab2.iet.unipi.it>
References:  <517801D3.5040502@FreeBSD.org> <20130424162349.GA8439@onelab2.iet.unipi.it> <51780C49.7000204@FreeBSD.org> <20130424190930.GA10395@onelab2.iet.unipi.it> <51783798.4020004@FreeBSD.org> <20130424202308.GA11146@onelab2.iet.unipi.it>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------080306070304010006020308
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 25.04.2013 00:23, Luigi Rizzo wrote:
> On Wed, Apr 24, 2013 at 11:50:48PM +0400, Alexander V. Chernikov wrote:
>> On 24.04.2013 23:09, Luigi Rizzo wrote:
>>> On Wed, Apr 24, 2013 at 08:46:01PM +0400, Alexander V. Chernikov wrote:
>>>> On 24.04.2013 20:23, Luigi Rizzo wrote:
> ...
>>>> Well, actually I'm thinking of the next 2 steps:
>>>> 1) making kernel rule header more compact (20 bytes instead of 48) and
>>>> making it invisible for userland.
>>>> This involves rule counters to be stored separately (and possibly as
>>>> pcpu-based ones).
>>>> 2) since ruleset is now nearly readonly and more or less compact we can
>>>> try to store it in
>>>> contiguous address space to optimize cache line usage.
>>> certainly a worthwhile goal (also using gleb's new counters)
>>> but i suspect that compacting rules are a second order effect.
>>> I a bit skeptical they make a big difference on the in-kernel
>>> version of ipfw. You might see some difference in the
Well, direct tests with 1-2 rules shows results nearly within the limits 
of error with all 3 approaches. Production shows better results, but 
there are too many factors here..

>> My current numbers are ~5mpps of IPv4 forwarding with ipfw turned on (1
>> rule) for vlans over ixgbe, with 60% cpu usage (2xE5646).
>
> yes, but that means about 1mpps per core. the userspace version,
> i have been told, reached 14.2mpps with one core when running
> on netmap interfaces (single rule). So the impact of 20-30ns
> per rule is much higher in the second case.

>
> Speaking of which -- it would be better to report performance
> in terms of ns per packet (or per rule), rather than Mpps, because
> it is much easier to compare results.
Yes, this is true if you can scale linearly. With kernel-based 
forwarding this is not the case, actually speed is more or less constant 
(1mpps per core), and (nearly) all my patches are related to
decrease locking (e.g. scaling, not code optimization).

> Saying "20% better" or "300Kpps more" requires a lot more
> context to understand how much things improved.
>
>> For lagg with 2x ixgbe it is ~7mpps with the same 60% usage.
>> (And, say, 70% of CPU usage on our production is ipfw, despite low
>> number of rules).
>>> userspace version, which runs on top of netmap.
>> We are preparing to move forward in this direction (and thinking of
>> 20-30mpps as our goal).
>> (And I hope some changes of kernel-based version can migrate to userland
>> one :))
>
> yes, my goal is to have a single source tree that builds both in kernel
> and userspace. The diffs are very small, I just have a little bit
> of glue code to hook the packet path and the control socket.

Yep. While such changes may be not significant in kernel-based ipfw 
version, there is netmap-based one.

Personally I think that userland ipfw definitely should have some 
interface tracking anyway (since one of the fastest/easiest way to speed 
up forwarding is to retain control plain the same, while making 
forwarding plane configurable based on routing socket messages) and 
having this functionality can help kernel a bit (ipfw nat interface 
tracking, for example)

Opcode rewriting can help in keeping ABI the same while changing 
everything inside kernel. It can be currently used to eliminate
O_IP6_SRC_ME, O_IP6_DST_ME, O_IPPRECEDENCE and O_IPTOS opcodes,


The last one splits kernel/user ip_fw rule headers and prepares 
interface counters to be switched to per-cpu  (and here there is 
definitely visible performance difference even in kernel-based ipfw).

What do you think?


>
> cheers
> luigi
>


--------------080306070304010006020308
Content-Type: text/plain; charset=UTF-8;
 name="ipfw_rule_abi.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="ipfw_rule_abi.diff"

Index: sys/netinet/ip_fw.h
===================================================================
--- sys/netinet/ip_fw.h	(revision 250230)
+++ sys/netinet/ip_fw.h	(working copy)
@@ -468,6 +468,7 @@ typedef struct _ipfw_insn_icmp6 {
                        */
 } ipfw_insn_icmp6;
 
+#ifndef _KERNEL
 /*
  * Here we have the structure representing an ipfw rule.
  *
@@ -489,10 +490,6 @@ typedef struct _ipfw_insn_icmp6 {
  *	(at ACTION_PTR(r)) MUST be O_LOG
  *  + if a rule has an "altq" option, it comes after "log"
  *  + if a rule has an O_TAG option, it comes after "log" and "altq"
- *
- * NOTE: we use a simple linked list of rules because we never need
- * 	to delete a rule without scanning the list. We do not use
- *	queue(3) macros for portability and readability.
  */
 
 struct ip_fw {
@@ -521,6 +518,7 @@ struct ip_fw {
 
 #define RULESIZE(rule)  (sizeof(struct ip_fw) + \
 	((struct ip_fw *)(rule))->cmd_len * 4 - 4)
+#endif
 
 #if 1 // should be moved to in.h
 /*
Index: sys/netpfil/ipfw/ip_fw2.c
===================================================================
--- sys/netpfil/ipfw/ip_fw2.c	(revision 250246)
+++ sys/netpfil/ipfw/ip_fw2.c	(working copy)
@@ -2637,8 +2637,7 @@ vnet_ipfw_init(const void *unused)
 	chain->n_rules = 1;
 	chain->static_len = sizeof(struct ip_fw);
 	chain->map = malloc(sizeof(struct ip_fw *), M_IPFW, M_WAITOK | M_ZERO);
-	if (chain->map)
-		rule = malloc(chain->static_len, M_IPFW, M_WAITOK | M_ZERO);
+	rule = ipfw_alloc_rule(chain, sizeof(struct ip_fw));
 
 	/* Set initial number of tables */
 	V_fw_tables_max = default_fw_tables;
Index: sys/netpfil/ipfw/ip_fw_private.h
===================================================================
--- sys/netpfil/ipfw/ip_fw_private.h	(revision 250230)
+++ sys/netpfil/ipfw/ip_fw_private.h	(working copy)
@@ -212,6 +212,63 @@ VNET_DECLARE(int, autoinc_step);
 VNET_DECLARE(unsigned int, fw_tables_max);
 #define V_fw_tables_max		VNET(fw_tables_max)
 
+#ifdef _KERNEL
+typedef struct ip_fw_cntr {
+	uint64_t	pcnt;	   /* Packet counter		*/
+	uint64_t	bcnt;	   /* Byte counter		 */
+	uint32_t	timestamp;      /* tv_sec of last match	 */
+} ip_fw_cntr;
+
+/*
+ * Here we have the structure representing an ipfw rule.
+ *
+ * It starts with a general area (with link fields and counters)
+ * followed by an array of one or more instructions, which the code
+ * accesses as an array of 32-bit values.
+ *
+ * Given a rule pointer  r:
+ *
+ *  r->cmd		is the start of the first instruction.
+ *  ACTION_PTR(r)	is the start of the first action (things to do
+ *			once a rule matched).
+ *
+ * When assembling instruction, remember the following:
+ *
+ *  + if a rule has a "keep-state" (or "limit") option, then the
+ *	first instruction (at r->cmd) MUST BE an O_PROBE_STATE
+ *  + if a rule has a "log" option, then the first action
+ *	(at ACTION_PTR(r)) MUST be O_LOG
+ *  + if a rule has an "altq" option, it comes after "log"
+ *  + if a rule has an O_TAG option, it comes after "log" and "altq"
+ *
+ * NOTE: we use a simple linked list of rules because we never need
+ * 	to delete a rule without scanning the list. We do not use
+ *	queue(3) macros for portability and readability.
+ */
+
+struct ip_fw {
+	uint16_t	act_ofs;	/* offset of action in 32-bit units */
+	uint16_t	cmd_len;	/* # of 32-bit words in cmd     */
+	uint16_t	rulenum;	/* rule number		  */
+	uint8_t set;	    /* rule set (0..31)	     */
+#define	RESVD_SET	31      /* set for default and persistent rules */
+	uint8_t	 _pad;	   /* padding		      */
+	struct ip_fw_cntr	*cntr;  /* Pointer to rule counters     */
+	struct ip_fw    *x_next;	/* linked list of rules	 */
+	struct ip_fw    *next_rule;     /* ptr to next [skipto] rule    */
+	uint32_t	id;	     /* rule id */
+
+	ipfw_insn	cmd[1];	 /* storage for commands	 */
+};
+
+/* Kernel rule length */
+#define RULESIZE(rule)  (sizeof(struct ip_fw) + \
+	((struct ip_fw *)(rule))->cmd_len * 4 - 4)
+
+#define ACTION_PTR(rule)			       \
+       (ipfw_insn *)( (u_int32_t *)((rule)->cmd) + ((rule)->act_ofs) )
+#endif
+
 struct ip_fw_chain {
 	struct ip_fw	*rules;		/* list of rules */
 	struct ip_fw	*reap;		/* list of rules to reap */
@@ -238,9 +295,9 @@ struct sockopt;	/* used by tcp_var.h */
 
 /* Macro for working with various counters */
 #define	IPFW_INC_RULE_COUNTER(_cntr, _bytes)	do {	\
-	(_cntr)->pcnt++;				\
-	(_cntr)->bcnt += _bytes;			\
-	(_cntr)->timestamp = time_uptime;		\
+	(_cntr)->cntr->pcnt++;				\
+	(_cntr)->cntr->bcnt += _bytes;			\
+	(_cntr)->cntr->timestamp = time_uptime;		\
 	} while (0)
 
 #define	IPFW_INC_DYN_COUNTER(_cntr, _bytes)	do {		\
@@ -249,9 +306,9 @@ struct sockopt;	/* used by tcp_var.h */
 	} while (0)
 
 #define	IPFW_ZERO_RULE_COUNTER(_cntr) do {		\
-	(_cntr)->pcnt = 0;				\
-	(_cntr)->bcnt = 0;				\
-	(_cntr)->timestamp = 0;				\
+	(_cntr)->cntr->pcnt = 0;			\
+	(_cntr)->cntr->bcnt = 0;			\
+	(_cntr)->cntr->timestamp = 0;			\
 	} while (0)
 
 #define	IPFW_ZERO_DYN_COUNTER(_cntr) do {		\
@@ -298,6 +355,7 @@ int ipfw_find_rule(struct ip_fw_chain *chain, uint
 int ipfw_add_rule(struct ip_fw_chain *chain, struct ip_fw *input_rule);
 int ipfw_ctl(struct sockopt *sopt);
 int ipfw_chk(struct ip_fw_args *args);
+struct ip_fw *ipfw_alloc_rule(struct ip_fw_chain *chain, size_t rulesize);
 void ipfw_reap_rules(struct ip_fw *head);
 
 /* In ip_fw_table.c */
Index: sys/netpfil/ipfw/ip_fw_sockopt.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_sockopt.c	(revision 250230)
+++ sys/netpfil/ipfw/ip_fw_sockopt.c	(working copy)
@@ -73,6 +73,27 @@ MALLOC_DEFINE(M_IPFW, "IpFw/IpAcct", "IpFw/IpAcct
  * static variables followed by global ones (none in this file)
  */
 
+struct ip_fw *
+ipfw_alloc_rule(struct ip_fw_chain *chain, size_t rulesize)
+{
+	struct ip_fw *rule;
+	struct ip_fw_cntr *cntr;
+
+	rule = malloc(rulesize, M_IPFW, M_WAITOK | M_ZERO);
+	cntr = malloc(sizeof(struct ip_fw_cntr), M_IPFW, M_WAITOK | M_ZERO);
+	rule->cntr = cntr;
+
+	return (rule);
+}
+
+static void
+free_rule(struct ip_fw *rule)
+{
+
+	free(rule->cntr, M_IPFW);
+	free(rule, M_IPFW);
+}
+
 /*
  * Find the smallest rule >= key, id.
  * We could use bsearch but it is so simple that we code it directly
@@ -158,20 +179,27 @@ ipfw_add_rule(struct ip_fw_chain *chain, struct ip
 	struct ip_fw *rule;
 	int i, l, insert_before;
 	struct ip_fw **map;	/* the new array of pointers */
+	struct ip_fw_cntr *cntr;
 
 	if (chain->rules == NULL || input_rule->rulenum > IPFW_DEFAULT_RULE-1)
 		return (EINVAL);
 
 	l = RULESIZE(input_rule);
-	rule = malloc(l, M_IPFW, M_WAITOK | M_ZERO);
+	rule = ipfw_alloc_rule(chain, l);
 	/* get_map returns with IPFW_UH_WLOCK if successful */
 	map = get_map(chain, 1, 0 /* not locked */);
 	if (map == NULL) {
-		free(rule, M_IPFW);
-		return ENOSPC;
+		free_rule(rule);
+		return (ENOSPC);
 	}
 
+	/* Save old data */
+	cntr = rule->cntr;
+
 	bcopy(input_rule, rule, l);
+
+	rule->cntr = cntr;
+
 	/* clear fields not settable from userland */
 	rule->x_next = NULL;
 	rule->next_rule = NULL;
@@ -220,7 +248,7 @@ ipfw_reap_rules(struct ip_fw *head)
 
 	while ((rule = head) != NULL) {
 		head = head->x_next;
-		free(rule, M_IPFW);
+		free_rule(rule);
 	}
 }
 
@@ -830,7 +858,6 @@ bad_size:
 	return EINVAL;
 }
 
-
 /*
  * Translation of requests for compatibility with FreeBSD 7.2/8.
  * a static variable tells us if we have an old client from userland,
@@ -859,15 +886,44 @@ struct ip_fw7 {
 	ipfw_insn	cmd[1];		/* storage for commands     */
 };
 
-	int convert_rule_to_7(struct ip_fw *rule);
-int convert_rule_to_8(struct ip_fw *rule);
-
-#ifndef RULESIZE7
-#define RULESIZE7(rule)  (sizeof(struct ip_fw7) + \
+#define	RULESIZE7(rule)	(sizeof(struct ip_fw7) + \
 	((struct ip_fw7 *)(rule))->cmd_len * 4 - 4)
-#endif
 
+/* Size difference between FreeBSD7 userland and kernel */
+#define	RULEDIFF7	(sizeof(struct ip_fw7) - sizeof(struct ip_fw))
 
+struct ip_fw8 {
+	struct ip_fw    *x_next;	/* linked list of rules	 */
+	struct ip_fw    *next_rule;     /* ptr to next [skipto] rule    */
+	/* 'next_rule' is used to pass up 'set_disable' status	  */
+
+	uint16_t	act_ofs;	/* offset of action in 32-bit units */
+	uint16_t	cmd_len;	/* # of 32-bit words in cmd     */
+	uint16_t	rulenum;	/* rule number		  */
+	uint8_t set;	    /* rule set (0..31)	     */
+#define	RESVD_SET	31      /* set for default and persistent rules */
+	uint8_t	 _pad;	   /* padding		      */
+	uint32_t	id;	     /* rule id */
+
+	/* These fields are present in all rules.			*/
+	uint64_t	pcnt;	   /* Packet counter		*/
+	uint64_t	bcnt;	   /* Byte counter		 */
+	uint32_t	timestamp;      /* tv_sec of last match	 */
+
+	ipfw_insn	cmd[1];	 /* storage for commands	 */
+};
+
+#define	RULESIZE8(rule)	(sizeof(struct ip_fw8) + \
+	((struct ip_fw8 *)(rule))->cmd_len * 4 - 4)
+
+/* Size difference between FreeBSD8 userland and kernel */
+#define	RULEDIFF8	(sizeof(struct ip_fw8) - sizeof(struct ip_fw))
+
+static int convert_rule_from_7(void *src, struct ip_fw *rule);
+static int convert_rule_to_7(struct ip_fw *rule, void *dst_rule, size_t *psize);
+static int convert_rule_from_8(void *src, struct ip_fw *rule);
+static int convert_rule_to_8(struct ip_fw *rule, void *dst_rule, size_t *psize);
+
 /*
  * Copy the static and dynamic rules to the supplied buffer
  * and return the amount of space actually used.
@@ -878,56 +934,21 @@ ipfw_getrules(struct ip_fw_chain *chain, void *buf
 {
 	char *bp = buf;
 	char *ep = bp + space;
-	struct ip_fw *rule, *dst;
+	struct ip_fw *rule;
 	int l, i;
-	time_t	boot_seconds;
 
-        boot_seconds = boottime.tv_sec;
 	for (i = 0; i < chain->n_rules; i++) {
 		rule = chain->map[i];
 
-		if (is7) {
-		    /* Convert rule to FreeBSd 7.2 format */
-		    l = RULESIZE7(rule);
-		    if (bp + l + sizeof(uint32_t) <= ep) {
-			int error;
-			bcopy(rule, bp, l + sizeof(uint32_t));
-			error = convert_rule_to_7((struct ip_fw *) bp);
-			if (error)
-				return 0; /*XXX correct? */
-			/*
-			 * XXX HACK. Store the disable mask in the "next"
-			 * pointer in a wild attempt to keep the ABI the same.
-			 * Why do we do this on EVERY rule?
-			 */
-			bcopy(&V_set_disable,
-				&(((struct ip_fw7 *)bp)->next_rule),
-				sizeof(V_set_disable));
-			if (((struct ip_fw7 *)bp)->timestamp)
-			    ((struct ip_fw7 *)bp)->timestamp += boot_seconds;
-			bp += l;
-		    }
-		    continue; /* go to next rule */
-		}
-
-		/* normal mode, don't touch rules */
-		l = RULESIZE(rule);
-		if (bp + l > ep) { /* should not happen */
-			printf("overflow dumping static rules\n");
-			break;
-		}
-		dst = (struct ip_fw *)bp;
-		bcopy(rule, dst, l);
-		/*
-		 * XXX HACK. Store the disable mask in the "next"
-		 * pointer in a wild attempt to keep the ABI the same.
-		 * Why do we do this on EVERY rule?
-		 */
-		bcopy(&V_set_disable, &dst->next_rule, sizeof(V_set_disable));
-		if (dst->timestamp)
-			dst->timestamp += boot_seconds;
+		if (is7)
+			l = convert_rule_to_7(rule, bp, &space);
+		else
+			l = convert_rule_to_8(rule, bp, &space);
+		if (l == 0)
+			return (0);
 		bp += l;
 	}
+
 	ipfw_get_dynamic(chain, &bp, ep); /* protected by the dynamic lock */
 	return (bp - (char *)buf);
 }
@@ -999,6 +1020,12 @@ ipfw_ctl(struct sockopt *sopt)
 			int len = 0, want;
 
 			size = chain->static_len;
+			/*
+			 * Currently the only (size-related) differece between
+			 * kernel and 7/8 userland are rule headers.
+			 * Calculate size change depending on version.
+			 */
+			size += chain->n_rules * (is7 ? RULEDIFF7 : RULEDIFF8);
 			size += ipfw_dyn_len();
 			if (size >= sopt->sopt_valsize)
 				break;
@@ -1023,9 +1050,14 @@ ipfw_ctl(struct sockopt *sopt)
 		break;
 
 	case IP_FW_ADD:
-		rule = malloc(RULE_MAXSIZE, M_TEMP, M_WAITOK);
-		error = sooptcopyin(sopt, rule, RULE_MAXSIZE,
-			sizeof(struct ip_fw7) );
+		buf = malloc(RULE_MAXSIZE * 2, M_TEMP, M_WAITOK);
+		rule = (struct ip_fw *)((char *)buf + RULE_MAXSIZE);
+		/*
+		 * Note that ip_fw7 has the smallest size between all other
+		 * userland rule structures.
+		 */
+		error = sooptcopyin(sopt, buf, RULE_MAXSIZE,
+			sizeof(struct ip_fw7));
 
 		/*
 		 * If the size of commands equals RULESIZE7 then we assume
@@ -1037,32 +1069,33 @@ ipfw_ctl(struct sockopt *sopt)
 		 *       the ipfw binary may crash or loop infinitly...
 		 */
 		if (sopt->sopt_valsize == RULESIZE7(rule)) {
-		    is7 = 1;
-		    error = convert_rule_to_8(rule);
-		    if (error)
-			return error;
-		    if (error == 0)
-			error = check_ipfw_struct(rule, RULESIZE(rule));
+			is7 = 1;
+			error = convert_rule_from_7(buf, rule);
 		} else {
-		    is7 = 0;
+			is7 = 0;
+			error = convert_rule_from_8(buf, rule);
+		}
+
 		if (error == 0)
-			error = check_ipfw_struct(rule, sopt->sopt_valsize);
+			error = check_ipfw_struct(rule, RULESIZE(rule));
+
+		if (error != 0)
+			return (error);
+
+		/* locking is done within ipfw_add_rule() */
+		error = ipfw_add_rule(chain, rule);
+		if (!error && sopt->sopt_dir == SOPT_GET) {
+			if (is7)
+				size = convert_rule_to_7(rule, buf, NULL);
+			else
+				size = convert_rule_to_8(rule, buf, NULL);
+
+			if (size == 0)
+				return (EINVAL);
+
+			error = sooptcopyout(sopt, buf, size);
 		}
-		if (error == 0) {
-			/* locking is done within ipfw_add_rule() */
-			error = ipfw_add_rule(chain, rule);
-			size = RULESIZE(rule);
-			if (!error && sopt->sopt_dir == SOPT_GET) {
-				if (is7) {
-					error = convert_rule_to_7(rule);
-					size = RULESIZE7(rule);
-					if (error)
-						return error;
-				}
-				error = sooptcopyout(sopt, rule, size);
-		}
-		}
-		free(rule, M_TEMP);
+		free(buf, M_TEMP);
 		break;
 
 	case IP_FW_DEL:
@@ -1327,47 +1360,121 @@ ipfw_ctl(struct sockopt *sopt)
 #undef RULE_MAXSIZE
 }
 
+/*
+ * Functions to convert rules between FreeBSD 7 userland and kernel.
+ * No input validation (except general size checking) is done here.
+ */
 
-#define	RULE_MAXSIZE	(256*sizeof(u_int32_t))
+#define        _COPYFIELD(_f)  rule8->_f = rule->_f
+static int
+convert_rule_to_8(struct ip_fw *rule, void *dst_rule, size_t *psize)
+{
+	struct ip_fw8 *rule8;
+	size_t len;
 
-/* Functions to convert rules 7.2 <==> 8.0 */
-int
-convert_rule_to_7(struct ip_fw *rule)
+	if (psize != NULL && *psize < RULESIZE(rule) + RULEDIFF8)
+		return (0);
+
+	rule8 = (struct ip_fw8 *)dst_rule;
+	memset(rule8, 0, sizeof(struct ip_fw8));
+
+	_COPYFIELD(act_ofs);
+	_COPYFIELD(cmd_len);
+	_COPYFIELD(rulenum);
+	_COPYFIELD(set);
+	_COPYFIELD(id);
+
+	/*
+	 * Store the disable mask in the "next_rule" field.
+	 * Why do we do this on EVERY rule?
+	 */
+	memcpy(&rule8->next_rule, &V_set_disable, sizeof(V_set_disable));
+	if (rule8->timestamp > 0)
+		rule8->timestamp += boottime.tv_sec;
+
+	/* Export counters */
+	if (rule->cntr != NULL) {
+		rule8->bcnt = rule->cntr->bcnt;
+		rule8->pcnt = rule->cntr->pcnt;
+		rule8->timestamp = rule->cntr->timestamp;
+	}
+
+	memcpy(rule8->cmd, rule->cmd, rule->cmd_len * sizeof(uint32_t));
+
+	len = RULESIZE8(rule8);
+	if (psize != NULL)
+		*psize -= len;
+
+	return (len);
+}
+#undef _COPYFIELD
+
+#define        _COPYFIELD(_f)  rule->_f = rule8->_f
+static int
+convert_rule_from_8(void *src, struct ip_fw *rule)
 {
-	/* Used to modify original rule */
-	struct ip_fw7 *rule7 = (struct ip_fw7 *)rule;
-	/* copy of original rule, version 8 */
-	struct ip_fw *tmp;
+	struct ip_fw8 *rule8;
 
-	/* Used to copy commands */
+	rule8 = (struct ip_fw8 *)src;
+	memset(rule, 0, sizeof(struct ip_fw));
+
+	_COPYFIELD(act_ofs);
+	_COPYFIELD(cmd_len);
+	_COPYFIELD(rulenum);
+	_COPYFIELD(set);
+	_COPYFIELD(id);
+
+	memcpy(rule->cmd, rule8->cmd, rule8->cmd_len * sizeof(uint32_t));
+
+	return (0);
+}
+#undef _COPYFIELD
+
+
+static int
+convert_rule_to_7(struct ip_fw *rule, void *dst_rule, size_t *psize)
+{
 	ipfw_insn *ccmd, *dst;
 	int ll = 0, ccmdlen = 0;
+	struct ip_fw7 *rule7;
+	size_t len;
 
-	tmp = malloc(RULE_MAXSIZE, M_TEMP, M_NOWAIT | M_ZERO);
-	if (tmp == NULL) {
-		return 1; //XXX error
-	}
-	bcopy(rule, tmp, RULE_MAXSIZE);
+	if (psize != NULL && *psize < RULESIZE(rule) + RULEDIFF7)
+		return (0);
 
+	rule7 = (struct ip_fw7 *)dst_rule;
+	memset(rule7, 0, sizeof(struct ip_fw7));
+
 	/* Copy fields */
-	rule7->_pad = tmp->_pad;
-	rule7->set = tmp->set;
-	rule7->rulenum = tmp->rulenum;
-	rule7->cmd_len = tmp->cmd_len;
-	rule7->act_ofs = tmp->act_ofs;
-	rule7->next_rule = (struct ip_fw7 *)tmp->next_rule;
-	rule7->next = (struct ip_fw7 *)tmp->x_next;
-	rule7->cmd_len = tmp->cmd_len;
-	rule7->pcnt = tmp->pcnt;
-	rule7->bcnt = tmp->bcnt;
-	rule7->timestamp = tmp->timestamp;
+	rule7->_pad = rule->_pad;
+	rule7->set = rule->set;
+	rule7->rulenum = rule->rulenum;
+	rule7->cmd_len = rule->cmd_len;
+	rule7->act_ofs = rule->act_ofs;
+	rule7->next_rule = (struct ip_fw7 *)rule->next_rule;
+	rule7->next = (struct ip_fw7 *)rule->x_next;
+	rule7->cmd_len = rule->cmd_len;
 
+	if (rule->cntr != NULL) {
+		rule7->pcnt = rule->cntr->pcnt;
+		rule7->bcnt = rule->cntr->bcnt;
+		rule7->timestamp = rule->cntr->timestamp;
+	}
+
+	/*
+	 * Store the disable mask in the "next"
+	 * Why do we do this on EVERY rule?
+	 */
+	memcpy(&rule7->next_rule, &V_set_disable, sizeof(V_set_disable));
+	if (rule7->timestamp > 0)
+		rule7->timestamp += boottime.tv_sec;
+
 	/* Copy commands */
-	for (ll = tmp->cmd_len, ccmd = tmp->cmd, dst = rule7->cmd ;
-			ll > 0 ; ll -= ccmdlen, ccmd += ccmdlen, dst += ccmdlen) {
+	for (ll = rule->cmd_len, ccmd = rule->cmd, dst = rule7->cmd; ll > 0;
+	    ll -= ccmdlen, ccmd += ccmdlen, dst += ccmdlen) {
 		ccmdlen = F_LEN(ccmd);
 
-		bcopy(ccmd, dst, F_LEN(ccmd)*sizeof(uint32_t));
+		memcpy(dst, ccmd, F_LEN(ccmd) * sizeof(uint32_t));
 
 		if (dst->opcode > O_NAT)
 			/* O_REASS doesn't exists in 7.2 version, so
@@ -1378,37 +1485,43 @@ ipfw_ctl(struct sockopt *sopt)
 		if (ccmdlen > ll) {
 			printf("ipfw: opcode %d size truncated\n",
 				ccmd->opcode);
-			return EINVAL;
+			return (EINVAL);
 		}
 	}
-	free(tmp, M_TEMP);
 
-	return 0;
+	len = RULESIZE7(rule7);
+	if (psize != NULL)
+		*psize -= len;
+
+	return (len);
 }
 
-int
-convert_rule_to_8(struct ip_fw *rule)
+static int
+convert_rule_from_7(void *src, struct ip_fw *rule)
 {
-	/* Used to modify original rule */
-	struct ip_fw7 *rule7 = (struct ip_fw7 *) rule;
-
 	/* Used to copy commands */
 	ipfw_insn *ccmd, *dst;
 	int ll = 0, ccmdlen = 0;
+	struct ip_fw7 *rule7;
+	
+	rule7 = (struct ip_fw7 *)src;
+	memset(rule, 0, sizeof(struct ip_fw));
 
-	/* Copy of original rule */
-	struct ip_fw7 *tmp = malloc(RULE_MAXSIZE, M_TEMP, M_NOWAIT | M_ZERO);
-	if (tmp == NULL) {
-		return 1; //XXX error
-	}
+	rule->_pad = rule7->_pad;
+	rule->set = rule7->set;
+	rule->rulenum = rule7->rulenum;
+	rule->cmd_len = rule7->cmd_len;
+	rule->act_ofs = rule7->act_ofs;
+	rule->next_rule = (struct ip_fw *)rule7->next_rule;
+	rule->x_next = (struct ip_fw *)rule7->next;
+	rule->cmd_len = rule7->cmd_len;
+	rule->id = 0; /* XXX see if is ok = 0 */
 
-	bcopy(rule7, tmp, RULE_MAXSIZE);
-
-	for (ll = tmp->cmd_len, ccmd = tmp->cmd, dst = rule->cmd ;
+	for (ll = rule7->cmd_len, ccmd = rule7->cmd, dst = rule->cmd ;
 			ll > 0 ; ll -= ccmdlen, ccmd += ccmdlen, dst += ccmdlen) {
 		ccmdlen = F_LEN(ccmd);
 		
-		bcopy(ccmd, dst, F_LEN(ccmd)*sizeof(uint32_t));
+		memcpy(dst, ccmd, F_LEN(ccmd) * sizeof(uint32_t));
 
 		if (dst->opcode > O_NAT)
 			/* O_REASS doesn't exists in 7.2 version, so
@@ -1419,25 +1532,11 @@ ipfw_ctl(struct sockopt *sopt)
 		if (ccmdlen > ll) {
 			printf("ipfw: opcode %d size truncated\n",
 			    ccmd->opcode);
-			return EINVAL;
+			return (EINVAL);
 		}
 	}
 
-	rule->_pad = tmp->_pad;
-	rule->set = tmp->set;
-	rule->rulenum = tmp->rulenum;
-	rule->cmd_len = tmp->cmd_len;
-	rule->act_ofs = tmp->act_ofs;
-	rule->next_rule = (struct ip_fw *)tmp->next_rule;
-	rule->x_next = (struct ip_fw *)tmp->next;
-	rule->cmd_len = tmp->cmd_len;
-	rule->id = 0; /* XXX see if is ok = 0 */
-	rule->pcnt = tmp->pcnt;
-	rule->bcnt = tmp->bcnt;
-	rule->timestamp = tmp->timestamp;
-
-	free (tmp, M_TEMP);
-	return 0;
+	return (0);
 }
 
 /* end of file */

--------------080306070304010006020308--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51869038.8060302>