From owner-freebsd-ipfw@FreeBSD.ORG  Wed Jul  2 14:43:25 2003
Return-Path: <owner-freebsd-ipfw@FreeBSD.ORG>
Delivered-To: freebsd-ipfw@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 4CD3D37B401; Wed,  2 Jul 2003 14:43:25 -0700 (PDT)
Received: from mail2.fishnavy.inf.cu (mail2.fishnavy.inf.cu [200.55.133.86])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 5BECC43FDF; Wed,  2 Jul 2003 14:43:09 -0700 (PDT)
	(envelope-from jemilio@fishnavy.inf.cu)
Received: from jemilio (proxy.cubamar.cu [200.55.133.82] (may be forged))
	h62MlJ9N000193;	Wed, 2 Jul 2003 17:47:28 -0500 (EST)
Message-ID: <00f501c340e3$9a4d31d0$85060a0a@CONDOR.CU>
From: =?iso-8859-1?Q?Jos=E9_Emilio_N=FA=F1ez_May=E1ns?=
	<jemilio@fishnavy.inf.cu>
To: "Luigi Rizzo" <rizzo@icir.org>, <rwatson@FreeBSD.org>,
	<ipfw@FreeBSD.org>
Date: Wed, 2 Jul 2003 17:47:49 -0400
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1106
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Subject: Another algorithm for matching static firewall rules on a network
	packet
X-BeenThere: freebsd-ipfw@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: IPFW Technical Discussions <freebsd-ipfw.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw>,
	<mailto:freebsd-ipfw-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-ipfw>
List-Post: <mailto:freebsd-ipfw@freebsd.org>
List-Help: <mailto:freebsd-ipfw-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ipfw>,
	<mailto:freebsd-ipfw-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Jul 2003 21:43:25 -0000


Another algorithm for matching static firewall rules on a network packet

PREFACE

Associated with each firewall package is an evaluation algorithm that decides
what action to do (deny, allow, count, etc.) with a network packet according
to the existent fw-rules. This evaluation algorithm consists basically in a
search through the firewall rules for a matching one(s) on the packet. In
FreeBSD 4.5, this is the function "ip_fw_chk" in the file "ip_fw.c", IMHO.

What is to be discussed next is a possible (?) firewall evaluation algorithm.
I'm sorry, but I have no previous experience with the FreeBSD kernel to have a
working implementation, so I can't run any real-life comparison. The algorithm
may be UNPOLISHED. I hope you have more luck than I had.

It's a complex algorithm, so the final idea is communicated in incremental
steps, in order to let you correct more easily any error, bug or
misunderstanding that, I'm positive, may arise. If I could ask you patience, I
shall.

The algorithm shown here shouldn't be covered by a patent currently, because
it could be in use in the RDBMS arena for many years now.

my best wishes,
sincerely,
José Emilio Núñez Mayáns

PS: Although this e-mail is sent to:
     - professor Luigi Rizzo, by its extensive and professional work on the
       dynamic (stateful) firewall (and a lot of others); and to
     - Robert Watson, by the "ugidfw(8)" (File System Firewall Policy in
       FreeBSD-5.1-current), another kind firewall (on filesystem objects)
       (TrustedBSD project);
     - /* it should have been sent to others that have collaborated in the
       firewall code, Ugen J. S. Antsilevich, Poul-Henning Kamp, Alex Nash,
       Archie Cobbs... */

it's an incomplete work and FreeBSD is an open project. And I apologize for
the unintentional mistakes I have made, and consuming your spare time.


TOPICS

SECTION 1. Introduction to the proposed algorithm
SECTION 1.1. The basic algorithm's steps (ip_fw_chk function)
SECTION 1.2. The search data structure
SECTION 2. Informal initial sketch of the algorithm in C
SECTION 3. Details and improvements of the algorithm
SECTION 3.2. Irrelevant fields (omitted or equal to ANY) in a rule
SECTION 3.3. Negated (NOT) values in a fw-rule
SECTION 3.4. Dealing with ranges and masks of values
SECTION 3.7. Frequently-matched rules near the start of the rules list
SECTION 3.7.1. A first approach using a partial merge
SECTION 3.7.2. Another approach using the fields' structure among the rules,
               without a merge of the indexing lists
SECTION 3.7.3. Consolidating both variants of the algorithm
SECTION 3.8. Multi-pass firewalls
SECTION 4. Final words
ANNEX 1. The merge algorithm (of section 3.2)
ANNEX 2. IP_FW_CHK from section 3.7.2
ANNEX 3. How to build the array "break_rule" of each (static) rule
         (from section 3.7.2)
ANNEX 4. IP_FW_CHK from section 3.7.2
ANNEX 90. A useful idea from the dynamic rules


SECTION 1. INTRODUCTION TO THE PROPOSED ALGORITHM

When a list of (static) firewall rules is set up, all the problematic values
(restricted and/or allowed addresses, ports, etc.), are informed to the IPFW
subsystem. When a packet arrives, all the values of its fields are known.
Since those values don't change while scanning (searching) for deciding what
to do with the packet, then, if we can find a way to use this fact in the
search, we may obtain some saving.

The proposed evaluation strategy tries to economize comparisons, which, in the
common sequential-search algorithm may be repeated if two rules share a common
field(s), e.g., in these two rules,

         allow tcp from 192.168.1.0/25 to me smtp
         allow udp from 192.168.3.0/25 to me syslog

the IP-protocol, source and destination addresses, and destination port, are
compared twice versus the respective ones in the packet, but no value has
changed. This could be expensive if IP6 addresses are to be compared.

(But note that, by trying to use the mentioned fact and by trying to save
comparisons between the packet's fields and the firewall's, a waste in
operations can have occurred in some other sense.)


SECTION 1.1. THE BASIC ALGORITHM'S STEPS (ip_fw_chk FUNCTION)

The idea consists basically in an indexing schema.

1.- Separate initially each field of the packet, because the comparisons are
    ultimately done field by field (addresses, ports, flags, ...)  between its
    real value in the packet and what was specified in a fw-rule.

2.- Employ a quick search method for each of the (independent) fields in the
    (projection on that field of the) fw-rules. (For example, for each field,
    a hash-based lookup, or, what is also mentioned in the following, a search
    on a B-tree).

    Each value that appears in the search data structure of a field is there
    because it occurs in a fw-rule in such field. Each value (entry or node in
    the search data structure) has associated an ordered (by "fw_number"
    fw-rule#) list of the rules that have such value for the field (cf.
    section 1.2). These ordered lists were built when the search data
    structure (hash, B-tree, etc) containing the values was.

    In this step, each search for a field returns the above-mentioned list of
    fw-rules associated with the value of the field. I.e., this step as a
    whole returns the ordered lists of rules associated with the values of the
    fields in the packet.

3.- Merge the lists extracting the fw-rules that are common to all the fields
    (coincidences), and these are the fw-rules that have to be applied on the
    packet. As this common sublist will be also ordered by increasing rule
    number ("fw_number"), low-numbered rules will be applied first on the
    packet, preserving the existent semantics of the fw-subsystem.

4.- Finally, apply the list of common fw-rules -from step 3- on the packet,
    until an allow/deny rule is found in the list and then the transversal
    stops. More exactly, in the words of the ipfw(8) man page, the transversal
    on the list ends immediately after a "the search terminates" rule is
    applied (allow, deny, reject, reset, divert, etc.), and continues after a
    "the search continues" rule (count, skip, etc.)

    If no "the search terminates" fw-rule is found in the list, then, after
    the transversal, the default fw-rule is applied to the packet.


SECTION 1.2. THE SEARCH DATA STRUCTURE

Before continuing to the sketch of the basic algorithm in C, let's briefly
expose the search data structure employed in step 2. It would have the
following form:

Overall perspective:

 a field                     main fw rule           another field
 search data struct.           list                search data struct.
  |        |              ("ip_fw_chain_head")       |        |
  V   ...  V                      |                  V   ...  V
 list     list                    V                 list     list
 | |...    |  ----...-........-> rule               | |...    ...
 | |       -----...--..-          |                 | ...
 | ----...->..          |         V                 |
 ---------...->....     -------> ...  <----...-------
                                  |
                                  V
                                 ...
                                  |
                                  V
                            default fw-rule

And detailed for each field,

           search data structure (B-tree, hash, etc.) for the field
             |
           (/* containing several values of the field, mentioned in the
             |          firewall rules -the projection of the fw-rules on that
             |          field */)
             |
           a node or entry: a single value of the field, mentioned by the
             |          fw-rules
             |
            (points to)
             |
               -> an ordered -by "fw_number"- list of the fw-rules matching
                  that value for that field. Each entry in the list contains
                  a pointer to a "struct ip_fw" in the "ip_fw_chain_head"
                  list.

                  A list of this type will be returned, if any, by the step 2
                  (searches) for each field. Step 3 (merge) may not change
                  these lists.

So, a node in the lists has this structure:

       struct list_entry {
           struct list_entry * next;
           struct ip_fw      * rule; /* to its node on the "ip_fw_chain_head"
                                      * list
                                      */
       };

(Since the lists of rules are built only at the same time the data structure
is, the dynamic rules generated in a stateful firewall should not be
represented by this algorithm, if it were considerable the time consumed in
updating the search data structure and its associated lists when the
fw-subsystem adds and removes dynamic rules.)

In section 3, there are topics that complement in minor aspects these data
structures. But the above diagram will be valid.


SECTION 2. INFORMAL INITIAL SKETCH OF THE ALGORITHM IN C

To make the idea of the algorithm clearer, let's expose informally an
**initial** sketch of it in C:

static int
ip_fw_chk( struct ip ** pip, /* rest of parameters here*/ ...)
{

    /* (This function is syntactically incorrect.) */

    /*** 1. GET THE FIELDS OF THE PACKET
     ***/

        get_packet_fields( pip, hlen, oif,
                                 &packet_src_address,
                                 &packet_dest_address,
                                 &packet_protocol,
                                 &packet_in_interface,
                                 &packet_out_interface, ...);

    /*** 2. MAKE A FAST SEARCH FOR EACH FIELD OF THE PACKET ON THE PROJECTION
     ***    OF THE FW-RULES ON THAT FIELD
     ***/

        struct list_entry * ordered_list_for_field[ MAX_FIELDS ];

        for each of the fields of the packet
                 /* src_address, dest_address, protocol, interfaces, ... */
        {
            ordered_list_for_field[ fieldX ] =
                  search_in_projection_of_fwrules_on_FIELDx
                                            ( packet_fieldX_value );

            /* this search could be binary or hash-based. For example, using a
             * B-tree on the values of the field. There would be then several
             * B-trees, one for each field. You may use a type of data
             * data structure for a field and a distinct one for another
             * field.
             *
             * To improve performance, the lists are NOT built by the
             * searches. Each list returned by each search was set up when its
             * search data structure -hash, B-tree-, was.
             *
             * (Thereafter, the overhead of a stateful firewall -adding and
             * removing dynamic rules- could put a load in updating the search
             * data structures and their associated ordered lists, if this
             * algorithm were used for a stateful firewall.)
             */

    }

/*** 3. EXTRACT THE COINCIDENCES IN ALL THE SEARCHES (A MERGE)
 ***/

    struct list_entry * ordered_common_sublist =
             merge_and_extract_coincidences( ordered_list_for_field );

/*** 4. APPLY THE COINCIDENT RULES ON THE PACKET
 ***/

    if ( ! empty( ordered_common_sublist ) )
    {
         /* The following function, "apply_fw_rules", finally applies the
          * rules on the packet.
          *
          * It returns TRUE if a "the search terminates" (allow, deny, reset,
          * unreach, ...) rule was applied -and it would have been the only
          * one of that type to have been applied-, and returns FALSE if all
          * the rules applied belonged to "the search continues" class (and
          * therefore the default fw rule has to be applied next).
          *
          * This function employs some parameters of "ip_fw_chk", e.g., the
          * mbuf "m" and the "next_hop", since "apply_fw_rules()" may modify
          * them if requested by a fw-rule.
          *
          * The last parameter to "apply_fw_rules" is really the return value
          * of this function "ip_fw_chk", if the former returns TRUE.
          */

         int decision_found, ip_fw_chk_returnval ;

         decision_found =
                   apply_fw_rules( ordered_common_sublist,
                                   /* other parameters to ip_fw_chk are passed
                                    * here, e.g., the mbuf "m", the
                                    * "next_hop", etc, since they may be
                                    * modified here
                                    */
                                   &ip_fw_chk_returnval );

         if ( decision_found )  return ip_fw_chk_returnval;

    }

    /* default firewall rule */
      /* other stuff here: free the mbuf parameter "m", increment default
       * rule's statistics, etc.
       */
    return IP_FW_PORT_DENY_FLAG;

}

/**** END of function ip_fw_chk() ****/


SECTION 3. DETAILS AND IMPROVEMENTS OF THE ALGORITHM

There are several details that modify the implementation.

SECTION 3.2. IRRELEVANT FIELDS (OMITTED OR EQUAL TO any) IN A RULE

Normally, there are fw-rules that don't specify a value for a field. E.g., the
rule

                allow tcp from any to me 80 setup ...

doesn't specify any values for the source IP-address/TCP-port. Let's call
these rules by "fieldX is irrelevant" in them.

When a search data structure (B-tree, hash, etc) for a field is built, it
contains known values, explicitly mentioned in the fw-rules. In the case of a
"fieldX is irrelevant" rule, all the implicit, missing values of the fieldX
domain can be generated into the data structure, but, when the domain is large
(IP4 or IP6 addresses), the search on fieldX would be slow, IMHO.

A solution is to have, for each fieldX, an entry "REST of the domain", with
the list of the "fieldX is irrelevant" fw-rules. This entry is independent of
fieldX's search data structure. If the search on the data structure is
unsuccessful, then the value of fieldX in the network packet is classified as
being in the "REST of the domain", and hence, that list is returned in step 2
for fieldX. (Of course, since these "fieldX is irrelevant" rules can also
match every explicit value in the fieldX's search data structure, such rules
can be in the explicit values' lists too, as required from section 1.2 which
stated "an ordered list of fw-rules *matching* that -explicit- value for that
field. Cf section 3.3)

      -- fieldX -------------------------------------------------------------
     |
     |   ------------------------------------------------------------------
     |  | "REST of the domain" entry
     |  |  |
     |  | (points to)
     |  |  |
     |  |    -> ordered list of the "fieldX is irrelevant" fw-rules. This list
     |  |       is returned *only when* the search on the data structure for
     |  |       an explicit value was unsuccessful.
     |   ------------------------------------------------------------------
     |
     |   ------------------------------------------------------------------
     |  | search data structure on fw's explicit fieldX's values (cf. section
     |  | 1.2.)
     |  | Note: the rules in the "REST of the domain" entry may also be
     |  | included in the explicit values' lists (cf. section 3.3).
     |   ------------------------------------------------------------------
      ------------------------------------------------------

    For example, in these two rules,

        add 1000 allow ip  from me  to 192.168.1.1
        add 2000 allow tcp from any to any  established

    taking in consideration the field "protocol" **only**, then the first
    rule is in the "REST" entry, because it didn't request a specific
    IP-family protocol. The second rule is in the list associated to entry
    "tcp" in the search data structure for field "protocol". Since a TCP
    packet can match the first rule also -from the field "protocol" point
    of view, because it specified "ip" that means irrelevant-, then rule
    1000 is in the list of the "tcp" entry too -before the second one, by
    the "fw_number" order:

      field "protocol":

              - "REST of the domain" entry ---> list( ... 1000, ...)

              - search data structure (hash, B-tree, etc.)
                      |
                      | /* has an entry for protocol TCP ...*/
                      V
                    "tcp" entry ---> list( ... 1000, 2000, ... )

A performance penalty of a "REST of the domain" entry (as employed for the
"fieldX is irrelevant" problem of this section 3.2, but see section 3.3) is
that, since some rules in the "REST" list can be included in each explicit
value's list too, these lists can be longer, and the next step, step 3 -the
merge-, will have more work to do exhausting longer lists.

For formal reasons only, the "REST of the domain" solution was shown in this
section, but it is reserved to be applied to section 3.3. The proposed
solution to the "fieldX is irrelevant" problem is that each fw-rule have a
bit-vector indicating which fields the rule needs to be matched (the rule has
explicit values for them) and which fields it ignores. (This bitmask is very
similar to the "fw_flg" and "fw_ipflg" existing in "struct ip_fw", but without
the IP_FW_F_COMMAND and some other bits -e.g., the IP_FW_F_KEEP_S and
IP_FW_F_CHECK_S bits.)

With this bitmask, step 2 (the searches) are unaffected, but in the merge
phase, the coincidence on a tentative fw-rule occurs when it appears in all
the lists of its needed (explicit) fields, and the lists of its irrelevant
fields are ignored.

      The coincidences with a bitmask for relevant fields

         (step 2-searches)
         As usual, array "ordered_list_for_field[ ]" in the code above:

              [field1] -- points to -> ordered list of fw-rules returned by
                          the search on field1 (having its explicit value
                          matching that in the network packet)
               ...

              [fieldN] -- points to -> ordered list of fw-rules returned by
                          the search on field1 (having its explicit value
                          matching that in the network packet)
         --------
         (step 3-merge)
         Be a tentative fw-rule R (with needed-fields bitmask RB):
              R is a coincidence of the searches if every field in R's bitmask
              RB has rule R in the head of its list (in the code above, it is
              pointed to by "ordered_list_for_field[field]->rule")

In Annex 1 there is a possible implementation of this variant for step
3-merge. As a detail, it is immune to "static" rules having no relevant fields
at all (from the "static" firewall point of view), e.g.  the rule

        allow check-state

because the merge in Annex 1 always tries to discard a minimum (possible
coincidence) as soon as possible, but that rule can't be discarded since it
doesn't need any match on any field for it to match the packet: it can only be
replaced by a lesser "fw_number" matching rule.

(For such rules, a question arises in step 2 -the searches-: if the rule
depends on no field, then, in which field/list is it to be stored, for the
rule can be returned later for its processing in step 3 -merge? At first
sight, it seems that in all rules and fields. See section 3.7 for an
alternative.)

    NOTE: about the IPFW2 expressive power and fields both explicit and
          irrelevant in an IPFW2 rule ...

In IPFW2, there may be fields both explicitly requested and irrelevant in a
same rule, e.g., the protocol and destination address in

       deny proto icmp or dst-ip me

because, though they are in different OR-blocks, the merge may incorrectly
interprets the bitmask as an AND. IMHO, this IPFW2 OR-blocks difficulty could
be solved by representing internally the IPFW2 rule in a canonical equivalent
set of simpler rules where each field is either explicitly given or
irrelevant, but not both.

A practical interpretation of such "canonical equivalent set of rules" is not
only merging (step 3: extract coincidences) according to a common fw-rule, but
requiring further a coincidence in the OR-block in that IPFW2 rule. Thus, the
ordered list associated with each explicit value in the search data structure
would not be one of rules, but one of OR-blocks in rules: a coincidence needs
everyone to point to the same OR-block in a rule. Thus, the canonical set is a
kind of disjunctive normal form.


SECTION 3.3. NEGATED (NOT) VALUES IN A FW-RULE

The problem of negated values for a field in a rule and their matching affects
the algorithm. A simple example to proceed:

           add 10000 deny ip from not 192.168.12.0/25 to any

In it, the "IP-SourceAddress" value is negated.

(There may be other, better, solutions to this problem than the following,
IMHO.)

Suppose there is a bitmask "fw_notflg" in the "struct ip_fw" indicating which
fields are negated in a rule. That bitmask would be processed in step 3
(merge) or in step 4 (apply) to exclude a negated rule that matched the value
in the packet. In step 2 (searches), the rule with the negated value would be
in the list of that value, as usual. In the example above:

     - field: "IP source Address"

              search data structure
                       |
                       |
                  192.168.12.0/25 --> list( ..., 10000, ...)

     - If a packet with IP-SourceAddress == "192.168.12.0/25" arrives,

         - In step 2 (searches) that value is matched and its list returned,
           as usual.

         - In step 3 (merge) or 4 (apply), since rule 10000 in the list has
           the "IP-SourceAddress" field negated (ON in the "fw_notflg"
           bitmask), then rule 10000 is discarded.

But, what happens on a packet not having that value in the rule? E.g., in the
example above, with "IP-SourceAddress"== 192.168.3.3. Then, the value
wouldn't be found in the search data structure (or perhaps yes, but rule 10000
wouldn't be in its list), and this is not the intended behavior, because rule
10000 -for field "IP-SourceAddress"- matches indeed, since the value in the
packet -192.168.3.3- does match rule 10000's "not 192.168.12.0/25" condition.

To avoid this "loss", such fw-rule (in the example, 10000) has to be put in
the list of every other explicit value for that field in its search data
structure. E.g., if there are these two rules

           # In real life, it may be after the rule with the "not"
           add  5000 allow tcp from 192.168.55.0/24 to me smtp setup
           add 10000 deny ip from not 192.168.12.0/25 to any

then we would have that rule 10000 matches for the "Ip-SourceAddress" field ==
192.168.55.0/24, and really rule 10000 matches every other explicit value for
field "Ip-SourceAddress" that doesn't intersect with rule 10000's negated
value "192.168.12.0/25".

     - field: "IP source Address"

              search data structure
                       |
                       |
           ------------|-----------
           |           |          |
          ...          |         ...
                       |
                 192.168.55.0/24 --> list( ... 5000, 10000, ...)

But, what happens if a packet arrives not having any of the explicit values
for the field? E.g., in the example above, with an "IP-SourceAddress" not
being "192.168.12.0/25", "192.168.55.0/24" or whatever else is listed in the
firewall. Despite such surprise, the negated rule matches on the packet for
the field. (In the example, rule 10000 matches on the "unexpected" packet for
the field "IP-SourceAddress"). So a "REST of the domain" list is needed for
field "IP-SourceAddress" (cf section 3.2).

So, in the negated field, the rule with the negated value has to appear in all
other explicit values' lists not intersecting the negated value and also in
the "REST of the domain" list. Going back to the beginning of this section,
why has the rule to appear in the negated value's list, if it is precisely
that case that shouldn't match? The negated rule can merely be omitted in its
negated value's list. In the example above,

     - IP source Address (that is the negated field in rule 10000)

          - "REST of the domain" entry --> ( ..., 10000, ...)

          - search data structure
                       |
                  /* original version: complex, with a bitmask */
                       |
                  192.168.12.0/25         /* the negated value */
                       |
                       --> list( ...X, 10000, Y,...)

                  /* new version: simpler: negated rule doesn't appear
                   * in its own negated value's list */
                       |
                  192.168.12.0/25 --> list( ...X, Y, ...)

Therefore, the field "fw_notflag" in "struct ip_fw" is unnecessary for the
problem of this section.

That's the solution proposed for the negated value problem:

  - to employ a "REST of the domain" list for (only) the negated field; and

  - with respect to the negated field, to put the rule with the negated value
    in:

      o the list of every other explicit value that doesn't intersect the
        negated value; and also in

      o the list of the "REST of the domain" for that field.

(A bitmask for representing which fields are negated and which are
"positively" required is not needed.)

There was a critic in section 3.2 against the "REST of the domain" list in
using it for solving the "irrelevant field" problem. It was that the rules in
that "REST" list (which were those where the field was irrelevant for them)
can be included again in every explicit values' list, so these lists can be
longer and this makes the merge analyze more rules. That critic also applies
here to this "REST of the domain" list for the "negated values" problem, but
noting that negated values in rules are not so common in practice as
irrelevant fields, so the "REST of the domain" list, containing only the rules
that negate its field, should be shorter than that in the "irrelevant field"
counterpart.

For example, this brief IPFW2 rule

         deny proto icmp

on the "IP-protocol" field only, is making all other fields in a packet
irrelevant for matching the rule. For the "irrelevant field" problem, this
rule would have to be in every list of every irrelevant field in it if a
relevant fields bitmask weren't used. (E.g., the field "IP source address"
would have to have this rule in its explicit values' and "REST" lists.) For
the "negated value" problem, this rule doesn't have to be included, since it
doesn't have any negated value (and if so it were, then the rule would be
duplicated in some lists in the negated field's data structure only, not in
every other field's).

    NOTE: about negated values and IPFW2 OR-blocks

A natural constraint in any algorithm pretending to be compatible with the
existing sequential search, is that the former has to match the same rule that
the latter with the same input. E.g., consider

    allow src-ip not 192.168.12.0/24 or src-ip not 192.169.13.0/25

If the source IP of the packet is 192.168.13.1, should it be allowed?

This rule shouldn't be in the list of the node 192.168.12.0/24 or in that of
192.169.13.0/25 either, because they are negated there; or to the contrary,
because both masks don't intersect each other and the subsequent cross-over,
the rule should be in both lists. Which to choose? The answer is: choose the
one that coincides with the existing sequential search.


SECTION 3.4. DEALING WITH RANGES AND MASKS OF VALUES

The problem of dealing with ranges and masks is going to be reduced to how to
represent them, in order that step 2 (the searches) returns the associated
list, and no particular processing is done by step 3 (merge) or 4 (apply
rule). (Note that the problem of irrelevant fields -section 3.2- was solved by
the latter approach -in the merge-, while the negated values one was by the
former method -in the data structure and the searches.)

Ranges are dealt first. Let all the values mentioned in a firewall for a
field, including those limiting its ranges used in the rules, be put in order:

   |------------X--------------X------X--.....--X----------|
domain-min    value1         value2 ...      valueN     domain-max
               (values mentioned in the firewall)

Let's call a "minimum segment" that between two consecutive values in the
domain, without including any of both that is also used as a value alone in
the fw. (E.g., a minimum domain would be, in general, a segment from
"value(I)+1" to "value(I+1)-1" where the sum/rest of 1 depends on whether the
value appears in the fw as a value alone or only delimiting a range; if
"domain-min" and "domain-max" are not mentioned in the rules, then another two
minimum segments would be:

     - from "domain-min" to "value(1)-1"
     - from "value(N)+1" to "domain-max"

where there is a similar criterion to the one exposed above on whether to sum/
/rest 1 from a value delimiting these extreme minimum segments.)

Let a B-tree where each interior node is a value, and each leaf is either a
value, or a minimum segment. E.g., in the port numbers domain -supposing the
values used in the fw for this field are 20, 21, 22, 25, 80, 110, 443, 514,
1024 and 3128- the B-tree might be:

                                  443
                     <             |            >
                ------------------------------------
               |                                   |
              80                                   1024
               |                                   |
          -------------                  ---------------------
          |           |                  |                   |
         22           110              514                   3128
          |           |                  |                   |
     ----------   ----------        -----------         -----------
    |         |   |        |        |         |         |         |
   20         25  81-     114-    444-        515-     1025-     3129-
    |         |    -109    -442    -513        -1023    -3127     -65535
--------   -------
|      |   |     |
1-     21  23-   26-
-19         -24   -79

In the example above, any node of the form "A-B", where A and B are two
numbers, is a minimum segment, and note they are all leafs in the tree.

In practice, there is not need to represent all minimum segments of the set of
single values in the firewall for the field. They have to be made only if they
are inside of a range specified by a rule, e.g., in

               minimum         minimum
                segment A       segment B
        ... X---------------X----------------X ...
         a range's          |            the range's
         min value          |            max value
                            |
                  another value in the
                  fw that is incidentally
                  inside the range.

there are only two segments, and if there isn't any other range in the fw for
the field, there won't be more minimal segments required. (In other words, the
domain of the field doesn't have to be represented into a full partition: a
minimum segment only have to be represented in a leaf in the B-tree if it is
inside a range mentioned by a rule.) (A "REST of the domain" entry might be
appealed for the other, meaningless segments in the domain.)

    In the example, if the only range mentioned by the firewall rules
    is that from 1024 to 65535, then the only minimum segments required
    in the B-tree are 1025-3127 and 3129-65535.

Back to the B-tree. The fw-rules in the list associated to a node in the
B-tree are those such that the node's value (or minimum segment) is equal to
or *contained in* the explicit value or range for the field in them. Note
that, since a real range in a fw-rule may be built up of several simpler
segments in the B-tree, then a same fw-rule may be in several nodes' lists at
the same time. E.g., in the tree above and the rule,

       add 20000 allow tcp from my-net to any 1024-65535 setup ...

    the rule's range, 1024-65535, is composed of several nodes in the tree
    above -such nodes are: 1024, 3128, 1025-3127 and 3129-65535. Therefore,
    rule 20000 has to be in the lists of these four nodes.

There are also many solutions possible to represent masks of values. For
example, they can be converted into ranges of values. What is peculiar to
masks, though, is that, for each bit in one of them:

 o  if the bit in a position in the mask is 1, then the corresponding bits in
    its masked value and the real value have to be equal;

 o  if the bit in a position in the mask is 0, the corresponding bit in the
    real value is ignored.

>From this analysis, a pair (mask, masked-value) defines a kind of string
matching, IMHO, in the form of a regular expression of the form of

                      ([01]*.*)*

where the whole length of the pattern is known before-hand. E.g.,

                       192.168.0.0/255.255.0.255

means the following binary pattern:

           11000000'10101000'........'00000000      (length == 32)

where "." matches any bit-value, and "'" is used to separate the bytes.

(Of course, the term "string matching" refers to this class of problems, no
conversion of the real value to a string should be necessary.)

What can be extracted from a firewall is then a set of masks (patterns) and
the problem is matching the real value in this set of patterns. One solution
to this problem is to use a trie of keywords and to build a finite-state
machine to recognize the patterns in the set. (More properly, a solution for
this search in a set of masks **might** be obtained using ideas from the
string matching algorithms.)

Trouble is, first, what have to be matched are patterns, not keywords.
(Although they aren't general reg-exs, they just specify a bit or ignore it.)
Second, ranges of values can also be extracted from the firewall for the same
field. It is difficult to represent a range of values into the finite-state
machine to recognize it. A solution could be to transform the range of values
into a set of masks, but this would augment the number of states in the
automata.


SECTION 3.7. FREQUENTLY-MATCHED RULES NEAR THE START OF THE RULES LIST

Some issues derived from the syntax of a rule has been exposed till now
because they should be taken into account. The order of the rules in the
overall fw-rules list ("ip_fw_chain_head") is another thing that affects the
efficiency of the algorithm. For example, it is common that the first rules in
the list are frequent matches, and they only need a couple of fields for their
matching. The opportunity is to improve the algorithm using these statistical
facts.

     The template given in "/etc/rc.firewall" -extended to deny the
     IANA reserved networks and other measures-, begins with the rules

           allow ip from any to any via lo0
           deny ip from any to 127.0.0.0/8
           deny ip from 127.0.0.0/8 to any
           deny ip from ${inet} to any in recv ${oif}
           deny ip from ${onet} to any in recv ${iif}
           deny ip from any to 10.0.0.0/8 via ${oif}
           deny ip from any to 172.16.0.0/12 via ${oif}
           ... deny networks not assigned by IANA to the public via ${oif}
       *** divert 8668 ip from any to any via ${oif}
           deny ip from 10.0.0.0/8 to any via ${oif}
           deny ip from 172.16.0.0/12 to any via ${oif}
           ... deny networks not assigned by IANA to the public via ${oif}
       *** allow tcp from any to any established  # or check-state
           ...
           ... rest of the rules would follow

     where the "***"-marked rules are frequent hits under normal
     circumstances, and only a couple of fields are relevant for all these
     rules. The point here is not whether the template is used, is that,
     since the frequently-matched rules should tend to be as close as possible
     to the beginning of the list of rules to improve the existing sequential
     search algorithm, and since the lists indexing **all** the fields in the
     packet don't need to be known there, the point is how to use this
     ordering.


SECTION 3.7.1. A FIRST APPROACH USING A PARTIAL MERGE

The simple use here is to limit the searches on the fields of the packet (step
2) -where their lists are obtained-, to only those fields that are strictly
relevant, and then merging the lists obtained so far. The basis for this
laziness is that the searches in the data structures *might* be more time-
consuming than the merge (?).

But a problem arises when only the lists of a subset of fields are merged: the
case where, even if a feasible -though partial- minimum is produced by the
merge, there is a rule before that feasible minimum that doesn't need any of
these known fields, so that rule isn't included in any of the lists obtained
so far. Therefore, the merge of the subset of lists will never give that rule
as a feasible minimum, silently ignoring it. E.g., in

        allow tcp from any          to any 25 setup ...
        allow udp from any          to any 53 ...
        allow ip  from trusted-net  to any
        ...

     the third rule doesn't need any of the fields relevant for its two
     previous rules. If only the lists of the fields relevant for the
     first two rules were merged, the result (coincidence) would always
     be different (perhaps greater) than the third rule.

In short, the set of rules possible to be returned from a partial merge can be
discontinuous in the whole list ("ip_fw_chain_head") of rules, IMHO.

A solution to this problem is that the partial merge be done and its result
known anyway, but each rule before the computed, partial minimum in the
"ip_fw_chain_head" list has to be visited to make sure that the rule doesn't
have a whole different subset of relevant fields. In other words, to keep
visiting each successive rule, skip the unfeasible ones, and if feasible
-"feasible" according to the current minimum and the lists merged that
coincided in it-, then to search for the lists of its still-unknown relevant
fields and to merge with the previously-known lists, until one of these newer
lists proves the rule doesn't match, or all the lists of its relevant fields
prove the rule is matched:

     for each successive rule in the "ip_fw_chain_head" list:
     begin

         if the rule is unfeasible according to one of its relevant fields
                   which is already known and merged into the partial minimum,
            then continue with the next rule in "ip_fw_chain_head".

         for each still-unknown field, relevant for the rule:
         begin
              search for the value of the field in the packet and obtain its
                     list of matching rules (make the field "known")

              merge this list with the lists already known and obtain a
                     partial minimum (coincident rule) in all them

              if the resulting rule (the partial minimum) proves this rule is
                            unfeasible,
                 then continue with the next rule in "ip_fw_chain_head"
         end-for

         the rule has been satisfied by all its fields: apply it on the packet
                (the transversal may finish here if it is an ipfw(8)-"the
                search terminates" rule
                else
                   the search on "ip_fw_chain_head" has to continue,
                       discard current minimum because it was applied on the
                       packet, and merge all known lists again to obtain a
                       new coincident minimum rule.)

     end-for

An implementation is in Annex 2.

   (Note: this solution also solves the problem of in what list associated to
    the data structures a rule without relevant fields at all should be
    stored? (cf section 3.2) Examples of rules without relevant fields are:

         check-state
         count  ip from any to any
         deny   ip from any to any   # the default fw-rule

    Note that their "rfm" -relevant fields mask- is 0.

    The answer in this section, by the previous algorithm, is that these rules
    aren't in any list in the search data structures, they are only referenced
    through the main "ip_fw_chain_head" list, where they will be visited if
    necessary. I.e., these rules are not indexed by the search data
    structures.

    The next algorithm also preserves this situation for these rules.)


SECTION 3.7.2. ANOTHER APPROACH USING THE FIELDS' STRUCTURE AMONG THE RULES,
               WITHOUT A MERGE OF THE INDEXING LISTS

There is a different solution, that while it keeps restricting its analysis to
only the fields relevant for the rule, it doesn't do a real merge of the known
lists to obtain the next feasible -though partial- minimum. Instead, it
recourses to a simpler analysis for not visiting each successive rule, really
jumping over some of them. In a way, it tries to improve the
"lookup_next_rule" function, defining which rule in "ip_fw_chain_head" will be
the one analyzed in the next iteration, by using the lists of matching rules
returned by the independent searches for the values in the packet (step 2).

To explain this variant, a new concept has to play its role here. It is that
of the longest (forward) continuous sequence of rules that share a relevant
field with the current rule (the base point of the sequence). The concept is
intuitive, but may be formalized as follows:

  1.- the longest continuous sequence for a field that starts in a given rule
      is only defined if that field is relevant in that rule.

  2.- if the list of rules consists of only one rule, the longest continuous
      sequence for each of its relevant fields is just the rule itself. (I.e.,
      the sequences are of length 1.)

  3.- if the list of rules consists of more than one rule, and taking without
      losing generality the base point as the first rule in the list, then the
      longest continuous sequence for each of the base point's relevant fields
      depends on whether the field is also relevant for the rule after the
      base point (it would be the "second" rule in the list starting at the
      base point):

         3a.- if the field is not relevant in the next ("second") rule, then
              the sequence is just the first rule itself (the base point), as
              in point 2.

         3b.- if the field is also relevant in the next ("second") rule, then
              the sequence for the field starting in the base point is
              composed of the base point followed by the longest continuous
              sequence for the field that starts at the next ("second") rule.

   Example: in these three rules,

              allow tcp from my-net       to any   setup
              allow udp from another-net  to a-net ...
              deny  ip  from another-net2 to another-net3

     taking as the base point of the sequences the first rule, the
     longest continuous sequence for the field:

       o  "IP-protocol" ends in the second rule;
       o  "Ip-SourceAddress" ends in the third rule;
       o  "TCP-flags" field is composed of just that first rule;
       o  there is no sequence for field "IP-DestinAddress" starting
          at the first rule, since that field isn't relevant there.

The concept is structural to the list of rules, it doesn't depend on a real
network packet.

A practical representation for the longest continuous sequence is not
explicitly by the sequence per-se, but implicitly by giving which is the rule
that breaks -ends- the sequence by not having the field as relevant (point
3a.). (I.e., the rule immediately following in "ip_fw_chain_head" that ending
the sequence.) This breaking-rule always exists, because it is at least the
default fw-rule, where all fields are irrelevant (by point 3a, it breaks any
sequence that reaches just before it). For this reason also, there is no
longest continuous sequence starting at the default fw-rule for any field
(point 1 in the definition). (I.e., the default fw-rule is unable to be a
base-point because it doesn't have any relevant field -point 1 in the
definition above.)

An implementation of how to build these sequences for a base rule is given in
Annex 3.

The usefulness of this concept is related to jumping in a safe way over some
rules in the "ip_fw_chain_head" list using the known lists for the relevant
fields of the rule being analyzed by the loop. "Jumping in a safe way" is
jumping over some rules without missing a still-feasible rule. This is
detailed here.

Suppose the current rule in the main iteration on the "ip_fw_chain_head" list
is "R", the transversal has to continue, and that the next rules in the
relevant lists for "R" are "N[1]", "N[2]", ... "N[m]". (I.e., "N[i]" is the
rule in the head of the relevant list for field "F[i]" in "R" that matches the
value in the packet for "F[i]", and further, that "N[i]" follows "R", i.e.,
"N[i].fw_number > R.fw_number" since the iteration has to continue after "R".)
Let "B[i]" be the rule that breaks the longest continuous sequence for field
"F[i]" that start at rule "R".(Of course, "B[i].fw_number > R.fw_number", by
point 3 of the definition of a longest continuous sequence for field "F[i]".)

Suppose "J" is a rule after "R". May a jump be done in a safe way from "R" to
"J"? What is the safe "J" that is farthest away from rule "R"?

The solution is in the longest continuous sequences starting at "R" for its
fields. Consider a rule "N[i]" -that is ultimately taken from the value of
"F[i]" in the network packet:

 o if "N[i]" is inside the longest continuous sequence of its field "F[i]",
   then jumping directly to "N[i]" from "R" is safe because all rules from
   "R->next" to the previous to "N[i]" require field "F[i]" by definition of
   the longest sequence, but they won't be a coincidence because the next
   matched rule according to the value of field "F[i]" in the packet is
   "N[i]". (I.e., in this case, "N[i]" is the nearest rule from "R" that is
   still feasible, according to field "F[i]".)

 o by the same reason, if a "N[i]" is outside its sequence, all the rules in
   the longest continuous sequence starting at "R" of its field "F[i]" may be
   ignored because the field "F[i]" in the packet matched on rule "N[i]"; but
   the jump can be no further that sequence, because field "F[i]" is
   irrelevant for the rule after this sequence (its breaking-rule "B[i]").
   (I.e., in this case, "B[i]" is the nearest rule from "R" that is still
   feasible, according to field "F[i]".)

Generalizing this reasoning, a safe jump may be to the farthest -greater
"fw_number"- of the set of these rules:

 - the "N[i]"s that are inside the longest continuous sequences starting at
   "R" for their fields "F[i]" (N[i].fw_number < B[i].fw_number);
 - the "B[i]"s for those "N[i]"s outside the longest continuous sequences
   starting at "R" for their fields "F[i]" (N[i].fw_number >= B[i].fw_number).

analyzed for every field "F[i]" relevant at "R". Farther than this maximum, it
is unsafe, i.e., the jump may be over a feasible rule. In a simple expression:

    farthest safe jump = max( { min( N[i], B[i] ) } )

where the "i" ranges over the relevant fields of "R" whose lists are known.
The "ip_fw_chk" function is then:

     while (TRUE)
     begin

         for each of the relevant fields for the rule:
         begin
              if its list of matching rules for the value of the field in the
                     packet is still unknown,
                then
                    search for the value of the field in the packet and obtain
                           that list (make the field "known")
                else
                    get its already-known list from the array-cache

              skip (sync) the rules in (the starting sublist of) this list
                   that are lesser ("fw_number") than the rule being analyzed

              if the rule in the head of this list coincides with the rule
                       being analyzed
                 then
                      good, another coincidence on this rule
                 else
                      next rule to analyze = farthest, safe jump from this
                                               rule that was being analyzed
                      continue main "while" loop on this new rule
         end-for

         the rule has been satisfied by all its fields: apply it on the packet
                (the transversal may finish here if it is an ipfw(8)-"the
                search terminates" rule
                else
                   the search on "ip_fw_chain_head" has to continue,
                       next rule to analyze = farthest, safe jump from this
                                                 rule that was applied on the
                                                 packet
                       continue main "while" loop on this new rule)

     end-while

In Annex 4 there is an "ip_fw_chk" function using this algorithm.

(Note, though, that the farthest, safe jump approach is very dependent on how
long a continuous sequence of rules for a field is, and may be innocently
torpedoed by breaking the continuous sequence with the insertion of a rule
with a different set of relevant fields, or none at all, e.g., by inserting
this rule

        count ip from any to any          # relevant fields == none

in the middle of the continuous sequence for a field.)


SECTION 3.7.3. CONSOLIDATING BOTH VARIANTS OF THE ALGORITHM

Comparing both algorithms (section 3.7.1/annex 2 vs section 3.7.2/annex 4),
the PROS of each are:

 o The merge

   - The merge has a better point of analysis for discarding rules before
     "real_m", since it takes into account -in the merge- all the known lists,
     not only those relevant at current, base point, rule.

   - Its simple comparisons in UNFEASIBLE are done at the beginning of the
     loop on the current "target_rule", continuing immediately with the next
     rule in "ip_fw_chain_head" if appropriate.

 o The farthest, safe, jump

   - It jumps over some continuous sequence of rules, not needing to visit
     each successive rule testing whether is feasible (and analyzing it) or
     not.

   - It's simpler, but nevertheless, tries to search for a list of a unknown
     field favoring those fields that may give the farthest, safe, jump.

So, a natural step is to combine both algorithms, the merge and the farthest
safe jump, basically by leaving the latter compute which rule should be the
next in the iteration on "ip_fw_chain_head", and the former -through its check
UNFEASIBLE- to quickly discard that jump, given all the known lists it has
merged.

In practice, both algorithms fit especially nicely, IMHO, into the beginning
of the list of rules in the existing template in "/etc/rc.firewall":

     (a)   allow ip from any to any via lo0
     (b)   deny ip from any to 127.0.0.0/8
     (c)   deny ip from 127.0.0.0/8 to any
           deny ip from ${inet} to any in recv ${oif}
           deny ip from ${onet} to any in recv ${iif}
     (d)   deny ip from any to 10.0.0.0/8 via ${oif}
     (d)   deny ip from any to 172.16.0.0/12 via ${oif}
     (d)   ... deny networks not assigned by IANA to the public via ${oif}
     ***   divert 8668 ip from any to any via ${oif}
     (e)   deny ip from 10.0.0.0/8 to any via ${oif}
     (e)   deny ip from 172.16.0.0/12 to any via ${oif}
     (e)   ... deny networks not assigned by IANA to the public via ${oif}
     ***   allow tcp from any to any established  # or check-state
           ...

In the rules (a), (b) and (c), the lists of the rules matching the fields
"Input-Interface" (maybe "Output-Interface" if exists), "IP-DestinAddress" and
"IP-SourceAddress" in the packet, have to be known in both algorithms. In the
long sequence of rules in (d) before the "divert natd" rule, the required
fields are the same and are already known, and:

 o   UNFEASIBLE in the merge algorithm may quickly decide if each rule in (d)
     has to be immediately discarded; and
 o   the farthest, safe jump may skip over the whole sequence of (d)s to the
     "divert natd" rule directly.

The same applies to the long sequence of (e) before the other frequently-
matched rule. (We may be even luckier, if the packet doesn't refer to the
interface "${oif}", required for the match of the sequence (d), the "divert
natd" rule, and the sequence (e), because, if so, the farthest safe jump may
be directly to the "allow tcp ... established" or "check-state" rule.)


SECTION 3.8. MULTI-PASS FIREWALLS

   (Luigi Rizzo -et al.- may especially help a ^^LOT^^ here.)

Advanced firewalls can recourse to multiple passes on the list of rules (e.g.,
on "ip_fw_chain_head") before deciding finally to accept or not the packet. At
the end of each intermediate pass, the matched rule in it points to another
kernel-module which has to process the packet. In a way, a rule in the
firewall, re-using its matching capabilities, instructs it to "route" the
packet through another kernel-module for additional processing, and after
that, another pass in the firewall may begin. (Cf. Luigi Rizzo's work.) This
technique is present in "ipfw(8)" to implement:

 o   dummynet (temporary transfer of control to it), if the sysctl-var
     "net.inet.fw.one_pass" is not set (if set, then the dummynet's action
     means "[pipe/queue] and accept the packet");

 o   packet tee'ing, if the search had to continue for action "tee".

A processing loop for the packet in "ipfw" is avoided by transferring the
state of the search between successive passes. E.g., in the current
implementation, this is attained by saving the rule where the previous pass
ended, and then, in the next pass in "ip_fw_chk", resuming processing of the
packet in the next rule. The state of the search between passes is saved in a
local variable in the stack frame of a caller function of "ip_fw_chk", and a
pointer to it is passed down to 'ip_fw_chk".

In the algorithm shown in this paper, the state may also be saved and
transferred between successive passes in the firewall.

If a merge is used, then after the rule was applied on the packet but before
the pass returns, the next "real_m" **could** be calculated for the next pass
(the old "real_m" might be the rule just applied and if so, has to be
discarded):

    ... /* refer to Annex 2 */

    decision_found = apply( target_rule, packet, &ip_fw_chk_returnval );

    if( ( ! decision_found  /* the search has to continue in this pass */
          ||
         ( fw_one_pass == 0 &&      /* there will be a return and then ... */
           ( ip_fw_chk_returnval &           /* ... another pass in the fw */
                          (IP_FW_PORT_DYNT_FLAG|IP_FW_PORT_TEE_FLAG)))
        &&
        ( real_m.fw_number == target_rule->fw_number ) ) {

        /* refer to the existing code in "ip_fw_chk" in Annex 2 to pop a
         * feasible minimum from the stack if possible, and then merging all
         * known lists.
         */
         ...
         ...
    }
    if( decision_found )  return ip_fw_chk_returnval;

There would be a minor change in semantics in this code to what is done in the
current implementation, where "fw_one_pass" -giving whether the firewall is
multi-pass or not- is only checked at the beginning of the new pass. The
difference, in practice, only appears on packets that are being processed just
when a change to "fw_one_pass" happens between the passes, a random event.

The alternative, also viable, is to update "real_m" in the beginning of the
new pass, if it is equal to the "last_rule_processed" by the previous pass.
The code to be inserted at the beginning of "ip_fw_chk" to update it is very
similar to what is done at the end of it (pop from the stack and merge the
lists).

In either case, if the merge is used, the state to be saved between passes is:

     struct  state_of_the_search {

              struct ip_fw          * last_rule_processed;
              struct list_entry     * ordered_list_for_field[ MAX_FIELDS ];
              struct minimum_entry    minimums_stack[ MAX_FIELDS ];
              u_int32                 merge_done_mask, empty_lists_mask;
              struct t_real_m         real_m;
              u_int8                  is_real_m_set, old_fw_one_pass,
                                      map_size, map[ MAX_FIELDS ];
     };

Note: if a new "real_m" is computed only at the beginning of the new pass,
instead of at the end of the previous, the members "real_m", "is_real_m_set"
and "old_fw_one_pass" are unnecessary in the saved state.

In the version of Annex 4 (with the farthest jump, without any merge at all),
the state to be saved is simpler:

     struct  state_of_the_search {

              struct ip_fw          * last_rule_processed;
              struct list_entry     * ordered_list_for_field[ MAX_FIELDS ];
              u_int32                 search_done_mask, empty_lists_mask;
     };

In either implementation, what has to be done at the beginning of "ip_fw_chk"
to resume a pass consists mainly in:

    if( state->last_rule_processed == NULL )      /* first pass */
             target_rule = LIST_FIRST( &ip_fw_chain_head );
    else {
             /* a successive pass */
             if ( fw_one_pass )        /* or "state->old_fw_one_pass" */
                    return 0;          /* multi-pass is OFF */

             target_rule = LIST_NEXT( state->last_rule_processed, next );

             /* Remember the note above, if the merge of the lists is used
              * and "real_m" was not computed at the end of the previous pass
              * and saved in the state. The code mentioned there to compute
              * the new "real_m" has to be put here.
              *
              * (If that is the case but "real_m" weren't updated here, since
              *  the old "real_m" would be "last_rule_processed", then
              *
              *         real_m.fw_number < target_rule->fw_number
              *
              * and then the UNFEASIBLE macro wouldn't work. Consult the
              * comment before the definition of UNFEASIBLE in Annex 2.)
              */
    }

    ... main FOR loop on "target_rule" (the same that in Annex 2) ...

A problem with the approach of transferring the state between the passes of
either variant of this algorithm is the big size of the state to be saved.
Some members in the structure could be discarded at the end of a pass, not
saving them (specially if a merge is used) and then easily calculated again in
the next pass but the size is anyway between 100-500 bytes. For example:

    struct tiny_stack_entry {
           struct ip_fw * minimum_rule;
           u_int8         map_index;
           /* Don't save the copies "fw_number" and "rfm" in the entry */
    };

    struct state_of_the_search {

           u_int16                   last_rule_processed_fw_number;
           struct list_entry       * ordered_list_for_field[ MAX_FIELDS ];
           struct tiny_stack_entry   minimums_stack[ MAX_FIELDS ];
           u_int8                    map[ MAX_FIELDS ];

           /* Perhaps "is_real_m_set", "real_m" and "old_fw_one_pass" might
            * be here too.
            */
    };

    ...
    ...
       /* In the beginning of "ip_fw_chk", if a next pass is detected */
    else {
             /* a successive pass */
             if ( fw_one_pass /* "state->old_fw_one_pass" */ ) return 0;

             target_rule = locate_in_the_ip_fw_chain_head_list_the_rule_after
                                   ( state->last_rule_processed_fw_number );

             merge_done_mask = empty_lists_mask = map_size = 0;
             while( map_size < MAX_FIELDS &&
                    ((field_no = state->map[map_size]) != FIELD_NOT_ASSIGNED))
             {
                 merge_done_mask |= 1<<field_no;
                 if( state->ordered_list_for_field[map_size] == NULL )
                      empty_lists_mask |= 1<<field_no;
                 map_size ++;
             }

             ... /* next, "real_m" would be computed if not updated and
                  * transferred from the previous pass. */
    }

    ... main FOR loop on "target_rule" (the same that in Annex 2) ...

where "map_size", "merge_done_mask" and "empty_lists_mask" are inferred at the
beginning of the new pass. (The state didn't suffer a significant shrinking,
though, and also ignoring issues about the alignment of types in memory.)

So, what could be done with the size of the state is to pray that what was
consumed in stack memory can be returned sooner by a faster search (?).


SECTION 4. FINAL WORDS

Not every field available for being matched by the firewall has to be indexed.
If a field is not indexed -i.e., hasn't a search data structure on its values
and lists associated to them- then its matching is left to step 4 (apply the
rule, cf. section 1.1), before the action happens -i.e., its matching is done
in step 4 before everything there.

Not indexing every field available makes the value of MAX_FIELDS lesser, and
thereafter, the memory consumption by the arrays will be lower.

Really, if no fields is indexed (MAX_FIELDS == 0), then all the matching would
be done to step 4 -really, there wouldn't exist step 2 (independent searches)
or step 3 (merge)- and the code would be very similar to the existing
sequential search.

Finally, each search in the data structure of a field has a different cost,
according to the implementation of this data structure (hash, B-tree, etc.),
and to how large the domain of the field is. Which field in the packet should
be searched for first and which next may be influenced by the estimated cost
of the search.


ANNEX 1. THE MERGE ALGORITHM (OF SECTION 3.2)

This is the merge algorithm (step 3- extract the coincidences) when a mask
"relevant_fields_mask" is attached to each fw-rule indicating which fields are
required (for comparison against the network packet's) and which fields are
irrelevant (cf section 3.2). This flag is a kind of "fw_flg" and "fw_ipflg",
but only for needed fields, without the firewall command and other bits.

Receives the array of lists "ordered_list_for_field" from step 2 (searches)
(cf. section 2).

Each possible field in the packet is represented by a bit mask. I.e., if the
field has index "n" in the search results array "ordered_list_for_field", then
the mask "1<<n" is also used to refer to that field, where appropriate.

    /***********************************************
     ***                                         ***
     *** MERGE PHASE (step 3)                    ***
     ***                                         ***
     ***********************************************
     ***/

static struct list_entry *
merge_and_extract_coincidences( const struct list_entry **
                                ordered_list_for_field )
{

    struct list_entry * ordered_common_sublist;

    u_int32 empty_lists_mask, /* the mask of all the lists that have been
                               * emptied in the transversal in the merge
                               */

          rule_minimum, /* the current -partial- coincidence fw-rule# */

          occurrence_mask,  /* the mask for all the fields where the current
                             * coincident rule has been seen -at their lists'
                             * heads, of course, since the lists are ordered
                             * and the current coincidence is the minimum seen
                             * so far
                             */

          min_relevant_fields,  /* the relevant fields mask for that tentative
                                 * coincident fw-rule (the current minimum
                                 * seen so far)
                                 */

          field_mask,  /* a mask representing the field for the iteration */

          current_rule, current_rule_ptr, current_relevant_fields;
                    /* the corresponding variables for the current rule under
                     * exam by the loop
                     */

    int field_no;        /* the main iterator variable */

    struct ip_fw * rule_min_ptr;     /* the current minimum
                                      * ("struct ip_fw" * version)
                                      */

#define ADVANCE_HEAD_OF_LISTS(occurrence_mask)                     \
              for (u_int f=0, mask=1;  f < field_no;               \
                   f++, mask <<= 1 ) {                             \
                      if( mask & ( occurrence_mask ))              \
                           ordered_list_for_field[f] =             \
                                ordered_list_for_field[f]->next;   \
              }

    /****************************************************/
    /* Begin of the merge                               */

    ordered_common_sublist = NULL;
    empty_lists_mask = 0;

lookup_for_a_new_minimum:  /* a goto label- there are only two in this code */

    rule_minimum = IPFW_DEFAULT_RULE; /* no minimum so far */
    rule_min_ptr = NULL;
    min_relevant_fields = 0;
    occurrence_mask = 0;    /* no minimum, no occurrence */

    field_no = -1;           /* start again from field index == 0 */

    while ( ++ field_no < Packet_protocols_Qtty_fields ) {

 field_mask = 1 << field_no;   /* the mask for the field */

        if( empty_lists_mask & field_mask != 0 )
            /* this list had been already detected to be empty */
            continue;

    try_next_value_in_this_list:        /* the other goto target */

        if( ordered_list_for_field[field_no] == NULL ) {
            /* this list has been emptied */
            empty_lists_mask |= field_mask;

            /* don't waste time on the current minimum if this list was
             * relevant for it
             */
            if ( rule_minimum < IPFW_DEFAULT_RULE &&
                 min_relevant_fields & empty_lists_mask != 0 ) {
                 /* this field (list) was relevant for the current coincidence:
                  * discard this partial coincidence
                  */

                 /* "11111111" */
                 ADVANCE_HEAD_OF_LISTS( occurrence_mask );

                 goto lookup_for_a_new_minimum;
            }
            else
                 /* this list belongs to a field irrelevant for the current
                  * minimum (partial coincident rule)
                  */
                 continue;  /* ignore this list for this minimum */
        }

        assert( ordered_list_for_field[field_no] != NULL );

        /* Get the fw-rule pointed to by the head of this list */

        current_rule_ptr = ordered_list_for_field[field_no]->rule;

        current_rule = current_rule_ptr->fw_number ;
        current_relevant_fields = current_rule_ptr->relevant_fields_mask;

        if( current_rule < rule_minimum ) {

            /* a new minimum for the merge: replace the new minimum, IF
             * POSSIBLE, according to circumstances
             */

            /* Useful to keep in mind: since field_mask is 1<<field_no, it is
             * a power of 2, so (field_mask - 1) represents all the previous
             * (lesser "field_no") we have already visited
             */

            if( current_relevant_fields & ( field_mask - 1 ) != 0
                /* the new minimum required a previous field whose list didn't
                 * have that new minimum in the list's head value
                 */
                ||
                current_relevant_fields & empty_lists_mask != 0 )
                /* the new minimum required a field whose list is already
                 * empty
                 */
            {
                /* there are no conditions to set up this new minimum: its
                 * requirements won't be met
                 */

                /* skip this useless minimum in this list */
                ordered_list_for_field[field_no] =
                                    ordered_list_for_field[field_no]->next;
                /* NOTE: we advanced our head pointer but didn't change the
                 * list in the search data structure
                 */

                goto try_next_value_in_this_list;
            }
            else {
                /* Yes, it is a new minimum and a possible coincidence */
                 /* "22222222" */
                rule_minimum = current_rule;
                rule_min_ptr = current_rule_ptr;
                min_relevant_fields = current_relevant_fields;
                /* init its occurrence mask */
                occurrence_mask = field_mask;
            }
        }
        else
            if ( current_rule == rule_minimum ) {
                /* good, another coincidence for the current minimum. May be
                 * useless (?), but we register it anyway
                 */
                occurrence_mask |= field_mask;
            }
            else {
                assert( current_rule > rule_minimum );

                /* Was this field relevant for the current minimum? */

                if( min_relevant_fields & field_mask != 0 ) {
                    /* sorry, it was relevant: discard the current minimum */

                   /* "33333333" */
                   ADVANCE_HEAD_OF_LISTS( occurrence_mask );

                   goto lookup_for_a_new_minimum;
                }
            }

    }  /* end of the WHILE loop */

    if ( rule_minimum < IPFW_DEFAULT_RULE ) {

        /* a new minimum found and with all its required coincidences */

        assert( min_relevant_fields & occurrence_mask ==
                                                  min_relevant_fields );

        /* emit this new minimum (its "struct ip_fw" pointer) for the common
         * sublist for step 4 (apply the rules on the packet)
         */

        append( ordered_common_sublist, rule_min_ptr );

        /* "44444444" */
        ADVANCE_HEAD_OF_LISTS( occurrence_mask );

        if( is_there_any_remaining_list( empty_lists_mask ) )
                goto lookup_for_a_new_minimum;
    }

    return ordered_common_sublist;
}
/* end of the merge phase */

There is an observation that may lead to improve the run-time and it is that,
when a new minimum is found that replaces an existing minimum, then the new
minimum can finally be appended to the common list or discarded, but anyway,
the replaced -old- minimum can be employed again because it still **is** the
minimum for the first lists checked until the other, newer and lesser, minimum
that replaced it. I.e., the replaced minimum should be saved somewhere, and
this minimum should be the first one to be resumed again -when a new *partial*
minimum is wanted for resuming the merge. Since the old minimum have to be
saved last but restored first, a LIFO policy, then a stack for saving the
replaced minimum may be an optimization, IMHO. Note that the contents of the
stack would be ordered by "fw_number" rule# from its top incrementally to its
bottom.

In the following patches, a stack "stack_minimums_history" is used, that would
be declared local at the beginning of the merge. The contents of an entry in
this stack are

  (old_minimum_ptr, occurrence_mask_til_replaced, field_where_it_was_replaced)

Excuse me very deeply for my poor form of back-referencing for patching:

  /* look for this comment in the code above:
   *         "11111111"
   * for the following patch */

  /* start context */
  ADVANCE_HEAD_OF_LISTS( occurrence_mask );

  /* changes start here */

  while( !empty( stack_minimums_history ) )
  {
       /* the current minimum can't hold its requirements */

              /* We try, with the extra visits we have done with the just
               * discarded minimum -before this "while" loop-, to discard also
               * the minimums we met in the first lists (lower "field_no"s)
               * and are in the stack -and were defeated by the minimum just
               * discarded- for not visiting them again. If one of those
               * previously met minimums (the lowest at the top of the stack)
               * can't be discarded because it has some chance to be a real
               * coincidence, then we instead resume visiting, with it as the
               * new minimum, from the "field_no"s where it was replaced and
               * pushed onto the stack.
               *
               * The alternative to this would be to optimistically continue
               * visiting the higher "field_no" lists hoping to meet a new
               * minimum there. The visiting of the whole array of lists would
               * be then circular, wrapping from
               * "Packet_protocols_Qtty_fields - 1" down to 0 inside the
               * "while" loop, without the need of a goto to
               * "lookup_for_a_new_minimum", and only finishing the "while"
               * when "empty_fields_mask" is all "1"s.
               *
               * Note that our choice, in other words, causes us to exhaust
               * first, if possible, the lower "field_no"s lists
               */

       pop( stack_minimums_history,
       &rule_minimum_ptr, &occurrence_mask, &field_no );

       min_relevant_fields = rule_minimum_ptr->relevant_fields_mask;

       if( min_relevant_fields & empty_lists_mask != 0 ) {
                     /* "empty_lists_mask" is the currently known empty lists
                      * mask, i.e., it was not popped from the stack
                      */
       ADVANCE_HEAD_OF_LISTS( occurrence_mask );
       } else
       break; /* We have just found a possible minimum */

  }

  if( min_relevant_fields & empty_lists_mask != 0 )
       /* the stack_minimums_history was emptied and none of its
               * recorded minimum could keep at least a hope according to the
               * already empty lists
               */
       goto lookup_for_a_new_minimum; /* from the beginning */
  else {
       rule_minimum = rule_minimum_ptr->fw_number ;
              field_mask = 1 << field_no;
              goto try_next_value_in_this_list; /* of the just popped
                                                 * "field_no" */
  }

  /* end of patch for "11111111" */
  ....
  ....


  /* look for this comment in the code above:
   *    "22222222"
   * for the following patch
          */

  /* save the minimum just to be replaced */
  if ( rule_minimum < IPFW_DEFAULT_RULE )
       push( stack_minimums_history,
                               rule_min_ptr, occurrence_mask, field_no );

  /* keep doing what was previously indicated, no change */
  rule_minimum = current_rule;     /* put the new minimum */
  rule_min_ptr = current_rule_ptr;
  min_relevant_fields = current_relevant_fields;
  /* init its occurrence mask */
  occurrence_mask = field_mask;

  /* end of patch for "22222222" */

  ....
  ....

  /* for the code after the comments "33333333" and "44444444" we have
          * to apply the same patch that for "11111111".
          */


The main (outermost) "while" in the merge -the one immediately after the label
"lookup_for_a_new_minimum"- always start with field "field_no = 0", and
increment it consecutively until "field_no" reaches the number of fields in
the packet's protocols. I.e., there is no wrapping of "field_no" inside the
"while" loop back to 0, this can only happen by a goto to
"lookup_for_a_new_minimum", before the "while", where "field_no" is reset to
0. In the previous patches, a goto to "lookup_for_a_new_minimum" happens
because the "stack_minimums_history" was emptied. So, combining both facts, it
can be inferred that the maximum depth that the stack may have is the number
of fields in the packet ("Packet_protocols_Qtty_fields"), that only occurs
when every new field (visited incrementally by its index) defines a new
possible minimum that replaces the immediately previous.  Thereafter,
"stack_minimums_history" may be declared in the code as a fixed-size
(pre-allocated) local array:

   typedef u_int  mask;

   #define MAX_FIELDS   ( sizeof(mask) * 8 )
            /* Really, the number of different fields in IPFW2 is 27, IMHO */

   struct  minimum_entry {
              struct ip_fw * partial_minimum;
              int            field_no_where_it_lost;
           }
              stack_minimums_history[ MAX_FIELDS ];


That avoids to call the heap allocation routines for the stack, that would be
relatively expensive since merely a network packet is being matched. I.e.,
from the perspective of the overhead incurred, the use of a stack is feasible,
IMHO.

(Also, a little patch to keep the algorithm's state clear in that label:

         lookup_for_a_new_minimum:

                  assert( empty( stack_minimums_history ) );

)

A simplification follows the use of the stack to save previous minimums. In
the code above, "occurrence_mask" is used to remember where the current
minimum, "rule_minimum", has occurred so far, in order to use
"ADVANCE_HEAD_OF_LISTS(occurrence_mask)" to discard that minimum when it
proves to be unfeasible. If another minimum improved (replaced) this old
minimum, then the occurrences of the old minimum couldn't be skipped in the
heads of its lists because that was the way to remember that it existed -i.e.,
to find it again in the first lists if needed to be.

But with the "stack_minimums_history", the previous minimum is saved (pushed)
into the stack, so it won't be lost -really, it may be popped out later-, so
not skipping the heads of the lists where that minimum occurred is a very
conservative move. That heads may be skipped, because that minimum won't be
lost, either discarded or saved into the stack. But if the heads may be
skipped, "ADVANCE_HEAD_OF_LISTS(occurrence_mask)" can't be used, and
"occurrence_mask" is useless. So, simply, where an assignment to
"occurrence_mask" happened to record in which lists the minimum occurred, now
it's replaced by an advance in the head of that list.


ANNEX 2. IP_FW_CHK FROM SECTION 3.7.1


struct t_real_m {
        u_int16        fw_number;  /* a copy of the rule's fw_number */
        u_int32        rfm;         /* and its relevant fields mask */
        struct ip_fw * fw_rule_ptr;  /* the rule in ip_fw_chain_head */
     };

struct minimum_entry {
        struct t_real_m  old_minimum;
        u_int8           map_index;   /* the "field_no" where it lost */
        u_int32          merge_done_mask;
     };


static void
merge( struct list_entry   ** ordered_list_for_field,
       u_int8                 starting_at_map_index,
       u_int8               * map,
       u_int8                 map_size,
       struct minimum_entry * minimums_stack,
       u_int8               * is_real_m_set,
       struct t_real_m      * real_m,
       u_int32              * empty_lists_mask,
       u_int32              * merge_done_mask
     );


/*
 * If "is_real_m_set" is TRUE ("real_m" has a value), then "real_m" has to be
 * the current rule being visited by the iteration, or greater than it.
 * ("real_m" may not be lesser than it because it should be then previously
 * discarded or applied on the packet, but nevertheless such "real_m" would be
 * history now.) In symbols, what was expressed here is that
 *
 *        rule_being_visited->fw_number <= real_m.fw_number
 *
 * if "is_real_m_set" is TRUE.
 *
 * Feasibility of rule being visited, i.e., that the rule may be a partial
 * coincidence when we sequentially transverse the "ip_fw_chain_head" list,
 * requires both ("rfm" is the "relevant fields mask" of the rule):
 *
 *  o  rule->rfm & empty_lists_mask == 0
 *
 *  o  rule->rfm & merge_done_mask != 0   ===> (implies)
 *         ===> is_real_m_set && real_m.fw_number == rule->fw_number
 *
 * These are necessary conditions for possible feasibility of a given fw-rule.
 * Really, the first condition is implied by the second, as can be easily
 * proved:
 *
 *       is_real_m_set && real_m.fw_number == rule->fw_number  ===>
 *                                   ===> rule->rfm & empty_lists_mask == 0
 *
 * otherwise the merge would have produced a feasible "real_m" with one of its
 * relevant list known and empty. But, in real-life, since the
 * "merge_done_mask" may be restored from the old minimums stack, but the
 * "empty_lists_mask" not (i.e., "empty_lists_mask" may be ahead of
 * "merge_done_mask" when the latter is retrieved from the stack), the first
 * condition is left explicitly checked.
 *
 * Since we try to discard a rule from analysis for coincidence as soon as
 * possible, we check for its unfeasibility, i.e., the negation of the above
 * conditions.
 */

#define UNFEASIBLE(r)                                                   \
       (                                                                \
            ( (r)->nfm & merge_done_mask != 0 &&                        \
              ( ! is_real_m_set || (r)->fw_number < real_m.fw_number )) \
          ||                                                            \
            ( (r)->nfm & empty_lists_mask != 0 )                        \
       )


static int
ip_fw_chk( struct ip ** pip, /* rest of parameters here ... */ )
{

u_int32   empty_lists_mask, merge_done_mask, preferred_fields, field_mask;
u_int8    is_real_m_set, map_size, map_index;

struct t_real_m   real_m;

struct list_entry    * ordered_list_for_field[ MAX_FIELDS ];
struct minimum_entry   minimums_stack[ MAX_FIELDS ];
u_int8                 map[ MAX_FIELDS ];

struct ip_fw         * target_rule;
struct list_entry    * list4value;  /* The list for the value of the field in
                                     * the network packet.
                                     */

/************************************************************/
/*
 *              Begin of the work of "ip_fw_chk"
 */
/************************************************************/

empty_lists_mask = 0;
merge_done_mask = 0;
is_real_m_set = FALSE;
map_size = 0;

target_rule = LIST_FIRST( &ip_fw_chain_head );

for(; targe_rule; target_rule = LIST_NEXT(target_rule, next)) {

    /* We cannot have a minimum "real_m" that is before this rule
     * "target_rule", because we should have applied that minimum when we
     * visited it, since all of its relevant fields had been merged into
     * "real_m" and hence assured it as a full coincidence.
     *
     *     is_real_m_set ===> (implies)
     *            ===> target_rule->fw_number <= real_m.fw_number
     */

    assert( ! is_real_m_set || target_rule->fw_number <= real_m.fw_number );

    /* We cannot have the "minimums_stack" non-empty if there is no best
     * current minimum set
     *
     *    ! empty( minimums_stack ) ===> is_real_m_set
     */

    assert( empty( minimums_stack ) || is_real_m_set );

    /* This is how "merge_done_mask" might be rebuilt by each iteration if
     * needed to be (since it's in practice preserved among the iterations and
     * by the minimums' stack).
     */

    merge_mask_temp = 0;                   /* just for the assertion */
    for( m=0; m < map_size; m++ )
         merge_mask_temp |= 1 << map[m];
    assert( merge_mask_temp == merge_done_mask );

    /* End of the assertions, back to work ... */

    if( UNFEASIBLE( target_rule ) )
        continue;   /* Nothing to do with this rule */


    /* We calculate which fields are pending for deciding whether this rule
     * applies or has to be discarded.
     *
     * There are two passes here over the set of relevant fields for this rule
     * "target_rule". In the first pass we try to analyze only those of its
     * fields that are also relevant for rule "real_m" but are still unknown,
     * because, as we know we have to search for some unknown fields for this
     * rule "target_rule", we prefer those that also affect "real_m".
     *
     * (Note: the condition
     *
     *            target_rule->fw_number != real_m.fw_number
     *
     * implies, by "! UNFEASIBLE(target_rule)", that
     *
     *         target_rule->rfm & ~merge_done_mask == target_rule->rfm
     *
     * i.e., there are no coincidences between both. But with
     * "~merge_done_mask" is clearer.
     */

    preferred_fields = (is_real_m_set &&
                               target_rule->fw_number != real_m.fw_number) ?
                   (target_rule->rfm & real_m.rfm & ~merge_done_mask) :
                   0;  /* else no preference */

    for( importance = 2; importance; importance -- ) {

       if( preferred_fields == 0 ) goto skip_FORloop_on_field_no;

       for( field_no = 0; preferred_fields; field_no ++ ) {

           field_mask = 1 << field_no;

           if( field_mask & preferred_fields == 0 )
               continue;  /* This field is not required by the rule or is
                           * not preferred in this pass by its importance */

           preferred_fields ^= field_mask;  /* clear its bit */

           /* Search this field in its data structure for the first and unique
            * time
            */

           list4value = search( field_no, packet->fields[field_no] );

           /* We don't know anything yet about the list just returned */
           assert( list4value == NULL || list4value != NULL );

           /* We store this list as a new entry in our array. We don't try to
            * store it at index "field_no", because then the array would be
            * dispersed, and we prefer it to be condensed, growing from 0, to
            * facilitate our merge
            */

           map_index = map_size ++;
           map[map_index] = field_no;
           ordered_list_for_field[map_index] = list4value;

           /* Merge this newly known list with the ones we already have from
            * previous iterations on other "target_rules", and from previous
            * fields from this same rule.
            *
            * (I don't think a CALL instruction is cheap here, but to make the
            * code clearer.)
            */

           merge( ordered_list_for_field,
                  map_index, map, map_size,
                  minimums_stack,
                  &is_real_m_set, &real_m, &empty_lists_mask,
                  &merge_done_mask );

           if( ! is_real_m_set ||
               target_rule->fw_number < real_m.fw_number ) {

               /* there is nothing left we can do for this "target_rule",
                * because we have included one of its distinctive fields in
                * the merge, and, if it were a coincidence (in "real-m"), then
                * "target_rule" should be the "real_m". You may check by
                * yourself that this condition in this "if" is equivalent to
                * UNFEASIBLE( target_rule ) by the asserts above.
                */

               assert( UNFEASIBLE( target_rule ) );

               goto  NEXT_TARGET_RULE;

           }

           assert( is_real_m_set &&
                   target_rule->fw_number == real_m.fw_number );

           assert( ! UNFEASIBLE( target_rule ) );

       } /* end FOR(field_no) merging unknown fields for "target_rule" */

    /* next cycle of lower importance: the remaining unknown fields of
     * "target_rule". Note here that it is possible that
     *
     *     target_rule->fw_number == real_m.fw_number
     *
     * by the end of the previous FOR(field_no) loop, so we mayn't use
     * "real_m" here since it may have been changed. We simply use
     * "merge_done_mask" that has been updated too.
     */

    skip_FORloop_on_field_no:

       preferred_fields = target_rule->rfm & ~merge_done_mask;

    } /* end of the FOR(importance) loop */

    /* By the last two assertions above, here we know this "target_rule" has
     * been matched by the packet, and since we are walking through the
     * "ip_fw_chain_head" list respecting its "fw_number" order, we know we
     * may apply "target_rule" on the packet. We'll immediately  proceed to do
     * so, because this rule may be a "the search terminates" one.
     */

    assert( (target_rule->rfm == 0)  /* A very unusual case, e.g., the rule
                                      *       "count ip from any to any"
                                      * that the merge will always ignore
                                      */
           ||
            ( is_real_m_set && real_m.fw_number == target_rule->fw_number &&
            real_m.rfm & merge_done_mask == real_m.rfm) );

    decision_found = apply( target_rule, packet, &ip_fw_chk_returnval );

    if( ! decision_found &&
         real_m.fw_number == target_rule->fw_number /* i.e., equivalent to
                                                     * target_rule->rfm != 0
                                                     */
      ) {

        /* The rule applied was a "the search continues" rule.
         *
         * Since "target_rule" is a real coincidence in the merge but it has
         * been applied on the packet -it can't be applied twice on it-, we
         * have to forget it.
         */

        do {
             if( empty( minimums_stack )) break;
             pop( minimums_stack, &real_m, &map_index, &merge_done_mask );
        } while( real_m.rfm & empty_lists_mask != 0 );

        if( real_m.fw_number == target_rule->fw_number ||
            real_m.rfm & empty_lists_mask != 0 ) {

            /* nothing could be found in the minimums stack */

            map_index = 0;
            merge_done_mask = 0;
            is_real_m_set = FALSE;
        }

        merge( ordered_list_for_field, map_index, map, map_size,
               minimums_stack, &is_real_m_set, &real_m,
               &empty_lists_mask, &merge_done_mask );
    }

    if( decision_found )  /* "decision_found" is the same that from the return
                           * of the "apply" function above */
            return ip_fw_chk_returnval;

NEXT_TARGET_RULE: ;

}  /* end of the FOR on "target_rule" */

 /* The default fw-rule is linked into the "ip_fw_chain_head" list and is
  * visited in the "for" loop if it is the rule matched. The specificities of
  * the default fw-rule -free the mbuf parameter "m", etc- are dealt with by
  * the "apply" rule function.
  */

} /* end of the "ip_fw_chk" function */

/***********************************************************/
/*
 *
 *     The "merge" function.
 *
 */
/***********************************************************/


static void
merge( struct list_entry   ** ordered_list_for_field,
       u_int8                 starting_at_map_index,
       u_int8               * map,
       u_int8                 map_size,
       struct minimum_entry * minimums_stack,
       u_int8               * is_real_m_set_ptr,
       struct t_real_m      * real_m_ptr,
       u_int32              * empty_lists_mask_ptr,
       u_int32              * merge_done_mas_ptr
     )
{

    /* Avoid pointer references, by using local copies. (In real life, this
     * independent function "merge" shouldn't exist; the following should
     * be directly into "ip_fw_chk".)
     */
    u_int32           empty_lists_mask, merge_done_mask, field_mask;
    u_int8            is_real_m_set, field_no;
    struct t_real_m   real_m;
    struct list_entry    * list4value;


    empty_lists_mask = *empty_lists_mask_ptr;
    merge_done_mask  = *merge_done_mask_ptr;
    is_real_m_set    = *is_real_m_set_ptr;
    if( is_real_m_set ) real_m = *real_m_ptr;

    do {

    DO_LOOP:

        field_no = map[starting_at_map_index];
        list4value = ordered_list_for_field[starting_at_map_index];
        field_mask = 1 << field_no;

        /* The bitmask "merge_done_mask" represents over which fields the
         * merge has been concluded -and "real_m" extracted. Here, we are
         * starting a merge on a field at "starting_at_map_index", so...
         */

        merge_mask_temp = 0;
        for( m=0; m < starting_at_map_index; m++ )
             merge_mask_temp |= 1 << map[m];
        assert( merge_mask_temp == merge_done_mask );

        /* We can have a possible minimum, i.e., have "is_real_m_set" == TRUE,
         * only when we have merged at least one list, i.e.,
         * starting_at_map_index >= 1:
         *
         *     is_real_m_set ===> (implies) ===> starting_at_map_index >= 1
         *
         * This is really a consequence of the above assert, i.e., that a
         * "real_m" can only exists when there has been at least a merge,
         * "merge_done_mask" > 0
         */

        assert( ! is_real_m_set || starting_at_map_index >= 1 );

        /* This comes from the Annex 1: the "minimums_stack" can only save at
         * most one minimum for each visited field, since we don't wrap
         * circularly our merge from "MAX_FIELDs" down to 0.
         */

        assert( depth( minimums_stack ) <= starting_at_map_index );

        /* Real work to be done */

        while( list4value != NULL && UNFEASIBLE( list4value->rule ))

               list4value = LIST_NEXT( list4value, next );

        assert( list4value == NULL || ! UNFEASIBLE( list4value->rule ) );

        /* Save back in our array what is remaining from the list */

        ordered_list_for_field[starting_at_map_index] = list4value;

        /* Analyze all, IMHO, possible cases in the merge. */

        if( ! is_real_m_set ) {

            /* No "real_m" yet */

            if( list4value != NULL ) {

                 /* "list4value->rule" is a feasible rule according to the
                  * previous "while" loop.
                  */

                 assign_min( real_m, list4value->rule );
                 is_real_m_set = TRUE;   /* we have just set it */

            } else
                 empty_lists_mask |= field_mask;

        } else /* we had a previous feasible minimum in "real_m" */
        if( list4value == NULL ) {

               empty_lists_mask |= field_mask;

               /* How this empty list interacts with the previous minimum */

               if( real_m.rfm & empty_lists_mask != 0 ) {

                   /* This list was relevant for the previous minimum: discard
                    * it, and try to rescue another feasible minimum from the
                    * stack.
                    *
                    * Note that the "UNFEASIBLE( of the top of the
                    * minimums_stack )" reduces only to test if one of its
                    * relevant lists is empty, because here we should have
                    * "is_real_m_set == FALSE" from the "if" immediately above
                    * this comment.
                    */

                   while( real_m.rfm & empty_lists_mask != 0 &&
                             ! empty( minimums_stack ))

                         pop( minimums_stack, &real_m, &starting_at_map_index,
                                                   &merge_done_mask);

                   if( real_m.rfm & empty_lists_mask != 0 ) {

                       /* Nothing was found in the stack */

                       is_real_m_set = FALSE;

                       /* We have to merge all known fields again */

                       starting_at_map_index = 0;
                       merge_done_mask = 0;
                   }

                   goto DO_LOOP;

               }
        } else /* we had a previous feasible minimum in "real_m" and this new
                * "list4value" != NULL
                *
                * What is remaining is to compare "list4value->rule->fw_number"
                * against the current feasible minimum
                */

        if( list4value->rule->fw_number < real_m.fw_number ) {

               /* We know from the "while" before this large "if" that
                * "list4value->rule" may be FEASIBLE.
                */

               /* save the old minimum in the stack */

               push( minimums_stack,
                          real_m, starting_at_map_index, merge_done_mask );

               assign_min( real_m, list4value->rule );

               /* And we don't need this rule any more, since we have put it
                * in "real_m". If it is defeated by a higher
                * "starting_at_map_index" minimum, then it will be saved then
                * in the "minimums_stack".
                */

               ordered_list_for_field[starting_at_map_index] =
                                       LIST_NEXT( list4value, next );

        } else /* we had a previous feasible minimum in "real_m" and this new
                * "list4value" != NULL and
                * "list4value->rule->fw_number" >= "real_m.fw_number"
                */

        if( list4value->rule->fw_number == real_m.fw_number )

               /* A coincidence on "real_m": skip this rule, because we
                * already knew it (in "real_m").
                */

               ordered_list_for_field[starting_at_map_index] =
                                       LIST_NEXT( list4value, next );

        else /* we had a previous feasible minimum in "real_m" and this new
              * "list4value" != NULL and
              * "list4value->rule->fw_number" > "real_m.fw_number"
              */

               /* See if this list was relevant for the current feasible
                * minimum, because it hasn't got a coincidence in this list.
                */

               if( real_m.rfm & field_mask != 0 ) {

                   /* This list (field) was relevant for the current feasible
                    * minimum: discard it and try to rescue another minimum
                    * from the "minimums_stack"
                    */
                   real_m.rfm = empty_lists_mask; /* Discard "real_m" */

                   do {

                        if( empty( minimums_stack )) break;

                        pop( minimums_stack, &real_m, &starting_at_map_index,
                                              &merge_done_mask );

                   } while( real_m.rfm & empty_lists_mask != 0 );

                   if( real_m.rfm & empty_lists_mask != 0 ) {

                       /* No minimum in the stack was feasible */

                       is_real_m_set = FALSE;
                       starting_at_map_index = 0; /* merge all lists again */
                       merge_done_mask = 0;
                   }

                   goto DO_LOOP;

        }  /* no more cases to analyze for the "if" of the merge */

        merge_done_mask |= field_mask;

        /* If this list was found before this "if" to be non-empty then a
         * feasible minimum in "real_m" has to be set, since at least we know
         * that the rule at the "list4value->rule" was FEASIBLE according the
         * "while" just before the "if".
         *
         *  field_mask & empty_lists_mask == 0 ===> is_real_m_set
         */

        assert( field_mask & empty_lists_mask != 0 || is_real_m_set );

    } while ( ++ starting_at_map_index < map_size );

    /* Copy back all the local values to the pointer parameters */

    *empty_lists_mask_ptr = empty_lists_mask;
    *merge_done_mask_ptr = merge_done_mask;
    *is_real_m_set_ptr = is_real_m_set;
    if( is_real_m_set ) *real_m_ptr = real_m;

} /* end of function "merge" */


ANNEX 3. HOW TO BUILD THE ARRAY "break_rule" OF EACH (STATIC) RULE
         (FROM SECTION 3.7.2)

Let's call the array with the break rules for the longest continuous sequences
starting at a rule by "break_rule" (cf. section 3.7). This array is indexed by
relevant field in the rule, so in the code, to keep the array compact, another
array "relevant_field" for a rule gives the real field number. Both arrays
have size "number_relevant_fields" for its rule.

We suppose that the "ip_fw_chain_head" list is doubly-linked (there is a
"previous" pointer in the "struct ip_fw", analogous to the existing "next"
pointer there). There is no circularity in the "ip_fw_chain_head" list.

The cases analyzed are when a new fw-rule is added (the function "add_entry"
in "ip_fw.c") or deleted ("del_entry").

CASE 1: A new (static) fw-rule is added.

static int
add_entry( struct ip_fw_head * head, struct ip_fw * new_rule )
{
    ...
    ...

    /* We suppose here that the rule "new_rule" has been already inserted in
     * its correct place into the "head" list, the pointers in the list have
     * been updated, and that both arrays, "new_rule->relevant_field" and
     * "new_rule->break_rule", have been allocated in memory to their size of
     * "new_rule->number_relevant_fields"
     */

    u_int32        interesting_fields_mask;
    u_int8         seq, seq1, field_no;
    struct ip_fw * rule_after;

    /* Update both arrays for "new_rule". (I.e., "new_rule" as the base point
     * for the longest continuous sequences -their starting point.)
     *
     * "new_rule" inherits the same breaking rules than those of the rule
     * after it, for those fields that both rules have in common. The common
     * sequences of the other fields of "new_rule", not in "rule_after", are
     * broken by "rule_after" itself.
     *
     * (We re-use the existing order in the arrays of "rule_after", that are
     *  not modified, to avoid an explicit sort on the "new_rule"'s arrays.)
     */

    rule_after = LIST_NEXT( new_rule, next );
    if( rule_after == NULL )       /* "new_rule" is the default fw-rule */
           goto skip_arrays_build;

    seq = 0;       /* how many sequences have been filled in "new_rule" */

    interesting_fields_mask = new_rule->rfm & rule_after->rfm ;

    for( seq1 = 0; seq1 < rule_after->number_relevant_fields; seq1++ ) {

        field_no = rule_after->relevant_field[seq1];

        if( ( 1 << field_no ) & interesting_fields_mask == 0 )
            /* this field of "rule_after" is not in "new_rule" */
            continue;

        /* "new_rule" and "rule_after" share this field: inherit */

        new_rule->relevant_field[seq] = field_no;
        new_rule->break_rule[seq] = rule_after->break_rule[seq1];
        seq++;  /* a new entry inserted in the arrays of "new_rule" */

    }

    /* Now to the fields of "new_rule" that are not in "rule_after" */

    interesting_fields_mask = new_rule->rfm & ( ~rule_after->rfm );

    for( field_no = 0; field_no < MAX_FIELDS; field_no ++ )

        if( ( 1 << field_no ) & interesting_fields_mask != 0 ) {
            /* The field is relevant in " new_rule" but not in "rule_after" */
            new_rule->relevant_field[seq] = field_no;
            new_rule->break_rule[seq] = rule_after;
        }

    /* The arrays of "new_rule" have been calculated.
     *
     * Now, update the arrays of the rules before "new_rule". I.e., iterate
     * over the previous rules taking each of them as the base point of the
     * longest continuous sequences starting at each.
     */

    /* The fields whose continuous sequences can't reach "new_rule" because
     * they have been broken in the middle.
     */
    u_int32        accum_middle_broken_sequences_mask;
    struct ip_fw * previous_rule;

    accum_middle_broken_sequences_mask = 0;
    previous_rule = LIST_PREVIOUS( new_rule, previous );

    while( accum_middle_broken_sequences_mask != ((1 << MAX_FIELDS ) - 1)
             /* There is a field whose continuous sequence is not broken yet
              * and hence that sequence of "previous_rule", if present, could
              * reach "new_rule".
              */
            && previous_rule != NULL ) {

         int sort_needed = FALSE;

         for( seq = 0; seq < previous_rule->number_relevant_fields; seq++) {

             field_no = previous_rule->relevant_field[seq];
             field_mask = 1 << field_no;

             if( field_mask & new_rule->rfm != 0 )
                 /* The continuous sequence of this "field_no" of
                  * "previous_rule" can't be broken at any rate by "new_rule",
                  * even if the sequence reaches "new_rule", because that
                  * field is also relevant in "new_rule".
                  */
                  continue;  /* Next field of "previous_rule" */

             /* The continuous sequence of this field will be broken by
              * "new_rule", if it reaches there by not being broken by the
              * rules in the middle, between "previous_rule" and "new_rule".
              */
             if( field_mask & accum_middle_broken_sequences_mask == 0 ) {
                 /* The continuous sequence of "field_no" that starts at
                  * "previous_rule" is not broken by any rule in the middle,
                  * so it reaches "new_rule", and is broken there because
                  * "new_rule" doesn't need the sequence's field.
                  */
                 assert( previous_rule->break_rule[seq]->fw_number
                                                   > new_rule->fw_number );

                 previous_rule->break_rule[seq] = new_rule;  /* break */

                 sort_needed = TRUE; /* We have changed the array */
             }
         } /* End of the "for(seq = all fields in previous_rule)" */

         if( sort_needed ) {

             sort previous_rule->relevant_field and
                  previous_rule->break_rule arrays, in descending order
                  according to the value of
                  previous_rule->break_rule[i]->fw_number

         }
         /* This rule "previous_rule" itself breaks all continuous sequence
          * for the fields it doesn't need.
          */
         accum_middle_broken_sequences_mask |= ~ previous_rule->rfm;

         previous_rule = LIST_PREVIOUS( previous_rule, previous );

    } /* End of the "while( accum_middle_broken_sequences_mask ...)" */

skip_arrays_build:
    ...

    /* Ey!! Remember to update the search data structures for the fields and
     * values found in this new fw-rule, and the lists associated to their
     * nodes.
     */

} /* End of the function "add_entry" */

CASE 2: An existing (static) fw-rule is deleted.

Note: for simplicity, we suppose here that the "del_entry" function,
responsible for deleting a rule(s), takes the pointer to the rule it has to
delete, instead that the "fw_number" of all the rules it has to delete.

(The general case used really in the function in "ip_fw.c" is dealt with
similarly to what is done here: simply locate where the block of the given
"fw_number" rules begins and ends. "previous_rule" in the following would
start just before that block, and "rule_after" would be the rule after the
block.)


static int
del_entry( struct ip_fw_head * chainptr, struct ip_fw * rule_to_delete )
{
    u_int8         seq, field_no;
    struct ip_fw * rule_after;
    u_int8         rule_after_inverted_index[MAX_FIELDS];


    rule_after = LIST_NEXT( rule_to_delete, next );

    /* Create the inverted index of "field_no"s for "rule_after */

    for( seq = 0; seq < rule_after->number_relevant_fields; seq++ ) {

         field_no = rule_after->relevant_field[seq];

         rule_after_inverted_index[field_no] = seq;
    }

    /* Update the arrays for the rules previous to "rule_to_delete"
     *
     * The idea for this is very similar to the one used in "add_entry"
     */

    u_int8         seq1;
    u_int32        accum_middle_broken_sequences_mask;
    struct ip_fw * previous_rule;

    accum_middle_broken_sequences_mask = 0;

    previous_rule = LIST_PREVIOUS( previous_rule, previous );

    while ( accum_middle_broken_sequences_mask != (( 1 << MAX_FIELDS ) - 1 )
            && previous_rule != NULL ) {

           u_int sort_needed = FALSE;

           for( seq = 0; seq < previous_rule->number_relevant_fields;
                seq++ ) {

                field_no = previous_rule->relevant_field[seq];

                if( previous_rule->break_rule[seq] == rule_to_delete ) {
                    /* The current breaking rule for this sequence is going to
                     * be deleted soon.  Find which rule after "rule_to_delete"
                     * will be the next breaking rule for this sequence, using
                     * that of "rule_after".
                     */

                    if( (1<<field_no) & rule_after->rfm == 0)
                         /* "rule_after" doesn't have this field, so it will
                          * break this continuous sequence itself.
                          */
                         previous_rule->break_rule[seq] = rule_after;

                    else {
                         /* "rule_after" has this field, so it belongs to the
                          * sequence, and therefore, "previous_rule" inherits
                          * its same breaking rule for this field.
                          */
                         seq1 = rule_after_inverted_index[field_no];

                         previous_rule->break_rule[seq] =
                                              rule_after->break_rule[seq1];
                    }

                    sort_needed = TRUE;

                } /* End of the "if" */

           } /* End of the "for( seq )" */

           if( sort_needed ) {

               sort previous_rule->relevant_field and
                    previous_rule->break_rule arrays, in descending order
                    according to the value of
                    previous_rule->break_rule[i]->fw_number

           }
           /* This rule "previous_rule" itself breaks any continuous sequence
            * for the fields it doesn't need.
            */
           accum_middle_broken_sequences_mask |= ~ previous_rule->rfm;

           previous_rule = LIST_PREVIOUS( previous_rule, previous );

    } /* End of the while( accum_middle_broken_sequences_mask ... )" */

    /* Normal work of freeing "rule_to_delete"s arrays and of deleting
     * "rule_to_delete" from the list.
     * Remember to update the search data structures and their ordered lists
     * deleting from them this "rule_to_delete" and the values exclusive to it.
     */
    ...
    ...

} /* End of function "del_entry" */


ANNEX 4. IP_FW_CHK FROM SECTION 3.7.2

static int
ip_fw_chk( struct ip ** pip, /* rest of parameters here ... */ )
{

    u_int32          empty_lists_mask, search_done_mask, field_mask,
                     preferred_fields;

    u_int8           seq, seq1, field_no;

    struct ip_fw *   block_head, farthest_rule, break_rule, rule4packet;

    struct list_entry * ordered_list_for_field[ MAX_FIELDS ];
    struct list_entry * list4value;  /* The list for the value of the field in
                                      * the network packet.
                                      */

            /* our version of the "lookup_next_rule" function */

#define SAFE_FARTHEST_JUMP( base_rule, last_synced_seq )                     \
                                                                             \
    do {                                                                     \
                                                                             \
        farthest_rule = LIST_NEXT( (base_rule), next );  /* default jump;    \
                                                       * no rule missed, so  \
                                                       * it is a safe jump   \
                                                       */                    \
                                                                             \
        /* look if there is a farther (and safe) jump than merely            \
         * "base_rule->next" */                                              \
                                                                             \
        for( seq1 = 0; seq1 < (base_rule)->number_relevant_fields;           \
             seq1++ ) {                                                      \
                                                                             \
            field_no = (base_rule)->relevant_field[seq1];                    \
                                                                             \
            field_mask = 1 << field_no;                                      \
                                                                             \
            if( field_mask & search_done_mask == 0 )                         \
                                                                             \
                /* The search for this field hasn't been done.               \
                 *                                                           \
                 * We regard searches on the field's data structure as       \
                 * costly operations (why?)                                  \
                 */                                                          \
                                                                             \
                continue;  /* next relevant field */                         \
                                                                             \
            /* This list is known (its field has been searched for it */     \
                                                                             \
            break_rule = (base_rule)->break_rule[seq1];                      \
                                                                             \
            if( break_rule->fw_number <= farthest_rule->fw_number )          \
                                                                             \
                /* We have found that the previously existing                \
                 * "farthest_rule" is farther (and still safe) than the      \
                 * break rule (the "B[i]" in section 3.7.2) the current      \
                 * field at "seq1".                                          \
                 *                                                           \
                 * Since the array "base_rule->break_rule" is decreasingly   \
                 * ordered by "base_rule->break_rule[i]->fw_number", then    \
                 * by transitivity, for any index "i" > "seq1"               \
                 *                                                           \
                 *  base_rule->break_rule[i]->fw_number <=                   \
                 *                                 farthest_rule->fw_number  \
                 *                                                           \
                 * and since, for those fields for any index "i" > "seq1"    \
                 * their corresponding "list4value" in the network packet    \
                 * has to be in the longest continuous sequence for          \
                 * "field_no".(I.e., it has to be safe.)                     \
                 *                                                           \
                 *   list4value->rule->fw_number <                           \
                 *                      base_rule->break_rule[i]->fw_number  \
                 *                                                           \
                 * for being considered in the selection for the farthest    \
                 * rule, so, for these "i" > "seq1",                         \
                 *                                                           \
                 *   list4value->rule->fw_number < farthest_rule->fw_number  \
                 *                                                           \
                 * then we know that the current "farthest_rule" is the      \
                 * farthest jump (and it is safe)                            \
                 */                                                          \
                                                                             \
                break;  /* this "for(seq1)" loop */                          \
                                                                             \
            else /* farthest_rule->fw_number < break_rule->fw_number         \
                  *                                                          \
                  * We have to consult the rule in the list of the value in  \
                  * the packet (the "N[i]" in section 3.7.2), perhaps it's   \
                  * farther than "farthest_rule" and also safe.              \
                  */                                                         \
                                                                             \
                 if( empty_lists_mask & field_mask == 0 ) {                  \
                      /* "ordered_list_for_field[field_no]" not empty */     \
                                                                             \
                      list4value = ordered_list_for_field[field_no];         \
                                                                             \
                      if( seq1 > (last_synced_seq) ) {                       \
                                                                             \
                           /* We have to sync here the list4value->rule to   \
                            * pass this "base_rule" rule, since the jump is  \
                            * forward (in section 3.7.2 terminology:         \
                            *                                                \
                            *       R.fw_number < N[i].fw_number             \
                            */                                               \
                                                                             \
                           while( list4value != NULL &&                      \
                                list4value->rule->fw_number <=               \
                                                     base_rule->fw_number )  \
                                                                             \
                                list4value = LIST_NEXT( list4value, next );  \
                                                                             \
                           /* sync done: store it back */                    \
                           ordered_list_for_field[field_no] = list4value;    \
                                                                             \
                           if( list4value == NULL ) {                        \
                               empty_lists_mask |= field_mask;               \
                               continue;  /* no rule to take from list */    \
                           }                                                 \
                      }                                                      \
                                                                             \
                      rule4packet = list4value->rule;                        \
                                                                             \
                      if( rule4packet->fw_number < break_rule->fw_number     \
                          /* the rule in the list for the packet's value is  \
                           * in the longest continuous sequence for          \
                           * "field_no" (i.e., it's safe)                    \
                           */                                                \
                          && farthest_rule->fw_number <                      \
                                                    rule4packet->fw_number ) \
                                                                             \
                         farthest_rule = rule4packet;                        \
                 }                                                           \
                                                                             \
        } /* end of the for( seq1 ... ) loop */                              \
                                                                             \
    } while( 0 );


/***************************************************************/
/*
 *        Begin of the "ip_fw_chk" function
 *
 */
/***************************************************************/

search_done_mask = 0;
empty_lists_mask = 0;

block_head = LIST_FIRST( &ip_fw_chain_head );

/* We have the default fw-rule in the "ip_fw_chain_head" list, so at least one
 * rule will always match on the packet. Therefore, this loop is not endless.
 */

while( 1 ) {

  start_of_loop_on_current_block_head:

    /* at least a rule will match in the "ip_fw_chain_head" list */

    assert( block_head != NULL );

    if( block_head->rfm & empty_lists_mask != 0 ) {

        /* there are some relevant lists that are already empty, so this rule
         * "block_head" has some relevant parts missing and can't be a
         * coincidence. But this fact may avoid us useless searches on the
         * data structures of its relevant fields which are still unknown.
         *
         * Jump to the farthest away rule that is still feasible from this
         * unfeasible "block_head" rule, according only to its known lists and
         * its already empty lists.
         */

                               /* -1 means to sync all the known lists of this
                                * "block_head" to pass it */
        SAFE_FARTHEST_JUMP( block_head, -1 );

        block_head = farthest_rule;
        continue;       /* analyze this new "block_head" rule */
    }

    /* This rule "block_head" is still feasible since it doesn't have any
     * relevant lists empty: visit the lists of its relevant fields.
     *
     * Prioritize the analysis on the known lists first, in order of their
     * longest continuous sequence that starts at "block_head". The second
     * pass is on the unknown lists, searching them first. This preference
     * tries to avoid (or just delay) unnecessary searches.
     */
    preferred_fields = block_head->rfm & search_done_mask;

    for( importance = 2; importance ; importance--) {

        if( preferred_fields == 0 ) goto skip_FORloop_on_the_fields;

        for( seq=0; preferred_fields ; seq++ ) {

            field_no = block_head->relevant_field[seq];

            field_mask = 1 << field_no;

            if( field_mask & preferred_fields == 0 )
                continue;   /* this field doesn't have this "importance" */

            preferred_fields ^= field_mask;  /* clear its bit */

            if( field_mask & search_done_mask == 0 ) {

                /* a still-unknown list (field) (importance == 1) */

                list4value = search( field_no, packet );

                search_done_mask |= field_mask;

         } else  /* a known list (field) (importance == 2) */
             list4value = ordered_list_for_field[field_no];

            /* put in sync "list4value->rule" to be at least equal to
             * "block_head->fw_number"
             */

         while( list4value != NULL &&
                list4value->rule->fw_number < block_head->fw_number )

              list4value = LIST_NEXT( list4value, next );

         ordered_list_for_field[field_no] = list4value;

            /* Here, the merge is merely reduced to checking for coincidence
             * on "block_head"
             */

         if( list4value != NULL &&
                list4value->rule->fw_number == block_head->fw_number )

             /* Good! A coincidence on this rule "block_head".
                 * Do a favor: sync the list to pass this coincidence. */

             ordered_list_for_field[field_no] =
            LIST_NEXT( list4value, next );

         else {

                if( list4value == NULL ) empty_lists_mask |= field_mask;

                /* We don't have a coincidence on "block_head", so we have to
                 * discard it. The next "block_head" rule will be the farthest
                 * (but safe) jump from this old "block_head".
                 */

                SAFE_FARTHEST_JUMP( block_head, seq );

                block_head = farthest_rule;  /* analyze this rule */

                goto start_of_loop_on_current_block_head;
         }

        } /* end of the "for( seq < block_head->number_relevant_fields )" */

        /* We go to the next iteration on the fields of "block_head" that are
         * still unknown, where we'll have to search for their values in the
         * packet to get their associated lists.
         */
    skip_FORloop_on_the_fields:

        preferred_fields = block_head->rfm & (~search_done_mask);

    }  /* end of "for(importance)" */

    /* all the relevant fields for "block_head" give it as a coincidence, so
     * we may apply "block_head" right now.
     */

    decision_found = apply( block_head, packet, &ip_fh_chk_returnval );

    if( decision_found )
            return ip_fh_chk_returnval;
    else {
         /* It was a "the search continues" rule. (No list is needed to be
          * synced for the jump, that's why
          * "block_head->number_relevant_fields" is specified: it means "all
          * lists synced") */

         SAFE_FARTHEST_JUMP( block_head, block_head->number_relevant_fields );

         block_head = farthest_rule;
    }
} /* end of the "while( 1 )" loop */


ANNEX 90. A USEFUL IDEA FROM THE DYNAMIC RULES

As it was mentioned above, a stateful firewall should not be expressed by this
algorithm because, when a dynamic rule is added or removed, the search data
structures and the ordered lists associated with the values in them, have to
be updated, and this, IMHO, may be overwhelming. (I don't know... do you know
a data structure that allows fast searches on it, and besides, fast inserts
and then deletions possibly just after a net.inet.ip.fw.dyn_X_lifetime period
of time? I give up.) But, returning to the dynamic rules, the search currently
implemented on them -mainly the hash on the packet's addresses and ports-
might be considered a generalization of the proposed algorithm, because *each
independent* field in the packet isn't being searched for, but a *unified key
of some fields* in the packet is:

Proposed algorithm:
    independent searches for each field and then a merge looking for
    coincidences

Current search algorithm in a dynamic firewall:
    an initial search for a compound key of some fields, and then a precise
    search in the short list returned by the hash-lookup.

Of course, a dynamic rule only expresses real values for addresses and ports,
that may be exactly found in a packet, whereas a fw-rule normally expresses a
mask of addresses -a mask that by itself will not be found in any packet-, a
list or range of ports, etc. But the idea of the current implementation is
certainly applicable, instead of searching for each independent field, you
combine some fields and search for the compound key.

This may save searches if fields with a small values-domain are packed in a
key, e.g., the packet's IP and/or TCP options, the IP precedence, and/or TCP
flags. (Since there are not so many -practical- combinations of options and
flags, a direct indexing schema on the key may be sufficient and no extra
searches -as in the case of a hash- would be necessary.) Furthermore, since
only one list would be returned by a search for a key of some fields in
step 2, less number of fields (lists) would have to be merged in step 3.