Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Oct 2002 09:46:53 -0700
From:      "Sam Leffler" <sam@errno.com>
To:        "Julian Elischer" <julian@elischer.org>
Cc:        "Nate Lawson" <nate@root.org>, <freebsd-arch@FreeBSD.ORG>, <freebsd-net@FreeBSD.ORG>
Subject:   Re: CFR: m_tag patch
Message-ID:  <151b01c26e21$26adb690$52557f42@errno.com>
References:  <Pine.BSF.4.21.0210062343510.22932-100000@InterJet.elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sun, 6 Oct 2002, Sam Leffler wrote:
>
> > > 1. Is ordering important or is an SLIST sufficient for all cases?
> >
> > Order is not important.
>
> I do similar in Netgraph but could move to whatever scheme becomes
> standard in the rest of the system. However I have some serious
> comments.
>
> In netgraph I have a separate structure for the packet that contains
> a pointer to the mbuf chain as well a pointer to a malloc'd memory
> buffer that can contain metadata such as is kept here.
> the metadata is of the form [header][field][field][field]....
>
> where each field was defined as:
> struct meta_field_header {
>         u_long  cookie;         /* cookie for the field. Skip fields you
don't
>                                  * know about (same cookie as in messgaes)
*/
>         u_short type;           /* field ID within this cookie scheme */
>         u_short len;            /* total len of this field including extra
>                                  * data */
>         char    data[0];        /* data starts here */
> };
>
> the header for metadata is:
> struct ng_meta {
>         char    priority;       /* -ve is less priority,  0 is default */
>         char    discardability; /* higher is less valuable.. discard first
*/
>         u_short allocated_len;  /* amount malloc'd */
>         u_short used_len;       /* sum of all fields, options etc. */
>         u_short flags;          /* see below.. generic flags */
>         struct meta_field_header options[0];    /* add as (if) needed */
> };
>
> /* Flags for meta-data */
> #define NGMF_TEST       0x01    /* discard at the last moment before
sending */
> #define NGMF_TRACE      0x02    /* trace when handing this data to a node
*/
>
> One metadata collection is associated with each packet. similar to what
> you do except that in 4.x Ipass it as an arguent and in 5.0
> I have a packet header that is passed around that identifies both
> data and metadata. ( I need the header for other reasons anyway)
>
> I show this only to show that I have tackled the same problem with a
> similar but different scheme. My thought was that a packet usually only
> has at most a couple of tags, and the tags are usually short, so that it
> was ok to malloc a bigger chunk and using the length fields walk them.
> If you didn't have room you could malloc a bigger one and copy the
> intitial fields, and then add your new ones at the end. It would
> probably almost never happen..
>

I'm not sure I follow exactly how the above works, but the tag data
structures are very simple and would appear to accomodate your data
structures.  An m_tag is simply a variable-length malloc'd block of memory
that has a type field (and linked list pointer).  What you store in the
space that follows the fixed-length struct m_tag header is up to you.  The
type field is used simply to locate tags once attached to a packet.

You can certainly allocate a tag with larger chunk of memory than you
initially need and store the bookkeeping info in the tag data block.  This
would appear to permit implementation of the above within the m_tag
framework.

> I did however find a need to delete a tag once the packet passed out of
> the scope of the module in question. (the "cookie" represents a
> particular "ABI" the "type" is only valid withing the ABI.
> If you do not support a particular cookie type you cannot interpret the
> contents aof that field) Protocols and such have their own cookies.
>

Yes, this is exactly what the m_tag_id item is for in the m_tag data
structure (what I called a type field above).

> I point this out as a feature because the TAG values need to only be
> defined within their own ABI/API include files.
>
> If a module that supports a particular cookie passes a packet out to
> the rest of the world, it probably should invalidate some fields that
> are set with that particular cookie, in case that packet should in some
> way be redirected back towards a module that does support it, and that
> might cause unexpected results.  For this reason I needed to 'remove'
> fields. I ended leaving them in place and setting the cookie to a
> special "invalid" cookie value that no-one would match. In a similar
> vein, I can imagine wanting to remove one of the 'tags' in your lists.
>

You can either remove tags from the chain attached to an mbuf or invalidate
them as you described--by setting the m_tag_id field to something "invalid".

> Each module has a cookie field that is pretty much guaranteed
> to be unique (time since epoch when written) so in your terms,
> the IP code would only know and recognise TAGS prepended
> with the IP cookie and would handle other tags as opaque data.
> Similarly IPSEC code would only see into tags with the IPSEC cookie
> prepended.. I don't think you should be
> defining the TAG values in a global file, but rather
> each module should identify its own tags and  ignode others.
> If two modules need to share data, a .h file can contain the
> defineitions for that API and the cookie that identifies such tags,
> and both modules would include that shared include.  That way
> the base code need not know about future improvements, which
> has always been "difficult" :-)
>

Changing the way m_tag_id definitions are done is fine with me.  It's done
statically in mbuf.h for compatibility with code I've bringing over from
openbsd.

> >
> > > 2. Is it possible to attach the aux argument to the mbuf chain instead
of
> > > adding it as a new parameter to ip_output?
> > >
> >
> > The "aux argument" _was_ originally attached to the mbuf chain.  The
change
> > to add an extra arg to ip*_output was done to eliminate one of the
biggest
> > uses of the aux mbuf; the socket to use to get IPsec policy.  This is a
> > performance win and worth doing independent of the aux->m_tag switch.
> >
> > One could split the ip_output change out but doing it together avoids
> > converting code that would just eventually be eliminated (unless you did
the
> > ip_output change first and then the m_tag switch).
>
> I'm not sure but it's possible m_aux was invented so that ip_output()
> would not change interfaces to match with was defined in some interface
> method table. I think NetBSD added entries in their
> interface method tables with varargs (yech) to get around some of this..
>
>
> So in summary:
> is it worth making it a linked list?
> how  many tags do you see on packets, and how large are they?
> CAn you use a sche,e suc as that outlined so that each module can
> define its own tags privatly?
>

You can certainly use a scheme as you suggest.  I don't believe doing
something like that is precluded by what I've offered, you'd just layer the
additional logic on top w/o any changes--I think.  Tags tend to be very
small (e.g. a pointer or two ints) and there tend to be only 1 or 2 when
there are any.  In my performance tuning work with IPsec their manipulation
has never shown up as significant in kernel profiles.

    Sam


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?151b01c26e21$26adb690$52557f42>