Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Nov 2002 10:31:39 -0500 (EST)
From:      Robert Watson <rwatson@freebsd.org>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        Luigi Rizzo <rizzo@icir.org>, current@freebsd.org
Subject:   Re: mbuf header bloat ?
Message-ID:  <Pine.NEB.3.96L.1021127095837.43889C-100000@fledge.watson.org>
In-Reply-To: <15840.8629.324788.887872@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

Andrew,

Thanks for your patience as I finished some research and experimentation
regarding the options there.  Some more details below.

On Sat, 23 Nov 2002, Andrew Gallatin wrote:

> On the contrary, I think that if anything is going to be done, it must
> be done now, so as to not break binary network driver compatability like
> we did in 4.1.1 when the size of mbufs changed.  Otherwise, we're stuck
> with it until 6.0.

Per an on-going discussion on -arch, it seems there's a reasonable
concencus that the kernel driver ABI will not be frozen until 5.1, since
we need continued flexibility to mature the fine-grained locking, KSE, and
MAC technologies.  This will allow us some wiggle room in resolving these
sorts of issues. 

> As you eloquently state, there are a number of tradeoffs involved.  On a
> 64-bit platform, 99% of users are paying 40 bytes/pkt for something that
> they will never use.  On x86, 99.99% of users are paying 20 bytes/pkt
> for a feature they will never use.  At least a signifigant fraction of
> nics make use of csum offloading (xl, ti, bge, em, myri). 
> 
> I propose that we make struct label portion of the pkthdr compile-time
> conditional on MAC.  The assumption is that you will move the MAC label
> to an m_tag sometime after 5.0-RELEASE. 

For a variety of reasons, I'm averse to the notion of compile-time
components in the struct mbuf (and other) vital kernel structures.  One of
the design requirements for the MAC Framework was that it be possible for
third party vendors to distribute security modules that plug in without
necessarily being part of the FreeBSD build infrastructure.  While it is
true we currently require options MAC to be compiled into the kernel, we
don't require that you manually integrate module source into the kernel
source so that it builds as part of a kernel.  Due to the way that
separately shipped modules build out of the context of a kernel
configuration, this would introduce substantial problems.  However, since
we believe that the kernel ABI will not be frozen until 5.1, if we have an
alternative place to put the label that doesn't expand the pkthdr, then we
can change it once we think the solution is ready. 

On the topic of m_tag: I've spent a few days working with m_tag now to see
if it can meet the needs of the MAC Framework.  My conclusion is that, in
the form it's currently in the tree, it cannot meet the requirements. 
However, I believe with a relatively straight forward set of
modifications, it can.  As such, the proposed 5.1 time frame for moving
the MAC Framework to using m_tag is realistic.  I'm currently exchanging
patches with Sam Leffler looking at how to tweak the various protocol
stacks to properly maintain m_tag chains on mbufs when mbufs are copied,
etc.  These problems largely stem from a failure to maintain the tag
chains on mbufs over some of the copy/... operations that occur.  The
result is that the MAC labels stored in mbufs are often discarded or lost,
and many packets float around the system without proper protection.  For
policies that rely on ubituitous labeling, this results in rapid assertion
failures (yes, we fail very closed :-).  I hope to post patches for these
changes in the next few days once I've had a perform more extensive
testing.  Sam and I are having an on-going conversation about whether it
would be safe to introduce some of these changes before 5.0.

There are some downsides to moving to m_tag for MAC labels.  One is that
it effectively doubles the number of memory allocations in the system for
every packet delivered through the system when running with MAC if we
maintain the current semantic that all packets are labeled.  This means
users will pay a higher cost for MAC even if they don't label packets,
which is unfortunate.  I'm currently exploring the impact -- my hope is
that changes to the memory allocators since 4.x, such as the new mbuf
allocator and introduction of UMA, will largely mitigate that effect.  A
fair amount of interest has been expressed in supporting MAC in the
GENERIC kernel eventually: if and when that becomes the case, we may find
that the rationale for moving the label out of the mbuf is reversed.

> This will immediately reduce the size of mbufs for the vast majority of
> users, and will prevent a 4.1.1 like flag-day for 3rd party network
> driver vendors.  The only downside is that the few MAC users will not be
> able to use 3rd party binary network drivers until the MAC label is put
> into an m_tag.  This seems fair, as the only people inconvienced are the
> people who want the labels and they are motivated to move them to an
> m_tag.  But that's easy for me to say, since I don't run MAC, and I may
> be missing something big. 

I think you under-estimate the complexity of variably sized key kernel
data structures.  mbuf.h is included all over the kernel, as well as in
many user applications (although often for bogus reasons).  My proposed
strategy is the following:

(1) For 5.0, we either maintain the current storage of the struct label in
    struct mbuf, or move to m_tag's if there is a concensus the set of
    supporting changes is correct (move to m_dup_pkthdr() in a number of
    places, introduce proper handling of wait dispositions for tag
    allocation, and so on).

(2) For 5.1, assuming we're not already in m_tags, we move to m_tags for
    the label.  This is acceptable because we are opting not to freeze the
    kernel driver ABI between 5.0 and 5.1, which will permit
    infrastructural changes necessary to improve the performance and
    stability of the 5.x branch without locking it entirely to the current
    set of structure layout assumptions.

I'd like to continue to explore options for reducing the number of memory
allocations to extend storage on mbufs.  One idea I've been tossing around
is adopting Jeff Roberson's extension model used in struct proc and
related structures. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1021127095837.43889C-100000>