Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Feb 2003 20:13:18 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Bosko Milekic <bmilekic@unixdaemons.com>
Cc:        Hiten Pandya <hiten@unixdaemons.com>, FreeBSD-arch@FreeBSD.ORG
Subject:   Re: Mbuf flags cleanup proposal
Message-ID:  <3E56F8DE.5453DB88@mindspring.com>
References:  <20030221151007.GA60348@unixdaemons.com> <3E5673E7.F3F1FA4F@mindspring.com> <20030221150743.A79345@unixdaemons.com> <3E56B3F5.9EF3F9FE@mindspring.com> <20030221201728.A80661@unixdaemons.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bosko Milekic wrote:
>   In FreeBSD (notice I said "FreeBSD," not TerryBSD, SunOS, Linux, or
>   whatever else), network buffer allocations have for the longest time
>   been done seperately and for good enough reason; as a result, FreeBSD
>   has adopted some fairly serious optimizations for what concerns the
>   way they (and supporting structures) are allocated.

The mbufs, historically, were allocates out of a zalloci zone,
which was the only allocator type (again, historically) to support
allocation at interrupt time.  The argument that this allocation
was seperate has more to do with all interrupt allocation being
seperate, than it does with a special case for mbufs alone.

Nevertheless, it's generally true that the first thing I do when
converting a FreeBSD kernel for embedded processing in network
equipment is "replace the mbuf allocator".  Generally, I use a
machdep.c KVA preallocated freelist (FWIW), which is a heck of a
lot faster than anything that's ever been committed to the FreeBSD
CVS repository.  As such, there are some good arguments for seperation,
or at least layered abstraction, of the mbuf allocator.


>   "My" allocator
>   (and by the way, it's wrong to call it _mine_ because it really is the
>   result of a number of different people who have worked on it)
>   maintains those optimizations (for lack of finding equally simple and
>   well-performing replacements) while at the same time taking advantage
>   of parallel processing in the kernel.

I am an extremely vocal advocate of parallelization of code paths,
and have been since I "rescued" Jack Vogel's 1995 SMP code from
oblivion, and Peter and Steve Passe took that code and made it the
basis of the FreeBSD SMP project.

I fully support your efforts in this regard, and, among other things,
that support has taken the form of self-censorship of public criticism,
for the most part.

I understand that other people have worked on the code, but you are
the one who has consistently championed the code.  I'm sorry if that
led me to call it "your allocator" unjustly.


>   Similarily, UMA is a great
>   allocator, and it does similar things for general-purpose allocations
>   in the system.  If you ask me whether or not mbuf allocations can be
>   made to use UMA?  The answer is yes.  If you ask me whether
>   performance is going to be better?  I don't know for sure, but I can
>   tell you that in order to solve the issues I bring up it's going to be
>   difficult, and I _do_ know that if you don't solve them, performance
>   is going to suck, comparatively speaking.

I understand this, as well.  To my mind, it is a matter of will, on
the part of you and Jeff, where some of the internals of Jeff's code
need to have hooks made available for the additional processing the
mbuf code needs to do.

Let me say that I believe that this *will* happen, sooner or later,
and that any change that would make this more difficult later is
going to have to be backed out.  Better that it never went in, as
arguing to change something recently committed is very difficult,
for social, rather than technical reasons.


>   Solving them would require what I think is relatively serious
>   modification to UMA which, in my opinion anyway, would uglify [sic]
>   it.

I understand this as well; what I don't understand is the unwillingness
to discuss it, or to do the "uglification" anyway.

>   I *have* looked at it, and I think that the fact that UMA does
>   allocations for all objects using the same techniques is great and - I
>   can't speak for Jeff - but *I* wouldn't want to hack at it just so
>   that we can get the optimizations/solutions we currently have for mbuf
>   allocations.  And, you know what?  If *you* think it's worth it, why
>   don't *YOU* do it and waste hours on end to, finally, get something
>   that _maybe_ performs as well at the expense of an uglier
>   (not-so-general-and-simple-anymore) allocator.

I think it's possible to do -- as long as there are not changes
that preclude it, such as renaming manifest constants to be use
specific, etc..  Changes such as you were proposing.

I'm not willing to do it today... I don't think Jeff's code is
finished enough, yet, and I have confidence that he will return
to it, after his diversion into a new scheduler is done.  Until
he does, the excessive locking in the current code is evil.  I
could take care of that, but he has expressed a reluctance to
allow his statistics to become snapshots, rather than exact values,
and I have not wanted to step on his toes over that.

I will say that, if someone is willing to commit the code, I'm
willing to do the abstration work.  For me, it's trivially easy
to do this type of work (in fact, I had planned on submitting
patches soon which did the necessary indirection for boot-time
selection of an arbitrary scheduler from a list of loaded modules,
in the near future, which is a task with a similar level of
complexity).

Don't think that your work is unappreciated, but do realize that
I'm not attacking the manifest constant rename proposal out of a
sense of "trying to jump on a winning bandwagon", or out of a
sense of "trying to interfere with progress", like some others
seem to be (or they would have read the code, and realized some of
the hysteresis and thrashing problems they though might happen are
not, in fact possible, even with a GC thread).  I genuinely believe
that unification of the interfaces in the future is the right
direction.  Anything that abstracts complexity for other kernel
programmers is good.

Regards,
-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E56F8DE.5453DB88>