Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Nov 2012 11:49:04 +0100
From:      Andre Oppermann <andre@freebsd.org>
To:        Andre Oppermann <andre@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-user@freebsd.org
Subject:   Re: svn commit: r242910 - in user/andre/tcp_workqueue/sys: kern sys
Message-ID:  <50A0D420.4030106@freebsd.org>
In-Reply-To: <201211120847.qAC8lEAM086331@svn.freebsd.org>
References:  <201211120847.qAC8lEAM086331@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 12.11.2012 09:47, Andre Oppermann wrote:
> Author: andre
> Date: Mon Nov 12 08:47:13 2012
> New Revision: 242910
> URL: http://svnweb.freebsd.org/changeset/base/242910
>
> Log:
>    Base the mbuf related limits on the available physical memory or
>    kernel memory, whichever is lower.

The commit message is a bit terse so I'm going to explain in more
detail:

The overall mbuf related memory limit must be set so that mbufs
(and clusters of various sizes) can't exhaust physical RAM or KVM.

I've chosen a limit of half the physical RAM or KVM (whichever is
lower) as the baseline.  In any normal scenario we want to leave
at least half of the physmem/kvm for other kernel functions and
userspace to prevent it from swapping like hell.  Via a tunable
it can be upped to at most 3/4 of physmem/kvm.

Out of the overall mbuf memory limit I've chosen 2K clusters, 4K
(page size) clusters to get 1/4 each because these are the most
heavily used mbuf sizes.  2K clusters are used for MTU 1500 ethernet
inbound packets.  4K clusters are used whenever possible for sends
on sockets and thus outbound packets.

The larger cluster sizes of 9K and 16K are limited to 1/6 of the
overall mbuf memory limit.  Again, when jumbo MTU's are used these
large clusters will end up only on the inbound path.  They are not
used on outbound, there it's still 4K.  Yes, that will stay that
way because otherwise we run into lots of complications in the
stack.  And it really isn't a problem, so don't make a scene.

Previously the normal mbufs (256B) weren't limited at all.  This
is wrong as there are certain places in the kernel that on allocation
failure of clusters try to piece together their packet from smaller
mbufs.  The mbuf limit is the number of all other mbuf sizes together
plus some more to allow for standalone mbufs (ACK for example) and
to send off a copy of a cluster.  FYI: Every cluster eventually also
has an mbuf associated with it.

Unfortunately there isn't a way to set an overall limit for all
mbuf memory together as UMA doesn't support such a limiting.

Lets work out a few examples on sizing:

1GB KVM:
  512MB limit for mbufs
  419,430 mbufs
   65,536 2K mbuf clusters
   32,768 4K mbuf clusters
    9,709 9K mbuf clusters
    5,461 16K mbuf clusters

16GB RAM:
  8GB limit for mbufs
  33,554,432 mbufs
   1,048,576 2K mbuf clusters
     524,288 4K mbuf clusters
     155,344 9K mbuf clusters
      87,381 16K mbuf clusters

These defaults should be sufficient for event the most demanding
network loads.  If you do run into these limits you probably know
exactly what you are doing and you are expected to tune those
values for your particular purpose.

There is a side-issue with maxfiles as it relates to the maximum
number of sockets that can be opened at the same time.  With web
servers and proxy caches of these days there may be some 100K or
more sockets open.  Hence I've divorced maxfiles from maxusers as
well.  There is a relationship of maxfiles with the callout callwheel
though which has to be investigated some more to prevent ridiculous
values from being chosen.

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50A0D420.4030106>