Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 May 2009 19:11:42 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        freebsd-hackers@freebsd.org
Subject:   Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?
Message-ID:  <200905220211.n4M2Bg5b036854@apollo.backplane.com>
References:  <4A14F58F.8000801@rawbw.com> <Pine.GSO.4.64.0905202344420.1483@zeno.ucsd.edu> <4A1594DA.2010707@rawbw.com> <Pine.GSO.4.64.0905211344370.1483@zeno.ucsd.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
    There is no such thing as a graceful way to deal with running out of
    memory.  What is a program supposed to do?  Even if it gracefully exits
    it still fails to perform the function for which it was designed.  If
    such a program is used in a script then the script fails as well.   Even
    the best systems (e.g. space shuttle, mars rover, airplane control
    systems) which try to deal with unexpected situations still have to
    have the final option, that being a complete reset.  And even a complete
    reset is no guarantee of recovery (as one of the original airbus accidents
    during an air-show revealed when the control systems got into a reset loop
    and the pilot could not regain control of the plane).  The most robust
    systems do things like multiple independant built-to-spec programs and
    a voting system which require 10 times the man power to code and test,
    something you will likely never see in the open-source world or even
    in most of the commercial application world.

    In fact, it is nearly impossible to write code which gracefully fails
    even if the intent is to gracefully fail (and even if one can even figure
    out what a workable graceful failure path would even be). You would
    have to build code paths to deal with the failure conditions,
    significantly increasing the size of the code base, and you would have
    to test every possible failure combination to exercise those code
    paths to make sure they actually work as expected.  If you don't then
    the code paths designed to deal with the failure will themselves
    likely be full of bugs and make the problem worse.  People who try
    to program this way but don't have the massive resources required
    often wind up with seriously bloated and buggy code.

    So if the system runs out of memory (meaning physical memory + all
    available swap), having a random subset of programs of any size
    start to fail will rapidly result in a completely unusable system
    and only a reboot will get it back into shape.  At least until it
    runs out of memory again.

    --

    The best one can do is make the failures more deterministic.  Killing
    the largest program is one such mechanic.  Knowing how the system will
    react makes it easier to restore the system without necessarily rebooting
    it.  Of course there might have to be exceptions as you don't want
    your X server to be the program chosen.  Generally, though, having some
    sort of deterministic progression is going to be far better then having
    half a dozen random programs which happen to be trying to allocate memory
    suddenly get an unexpected memory allocation failure.

    Also, it's really a non-problem.  Simply configure a lot of swap... like
    8G or 16G if you really care.  Or 100G.  Then you *DO* get a graceful
    failure which gives you time to figure out what is going on and fix it.
    The graceful failure is that the system starts to page to swap more and
    more heavily, getting slower and slower in the process, but doesn't
    actually have to kill anything for minutes to hours depending on the
    failure condition.

    It's a lot easier to write code which reacts to a system which is
    operating at less then ideal efficiency then it is to write code which
    reacts to the failure of a core function (that of allocating memory).
    One could even monitor swap use as ring the alarm bells if it goes above
    a certain point.

    Overcommit has never been the problem.  The problem is there is no way
    a large system can gracefully deal with running out of memory, overcommit
    or not.

						-Matt




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905220211.n4M2Bg5b036854>