Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Jun 95 17:09:22 MDT
From:      terry@cs.weber.edu (Terry Lambert)
To:        taob@gate.sinica.edu.tw (Brian Tao)
Cc:        joerg_wunsch@uriah.heep.sax.de, hackers@FreeBSD.Org
Subject:   Re: bringing up freebsd
Message-ID:  <9506172309.AA16099@cs.weber.edu>
In-Reply-To: <Pine.BSI.3.91.950617153548.212M-100000@aries> from "Brian Tao" at Jun 17, 95 03:37:06 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sat, 17 Jun 1995, J Wunsch wrote:
> > 
> > You are strongly urged to have more swap space than physical RAM.  See
> > Terry Lambert's excellent article on "his hobby-horse memory over-
> > commitment" in Usenet (comp.unix.bsd.freebsd.misc).  I forgot the
> > Subject, it was something with somebody who's been asking if it's
> > possible to eliminate all panics.
> 
>     Could someone (Terry?) kindly repost that article here?  I
> remember seeing it go by, but I was in "brief article scan mode" and
> didn't stop long enough to digest what was said.  Thanks.

It was just a rephrasing of my standard diatribe, with an admission
that the reason I go off like that is basically that I'd make the
tradeoff differently were I in control of the universe.  8-).

I'll repeat the whole thing because I rather like the fault tolerance
aspects.


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.

=== BEGIN INCLUDED MATERIAL ================================================
Michael Dillon <michael@junction.net> wrote:
] I was just browsing a WWW page at Amdahl, the mainframe
] manufacturer when I came across the following:
] 
]     http://www.amdahl.com/doc/products/oes/cb.uts/utshist.html
] 
]     UTS 4.2 was engineered to eliminate all kernel panics (other UNIX 
]     operating systems based on a simple port of the base SVR4 source 
]     contain "panic" code that will stop the machine in unexpected 
]     situations). In the development of UTS 4.2, the base SVR4 code
]     was methodically "scrubbed" to create a run-time environment as
]     reliable as the S/390 hardware platform it serves. 
] 
] If they can do it, why can't FreeBSD do the same? I'm thinking that this
] problem is similar to the problems with TCP/IP congestion and that
] solutions could be found similarly.

Is that "marketing eliminated" or "engineering eliminated".

I guaran-damn-tee you if there is a hardware fault, the machine
is going to suck mud, no matter what they do to the software.

Just like a short in the ethernet will take out a NetWare SFT
(Software Fault Tolerance) server.


Now there *are* two classes of panic.  One is the result of an
unrecoverable failure mode.  UTS has unrecoverable failure modes,
too -- don't let them kid you.  You hadle these by panicing.

The Second type of panic is one where the kernel agrees to do
something, then renigs on the agreement.  There are a lot of
cases, mostly based on probability, where the kernel will commit
to doing something that it thinks it can most likely do, but is
not 100% certain it can.  For instance, allowing a process to
start at all without knowing what the maximum dirty data pages
it will use during its lifetime is beforehand.

It's possible to get around most of these problems by not allowing
the overcommitting of resources; the problem with that is that on
the the average, it's OK ot overcommit resources, and doing so
will result in less overall resources being required for the
average case.

One of my favorite hobby-horses is memory overcommit.  The good
things about memory overcommit are:

o       Your total avaiable memory is swap size + RAM size

o       You don't require real swap for clean text (and data, if
        correctly implemented) pages, since they can be reloaded
        from the file (this is called using the file as backing
        store).

o       Precommitting resources takes time, so not doing it means
        you can start executing code before you have it all in core.

o       The copy costs for the pages can be amortized over the
        runtime of the program.  The plus to this is that it
        grants the appearance of speed; the downside is that it
        actually detracts from overall speed during runtime binding
        (a problem most shared library implementations also have).


The bad things are:

o       Unless your total available memory is limited to swap size
        (meaning that you have real swap space reserved as backing
        store for RAM), you can't guarantee hot shutdown/restart,
        and you can't guarantee enough space to support kernel
        dumps (in case of unrecoverable errors).

o       Using a program file as backing store causes problems: if
        the program was loaded over NFS, the NFS server must stay
        up to swap in pages; therefore the image is fragile to
        network outages (anyone who has used a diskless Sun would
        agree).  The "fix" for this would be to special case remote
        file systems to load remote images entirely into local swap.
        That only works in "dataless" configurations, not "diskless",
        since swap is also remote in the second case.  FreeBSD,
        NetBSD, SVR4, etc. typically don't implement this "fix".

o       Using the program image as a swap store makes the program
        fragile to modification.  This is the purpose of the VTEXT
        flag on an in core vnode on such systems, and attempts to
        modify the image result in an error return of ETXTBUSY (a
        non-POSIX error return "extension").  The "fix" for this
        one is to fault the image to swap (and make the VM system
        "prefer" swap pages to disk pages -- something you want
        anyway, since a page reference from swap is much faster
        than one through the file system) and allow the modification
        to proceed.  Again, this is not typically implemented, and
        there are problems if the modification is not local to the
        machine doing the running, since the non-standard VTEXT
        flag is not propagated to a remote host (NFS/RFS).  In
        combination with forcing remotely executed code to local
        swap, this window is (mostly) closed.

o       Delayed startup (obviously: related to the size of the image
        being copied to swap).


And this is just *one* of the overcommitted resources on the machine.


Obviously, it a set of trade-offs between what the user is willing
to spend on hardware vs. what they get for their money.
=== END INCLUDED MATERIAL ==================================================



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9506172309.AA16099>