Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Jun 1996 03:09:58 -0700
From:      David Greenman <davidg@root.com>
To:        John Hay <jhay@mikom.csir.co.za>
Cc:        dyson@freebsd.org, stable@freebsd.org
Subject:   Re: Another try at the vm_pageout.c -stable diff 
Message-ID:  <199606261009.DAA00392@root.com>
In-Reply-To: Your message of "Sat, 26 Jun 1996 09:22:01 %2B0200." <199606260722.JAA03525@zibbi.mikom.csir.co.za> 

next in thread | previous in thread | raw e-mail | index | archive | help
>I'm not sure how relevant this still is. I see that there was a few commits
>during the night (well night for me here in SA).
>
>David I have tried your patch, but the kernel dies with "panic: unwire:
>page not in pmap" as soon as the machine starts to swap. It did not say
>anything more.
>
>I also tried the patch that John Dyson sent, but it also died. I have
>written down the essentials, but I did not get a coredump.

   Okay, here's the scoop. The patches that John and I provided both contained
a major error - one of the parameters to a function (a pointer) was missing,
so the kernel used stack garbage instead. That is what was causing your panic.
Now, even with that fixed, the patch still doesn't fix the original problem.
It turns out this is because of two reasons: the recursion tracking mechanism
that is being used doesn't unwind as the stack depth decreases (so it's
broken), and the recursion depth is much deeper than we originally had
thought. I'm of the opinion right now that this is architecturally flawed
and we're going to have to re-implement it. It's not acceptable to release
2.1.5R with this code enabled because it will almost certainly lead to a
stack overflow under certain circumstances, ...and it's not acceptable to
disable the code because it causes another bug which results in extremely
poor performance (I couldn't believe how bad it really was - I ended up
halting the machine and rebooting it because I was tired of waiting for it
to stop thrashing >5 minutes for a test that should take 10 seconds).
   The problem with disabling it is that you pretty much have to completely
disable swapping. Without that code in the vmdaemon, the swapping code
effectively stops all your processes from running, but doesn't bother to
actually free up any memory. The system eventually realizes that it needs
to swap in a process and pages out a page or three first to do this. The
process then runs for a _very_ short period and then the pattern repeats.
This is why it appears that pages are going out only one or two at a time
(because they are! :-)). I think the RSS limiting is evil and can't be
effectively implemented given our VM architecture. This is because, while
we can LRU order pages within VM objects, we really have no way of knowing
how to determine which object in the process to trim pages out of. The result
being that it effectively implements a highly degenerate paging algorithm
whenever RSS limiting gets into the picture. In some operating systems, such
as VMS, where the pages in the process are more or less all one big LRU
ordered glob, RSS limiting can be implemented without this problem. This isn't
the case with FreeBSD, so I'm going to suggest to John that we kill it in both
-stable and -current. We still need to handle whole process swapouts, however.
   I'll be looking at this more tonight and I'll discuss the issue with John
tomorrow.

-DG

David Greenman
Core-team/Principal Architect, The FreeBSD Project



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606261009.DAA00392>