Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Dec 2017 20:12:36 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Larry McVoy <lm@mcvoy.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: OOM problem?
Message-ID:  <80D1ECE3-D983-4DFB-9B28-3F716F73CD47@dsl-only.net>
In-Reply-To: <20171208011430.GA16016@mcvoy.com>
References:  <20171208011430.GA16016@mcvoy.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[Just a pointer to a potential example report on the
lists.]

On 2017-Dec-7, at 5:14 PM, Larry McVoy <lm at mcvoy.com> wrote:

> . . .
> It's sort of an ugly problem in that
> when it happens your only recourse is to power cycle the machine, you
> can't kill off the processes causing the problem.

If there is a serial console, can something like, say, CR TILDE CTRL-B
get to the db> prompt? (options ALT_BREAK_TO_DEBUGGER example.) 

> . . .
> 
> Here is the problem.  All of these "misbehaved" (by using lots of ram)
> processes go to sleep, I believe in vm_wait().  They are all waiting
> for more ram so the pageout daemon is kicked but to no avail, all the
> ram is tied up in the processes that want more ram.  The pageout daemon
> kicks out what it can but it quickly gets to the point that it scans
> everything and finds nothing (I know this because I added debugging to
> show that's what it is doing).
> 
> The OOM code kicks in and it behaves poorly.  It doesn't kill any of
> the big processes, those are all sleeping without PCATCH on so they are
> skipped.  The OOM code starts killing off anything it can find, it was
> killing getty, ssh, bash, dhclient.  One buglet is that, in my opinion,
> it finds stuff to kill that it probably shouldn't.  Anything that init
> will respawn is fine, anything that would not be respawned should be 
> run as not killable.  Seems like an audit of those processes might be
> in order.

https://lists.freebsd.org/pipermail/freebsd-hackers/2017-December/051890.html

may be an example of the problem on a rpi2 but with a swap
partition in use. I was able to get to the db> prompt and
included some basic information from there. It was head
-r326192 based.

(I did eventually reboot the rpi2 so I no longer have that
specific context available to examine.)

> I know that you'll ask why no swap?  Just add swap and the problem
> goes away.  Does it?  I don't think so, that's just kicking the can
> down the road.  If we add 256GB of swap now we have a 512GB bag to fill,
> fill that and I think we're right back to where we started.
> 
> . . .



===
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?80D1ECE3-D983-4DFB-9B28-3F716F73CD47>