Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Sep 2018 00:08:28 -0700
From:      bob prohaska <fbsd@www.zefox.net>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, freebsd-arm@freebsd.org, bob prohaska <fbsd@www.zefox.net>
Subject:   Re: RPI3 swap experiments (r338342 with vm.pageout_oom_seq="1024" and 6 GB swap)
Message-ID:  <20180906070828.GC3482@www.zefox.net>
In-Reply-To: <AA5BA4D3-5C06-4E18-B7D3-C8A5D0272F43@yahoo.com>
References:  <20180906003829.GC818@www.zefox.net> <201809060243.w862hq7o058504@pdx.rh.CN85.dnsmgr.net> <20180906042353.GA3482@www.zefox.net> <AA5BA4D3-5C06-4E18-B7D3-C8A5D0272F43@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 05, 2018 at 11:20:14PM -0700, Mark Millard wrote:
> 
> 
> On 2018-Sep-5, at 9:23 PM, bob prohaska <fbsd at www.zefox.net> wrote:
> 
> > On Wed, Sep 05, 2018 at 07:43:52PM -0700, Rodney W. Grimes wrote:
> >> 
> >> What makes you believe that the VM system has any concept about
> >> the speed of swap devices?  IIRC it simply uses them in a round
> >> robbin fashion with no knowlege of them being fast or slow, or
> >> shared with files systems or other stuff.
> >> 
> > 
> > Mostly the assertion that OOMA kills happening when the system had
> > plenty of free swap were caused by the swap being "too slow". If the
> > machine knows some swap is slow, it seems capable of discerning other
> > swap is faster. 
> 
> If an RPI3 magically had a full-speed/low-latency optane context
> as its swap space, it would still get process kills for buildworld
> buildkernel for vm.pageout_oom_seq=12 for -j4 as I understand
> things at this point. (Presumes still having 1 GiByte of RAM.)
> 
> In other words: the long latency issues you have in your rpi3
> configuration may contribute to the detailed "just when did it
> fail" but low-latency/high-speed I/O would be unlikely to prevent
> kills from eventually happening during the llvm parts of buildworld .
> Free RAM would still be low for "long periods". Increasing
> vm.pageout_oom_seq is essential from what I can tell.
> 
Understood and accepted. I'm using  vm.pageout_oom_seq=1024 at present.
The system struggles mightily, but it keeps going and finishes.

> vm.pageout_oom_seq is about controlling "how long". -j1 builds are
> about keeping less RAM active. (That is also the intent for use of
> LDFLAGS.lld+=-Wl,--no-threads .) Of course, for the workload involved,
> using a context with more RAM can avoid having "low RAM" for
> as long. An aarch64 board with 4 GiBYte of RAM and 4 cores possibly
> has no problem for -j4 buildworld buildkernel for head at this
> point: Free RAM might well never be low during such a build in such
> a context.
> 
> (The quotes like "how long" are because I refer to the time
> consequences, the units are not time but I'm avoiding the detail.)
> 
> The killing criteria do not directly measure and test swapping I/O
> latencies or other such as far as I know. Such things are only
> involved indirectly via other consequences of the delays involved
> (when they are involved at all). That is my understanding.
> 
Perhaps I'm being naive here, but when one sees two devices holding
swap, one at ~25% busy and one at ~150% busy, it seems to beg for
a little selective pressure for diverting traffic to the less busy
device from the more busy one. Maybe it's impossible, maybe it's more
trouble than the VM folks want to invest. Just maybe, it's doable
and worthwhile, to take advantage of a cheap, power efficient platform.

I too am unsure of the metric for "too slow". From earlier discussion
I got the impression it was something like a count of how many cycles
of request and rejection (more likely, deferral) for swap space were
made; after a certain count is reached, OOMA is invoked. That picture
is sure to be simplistic, and may well be flat-out wrong.

If my picture is not wholly incorrect, it isn't a huge leap to ask for
swap device-by-device, and accept swap from the device that offers it first.
In the da0 vs mmcsd0 case, ask for swap on each in turn, first to say yes gets
the business. The busier one will will get beaten in the race by the more
idle device, relieving the bottleneck to the extent of the faster device's
capacity. It isn't perfect, but it's an improvement.

Thanks for reading!

bob prohaska
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180906070828.GC3482>