Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 9 Apr 2017 15:27:15 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mark Millard <markmi@dsl-only.net>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, freebsd-hackers@freebsd.org, andrew@freebsd.org
Subject:   Re: The arm64 fork-then-swap-out-then-swap-in failures: a program source for exploring them
Message-ID:  <20170409122715.GF1788@kib.kiev.ua>
In-Reply-To: <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
References:  <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net> <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net> <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 08, 2017 at 06:02:00PM -0700, Mark Millard wrote:
> [I've identified the code path involved is the arm64 small allocations
> turning into zeros for later fork-then-swapout-then-back-in,
> specifically the ongoing RES(ident memory) size decrease that
> "top -PCwaopid" shows before the fork/swap sequence. Hopefully
> I've also exposed enough related information for someone that
> knows what they are doing to get started with a specific
> investigation, looking for a fix. I'd like for a pine64+
> 2GB to have buildworld complete despite the forking and
> swapping involved (yep: for a time zero RES(ident memory) for
> some processes involved in the build).]

I was not able to follow the walls of text, but do not think that
I pmap_ts_reference() is the real culprit there.

Is my impression right that the issue occurs on fork, and looks as
a memory corruption, where some page suddently becomes zero-filled ?
And swapping seems to be involved ?  It is somewhat interesting to see
if the problem is reproducable on non-arm64 machines, e.g. armv7 or amd64.

If answers to my two questions are yes, there is probably some bug with
arm64 pmap handling of the dirty bit emulation.  ARMv8.0 does not provide
hardware dirty bit, and pmap interprets an accessed writeable page as
unconditionally dirty.  More, accessed bit is also not maintained by
hardware, instead if should be set by pmap.  And arm64 pmap sets the
AF bit unconditionally when creating valid pte.

Hmm, could you try the following patch, I did not even compiled it.

diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
index 3d5756ba891..55aa402eb1c 100644
--- a/sys/arm64/arm64/pmap.c
+++ b/sys/arm64/arm64/pmap.c
@@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 		    sva += L3_SIZE) {
 			l3 = pmap_load(l3p);
 			if (pmap_l3_valid(l3)) {
+				if ((l3 & ATTR_SW_MANAGED) &&
+				    pmap_page_dirty(l3)) {
+					vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
+					    ~ATTR_MASK));
+				}
 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
 				PTE_SYNC(l3p);
 				/* XXX: Use pmap_invalidate_range */



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170409122715.GF1788>