Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 05 Nov 2014 19:08:38 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 194513] zfs recv hangs in state kmem arena
Message-ID:  <bug-194513-8-zb0XA0KoVM@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-194513-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-194513-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194513

Andriy Gapon <avg@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|Needs Triage                |In Discussion
                 CC|                            |alc@FreeBSD.org,
                   |                            |avg@FreeBSD.org,
                   |                            |jeffr@FreeBSD.org

--- Comment #4 from Andriy Gapon <avg@FreeBSD.org> ---
My personal opinion that the problem is caused by a bug in the combination of
the new vmem-based code and the changes in the page daemon code.

When there is not enough KVA the code wakes up the page daemon with expectation
that it would make some more KVA available, but the pagedaemon may not actually
take any action.

Previously the page daemon code used to check a return value from msleep and it
made a page out pass if it was woken up.  Now the page daemon code performs a
pass when it is woken up *and* vm_pages_needed is set.  As the comment before
pagedaemon_wakeup() explains that function is not guaranteed to actually wake
up the page daemon unless vm_page_queue_free_mtx is held.  And kmem_reclaim()
does not hold vm_page_queue_free_mtx when it calls pagedaemon_wakeup().

Additionally, before the switch to the vmem kmem_malloc() used to directly
invoke vm_lowmem hook and uma_reclaim() function as opposed to trying to wake
up the page daemon.

So, the old could would reliably free some KVA if there is any that can be
freed by vm_lowmem hook and uma_reclaim.
But the new code makes a lame attempt to wake up the page daemon.

I believe that the above explains why you sometimes see processes stuck in
vmem_xalloc() and why your workaround work - when you really force the page
daemon to make a page out pass it would finally free some KVA by invoking
vm_lowmem hook and uma_reclaim.

I see two trivial possible solutions:
- hold vm_page_queue_free_mtx in kmem_reclaim() around pagedaemon_wakeup() call
- directly call vm_lowmem hook and uma_reclaim() instead of pagedaemon_wakeup()
in kmem_reclaim()

Not sure which one would be better.  Maybe there is an even better solution.

-- 
You are receiving this mail because:
You are the assignee for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-194513-8-zb0XA0KoVM>