From owner-svn-src-stable-12@freebsd.org Tue Sep 3 19:28:00 2019 Return-Path: Delivered-To: svn-src-stable-12@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B6E67D3D8F; Tue, 3 Sep 2019 19:28:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46NH6S4Vtgz4CRq; Tue, 3 Sep 2019 19:28:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7DF8B7C87; Tue, 3 Sep 2019 19:28:00 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x83JS0d1057438; Tue, 3 Sep 2019 19:28:00 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x83JRxLr057430; Tue, 3 Sep 2019 19:27:59 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201909031927.x83JRxLr057430@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Tue, 3 Sep 2019 19:27:59 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r351776 - stable/12/sys/vm X-SVN-Group: stable-12 X-SVN-Commit-Author: kib X-SVN-Commit-Paths: stable/12/sys/vm X-SVN-Commit-Revision: 351776 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-12@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 12-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Sep 2019 19:28:00 -0000 Author: kib Date: Tue Sep 3 19:27:59 2019 New Revision: 351776 URL: https://svnweb.freebsd.org/changeset/base/351776 Log: MFC r351114: Fix OOM handling of some corner cases. Modified: stable/12/sys/vm/vm_fault.c stable/12/sys/vm/vm_page.c stable/12/sys/vm/vm_pageout.c stable/12/sys/vm/vm_pageout.h Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/vm/vm_fault.c ============================================================================== --- stable/12/sys/vm/vm_fault.c Tue Sep 3 19:14:00 2019 (r351775) +++ stable/12/sys/vm/vm_fault.c Tue Sep 3 19:27:59 2019 (r351776) @@ -134,6 +134,18 @@ static void vm_fault_dontneed(const struct faultstate static void vm_fault_prefault(const struct faultstate *fs, vm_offset_t addra, int backward, int forward, bool obj_locked); +static int vm_pfault_oom_attempts = 3; +SYSCTL_INT(_vm, OID_AUTO, pfault_oom_attempts, CTLFLAG_RWTUN, + &vm_pfault_oom_attempts, 0, + "Number of page allocation attempts in page fault handler before it " + "triggers OOM handling"); + +static int vm_pfault_oom_wait = 10; +SYSCTL_INT(_vm, OID_AUTO, pfault_oom_wait, CTLFLAG_RWTUN, + &vm_pfault_oom_wait, 0, + "Number of seconds to wait for free pages before retrying " + "the page fault handler"); + static inline void release_page(struct faultstate *fs) { @@ -568,7 +580,7 @@ vm_fault_hold(vm_map_t map, vm_offset_t vaddr, vm_prot vm_pindex_t retry_pindex; vm_prot_t prot, retry_prot; int ahead, alloc_req, behind, cluster_offset, error, era, faultcount; - int locked, nera, result, rv; + int locked, nera, oom, result, rv; u_char behavior; boolean_t wired; /* Passed by reference. */ bool dead, hardfault, is_first_object_locked; @@ -579,7 +591,9 @@ vm_fault_hold(vm_map_t map, vm_offset_t vaddr, vm_prot nera = -1; hardfault = false; -RetryFault:; +RetryFault: + oom = 0; +RetryFault_oom: /* * Find the backing store object and offset into it to begin the @@ -825,7 +839,18 @@ RetryFault:; } if (fs.m == NULL) { unlock_and_deallocate(&fs); - vm_waitpfault(dset); + if (vm_pfault_oom_attempts < 0 || + oom < vm_pfault_oom_attempts) { + oom++; + vm_waitpfault(dset, + vm_pfault_oom_wait * hz); + goto RetryFault_oom; + } + if (bootverbose) + printf( + "proc %d (%s) failed to alloc page on fault, starting OOM\n", + curproc->p_pid, curproc->p_comm); + vm_pageout_oom(VM_OOM_MEM_PF); goto RetryFault; } } Modified: stable/12/sys/vm/vm_page.c ============================================================================== --- stable/12/sys/vm/vm_page.c Tue Sep 3 19:14:00 2019 (r351775) +++ stable/12/sys/vm/vm_page.c Tue Sep 3 19:27:59 2019 (r351776) @@ -3062,7 +3062,7 @@ vm_domain_alloc_fail(struct vm_domain *vmd, vm_object_ * this balance without careful testing first. */ void -vm_waitpfault(struct domainset *dset) +vm_waitpfault(struct domainset *dset, int timo) { /* @@ -3074,7 +3074,7 @@ vm_waitpfault(struct domainset *dset) if (vm_page_count_min_set(&dset->ds_mask)) { vm_min_waiters++; msleep(&vm_min_domains, &vm_domainset_lock, PUSER | PDROP, - "pfault", 0); + "pfault", timo); } else mtx_unlock(&vm_domainset_lock); } Modified: stable/12/sys/vm/vm_pageout.c ============================================================================== --- stable/12/sys/vm/vm_pageout.c Tue Sep 3 19:14:00 2019 (r351775) +++ stable/12/sys/vm/vm_pageout.c Tue Sep 3 19:27:59 2019 (r351776) @@ -1733,6 +1733,12 @@ vm_pageout_oom_pagecount(struct vmspace *vmspace) return (res); } +static int vm_oom_ratelim_last; +static int vm_oom_pf_secs = 10; +SYSCTL_INT(_vm, OID_AUTO, oom_pf_secs, CTLFLAG_RWTUN, &vm_oom_pf_secs, 0, + ""); +static struct mtx vm_oom_ratelim_mtx; + void vm_pageout_oom(int shortage) { @@ -1740,9 +1746,31 @@ vm_pageout_oom(int shortage) vm_offset_t size, bigsize; struct thread *td; struct vmspace *vm; + int now; bool breakout; /* + * For OOM requests originating from vm_fault(), there is a high + * chance that a single large process faults simultaneously in + * several threads. Also, on an active system running many + * processes of middle-size, like buildworld, all of them + * could fault almost simultaneously as well. + * + * To avoid killing too many processes, rate-limit OOMs + * initiated by vm_fault() time-outs on the waits for free + * pages. + */ + mtx_lock(&vm_oom_ratelim_mtx); + now = ticks; + if (shortage == VM_OOM_MEM_PF && + (u_int)(now - vm_oom_ratelim_last) < hz * vm_oom_pf_secs) { + mtx_unlock(&vm_oom_ratelim_mtx); + return; + } + vm_oom_ratelim_last = now; + mtx_unlock(&vm_oom_ratelim_mtx); + + /* * We keep the process bigproc locked once we find it to keep anyone * from messing with it; however, there is a possibility of * deadlock if process B is bigproc and one of its child processes @@ -1806,7 +1834,7 @@ vm_pageout_oom(int shortage) continue; } size = vmspace_swap_count(vm); - if (shortage == VM_OOM_MEM) + if (shortage == VM_OOM_MEM || shortage == VM_OOM_MEM_PF) size += vm_pageout_oom_pagecount(vm); vm_map_unlock_read(&vm->vm_map); vmspace_free(vm); @@ -2061,6 +2089,7 @@ vm_pageout(void) p = curproc; td = curthread; + mtx_init(&vm_oom_ratelim_mtx, "vmoomr", NULL, MTX_DEF); swap_pager_swap_init(); for (first = -1, i = 0; i < vm_ndomains; i++) { if (VM_DOMAIN_EMPTY(i)) { Modified: stable/12/sys/vm/vm_pageout.h ============================================================================== --- stable/12/sys/vm/vm_pageout.h Tue Sep 3 19:14:00 2019 (r351775) +++ stable/12/sys/vm/vm_pageout.h Tue Sep 3 19:27:59 2019 (r351776) @@ -79,7 +79,8 @@ extern int vm_page_max_wired; extern int vm_pageout_page_count; #define VM_OOM_MEM 1 -#define VM_OOM_SWAPZ 2 +#define VM_OOM_MEM_PF 2 +#define VM_OOM_SWAPZ 3 /* * vm_lowmem flags. @@ -96,7 +97,7 @@ extern int vm_pageout_page_count; */ void vm_wait(vm_object_t obj); -void vm_waitpfault(struct domainset *); +void vm_waitpfault(struct domainset *, int timo); void vm_wait_domain(int domain); void vm_wait_min(void); void vm_wait_severe(void);