From owner-freebsd-arm@freebsd.org Thu Nov 12 12:18:33 2015 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 75AE9A29983 for ; Thu, 12 Nov 2015 12:18:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E36ED15FD; Thu, 12 Nov 2015 12:18:32 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id tACCIQNk057103 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 12 Nov 2015 14:18:26 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua tACCIQNk057103 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id tACCIP0i057102; Thu, 12 Nov 2015 14:18:26 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 12 Nov 2015 14:18:25 +0200 From: Konstantin Belousov To: Michael Tuexen Cc: freebsd-arm Subject: Re: Memory management issue on RPi? Message-ID: <20151112121825.GJ2257@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Nov 2015 12:18:33 -0000 On Thu, Nov 12, 2015 at 11:28:07AM +0100, Michael Tuexen wrote: > Dear all, > > I'm experiencing a behaviour I don't expect. > When running FreeBSD head on a RPI B+ without swap space > it shows the following behaviour on the console: > > [bsd10:~] tuexen% dd if=/dev/zero of=large_file bs=1m count=1024 > Nov 12 11:22:16 bsd10 kernel: pid 666 (sshd), uid 1002, was killed: out of swap space > Nov 12 11:22:19 bsd10 kernel: pid 606 (thttpd), uid 65534, was killed: out of swap space > Nov 12 11:22:24 bsd10 kernel: pid 316 (devd), uid 0, was killed: out of swap space > Killed > [bsd10:~] tuexen% Nov 12 11:22:27 bsd10 kernel: pid 676 (dd), uid 1002, was killed: out of swap space > [bsd10:~] tuexen% uname -a > FreeBSD bsd10.fh-muenster.de 11.0-CURRENT FreeBSD 11.0-CURRENT #10 r290676: Wed Nov 11 20:23:53 CET 2015 tuexen@bsd10.fh-muenster.de:/home/tuexen/head/sys/arm/compile/RPI-B arm > [bsd10:~] tuexen% ls -l large_file > -rw-r--r-- 1 tuexen tuexen 584056832 Nov 12 11:22 large_file > > Shouldn't I be able to use dd to generate an almost arbitrary large file (limited > by the filesystem, not by the memory)? This is a known problem with the swap-less OOM. The following patch should give you an immediate relief. You might want to tweak sysctl vm.pageout_oom_seq if default value is not right, it was selected by 'try and see' approach on very small (32 or 64MB) i386 VM. diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h index a87f682..1fa61eb 100644 --- a/sys/vm/vm_page.h +++ b/sys/vm/vm_page.h @@ -227,6 +227,7 @@ struct vm_domain { long vmd_segs; /* bitmask of the segments */ boolean_t vmd_oom; int vmd_pass; /* local pagedaemon pass */ + int vmd_oom_seq; int vmd_last_active_scan; struct vm_page vmd_marker; /* marker for pagedaemon private use */ struct vm_page vmd_inacthead; /* marker for LRU-defeating insertions */ diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c index f564fb5..b956e25 100644 --- a/sys/vm/vm_pageout.c +++ b/sys/vm/vm_pageout.c @@ -122,7 +122,8 @@ static void vm_pageout_init(void); static int vm_pageout_clean(vm_page_t m); static int vm_pageout_cluster(vm_page_t m); static void vm_pageout_scan(struct vm_domain *vmd, int pass); -static void vm_pageout_mightbe_oom(struct vm_domain *vmd, int pass); +static void vm_pageout_mightbe_oom(struct vm_domain *vmd, int page_shortage, + int starting_page_shortage); SYSINIT(pagedaemon_init, SI_SUB_KTHREAD_PAGE, SI_ORDER_FIRST, vm_pageout_init, NULL); @@ -158,6 +159,7 @@ SYSINIT(vmdaemon, SI_SUB_KTHREAD_VM, SI_ORDER_FIRST, kproc_start, &vm_kp); int vm_pages_needed; /* Event on which pageout daemon sleeps */ int vm_pageout_deficit; /* Estimated number of pages deficit */ int vm_pageout_wakeup_thresh; +static int vm_pageout_oom_seq = 24; #if !defined(NO_SWAPPING) static int vm_pageout_req_swapout; /* XXX */ @@ -223,6 +225,10 @@ static int pageout_lock_miss; SYSCTL_INT(_vm, OID_AUTO, pageout_lock_miss, CTLFLAG_RD, &pageout_lock_miss, 0, "vget() lock misses during pageout"); +SYSCTL_INT(_vm, OID_AUTO, pageout_oom_seq, + CTLFLAG_RW, &vm_pageout_oom_seq, 0, + "back-to-back calls to oom detector to start OOM"); + #define VM_PAGEOUT_PAGE_COUNT 16 int vm_pageout_page_count = VM_PAGEOUT_PAGE_COUNT; @@ -1041,7 +1047,8 @@ vm_pageout_scan(struct vm_domain *vmd, int pass) vm_object_t object; long min_scan; int act_delta, addl_page_shortage, deficit, error, maxlaunder, maxscan; - int page_shortage, scan_tick, scanned, vnodes_skipped; + int page_shortage, scan_tick, scanned, starting_page_shortage; + int vnodes_skipped; boolean_t pageout_ok, queues_locked; /* @@ -1080,6 +1087,7 @@ vm_pageout_scan(struct vm_domain *vmd, int pass) page_shortage = vm_paging_target() + deficit; } else page_shortage = deficit = 0; + starting_page_shortage = page_shortage; /* * maxlaunder limits the number of dirty pages we flush per scan. @@ -1343,6 +1351,12 @@ relock_queues: (void)speedup_syncer(); /* + * If the inactive queue scan fails repeatedly to meet its + * target, kill the largest process. + */ + vm_pageout_mightbe_oom(vmd, page_shortage, starting_page_shortage); + + /* * Compute the number of pages we want to try to move from the * active queue to the inactive queue. */ @@ -1453,15 +1467,6 @@ relock_queues: } } #endif - - /* - * If we are critically low on one of RAM or swap and low on - * the other, kill the largest process. However, we avoid - * doing this on the first pass in order to give ourselves a - * chance to flush out dirty vnode-backed pages and to allow - * active pages to be moved to the inactive queue and reclaimed. - */ - vm_pageout_mightbe_oom(vmd, pass); } static int vm_pageout_oom_vote; @@ -1472,12 +1477,17 @@ static int vm_pageout_oom_vote; * failed to reach free target is premature. */ static void -vm_pageout_mightbe_oom(struct vm_domain *vmd, int pass) +vm_pageout_mightbe_oom(struct vm_domain *vmd, int page_shortage, + int starting_page_shortage) { int old_vote; - if (pass <= 1 || !((swap_pager_avail < 64 && vm_page_count_min()) || - (swap_pager_full && vm_paging_target() > 0))) { + if (starting_page_shortage <= 0 || starting_page_shortage != + page_shortage) + vmd->vmd_oom_seq = 0; + else + vmd->vmd_oom_seq++; + if (vmd->vmd_oom_seq < vm_pageout_oom_seq) { if (vmd->vmd_oom) { vmd->vmd_oom = FALSE; atomic_subtract_int(&vm_pageout_oom_vote, 1); @@ -1485,6 +1495,12 @@ vm_pageout_mightbe_oom(struct vm_domain *vmd, int pass) return; } + /* + * Do not follow the call sequence until OOM condition is + * cleared. + */ + vmd->vmd_oom_seq = 0; + if (vmd->vmd_oom) return; @@ -1510,6 +1526,37 @@ vm_pageout_mightbe_oom(struct vm_domain *vmd, int pass) atomic_subtract_int(&vm_pageout_oom_vote, 1); } +static long +vm_pageout_resident_count(struct vmspace *vmspace) +{ + vm_map_t map; + vm_map_entry_t entry; + vm_object_t obj; + long res; + + map = &vmspace->vm_map; + KASSERT(!map->system_map, ("system map")); + sx_assert(&map->lock, SA_LOCKED); + res = 0; + for (entry = map->header.next; entry != &map->header; + entry = entry->next) { + if ((entry->eflags & MAP_ENTRY_IS_SUB_MAP) != 0) + continue; + obj = entry->object.vm_object; + if (obj == NULL) + continue; + switch (obj->type) { + case OBJT_DEFAULT: + case OBJT_SWAP: + case OBJT_VNODE: + case OBJT_PHYS: + res += obj->resident_page_count; + break; + } + } + return (res); +} + void vm_pageout_oom(int shortage) { @@ -1554,7 +1601,8 @@ vm_pageout_oom(int shortage) if (!TD_ON_RUNQ(td) && !TD_IS_RUNNING(td) && !TD_IS_SLEEPING(td) && - !TD_IS_SUSPENDED(td)) { + !TD_IS_SUSPENDED(td) && + !TD_IS_SWAPPED(td)) { thread_unlock(td); breakout = 1; break; @@ -1582,12 +1630,13 @@ vm_pageout_oom(int shortage) } PROC_UNLOCK(p); size = vmspace_swap_count(vm); - vm_map_unlock_read(&vm->vm_map); if (shortage == VM_OOM_MEM) - size += vmspace_resident_count(vm); + size += vm_pageout_resident_count(vm); + vm_map_unlock_read(&vm->vm_map); vmspace_free(vm); + /* - * if the this process is bigger than the biggest one + * If this process is bigger than the biggest one, * remember it. */ if (size > bigsize) {