Date: Wed, 25 Jun 2008 08:26:11 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/124963: old pagezero fixes for alc Message-ID: <20080625082417.C25463@besplex.bde.org> Resent-Message-ID: <200806242230.m5OMU5AK076872@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 124963 >Category: kern >Synopsis: old pagezero fixes for alc >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Tue Jun 24 22:30:05 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Bruce Evans >Release: FreeBSD 5.2-CURRENT i386 >Organization: >Environment: System: FreeBSD besplex.bde.org 5.2-CURRENT FreeBSD 5.2-CURRENT #4295: Tue Mar 4 09:02:36 EST 2008 bde@besplex.bde.org:/c/obj/usr/src/sys/compile/BESPLEX i386 >Description: alc asked me to submit these patches for the (currently disabled) pagezero thread as a PR. I'm getting tired of maintaining these patches -- there was a patch conflict in them today. Please commit some. They are hard to test since idlezero (sic) is disabled. Disabling idlezero still loses a few percent of performance relative to old versions of FreeBSD with idlezero and no vm_phys in cases where there is enough idle time to actually run idlezero (e.g., for a kernel build starting with sources not cached). In other cases, I don't see much difference for either idlezero or vm_phys. >How-To-Repeat: >Fix: % Index: vm_phys.c % =================================================================== % RCS file: /home/ncvs/src/sys/vm/vm_phys.c,v % retrieving revision 1.9 % diff -u -2 -r1.9 vm_phys.c % --- vm_phys.c 6 Apr 2008 18:09:28 -0000 1.9 % +++ vm_phys.c 19 Apr 2008 23:09:00 -0000 % @@ -41,6 +41,8 @@ % #include <sys/malloc.h> % #include <sys/mutex.h> % +#include <sys/proc.h> % #include <sys/queue.h> % #include <sys/sbuf.h> % +#include <sys/sched.h> % #include <sys/sysctl.h> % #include <sys/vmmeter.h> % @@ -552,7 +554,18 @@ % cnt.v_free_count--; % mtx_unlock(&vm_page_queue_free_mtx); % +#ifndef PREEMPTION_AND_PREEMPTION_WORKS % + if (sched_runnable()) { % + thread_lock(curthread); % + critical_exit(); % + mi_switch(SW_VOL | SWT_IDLE, % + NULL); % + thread_unlock(curthread); % + } else % +#endif % + critical_exit(); % pmap_zero_page_idle(m_tmp); % m_tmp->flags |= PG_ZERO; % mtx_lock(&vm_page_queue_free_mtx); % + critical_enter(); % cnt.v_free_count++; % vm_phys_free_pages(m_tmp, 0); Move stuff here, fix bugs in it, and always configure voluntarily preemption: - preemption should be here since it should be considered for every page. It must be here since preemption while holding the mutex cannot be good, and the caller shouldn't drop the mutex. - involuntary preemption while holding the mutex cannot be good either, so the caller prevents it using critical_enter() (patch originally by jhb). - it may be better better keep critical_enter() until after zeroing the page. - I don't know why I put the voluntary preemption before the page zeroing instead of after. Perhaps just to minimise the ugliness in the else clause. - always configure voluntary preemption, since invountary preemption had never worked at the time when these patches were written, for SMP at least with SCHED_4BSD and no IPI_PREEMPTION. In at least this case, one CPU tended to be left running pagezero for too long, since nothing sent it an IPI to tell it not to, and this was very bad for performance, apparently since the CPU spent its time thrashing caches when it should be doing useful work. % Index: vm_zeroidle.c % =================================================================== % RCS file: /home/ncvs/src/sys/vm/vm_zeroidle.c,v % retrieving revision 1.52 % diff -u -2 -r1.52 vm_zeroidle.c % --- vm_zeroidle.c 17 Apr 2008 04:20:10 -0000 1.52 % +++ vm_zeroidle.c 19 Apr 2008 23:04:56 -0000 % @@ -122,18 +122,14 @@ % % mtx_lock(&vm_page_queue_free_mtx); % + critical_enter(); See above. % for (;;) { % if (vm_page_zero_check()) { % vm_page_zero_idle(); % -#ifndef PREEMPTION % - if (sched_runnable()) { % - thread_lock(curthread); % - mi_switch(SW_VOL | SWT_IDLE, NULL); % - thread_unlock(curthread); % - } % -#endif This was moved after fixing it. % } else { % wakeup_needed = TRUE; % + critical_exit(); % msleep(&zero_state, &vm_page_queue_free_mtx, 0, % "pgzero", hz * 300); % + critical_enter(); % } % } Just always release critical_enter() when releasing the mutex. I think we would prefer to acquire critical_enter() before the mutex and release it after releasing the mutex (which we half do in vm_page_zero_idle(), but msleep() doesn't support that. Races from this of course only cause extra context switches, and I haven't noticed any cases where even the extra context switches prevented by critical_enter() make a difference. Another thing I tried here that didn't seem to make much difference was to run pagezero at a high priority except transiently when it gives up control. critical_enter() does essentially the same thing but more hackishly. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080625082417.C25463>