From owner-freebsd-hackers  Tue Dec 14 20:45:21 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id AD0271540F
	for <freebsd-hackers@FreeBSD.ORG>; Tue, 14 Dec 1999 20:45:18 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id UAA25923;
	Tue, 14 Dec 1999 20:45:13 -0800 (PST)
	(envelope-from dillon)
Date: Tue, 14 Dec 1999 20:45:13 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199912150445.UAA25923@apollo.backplane.com>
To: Ed Hall <edhall@screech.weirdnoise.com>
Cc: freebsd-hackers@FreeBSD.ORG, edhall@screech.weirdnoise.com
Subject: Re: VM Scan Rate: Speed Kills on 3.3
References:  <199912150159.RAA15697@screech.weirdnoise.com>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:Under certain circumstances the VM scan rate can spike into the millions/sec
:(as reported by vmstat) followed quickly by a system lockup (an endless
:loop in vm_pageout()), suggesting that the page queue has been tied in a
:loop.  This effect was observed in a program ported from Solaris that
:updated a large file by mmap()'ing small parts of it.  Although using
:read()/write() eliminates the problem (and with a sizable increase in
:performance as well), there may be other triggers for this bug.
:
:(A side comment: although using mmap() for file updates in FreeBSD
:applications seems to perform quite poorly when compared to read()/write(), 
:this is not the case on some other systems, such as Solaris.  Also,
:there may be cases where the shared memory semantics of mmap() are
:important to an application such that conversion to read()/write() is
:not possible.)
:
:I've attached a small test program that provokes the same behavior as
:...

    This is a known problem which has not been fixed in 3.x.  The problem
    has been mostly fixed in 4.x.  The problem is that in a
    low-memory situation the page daemon winds up being the only system 
    process left that is capable of cleaning (flushing) dirty pages to disk.
    However, the page daemon cannot flush pages associated with files whos
    vnodes are locked.  So the lockup occurs when some other process
    (even another system process) holds the vnode locked and the system
    runs out of memory.  The page daemon scans through tonnes of pages
    but can't flush any of them due to the locked vnode.

    There is no simple solution for 3.x.  It may be possible to place a
    workaround in vm_fault to block early on a low memory condition before
    memory becomes critical but it would be a pretty nasty hack.

    The below hack (for 3.x) is not something I would commit to the tree 
    because it is too drastic, but try it and see if it solves your problem.

						-Matt

Index: vm_fault.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_fault.c,v
retrieving revision 1.93.2.4
diff -u -r1.93.2.4 vm_fault.c
--- vm_fault.c	1999/08/29 16:33:30	1.93.2.4
+++ vm_fault.c	1999/12/15 04:43:57
@@ -191,6 +191,10 @@
 RetryFault:;
 	fs.map = map;
 
+	while ((fault_type & VM_PROT_WRITE) && (cnt.v_free_count + cnt.v_cache_count) < cnt.v_free_min) {
+		VM_WAIT;
+	}
+
 	/*
 	 * Find the backing store object and offset into it to begin the
 	 * search.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message