From owner-freebsd-hackers Fri Feb 22 12:43:39 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 5FA2F37B404 for ; Fri, 22 Feb 2002 12:43:33 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g1MKg4u22700; Fri, 22 Feb 2002 12:42:04 -0800 (PST) (envelope-from dillon) Date: Fri, 22 Feb 2002 12:42:04 -0800 (PST) From: Matthew Dillon Message-Id: <200202222042.g1MKg4u22700@apollo.backplane.com> To: Andrew Mobbs Cc: hackers@FreeBSD.ORG Subject: Re2: msync performance References: <15478.31998.459219.178549@chiark.greenend.org.uk> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :I recently raised PR 35195 : :Details are in the PR, but in summary; performing a large amount of :random IO to a large file through mmap, on a machine with a fair amount :of free RAM, will cause a following msync to take a significant amount :of time. : :I believe this is because msync walks the dirty buffer list by age, :therefor will write blocks out in an order liable to cause a lot of :disk seeks. : :My suggestion for a solution would be before starting the IO, to sort :the dirty buffer list by location on logical disk, and coalesce :adjacent blocks where possible. : :Before I volunteer to implement something like this, please could :somebody check I'm correct in my analysis, and comment on the :feasibility of my suggested solution. : :Thanks, : :-- :Andrew Mobbs - http://www.chiark.greenend.org.uk/~andrewm/ I've looked at this some more. I can fairly trivially improve sequential write efficiency of msync() is called on a range of dirty pages, and I can use the same code when msync() is called on a complete file *IF* the file is fairly small (no more then a hundred pages or so). But we have a serious problem when msync() is called on a very large file that may only contain a few dirty pages. For example, if you have a 20GB file and you are mmap()ing portions of it, we can't iterate through the file offsets sequentially without eating an enormous amount of cpu (as in several seconds worth of cpu or even several minutes). In this case we have to scan the object page list, which is not sorted. Even so the existing msync() code *DOES* cluster pages together into 64K chunks (though I notice that it does not appear to cluster the raw I/O). So, this falls back to your suggested solution.... sort object->memq (it's the actual page queue that is the problem, not the object queue). Looking at it some more I believe this may be a viable solution. I am going to work something up. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message