From owner-cvs-all@FreeBSD.ORG Thu May 6 13:08:42 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9745716A4CE; Thu, 6 May 2004 13:08:42 -0700 (PDT) Received: from cs.rice.edu (cs.rice.edu [128.42.1.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id DCD9143D5C; Thu, 6 May 2004 13:08:41 -0700 (PDT) (envelope-from alc@cs.rice.edu) Received: from localhost (calypso.cs.rice.edu [128.42.1.127]) by cs.rice.edu (Postfix) with ESMTP id 3295F4AFB0; Thu, 6 May 2004 15:08:41 -0500 (CDT) Received: from cs.rice.edu ([128.42.1.30]) by localhost (calypso.cs.rice.edu [128.42.1.127]) (amavisd-new, port 10024) with LMTP id 03666-01-19; Thu, 6 May 2004 15:08:40 -0500 (CDT) Received: by cs.rice.edu (Postfix, from userid 19572) id C90614AFAE; Thu, 6 May 2004 15:08:40 -0500 (CDT) Date: Thu, 6 May 2004 15:08:40 -0500 From: Alan Cox To: Bruce Evans Message-ID: <20040506200840.GT5199@cs.rice.edu> References: <200405060503.i4653OfT061105@repoman.freebsd.org> <20040507010151.I21163@gamplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040507010151.I21163@gamplex.bde.org> User-Agent: Mutt/1.3.28i X-Virus-Scanned: by amavis-20030616-p7 at cs.rice.edu cc: Alan Cox cc: cvs-src@FreeBSD.org cc: src-committers@FreeBSD.org cc: cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/fs/nwfs nwfs_io.c src/sys/fs/smbfs smbfs_io.c src/sys/fs/specfs spec_vnops.c src/sys/kern uipc_syscalls.c vfs_bio.c src/sys/nfsclient nfs_bio.c src/sys/vm swap_page X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2004 20:08:42 -0000 On Fri, May 07, 2004 at 01:30:27AM +1000, Bruce Evans wrote: > On Wed, 5 May 2004, Alan Cox wrote: > > > alc 2004/05/05 22:03:24 PDT > > > > FreeBSD src repository > > > > Modified files: > > sys/fs/nwfs nwfs_io.c > > sys/fs/smbfs smbfs_io.c > > sys/fs/specfs spec_vnops.c > > sys/kern uipc_syscalls.c vfs_bio.c > > sys/nfsclient nfs_bio.c > > sys/vm swap_pager.c vm_fault.c vnode_pager.c > > Log: > > Make vm_page's PG_ZERO flag immutable between the time of the page's > > allocation and deallocation. This flag's principal use is shortly after > > allocation. For such cases, clearing the flag is pointless. The only > > unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never > > requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed > ^^^^^ rarely(?) > > page. > > > > Reviewed by: tegge@ > > The request for a prezeroed page is just a preference, so vfs_bio_clrbuf() > certainly gets prezeroed pages. This happens whenever there are only > prezeroed pages to find. I think vfs_bio_clrbuf() sees them too. It > saw them at least once a few years ago when my debugging code for this > finally triggered. My kernel at the time had colorizing optimizations > that probably made using a prezeroed page more likely. No, HEAD and RELENG_4 differ. You are describing RELENG_4. In HEAD, the flag is only returned if a zeroed page is requested: flags = PG_BUSY; if (m->flags & PG_ZERO) { vm_page_zero_count--; if (req & VM_ALLOC_ZERO) flags = PG_ZERO | PG_BUSY; } Why did I do this? Consider the RELENG_4 implementation of PG_ZERO and VM_ALLOC_ZERO. The concept in RELENG_4 was to track the state of a page from birth to death. See vm/vm_page.h revision 1.55. This was never finished and enabled. We were, however, left with a model in which every caller to vm_page_alloc() was (potentially) responsible for clearing PG_ZERO whether they wanted a prezeroed page or not. In other words, if the caller's use of the page was going to dirty it, the caller was always responsible for clearing the flag. What was the potential gain of this model? If a prezeroed page was allocated but never dirtied before being freed, it would automatically go back to the prezeroed queue. How likely is this? Note that the most common case is captured "manually" by the pmap. When the pmap frees a page table page, it knows that the page is zeroed. So, it directs vm_page.c to place the page in the prezeroed queue. What is the downside of this model? Clearing the flag requires an atomic memory op in RELENG_4 and a mutex in HEAD. Given that atomic memory ops are getting more expensive with each successive generation of Intel CPU, this is a dubious tradeoff. In terms of programming model, its too easy for a driver, module, file system, etc. by the average joe to forget to clear PG_ZERO, causing havoc. Thus, prior to this commit, only those callers to vm_page_alloc() that explicitly asked for a prezeroed page were responsible for clearing PG_ZERO. Now, no caller is. Ultimately, I would like to reenable the RELENG_4-like behavior of VM_ALLOC_ZERO. In other words, PG_ZERO can be returned regardless of VM_ALLOC_ZERO. Before that happens, vfs_bio_clrbuf() needs further scrutiny. > The PG_ZERO optimizations in vfs_bio_clrbuf() were just worse than useless > because they rarely applied (and they break immutability here). Agreed. Regards, Alan