From owner-freebsd-current@FreeBSD.ORG Tue Sep 13 22:00:42 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7DA7316A41F for ; Tue, 13 Sep 2005 22:00:42 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4744A43D46 for ; Tue, 13 Sep 2005 22:00:42 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4/8.13.4) with ESMTP id j8DM0faQ005531; Tue, 13 Sep 2005 15:00:41 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4/8.13.4/Submit) id j8DM0fHI005530; Tue, 13 Sep 2005 15:00:41 -0700 (PDT) Date: Tue, 13 Sep 2005 15:00:41 -0700 (PDT) From: Matthew Dillon Message-Id: <200509132200.j8DM0fHI005530@apollo.backplane.com> To: Kris Kennaway References: <20050911075157.GA93947@xor.obsecurity.org> Cc: current@freebsd.org Subject: Re: 'swap_pager: indefinite wait buffer' with swapfile X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Sep 2005 22:00:42 -0000 : : :--bp/iNruPH9dso1Pn :Content-Type: text/plain; charset=us-ascii :Content-Disposition: inline : :I configured a vnode-backed md and enabled swapping on it. A few :hours later after moderate swap use the console showed: : :swap_pager: indefinite wait buffer: bufobj: 0, blkno: 889347, size: 8192 :[...repeated...] : :The backing store was a sparse file, but there was ample space: This is likely your problem. Do NOT use a sparse file for file-backed swap. If you carefully backtrace the other processes on the stuck system I'll bet you will find one stuck in ffs_balloc (or similar) in addition to the one you found stuck in allocbuf(). If you do, then its the same bug I reported to Kirk not too long ago. I determined that there was a lock order reveral between the locking of indirect blocks (on files) and related data blocks. It turns out that the data block must always be locked first because the standard BMAP path holds a locked data block while resolving the indirect block. However, when a dirty VM page is written to sparse file backing store the indirect block winds up being locked before the data block. It's only a problem when a VM fault has to allocate the underlying filesystem block (in our case, rtorrent was read()ing directly into a memory mapped sparse file). In anycase, the fix for FFS is /usr/src/sys/vfs/ufs/ffs_balloc.c:1.12 in the DragonFly source tree. I believe that may fix your swap issue as well, but my first comment still stands: Do NOT ever use a sparse file for swap backing store. Massive fragmentation within the file will occur and if the machine swaps regularly then after a while it will no longer be able to optimize swap I/O and will be even MORE sludgy then the normal sludginess you get when you have to page to swap. -Matt Matthew Dillon