From owner-freebsd-stable@FreeBSD.ORG Sat Aug 20 18:01:33 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 616D1106566B for ; Sat, 20 Aug 2011 18:01:33 +0000 (UTC) (envelope-from alan.l.cox@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 206B28FC13 for ; Sat, 20 Aug 2011 18:01:32 +0000 (UTC) Received: by yib19 with SMTP id 19so3272232yib.13 for ; Sat, 20 Aug 2011 11:01:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=BxwVeO12cVrdIaF5/1P2XlGzxmKj97ajQnDOuvXV/BU=; b=Ot6GmqJnnbdVRt98QP92Tp5pCQlRTg2D/pNyx1PSPE89rXcRhLo6T+5orWs7zLSm6U 4ZXrM8JCN/4pEX5NQE6fufyqWZ11w/ynoidPIz8y5baVvdoubIB4SJbOFY5Mt1LCdiQ+ PwhsVAPv3XE2xfN00X6Jd01Tol7xSRXhrJLIw= MIME-Version: 1.0 Received: by 10.42.137.2 with SMTP id w2mr668882ict.116.1313861609094; Sat, 20 Aug 2011 10:33:29 -0700 (PDT) Received: by 10.231.192.20 with HTTP; Sat, 20 Aug 2011 10:33:29 -0700 (PDT) In-Reply-To: <4E4CCA6C.8020408@ipfw.ru> References: <4E4143A6.6030307@digsys.bg> <935F8EC2-88E0-45A3-BE8B-7210BE223BC5@mac.com> <4e42a0c0.e2t/9MF98O3HFjb1%perryh@pluto.rain.com> <4E4CCA6C.8020408@ipfw.ru> Date: Sat, 20 Aug 2011 12:33:29 -0500 Message-ID: From: Alan Cox To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Kostik Belousov , freebsd-stable@freebsd.org, perryh@pluto.rain.com, daniel@digsys.bg Subject: Re: 32GB limit per swap device? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: alc@freebsd.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Aug 2011 18:01:33 -0000 On Thu, Aug 18, 2011 at 3:16 AM, Alexander V. Chernikov wrote: > On 10.08.2011 19:16, perryh@pluto.rain.com wrote: > >> Chuck Swiger wrote: >> >> On Aug 9, 2011, at 7:26 AM, Daniel Kalchev wrote: >>> >>>> I am trying to set up 64GB partitions for swap for a system that >>>> has 64GB of RAM (with the idea to dump kernel core etc). But, on >>>> 8-stable as of today I get: >>>> >>>> WARNING: reducing size to maximum of 67108864 blocks per swap unit >>>> >>>> Is there workaround for this limitation? >>>> >>> > Another interesting question: > > swap pager operates in page blocks (PAGE_SIZE=4k on common arch). > > Block device size in passed to swaponsomething() in number of _disk_ blocks > (e.g. in DEV_BSIZE=512). After that, kernel b-lists (on top of which swap > pager is build) maximum objects check is enforced. > > The (possible) problem is that real object count we will operate on is not > the value passed to swaponsomething() since it is calculated in wrong units. > > we should check b-list limit on (X * DEV_BSIZE512 / PAGE_SIZE) value which > is rough (X / 8) so we should be able to address 32*8=256G. > > The code should look like this: > > Index: vm/swap_pager.c > ==============================**==============================**======= > --- vm/swap_pager.c (revision 223877) > +++ vm/swap_pager.c (working copy) > @@ -2129,6 +2129,15 @@ swaponsomething(struct vnode *vp, void *id, u_long > u_long mblocks; > > /* > + * nblks is in DEV_BSIZE'd chunks, convert to PAGE_SIZE'd chunks. > + * First chop nblks off to page-align it, then convert. > + * > + * sw->sw_nblks is in page-sized chunks now too. > + */ > + nblks &= ~(ctodb(1) - 1); > + nblks = dbtoc(nblks); > + > + /* > > * If we go beyond this, we get overflows in the radix > * tree bitmap code. > */ > @@ -2138,14 +2147,6 @@ swaponsomething(struct vnode *vp, void *id, u_long > mblocks); > nblks = mblocks; > } > - /* > - * nblks is in DEV_BSIZE'd chunks, convert to PAGE_SIZE'd chunks. > - * First chop nblks off to page-align it, then convert. > - * > - * sw->sw_nblks is in page-sized chunks now too. > - */ > - nblks &= ~(ctodb(1) - 1); > - nblks = dbtoc(nblks); > > sp = malloc(sizeof *sp, M_VMPGDATA, M_WAITOK | M_ZERO); > sp->sw_vp = vp; > > > (move pages recalculation before b-list check) > > > Can someone comment on this? > > I believe that you are correct. Have you tried testing this change on a large swap device? Alan